Model evaluation

1. Model evaluation

In this lesson, we will discuss in more depth about model evaluation.

2. Precision and recall

Keeping the four categories of outcomes in mind, let's dive into precision and recall. Precision is defined as the number of true positives divided by the sum of true and false positives. Therefore, precision is the proportion of clicks relative to total number of impressions. This is important to maximize because if you are running an ad campaign, you want to maximize this amount since it defines an ROI on your ad spend. Recall is defined as the number of true positives divided by the sum of true positives and false negatives. Therefore, the denominator is the total number of clicks available, so recall represents the proportion of clicks gotten of all clicks available. A higher recall means that the ads being run are targeting the relevant audience or reach of people.

3. Calculating precision and recall

Although it is certainly possible to calculate precision and recall by hand based on the four categories, sklearn has some handy functions that we can use in our models. To calculate precision, we can use the precision_score function and for recall we can use the recall_score function. The result is a float, as shown. In both cases, each function will take in y_test (the actual target values), y_pred (our model's predicted target values), and a parameter named average, which we will default set to "weighted". What this parameter does is address the fact that there is a class imbalance, by weighing the frequency for each of the classes.

4. Baseline classifiers

With all models, it is important to evaluate them versus some kind of baseline. Due to the imbalanced nature of click data, the baseline classifier is not a random coin flip, since this classifier can have a higher accuracy by always predicting no click. Therefore, the comparable baseline classifier that we will want to assess our evaluation metrics, including precision and recall, against, is this classifier that always predicts no-click. We can simulate such a model by populating the y_pred array, which is a numpy array, with zeros. Here we can use a lambda function, where we take put a 0 in every element of an array for the length of the whole length of X_test, which is the same as that of y_pred and y_test. This results in the following array of zeros, which is consistent with the y_pred array of other classifiers, and can be used as inputs for analysis versus baseline.

5. Implications on ROI analysis

Note that since the baseline classifier always predicts a non click, or 0, then the number of true positives (TP) and false positives (FP) will be zero. Therefore, if we think about the ROI framework discussed in the last lesson, both the total return and total spend amounts are 0. However, in general we can apply the ROI analysis done to any set of classifiers. As a recap, for each classifier, we can use a confusion matrix using the confusion matrix function, along with numpy's ravel function to get the four categories of outcomes, which we can then plug into the ROI analysis equations. Here are the relevant equations for reference.

6. Let's practice!

Now that you've learned about precision and recall, baseline classifiers, and recapped comparing classifiers in terms of ROI on ad spend, let's jump right in into some examples!

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Predicting CTR with Machine Learning in Python

IntermediateSkill Level

5.0+

7 reviews