Precision and recall
Both precision and recall are related to the four outcomes discussed in the prior lesson and are important evaluation metrics for any machine learning model. An ad CTR model should ideally have high precision (high ROI on ad spend) and recall (relevant audience targeting). Although it is possible to calculate precision and recall by hand, sklearn
has some handy implementations that you can easily plug into the existing workflow. In this exercise, you will set up a decision tree and calculate precision and recall.
The pandas module is available as pd
in your workspace and the sample DataFrame is loaded as df
. The features are loaded in X
and the target is loaded in y
for use. Additionally, precision_score()
and recall_score()
from sklearn.metrics
are available.
This exercise is part of the course
Predicting CTR with Machine Learning in Python
Exercise instructions
- Obtain the training and testing splits for
X
andy
. - Define a decision tree classifier and produce predictions
y_pred
by fitting the model. - Use implementations from
sklearn
to get the precision and recall scores.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Set up training and testing split
X_train, X_test, y_train, y_test = ____(
____, ____, test_size = .2, random_state = 0)
# Create classifier and make predictions
clf = ____
y_pred = clf.____(____, _____).____(X_test)
# Evaluate precision and recall
prec = ____(y_test, ____, average = 'weighted')
recall = ____(y_test, ____, average = 'weighted')
print("Precision: %s, Recall: %s" %(prec, recall))