Optimizing the threshold
You heard that the default value of 0.5 maximizes accuracy in theory, but you want to test what happens in practice. So you try out a number of different threshold values, to see what accuracy you get, and hence determine the best-performing threshold value. You repeat this experiment for the F1 score. Is 0.5 the optimal threshold? Is the optimal threshold for accuracy and for the F1 score the same? Go ahead and find out! You have a scores
matrix available, obtained by scoring the test data. The ground truth labels for the test data is also available as y_test
. Finally, two numpy
functions are preloaded, argmin()
and argmax()
, which retrieve the index of the minimum and maximum values in an array respectively, in addition to the metrics accuracy_score()
and f1_score()
.
This exercise is part of the course
Designing Machine Learning Workflows in Python
Exercise instructions
- Create a range of threshold values that include 0.0, 0.25, 0.5, 0.75 and 1.0.
- Via double list comprehension, store the predictions for each threshold value in the range above. Recall that obtaining labels for a scores matrix using a threshold
thr
is possible using[s[1] > thr for s in scores]
. - Run through that list and compute the accuracy for each threshold. Repeat for the F1 score.
- Using either
argmin()
orargmax()
, find the optimal threshold for accuracy, and for F1.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create a range of equally spaced threshold values
t_range = ____
# Store the predicted labels for each value of the threshold
preds = [[____ > thr for s in scores] for ____ in ____]
# Compute the accuracy for each threshold
accuracies = [____(____, ____) for p in preds]
# Compute the F1 score for each threshold
f1_scores = [____(____, ____) for p in preds]
# Report the optimal threshold for accuracy, and for F1
print(t_range[____(accuracies)], t_range[____(f1_scores)])