Exercise

# Optimizing the threshold

You heard that the default value of 0.5 maximizes accuracy in theory, but you want to test what happens in practice. So you try out a number of different threshold values, to see what accuracy you get, and hence determine the best-performing threshold value. You repeat this experiment for the F1 score. Is 0.5 the optimal threshold? Is the optimal threshold for accuracy and for the F1 score the same? Go ahead and find out! You have a `scores`

matrix available, obtained by scoring the test data. The ground truth labels for the test data is also available as `y_test`

. Finally, two `numpy`

functions are preloaded, `argmin()`

and `argmax()`

, which retrieve the index of the minimum and maximum values in an array respectively, in addition to the metrics `accuracy_score()`

and `f1_score()`

.

Instructions

**100 XP**

- Create a range of threshold values that include 0.0, 0.25, 0.5, 0.75 and 1.0.
- Via double list comprehension, store the predictions for each threshold value in the range above. Recall that obtaining labels for a scores matrix using a threshold
`thr`

is possible using`[s[1] > thr for s in scores]`

. - Run through that list and compute the accuracy for each threshold. Repeat for the F1 score.
- Using either
`argmin()`

or`argmax()`

, find the optimal threshold for accuracy, and for F1.