1. Class probabilities and predictions
In the previous video, we worked through an example confusion matrix using 50% as the classification cutoff threshold.
2. Different thresholds
However, we're not limited to using this threshold. For example, if we wanted to catch more mines (at the expense of more false positives), we could use a cutoff of 10%. On the other hand, if we wanted to be more certain of our predicted mines (at the expense of catching fewer of them) we could use 90% as our threshold.
In other words, choosing a threshold is an exercise in balancing the true positive rate (or percent of mines we catch) with the false positive rate (or percent of non-mines we incorrectly flag as mines). Choosing a threshold is therefore very important, and also somewhat dependent on a cost-benefit analysis of the problem at hand.
Unfortunately, there's not a good heuristic for choosing prediction thresholds ahead of time. You usually have to use a confusion matrix on your test set to find a good threshold.
3. Confusion matrix
Let's work through an example and pretend we want fewer predicted mines, with a greater degree of certainty in each prediction. To do this, we could use a larger cutoff value on our predicted probabilities, for example 99% rather than 50%, and make the same 2-way frequency table we used in the previous exercise.
4. Confusion matrix with caret
As before, we can also use caret's helper functions to calculate the statistics associated with this confusion matrix. In this case, we get an accuracy of 30%, which is better than our last attempt, but still far below the 51% accuracy of the no-information model that always predicts mines.
5. Let’s practice!
Lets play around with some more confusion thresholds and see if we can manually find a good classification threshold for our rocks vs mines model.