Calculate a confusion matrix
As you saw in the video, a confusion matrix is a very useful tool for calibrating the output of a model and examining all possible outcomes of your predictions (true positive, true negative, false positive, false negative).
Before you make your confusion matrix, you need to "cut" your predicted probabilities at a given threshold to turn probabilities into a factor of class predictions. Combine ifelse()
with factor()
as follows:
pos_or_neg <- ifelse(probability_prediction > threshold, positive_class, negative_class)
p_class <- factor(pos_or_neg, levels = levels(test_values))
confusionMatrix()
in caret
improves on table()
from base R by adding lots of useful ancillary statistics in addition to the base rates in the table. You can calculate the confusion matrix (and the associated statistics) using the predicted outcomes as well as the actual outcomes, e.g.:
confusionMatrix(p_class, test_values)
This exercise is part of the course
Machine Learning with caret in R
Exercise instructions
- Use
ifelse()
to create a character vector,m_or_r
that is the positive class,"M"
, whenp
is greater than 0.5, and the negative class,"R"
, otherwise. - Convert
m_or_r
to be a factor,p_class
, with levels the same as those oftest[["Class"]]
. - Make a confusion matrix with
confusionMatrix()
, passingp_class
and the"Class"
column from thetest
dataset.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# If p exceeds threshold of 0.5, M else R: m_or_r
# Convert to factor: p_class
# Create confusion matrix