Get startedGet started for free

Predict and evaluate

1. Predict and evaluate

Welcome back! Now that you know how to create a data split and classification tree model, let's learn how to make predictions and evaluate how close they are to the truth.

2. Predicting on new data

Like most machine learning packages in R, parsnip has a predict() function. The first argument is the trained model, the second argument is the new or test dataset, and the third argument, type, controls whether the function returns predicted labels or probabilities.

3. Predicting on new data

Call the predict() function with the model and the test data. The default argument for type is "class", which produces class labels. If you write type = "prob", the result will contain numeric columns, one for each outcome variable. The numbers are the probability. Note you always get a tibble with rows that correspond to the rows of the new data. This makes working with tidymodels objects very handy.

4. Confusion matrix

The confusion matrix is a really useful tool for evaluating binary classification models. It's called a confusion matrix because it reveals how confused the model is between the two classes, and highlights instances in which it confuses one class for the other.

5. Confusion matrix

The columns of a confusion matrix correspond to the truth labels,

6. Confusion matrix

and the rows represent the predictions.

7. Confusion matrix

In a binary classification problem, the confusion matrix will be a 2 by 2 table. The main diagonal contains the counts of correctly classified examples, that is "yes"-predictions that are in fact "yes", and "no" predictions that are in fact "no". A good model will contain most of the examples in the main diagonal (the green squares) and it will have a small number of examples, ideally zero, in the off-diagonal (the red squares).

8. Confusion matrix

Let's briefly review the 4 possible outcomes with a binary classification model: True positives or TP are cases where the model correctly predicted yes. True negatives or TN are cases where the model correctly predicted no. False positives or FP are cases where the model predicted positive, but the true labels are negative. False negatives or FN are cases where the model predicted negative, but it's actually positive.

9. Create the confusion matrix

So, how to create such a confusion matrix? Simply use the mutate() function to add the true outcome of the diabetes test tibble to the predictions. The resulting tibble pred_combined has two columns: dot-pred_class and true_class. The yardstick package, which is a part of the tidymodels framework, provides the conf_mat() function. You need to specify three arguments: data, which is the data containing your predictions and true values, estimate, which is your prediction column, and truth, which are the true values. The matrix is printed as a result.

10. Accuracy

There are quite a few ways of evaluating classification performance. Accuracy measures how often the classifier predicted the class correctly. It is defined as the ratio between the number of correct predictions and the total number of predictions it made. It is quite intuitive and easy to calculate, and in Chapter 3, you'll get to know more helpful performance metrics! The yardstick package makes it very easy to assess accuracy. Just call the accuracy() function the same way you called the conf_mat() function: Simply supply the pred_combined tibble containing predictions and true values, and specify the estimate and truth arguments as before. The result is a tibble with information like the name of the metric, accuracy, and the result. In this case, the model is right about 70% of the time.

11. Let's evaluate!

Now that you learned techniques for evaluating classification models, it's your turn!