More model measures

1. More model measures

Welcome back. So far, you evaluated your binary classification models using accuracy.

2. Limits of accuracy

But do you recall that one exercise where the model that always predicted "no" achieved a very high accuracy? That is easily possible with an imbalanced data set.

3. Sensitivity or true positive rate

Fortunately, there are more binary classification metrics. One is sensitivity. Sensitivity or true positive rate is the proportion of all positive outcomes that were found by your model. For example, of the credit card customers that did churn, how many did our model predict correctly?

4. Specificity or true negative rate

Specificity or true negative rate measures the proportion of all negative outcomes that were correctly classified. For example, of the credit card customers that did not churn, what proportion did our model predict correctly?

5. Different thresholds

So far, you have only used the predict function to calculate predicted classes. It predicted "yes" if the probability was more than 0-point-5, and "no" otherwise. What if you calculated predicted probabilities instead? You could try different thresholds for your class predictions and would get different confusion matrices and performance measures. Is there one threshold among these that performs best?

6. ROC (Receiver-operating-characteristic) curve

The ROC curve, an acronym for receiver-operating-characteristic, visualizes the performance of a classification model across all possible probability thresholds. For each unique threshold, a point that represents the true positive rate or sensitivity and the false positive rate or one minus the specificity is added to the plot.

7. ROC curve and AUC

An ideal ROC curve would fall into the top-left corner of the graph, with a true positive rate or sensitivity of 1 and a false positive rate or 1 minus specificity of 0. The overall performance of a classifier, summarized over all possible thresholds, is given by the area under this curve (AUC).

8. Area under the ROC curve

An AUC of zero point five corresponds to a ROC curve along the diagonal and indicates that a model performs no better than random chance. An AUC of one corresponds to a model that perfectly classifies every example, while an AUC of zero means that your model is classifying every observation wrongly. So, ROC curves and the AUC are useful for comparing different classifiers, since they take into account all possible thresholds.

9. yardstick sensitivity: sens()

Let's see how all that is coded in tidymodels. Imagine you have the class predictions tibble containing the predicted and true classes. The sens() function calculates sensitivity and takes the same arguments as the conf_mat() and accuracy() functions. For data, we supply our predictions tibble. For estimate column we specify dot-pred_class and for the truth column, we specify true_class. As you would expect, the function returns a tibble that gives the sensitivity in the column dot-estimate. In this case, 87-point-2 percent of all positive outcomes were found by the model.

10. yardstick ROC: roc_curve()

To plot a ROC curve, you first need predicted probabilities instead of predicted classes. Use the predict function with the model, test data and type equals to "prob". Then, add these predictions to the test data tibble using bind_cols(). The next step is to create a tibble with sensitivity and specificity for many different thresholds. This is done by the roc_curve() function. Pass the predictions tibble as first argument, the dot-pred_yes column as estimate and the still_customer column as truth argument. This will return a tibble with specificity and sensitivity for all unique thresholds. Using the autoplot() function, you get a graphical representation of this curve.

11. yardstick AUC: roc_auc()

Calculating the area under the curve is no different. Same arguments as usual: data, estimate, and truth. The result is the familiar tibble containing our result in the dot-estimate column. Our area under the curve is 0-point-872, or 87-point-2 percent here.

12. Let's measure!

Now it's your turn to draw receiver-operating-characteristic curves and calculate sensitivity and area under the curve.

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.