Get startedGet started for free

Model evaluation

1. Model evaluation

Well done on reviewing logistic regression! In this final lesson of the course, we will go over model evaluation.

2. Introduction

We've fitted linear and logistic models, but we haven't yet checked how good they are at predictions. Companies need to know whether the models work and, consequently, if they can trust their predictions.

3. Validation set approach

The slide presents a dataset where y is the response variable. To validate your model, you can randomly divide the dataset into two parts:

4. Validation set approach

A training set and a test set.

5. Validation set approach

You fit the model on the training set.

6. Validation set approach

Then, you derive the predictions using the explanatory variables of the test set.

7. Validation set approach

The final step is to compare the predicted values with the actual values of the test set and compute the evaluation metric. You can iteratively change your model to improve the evaluation metric.

8. Cross-validation

Another approach is called k-fold cross-validation. K is a number here. Let's use 5-fold cross-validation as an example.

9. Cross-validation

In 5-fold cross-validation, we divide the whole dataset into five random subsamples of approximately the same size.

10. Cross-validation

We then use one of the subsamples as a test set and the remaining four as a training set.

11. Cross-validation

We repeat that five times

12. Cross-validation

so that each subsample

13. Cross-validation

is used as a test set

14. Cross-validation

exactly once. We can calculate the average of metrics to have one number for comparison against other models.

15. Confusion matrix

Now, let's move to evaluation metrics. We will go over regression and classification metrics. Take a look at the confusion matrix. There are four possible classification results.

16. Confusion matrix

The diagonal elements represent the predictions for which the predicted label is equal to the true label. The two remaining elements represent incorrect predictions.

17. Confusion matrix

Either the actual result was true, and we predicted false,

18. Confusion matrix

or the actual result was false, and we predicted true.

19. Classification metrics

Accuracy is the number of correct predictions divided by the total number of predictions. Precision is the number of true positives divided by the sum of true and false positives. Recall, on the other hand, is the number of true positives divided by the sum of true positives and false negatives.

20. Classification metrics

Different metrics are used for various purposes. Precision may be applied, for example, for a spam detector. We would rather classify an e-mail as non-spam than having the user miss something important. Recall may be useful to classify a rare disease, since it's better to raise the alarm if the symptoms even resemble a disease rather than ignore it.

21. Regression metrics

To evaluate classification, we count the number of correct and false predictions.

22. Regression metrics

In regression, we measure the distance between the actual and the predicted values.

23. Regression metrics

Root Mean Squared Error and Mean Absolute Error are commonly used metrics. Root Mean Squared Error measures the average magnitude of the error by taking the square root of the average of squared differences between the prediction and actual observation. Mean Absolute Error is the sum of the absolute differences between the predictions and actual values.

24. Regression metrics

The difference between these two metrics is that Root Mean Squared Error gives a relatively high weight to large errors. You should use this metric if large errors are particularly undesirable. For example, if it's worse for you to be wrong by ten than to be wrong by five twice, then choose Root Mean Squared Error over Mean Absolute Error as your metric. Mean Absolute Error may be preferable because its interpretation is straightforward.

25. Summary

To summarize, we've covered validation set approach, cross-validation, confusion matrix, classifications metrics, and regression metrics.

26. Let's practice!

Let's practice model evaluation!