Exercise

Calculating the confusion matrix

A confusion matrix (occasionally called a confusion table) is the basis of all performance metrics for models with a categorical response (such as a logistic regression). It contains the counts of each actual response-predicted response pair. In this case, where there are two possible responses (churn or not churn), there are four overall outcomes.

  1. The customer churned and the model predicted that.
  2. The customer churned but the model didn't predict that.
  3. The customer didn't churn but the model predicted they did.
  4. The customer didn't churn and the model predicted that.

churn and mdl_churn_vs_relationship are available.

Instructions

100 XP
  • Get the actual responses from the has_churned column of the dataset. Assign to actual_response.
  • Get the "most likely" predicted responses from the model. Assign to predicted_response.
  • Create a table of counts from the actual and predicted response vectors. Assign to outcomes.