Session Ready
Exercise

Calculating the confusion matrix

A confusion matrix (occasionally called a confusion table) is the basis of all performance metrics for models with a categorical response (such as a logistic regression). It contains the counts of each actual response-predicted response pair. In this case, where there are two possible responses (churn or not churn), there are four overall outcomes.

  1. True positive: The customer churned and the model predicted they would.
  2. False positive: The customer didn't churn, but the model predicted they would.
  3. True negative: The customer didn't churn and the model predicted they wouldn't.
  4. False negative: The customer churned, but the model predicted they wouldn't.

churn and mdl_churn_vs_relationship are available.

Instructions
100 XP
  • Get the actual responses from the has_churned column of the dataset. Assign to actual_response.
  • Get the "most likely" predicted responses from the model. Assign to predicted_response.
  • Create a DataFrame from actual_response and predicted_response. Assign to outcomes.
  • Print outcomes as a table of counts, representing the confusion matrix. This has been done for you.