Get startedGet started for free

Calculating the confusion matrix

A confusion matrix (occasionally called a confusion table) is the basis of all performance metrics for models with a categorical response (such as a logistic regression). It contains the counts of each actual response-predicted response pair. In this case, where there are two possible responses (churn or not churn), there are four overall outcomes.

  1. The customer churned and the model predicted that.
  2. The customer churned but the model didn't predict that.
  3. The customer didn't churn but the model predicted they did.
  4. The customer didn't churn and the model predicted that.

churn and mdl_churn_vs_relationship are available.

This exercise is part of the course

Introduction to Regression in R

View Course

Exercise instructions

  • Get the actual responses from the has_churned column of the dataset. Assign to actual_response.
  • Get the "most likely" predicted responses from the model. Assign to predicted_response.
  • Create a table of counts from the actual and predicted response vectors. Assign to outcomes.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Get the actual responses from the dataset
actual_response <- ___

# Get the "most likely" responses from the model
predicted_response <- ___

# Create a table of counts
outcomes <- ___

# See the result
outcomes
Edit and Run Code