Finding the optimal threshold
Imagine you are running a campaign with the aim of preventing customers to default. You can lay out your campaign with the help of your predictions. Thereby, the choice of the threshold is essential for your results. If you know the costs and the rewards of your campaign, you can empirically check which threshold is most reasonable. In this exercise, we are faced with the following scenario:
If a customer does not default due to our campaign, i.e. if we predicted the default correctly (true positive) we are rewarded with 1000€. If however we aim our campaign at a customer who would not have defaulted anyways, i.e. if we falsely predicted the customer (false positive) to default, we are faced with costs of 250€.
From the last exercise we know that the restricted model was the best one. So only calculate the optimal threshold for that model. The predictions are stored in the column predNew
of the defaultData
dataframe. Use the SDMTools
package.
This exercise is part of the course
Machine Learning for Marketing Analytics in R
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
library(SDMTools)
# Confusion matrix with threshold 0.5
confMat <- confusion.matrix(defaultData$PaymentDefault,
defaultData$predNew,
threshold = ___)
confMat