Get Started

Specifying a cut-off

We have shown you how the specification of a cut-off can make the difference to obtain a good confusion matrix. Now, you will learn how to transform the prediction vector to a vector of binary values indicating the status of the loan. The ifelse() function in R can help you here.

Applying the ifelse() function in the context of a cut-off, you would have something like

ifelse(predictions > 0.3, 1, 0)

In the first argument, you are testing whether a certain value in the predictions-vector is bigger than 0.3. If this is TRUE, R returns "1" (specified in the second argument), if FALSE, R returns "0" (specified in the third argument), representing "default" and "no default", respectively.

This is a part of the course

“Credit Risk Modeling in R”

View Course

Exercise instructions

  • The code for the full logistic regression model along with the predictions-vector is given in your console.
  • Using a cutoff of 0.15, create vector pred_cutoff_15 using the the ifelse() function and predictions_all_full.
  • Look at the confusion matrix using table() (enter the true values, so test_set$loan_status, in the first argument).

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# The code for the logistic regression model and the predictions is given below
log_model_full <- glm(loan_status ~ ., family = "binomial", data = training_set)
predictions_all_full <- predict(log_model_full, newdata = test_set, type = "response")

# Make a binary predictions-vector using a cut-off of 15%


# Construct a confusion matrix

This exercise is part of the course

Credit Risk Modeling in R

IntermediateSkill Level
4.3+
3 reviews

Apply statistical modeling in a real-life setting using logistic regression and decision trees to model credit risk.

Logistic regression is still a widely used method in credit risk modeling. In this chapter, you will learn how to apply logistic regression models on credit data in R.

Exercise 1: Logistic regression: introductionExercise 2: Basic logistic regressionExercise 3: Interpreting the odds for a categorical variableExercise 4: Multiple variables in a logistic regression modelExercise 5: Interpreting significance levelsExercise 6: Logistic regression: predicting the probability of defaultExercise 7: Predicting the probability of defaultExercise 8: Making more discriminative modelsExercise 9: Evaluating the logistic regression model resultExercise 10: Specifying a cut-off
Exercise 11: Comparing two cut-offsExercise 12: Wrap-up and remarksExercise 13: Comparing link functions for a given cut-off

What is DataCamp?

Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.

Start Learning for Free