Pruning the tree with the loss matrix

In this exercise, you will prune the tree that was built using a loss matrix in order to penalize misclassified defaults more than misclassified non-defaults.

This exercise is part of the course

Credit Risk Modeling in R

View Course

Exercise instructions

  • Run the code to set a seed and construct tree_loss_matrix again.
  • Use function plotcp() to examine the cross-validated error-structure.
  • Looking at the cp-plot, you will notice that pruning the tree using the minimum cross-validated error will lead to a tree that is as big as the unpruned tree, as the cross-validated error reaches its minimum for cp = 0.001. Because you would like to make the tree somewhat smaller, try pruning the tree using cp = 0.0012788. For this complexity parameter, the cross-validated error approaches the minimum observed error. Call the pruned tree ptree_loss_matrix.
  • Package rpart.plot is loaded in your workspace. Plot the pruned tree using function prp() (including argument extra = 1).

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# set a seed and run the code to construct the tree with the loss matrix again
set.seed(345)
tree_loss_matrix  <- rpart(loan_status ~ ., method = "class", data = training_set,
                           parms = list(loss=matrix(c(0, 10, 1, 0), ncol = 2)),
                           control = rpart.control(cp = 0.001))

# Plot the cross-validated error rate as a function of the complexity parameter


# Prune the tree using cp = 0.0012788


# Use prp() and argument extra = 1 to plot the pruned tree