Preventing overgrown trees
The tree grown on the full set of applicant data grew to be extremely large and extremely complex, with hundreds of splits and leaf nodes containing only a handful of applicants. This tree would be almost impossible for a loan officer to interpret.
Using the pre-pruning methods for early stopping, you can prevent a tree from growing too large and complex. See how the rpart
control options for maximum tree depth and minimum split count impact the resulting tree.
rpart
has been pre-loaded.
This exercise is part of the course
Supervised Learning in R: Classification
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Grow a tree with maxdepth of 6
loan_model <- ___
# Make a class prediction on the test set
loans_test$pred <- ___
# Compute the accuracy of the simpler tree
mean(___)