Building a simple decision tree
The loans dataset contains 11,312 randomly-selected people who applied for and later received loans from Lending Club, a US-based peer-to-peer lending company.
You will use a decision tree to try to learn patterns in the outcome of these loans (either repaid or default) based on the requested loan amount and credit score at the time of application.
Then, see how the tree's predictions differ for an applicant with good credit versus one with bad credit.
The dataset loans has been loaded for you.
This exercise is part of the course
Supervised Learning in R: Classification
Exercise instructions
- Load the
rpartpackage. - Fit a decision tree model with the function
rpart().- Supply the R formula that specifies
outcomeas a function ofloan_amountandcredit_scoreas the first argument. - Leave the
controlargument alone for now. (You'll learn more about that later!)
- Supply the R formula that specifies
- Use
predict()with the resulting loan model to predict the outcome for thegood_creditapplicant. Use thetypeargument to predict the"class"of the outcome. - Do the same for the
bad_creditapplicant.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Load the rpart package
# Build a lending model predicting loan outcome versus loan amount and credit score
loan_model <- rpart(___, data = ___, method = "___", control = rpart.control(cp = 0))
# Make a prediction for someone with good credit
predict(___, ___, type = "___")
# Make a prediction for someone with bad credit