Building a stepwise regression model
In the absence of subject-matter expertise, stepwise regression can assist with the search for the most important predictors of the outcome of interest.
In this exercise, you will use a forward stepwise approach to add predictors to the model one-by-one until no additional benefit is seen. The donors
dataset has been loaded for you.
This exercise is part of the course
Supervised Learning in R: Classification
Exercise instructions
- Use the R formula interface with
glm()
to specify the base model with no predictors. Set the explanatory variable equal to1
. - Use the R formula interface again with
glm()
to specify the model with all predictors. - Apply
step()
to these models to perform forward stepwise regression. Set the first argument tonull_model
and setdirection = "forward"
. This might take a while (up to 10 or 15 seconds) as your computer has to fit quite a few different models to perform stepwise selection. - Create a vector of predicted probabilities using the
predict()
function. - Plot the ROC curve with
roc()
andplot()
and compute the AUC of the stepwise model withauc()
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Specify a null model with no predictors
null_model <- ___(___, data = ___, family = "___")
# Specify the full model using all of the potential predictors
full_model <- ___
# Use a forward stepwise algorithm to build a parsimonious model
step_model <- step(___, scope = list(lower = null_model, upper = full_model), direction = "___")
# Estimate the stepwise donation probability
step_prob <- ___
# Plot the ROC of the stepwise model
library(pROC)
ROC <- ___
plot(___, col = "red")
auc(___)