Predict sparrow survival
In this exercise, you will predict the probability of survival using the sparrow survival model from the previous exercise.
Recall that when calling predict()
(docs) to get the predicted probabilities from a glm()
model, you must specify that you want the response:
predict(model, type = "response")
Otherwise, predict()
on a logistic regression model returns the predicted log-odds of the event, not the probability.
You will also use the GainCurvePlot()
(docs) function to plot the gain curve from the model predictions. If the model's gain curve is close to the ideal ("wizard") gain curve, then the model sorted the sparrows well: that is, the model predicted that sparrows that actually survived would have a higher probability of survival. The inputs to the GainCurvePlot()
function are:
frame
: data frame with prediction column and ground truth columnxvar
: the name of the column of predictions (as a string)truthVar
: the name of the column with actual outcome (as a string)title
: a title for the plot (as a string)
GainCurvePlot(frame, xvar, truthVar, title)
The sparrow
data frame and the model sparrow_model
have been pre-loaded.
This exercise is part of the course
Supervised Learning in R: Regression
Exercise instructions
- Create a new column in
sparrow
calledpred
that contains the predictions on the training data. - Call
GainCurvePlot()
to create the gain curve of predictions. Does the model do a good job of sorting the sparrows by whether or not they actually survived?
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# sparrow is available
summary(sparrow)
# sparrow_model is available
summary(sparrow_model)
# Make predictions
sparrow$pred <- ___
# Look at gain curve
___(___, ___, ___, "sparrow survival model")