Get startedGet started for free

Predict sparrow survival

In this exercise, you will predict the probability of survival using the sparrow survival model from the previous exercise.

Recall that when calling predict() (docs) to get the predicted probabilities from a glm() model, you must specify that you want the response:

predict(model, type = "response")

Otherwise, predict() on a logistic regression model returns the predicted log-odds of the event, not the probability.

You will also use the GainCurvePlot() (docs) function to plot the gain curve from the model predictions. If the model's gain curve is close to the ideal ("wizard") gain curve, then the model sorted the sparrows well: that is, the model predicted that sparrows that actually survived would have a higher probability of survival. The inputs to the GainCurvePlot() function are:

  • frame: data frame with prediction column and ground truth column
  • xvar: the name of the column of predictions (as a string)
  • truthVar: the name of the column with actual outcome (as a string)
  • title: a title for the plot (as a string)

GainCurvePlot(frame, xvar, truthVar, title)

The sparrow data frame and the model sparrow_model have been pre-loaded.

This exercise is part of the course

Supervised Learning in R: Regression

View Course

Exercise instructions

  • Create a new column in sparrow called pred that contains the predictions on the training data.
  • Call GainCurvePlot() to create the gain curve of predictions. Does the model do a good job of sorting the sparrows by whether or not they actually survived?

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# sparrow is available
summary(sparrow)

# sparrow_model is available
summary(sparrow_model)

# Make predictions
sparrow$pred <- ___

# Look at gain curve
___(___, ___, ___, "sparrow survival model")
Edit and Run Code