Session Ready
Exercise

Fit a model of sparrow survival probability

In this exercise, you will estimate the probability that a sparrow survives a severe winter storm, based on physical characteristics of the sparrow. The dataset sparrow is loaded into your workspace. The outcome to be predicted is status ("Survived", "Perished"). The variables we will consider are:

  • total_length: length of the bird from tip of beak to tip of tail (mm)
  • weight: in grams
  • humerus : length of humerus ("upper arm bone" that connects the wing to the body) (inches)

Remember that when using glm() to create a logistic regression model, you must explicitly specify that family = binomial:

glm(formula, data = data, family = binomial)

You will call summary(), broom::glance() to see different functions for examining a logistic regression model. One of the diagnostics that you will look at is the analog to \(R^2\), called pseudo-\(R^2\).

$$ pseudoR^2 = 1 - \frac{deviance}{null.deviance} $$

You can think of deviance as analogous to variance: it is a measure of the variation in categorical data. The pseudo-\(R^2\) is analogous to \(R^2\) for standard regression: \(R^2\) is a measure of the "variance explained" of a regression model. The pseudo-\(R^2\) is a measure of the "deviance explained".

Instructions
100 XP

The data frame sparrow and the package broom are loaded in the workspace.

  • As suggested in the video, you will predict on the outcomes TRUE and FALSE. Create a new column survived in the sparrow data frame that is TRUE when status == "Survived".
  • Create the formula fmla that expresses survived as a function of the variables of interest. Print it.
  • Fit a logistic regression model to predict the probability of sparrow survival. Assign the model to the variable sparrow_model.
  • Call summary() to see the coefficients of the model, the deviance and the null deviance.
  • Call glance() on the model to see the deviances and other diagnostics in a data frame. Assign the output from glance() to the variable perf.
  • Calculate the pseudo-\(R^2\).