Fit a model of sparrow survival probability
In this exercise, you will estimate the probability that a sparrow survives a severe winter storm, based on physical characteristics of the sparrow. The dataset sparrow has been pre-loaded. The outcome to be predicted is status ("Survived", "Perished"). The variables we will consider are:
- total_length: length of the bird from tip of beak to tip of tail (mm)
- weight: in grams
- humerus: length of humerus ("upper arm bone" that connects the wing to the body) (inches)
Remember that when using glm() (docs) to create a logistic regression model, you must explicitly specify that family = binomial:
glm(formula, data = data, family = binomial)
You will call summary() and broom::glance() to see different functions
for examining a logistic regression model. One of the diagnostics that you will look at is the analog to \(R^2\), called pseudo-\(R^2\).
$$ pseudoR^2 = 1 - \frac{deviance}{null.deviance} $$
You can think of deviance as analogous to variance: it is a measure of the variation in categorical data. The pseudo-\(R^2\) is analogous to \(R^2\) for standard regression: \(R^2\) is a measure of the "variance explained" of a regression model. The pseudo-\(R^2\) is a measure of the "deviance explained".
Este exercício faz parte do curso
Supervised Learning in R: Regression
Instruções do exercício
- As suggested in the video, you will predict on the outcomes TRUEandFALSE. Create a new columnsurvivedin thesparrowdata frame that is TRUE whenstatus == "Survived".
- Create the formula fmlathat expressessurvivedas a function of the variables of interest. Print it.
- Fit a logistic regression model to predict the probability of sparrow survival. Assign the model to the variable sparrow_model.
- Call summary()to see the coefficients of the model, the deviance and the null deviance.
- Call glance()on the model to see the deviances and other diagnostics in a data frame. Assign the output fromglance()to the variableperf.
- Calculate the pseudo-\(R^2\).
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
# sparrow is available
summary(sparrow)
# Create the survived column
sparrow$survived <- ___
# Create the formula
(fmla <- _____)
# Fit the logistic regression model
sparrow_model <- ___
# Call summary
___
# Call glance
(perf <- ___)
# Calculate pseudo-R-squared
(pseudoR2 <- ___)