Fit a multivariable logistic regression
Using the knowledge gained in the video you will revisit the crab
dataset to fit a multivariate logistic regression model. In chapter 2 you have fitted a logistic regression with width
as explanatory variable. In this exercise you will analyze the effects of adding color
as additional variable.
The color
variable has a natural ordering from medium light, medium, medium dark and dark. As such color
is an ordinal variable which in this example you will treat as a quantitative variable.
The crab
dataset is preloaded in the workspace. Also note that the only difference in the code from the univariate case is in the formula argument, where now you will add structure to incorporate the new variable.
This exercise is part of the course
Generalized Linear Models in Python
Exercise instructions
- Import necessary functions from
statsmodels
library for GLMs. - Define
formula
argument wherewidth
andcolor
are explanatory variables andy
is the response. - Fit a multivariate logistic regression model using
glm()
function. - Print model results.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Import statsmodels
import ____.____ as sm
from ____.____.____ import glm
# Define model formula
formula = '____ ~ ____'
# Fit GLM
model = glm(____, ____ = ____, ____ = sm.____.____).____
# Print model summary
____(____.____)