Fit a multivariable logistic regression
Using the knowledge gained in the video you will revisit the crab dataset to fit a multivariate logistic regression model. In chapter 2 you have fitted a logistic regression with width as explanatory variable. In this exercise you will analyze the effects of adding color as additional variable.
The color variable has a natural ordering from medium light, medium, medium dark and dark. As such color is an ordinal variable which in this example you will treat as a quantitative variable.
The crab dataset is preloaded in the workspace. Also note that the only difference in the code from the univariate case is in the formula argument, where now you will add structure to incorporate the new variable.
This exercise is part of the course
Generalized Linear Models in Python
Exercise instructions
- Import necessary functions from
statsmodelslibrary for GLMs. - Define
formulaargument wherewidthandcolorare explanatory variables andyis the response. - Fit a multivariate logistic regression model using
glm()function. - Print model results.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Import statsmodels
import ____.____ as sm
from ____.____.____ import glm
# Define model formula
formula = '____ ~ ____'
# Fit GLM
model = glm(____, ____ = ____, ____ = sm.____.____).____
# Print model summary
____(____.____)