The effect of multicollinearity
Using the crab dataset you will analyze the effects of multicollinearity. Recall that multicollinearity can have the following effects:
- Coefficient is not significant, but variable is highly correlated with \(y\).
- Adding/removing a variable significantly changes coefficients.
- Not logical sign of the coefficient.
- Variables have high pairwise correlation.
This exercise is part of the course
Generalized Linear Models in Python
Exercise instructions
- Import necessary functions from
statsmodelslibrary for GLMs. - Fit a multivariate logistic regression model with
weightandwidthas explanatory variables andyas the response. - View model results using
print()function.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Import statsmodels
import ____.____ as sm
from ____.____.____ import glm
# Define model formula
formula = '____ ~ ____'
# Fit GLM
model = glm(____, ____ = ____, ____ = sm.____.____).____
# Print model summary
____(____.____)