Get startedGet started for free

The effect of multicollinearity

Using the crab dataset you will analyze the effects of multicollinearity. Recall that multicollinearity can have the following effects:

  • Coefficient is not significant, but variable is highly correlated with \(y\).
  • Adding/removing a variable significantly changes coefficients.
  • Not logical sign of the coefficient.
  • Variables have high pairwise correlation.

This exercise is part of the course

Generalized Linear Models in Python

View Course

Exercise instructions

  • Import necessary functions from statsmodels library for GLMs.
  • Fit a multivariate logistic regression model with weight and width as explanatory variables and y as the response.
  • View model results using print() function.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Import statsmodels
import ____.____ as sm
from ____.____.____ import glm

# Define model formula
formula = '____ ~ ____'

# Fit GLM
model = glm(____, ____ = ____, ____ = sm.____.____).____

# Print model summary
____(____.____)
Edit and Run Code