Compute VIF
As you learned in the video one of the most widely used diagnostic for multicollinearity is the variance inflation factor or VIF, which is computed for each explanatory variable.
Recall from the video that the rule of thumb threshold is VIF at the level of 2.5, meaning if the VIF is above 2.5 you should consider there is effect of multicollinearity on your fitted model.
The previously fitted model and crab dataset are preloaded in the workspace.
Diese Übung ist Teil des Kurses
Generalized Linear Models in Python
Anleitung zur Übung
- From
statsmodelsimportvariance_inflation_factor. - From
crabdataset chooseweight,widthandcolorand save asX. AddInterceptcolumn of ones toX. - Using
pandasfunctionDataFrame()create an emptyvifdataframe and add column names ofXin columnVariables. - For each variable compute VIF using the
variance_inflation_factor()function and save invifdataframe withVIFcolumn name.
Interaktive Übung
Vervollständige den Beispielcode, um diese Übung erfolgreich abzuschließen.
# Import functions
from statsmodels.stats.outliers_influence import ____
# Get variables for which to compute VIF and add intercept term
X = ____[[____, ____, ____]]
X[____] = 1
# Compute and view VIF
vif = pd.____
vif["variables"] = X.____
vif["VIF"] = [____(X.values, i) for i in range(X.shape[1])]
# View results using print
____(____)