LoslegenKostenlos loslegen

Modeling with categorical variable

In previous exercises you have fitted a logistic regression model with color as explanatory variable along with width where you treated the color as quantitative variable. In this exercise you will treat color as a categorical variable which when you construct the model matrix will encode the color into 3 variables with 0/1 encoding.

Recall that the default encoding in dmatrix() uses the first group as a reference group. To view model matrix as a dataframe an additional argument in dmatrix(), namely, return_type will be set to 'dataframe'.

The color variable has a natural ordering as follows:
1: medium light
2: medium
3: medium dark
4: dark

The crab dataset is preloaded in the workspace.

Diese Übung ist Teil des Kurses

Generalized Linear Models in Python

Kurs anzeigen

Interaktive Übung

Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.

# Construct model matrix
model_matrix = ____('C(____, ____(____))' , data = ____, 
                       return_type = 'dataframe')

# Print first 5 rows of model matrix dataframe
print(____.____)

# Fit and print the results of a glm model with the above model matrix configuration
model = ____('____ ~ ____(____, ____(____))', data = ____, 
            family = ____).____

print(____.____)
Code bearbeiten und ausführen