Modeling with categorical variable
In previous exercises you have fitted a logistic regression model with color as explanatory variable along with width where you treated the color as quantitative variable. In this exercise you will treat color as a categorical variable which when you construct the model matrix will encode the color into 3 variables with 0/1 encoding.
Recall that the default encoding in dmatrix() uses the first group as a reference group. To view model matrix as a dataframe an additional argument in dmatrix(), namely, return_type will be set to 'dataframe'.
The color variable has a natural ordering as follows:
1: medium light
2: medium
3: medium dark
4: dark
The crab dataset is preloaded in the workspace.
Diese Übung ist Teil des Kurses
Generalized Linear Models in Python
Interaktive Übung
Vervollständige den Beispielcode, um diese Übung erfolgreich abzuschließen.
# Construct model matrix
model_matrix = ____('C(____, ____(____))' , data = ____,
return_type = 'dataframe')
# Print first 5 rows of model matrix dataframe
print(____.____)
# Fit and print the results of a glm model with the above model matrix configuration
model = ____('____ ~ ____(____, ____(____))', data = ____,
family = ____).____
print(____.____)