Modeling with categorical variable
In previous exercises you have fitted a logistic regression model with color as explanatory variable along with width where you treated the color as quantitative variable. In this exercise you will treat color as a categorical variable which when you construct the model matrix will encode the color into 3 variables with 0/1 encoding.
Recall that the default encoding in dmatrix() uses the first group as a reference group. To view model matrix as a dataframe an additional argument in dmatrix(), namely, return_type will be set to 'dataframe'.
The color variable has a natural ordering as follows:
1: medium light
2: medium
3: medium dark
4: dark
The crab dataset is preloaded in the workspace.
Cet exercice fait partie du cours
Generalized Linear Models in Python
Exercice interactif pratique
Essayez cet exercice en complétant cet exemple de code.
# Construct model matrix
model_matrix = ____('C(____, ____(____))' , data = ____,
return_type = 'dataframe')
# Print first 5 rows of model matrix dataframe
print(____.____)
# Fit and print the results of a glm model with the above model matrix configuration
model = ____('____ ~ ____(____, ____(____))', data = ____,
family = ____).____
print(____.____)