Modeling with categorical variable
In previous exercises you have fitted a logistic regression model with color
as explanatory variable along with width
where you treated the color
as quantitative variable. In this exercise you will treat color
as a categorical variable which when you construct the model matrix will encode the color
into 3 variables with 0/1
encoding.
Recall that the default encoding in dmatrix()
uses the first group as a reference group. To view model matrix as a dataframe an additional argument in dmatrix()
, namely, return_type
will be set to 'dataframe'
.
The color
variable has a natural ordering as follows:
1
: medium light
2
: medium
3
: medium dark
4
: dark
The crab
dataset is preloaded in the workspace.
Diese Übung ist Teil des Kurses
Generalized Linear Models in Python
Interaktive Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
# Construct model matrix
model_matrix = ____('C(____, ____(____))' , data = ____,
return_type = 'dataframe')
# Print first 5 rows of model matrix dataframe
print(____.____)
# Fit and print the results of a glm model with the above model matrix configuration
model = ____('____ ~ ____(____, ____(____))', data = ____,
family = ____).____
print(____.____)