Modeling with categorical variable
In previous exercises you have fitted a logistic regression model with color
as explanatory variable along with width
where you treated the color
as quantitative variable. In this exercise you will treat color
as a categorical variable which when you construct the model matrix will encode the color
into 3 variables with 0/1
encoding.
Recall that the default encoding in dmatrix()
uses the first group as a reference group. To view model matrix as a dataframe an additional argument in dmatrix()
, namely, return_type
will be set to 'dataframe'
.
The color
variable has a natural ordering as follows:
1
: medium light
2
: medium
3
: medium dark
4
: dark
The crab
dataset is preloaded in the workspace.
Este ejercicio forma parte del curso
Generalized Linear Models in Python
Ejercicio interactivo práctico
Prueba este ejercicio y completa el código de muestra.
# Construct model matrix
model_matrix = ____('C(____, ____(____))' , data = ____,
return_type = 'dataframe')
# Print first 5 rows of model matrix dataframe
print(____.____)
# Fit and print the results of a glm model with the above model matrix configuration
model = ____('____ ~ ____(____, ____(____))', data = ____,
family = ____).____
print(____.____)