ComenzarEmpieza gratis

Modeling with categorical variable

In previous exercises you have fitted a logistic regression model with color as explanatory variable along with width where you treated the color as quantitative variable. In this exercise you will treat color as a categorical variable which when you construct the model matrix will encode the color into 3 variables with 0/1 encoding.

Recall that the default encoding in dmatrix() uses the first group as a reference group. To view model matrix as a dataframe an additional argument in dmatrix(), namely, return_type will be set to 'dataframe'.

The color variable has a natural ordering as follows:
1: medium light
2: medium
3: medium dark
4: dark

The crab dataset is preloaded in the workspace.

Este ejercicio forma parte del curso

Generalized Linear Models in Python

Ver curso

Ejercicio interactivo práctico

Prueba este ejercicio y completa el código de muestra.

# Construct model matrix
model_matrix = ____('C(____, ____(____))' , data = ____, 
                       return_type = 'dataframe')

# Print first 5 rows of model matrix dataframe
print(____.____)

# Fit and print the results of a glm model with the above model matrix configuration
model = ____('____ ~ ____(____, ____(____))', data = ____, 
            family = ____).____

print(____.____)
Editar y ejecutar código