CommencerCommencer gratuitement

Modeling with categorical variable

In previous exercises you have fitted a logistic regression model with color as explanatory variable along with width where you treated the color as quantitative variable. In this exercise you will treat color as a categorical variable which when you construct the model matrix will encode the color into 3 variables with 0/1 encoding.

Recall that the default encoding in dmatrix() uses the first group as a reference group. To view model matrix as a dataframe an additional argument in dmatrix(), namely, return_type will be set to 'dataframe'.

The color variable has a natural ordering as follows:
1: medium light
2: medium
3: medium dark
4: dark

The crab dataset is preloaded in the workspace.

Cet exercice fait partie du cours

Generalized Linear Models in Python

Afficher le cours

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# Construct model matrix
model_matrix = ____('C(____, ____(____))' , data = ____, 
                       return_type = 'dataframe')

# Print first 5 rows of model matrix dataframe
print(____.____)

# Fit and print the results of a glm model with the above model matrix configuration
model = ____('____ ~ ____(____, ____(____))', data = ____, 
            family = ____).____

print(____.____)
Modifier et exécuter le code