Get startedGet started for free

Coding categorical variables

In previous exercises you practiced creating model matrices for continuous variables and applying variable transformation. During this exercise you will practice the ways of coding a categorical variable.

Categorical data provide a way to analyze and compare relationships given different groups or factors. Hence, choosing a reference group is important and often, depending on the study at hand, you might want to change the reference group, from the default one. One frequently used reason for changing the reference group is that the interpretation of coefficient estimates is more applicable and interesting given the study.

For this exercise you will revisit the crab dataset where colorand spine are categorical variables.

The dataset crab is preloaded in the workspace.

This exercise is part of the course

Generalized Linear Models in Python

View Course

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Import function dmatrix
from ____ import ____

# Construct and print model matrix for color as categorical variable
print(____('____', data = ____,
     	   return_type = 'dataframe').head())
Edit and Run Code