1. Learn
  2. /
  3. Courses
  4. /
  5. Generalized Linear Models in Python

Exercise

Coding categorical variables

In previous exercises you practiced creating model matrices for continuous variables and applying variable transformation. During this exercise you will practice the ways of coding a categorical variable.

Categorical data provide a way to analyze and compare relationships given different groups or factors. Hence, choosing a reference group is important and often, depending on the study at hand, you might want to change the reference group, from the default one. One frequently used reason for changing the reference group is that the interpretation of coefficient estimates is more applicable and interesting given the study.

For this exercise you will revisit the crab dataset where colorand spine are categorical variables.

The dataset crab is preloaded in the workspace.

Instructions 1/2

undefined XP
    1
    2
  • Import dmatrix from patsy.
  • Using dmatrix() construct and print a model matrix with color as categorical variable using C()function.