Get startedGet started for free

Collapsing categories

One problem that users of a local dog adoption website have voiced is that there are too many options. As they look through the different types of dogs, they are getting lost in the overwhelming amount of choice. To simplify some of the data, you are going through each column and collapsing data if appropriate. To preserve the original data, you are going to make new updated columns in the dogs dataset. You will start with the coat column. The frequency table is listed here:

short          1969
medium          565
wirehaired      220
long            180
medium-long       3

This exercise is part of the course

Working with Categorical Data in Python

View Course

Exercise instructions

  • Create a dictionary named update_coats to map both wirehaired and medium-long to medium.
  • Collapse the categories listed in this new dictionary and save this as a new column, coat_collapsed.
  • Convert this new column into a categorical Series.
  • Print the frequency table of this new Series.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Create the update_coats dictionary
____

# Create a new column, coat_collapsed
dogs["coat_collapsed"] = ____

# Convert the column to categorical
____

# Print the frequency table
print(____)
Edit and Run Code