Collapsing categories
One problem that users of a local dog adoption website have voiced is that there are too many options. As they look through the different types of dogs, they are getting lost in the overwhelming amount of choice. To simplify some of the data, you are going through each column and collapsing data if appropriate. To preserve the original data, you are going to make new updated columns in the dogs
dataset. You will start with the coat
column. The frequency table is listed here:
short 1969
medium 565
wirehaired 220
long 180
medium-long 3
This exercise is part of the course
Working with Categorical Data in Python
Exercise instructions
- Create a dictionary named
update_coats
to map bothwirehaired
andmedium-long
tomedium
. - Collapse the categories listed in this new dictionary and save this as a new column,
coat_collapsed
. - Convert this new column into a categorical Series.
- Print the frequency table of this new Series.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create the update_coats dictionary
____
# Create a new column, coat_collapsed
dogs["coat_collapsed"] = ____
# Convert the column to categorical
____
# Print the frequency table
print(____)