Counting categorical variables
Counting is a great way to get an overview of your data and to spot curiosities that you might not notice otherwise. In this exercise, you'll count the number of each type of store and the number of each department number using the DataFrames you created in the previous exercise:
# Drop duplicate store/type combinations
store_types = sales.drop_duplicates(subset=["store", "type"])
# Drop duplicate store/department combinations
store_depts = sales.drop_duplicates(subset=["store", "department"])
The store_types
and store_depts
DataFrames you created in the last exercise are available, and pandas
is imported as pd
.
This exercise is part of the course
Data Manipulation with pandas
Exercise instructions
- Count the number of stores of each store
type
instore_types
. - Count the proportion of stores of each store
type
instore_types
. - Count the number of stores of each
department
instore_depts
, sorting the counts in descending order. - Count the proportion of stores of each
department
instore_depts
, sorting the proportions in descending order.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Count the number of stores of each type
store_counts = ____
print(store_counts)
# Get the proportion of stores of each type
store_props = ____
print(store_props)
# Count the number of stores for each department and sort
dept_counts_sorted = ____
print(dept_counts_sorted)
# Get the proportion of stores in each department and sort
dept_props_sorted = ____.____(sort=____, normalize=____)
print(dept_props_sorted)