Dropping duplicates
Removing duplicates is an essential skill to get accurate counts because often, you don't want to count the same thing multiple times. In this exercise, you'll create some new DataFrames using unique values from sales
.
sales
is available and pandas
is imported as pd
.
This exercise is part of the course
Data Manipulation with pandas
Exercise instructions
- Remove rows of
sales
with duplicate pairs ofstore
andtype
and save asstore_types
and print the head. - Remove rows of
sales
with duplicate pairs ofstore
anddepartment
and save asstore_depts
and print the head. - Subset the rows that are holiday weeks using the
is_holiday
column, and drop the duplicatedate
s, saving asholiday_dates
. - Select the
date
column ofholiday_dates
, and print.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Drop duplicate store/type combinations
store_types = ____
print(store_types.head())
# Drop duplicate store/department combinations
store_depts = ____
print(store_depts.head())
# Subset the rows where is_holiday is True and drop duplicate dates
holiday_dates = sales[sales[____]].drop_duplicates(____)
# Print date col of holiday_dates
print(____)