Dropping duplicates
Removing duplicates is an essential skill to get accurate counts because often, you don't want to count the same thing multiple times. In this exercise, you'll create some new DataFrames using unique values from sales.
sales is available and pandas is imported as pd.
This exercise is part of the course
Data Manipulation with pandas
Exercise instructions
- Remove rows of
saleswith duplicate pairs ofstoreandtypeand save asstore_typesand print the head. - Remove rows of
saleswith duplicate pairs ofstoreanddepartmentand save asstore_deptsand print the head. - Subset the rows that are holiday weeks using the
is_holidaycolumn, and drop the duplicatedates, saving asholiday_dates. - Select the
datecolumn ofholiday_dates, and print.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Drop duplicate store/type combinations
store_types = ____
print(store_types.head())
# Drop duplicate store/department combinations
store_depts = ____
print(store_depts.head())
# Subset the rows where is_holiday is True and drop duplicate dates
holiday_dates = sales[sales[____]].drop_duplicates(____)
# Print date col of holiday_dates
print(____)