Grouping & aggregating
You'll be using .groupby()
and .agg()
a lot in this course, so it's important to become comfortable with them. In this exercise, your job is to calculate a set of summary statistics about the purchase data broken out by 'device'
(Android or iOS) and 'gender'
(Male or Female).
Following this, you'll compare the values across these subsets, which will give you a baseline for these values as potential KPIs to optimize going forward.
The purchase_data
DataFrame from the previous exercise has been pre-loaded for you. As a reminder, it contains purchases merged with user demographics.
This exercise is part of the course
Customer Analytics and A/B Testing in Python
Exercise instructions
- Group the
purchase_data
DataFrame by'device'
and'gender'
in that order. - Aggregate
grouped_purchase_data
, finding the'mean'
,'median'
, and the standard deviation ('std'
) of the purchase price, in that order, across these groups. - Examine the results. Does the mean differ drastically from the median? How much variability is in each group?
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Group the data
grouped_purchase_data = purchase_data.____(____ = ['____', '____'])
# Aggregate the data
purchase_summary = grouped_purchase_data.____({'price': ['____', '____', '____']})
# Examine the results
print(purchase_summary)