Using pandas functions effectively
You are creating a Python application that will calculate summary statistics based on user-selected variables. The complete dataset is quite large. For now, you are setting up your code using part of the dataset, preloaded as adult
. As you create a reusable process, make sure you are thinking through the most efficient way to setup the GroupBy
object.
This exercise is part of the course
Working with Categorical Data in Python
Exercise instructions
- Create a list of the names for two user-selected variables:
"Education"
and"Above/Below 50k"
. - Create a
GroupBy
object,gb
, using theuser_list
as the grouping variables. - Calculate the mean of
"Hours/Week"
across each group using the most efficient approach covered in the video.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create a list of user-selected variables
user_list = ____
# Create a GroupBy object using this list
gb = ____
# Find the mean for the variable "Hours/Week" for each group - Be efficient!
print(____)