Summary statistics for both classes
Consider the following .groupby()
code:
# Group by x and compute the standard deviation
df.groupby(['x']).std()
Here, a DataFrame df
is grouped by a column 'x'
, and then the standard deviation is calculated across all columns of df
for each value of 'x'
. The .groupby()
method is incredibly useful when you want to investigate specific columns of your dataset. Here, you're going to explore the 'Churn'
column further to see if there are differences between churners and non-churners. A subset version of the telco
DataFrame, consisting of the columns 'Churn'
, 'CustServ_Calls'
, and 'Vmail_Message'
is available in your workspace.
If you need a refresher on how .groupby()
works, please refer back to the pre-requisite Manipulating DataFrames with pandas course.
This exercise is part of the course
Marketing Analytics: Predicting Customer Churn in Python
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Group telco by 'Churn' and compute the mean
print(telco.____(['____']).____())