1. Grouping and summarizing data
In the telco dataset, there are two groups of customers we are interested in: The churners, and the non-churners.
2. Churners and non-churners
, and as you discovered using the `.value_counts()` method, most of these customers did not churn.
3. Model outcomes
Our goal in this course is to build a model that uses the information about each customer in the dataset to classify whether or not a new customer will churn. This model, therefore, has two outcomes, or classes: Either a customer will churn, or not churn.
4. Differences between churners and non-churners
Before even getting to the model building stage, you can use exploratory data analysis to identify differences between these two classes that can help you better understand the drivers of customer churn. Do churners call customer service more often? Does one state have more churners compared to another? These are some questions you can ask of the data.
5. Grouping and summarizing data
In order to answer these questions, you need to be able to group and summarize your data. To group data, pandas has a useful method called .groupby().
6. Let's group and summarize!
Let's see it in action!