1. Congratulations
Congratulations on completing the course, you've covered a lot!
2. Inspection and validation
You started off by learning how to inspect and validate data,
3. Aggregation
before performing aggregation and calculating summary statistics!
4. Address missing data
You saw how to check for missing values.
5. Address missing data
You then identified strategies to deal with it, including dropping missing values, and imputation!
6. Analyze categorical data
You discovered how to create categories from strings,
7. Apply lambda functions
use lambda functions to conditionally calculate summary statistics based on categories and add values into the original DataFrame,
8. Handle outliers
and deal with outliers!
9. Patterns over time
You progressed to examining relationships, including patterns over time,
10. Correlation
correlation between variables,
11. Distributions
and interpreting distributions!
12. Cross-tabulation
In the final chapter you learned the benefits of cross-tabulation,
13. pd.cut()
generated new features using pd-dot-cut,
14. Data snooping
and saw the impact of data snooping!
15. Generating hypotheses
You finished by identifying the limits of EDA and the next step of the data science workflow, hypothesis testing.
16. Next steps
Now you understand EDA, you may wish to explore some courses that build on the concepts in this course, such as the steps involved in hypothesis testing, or supervised learning, which is a form of machine learning!
17. Congratulations!
We hope you've enjoyed the course and feel confident in performing exploratory data analysis going forward!