1. Conclusion
In this course on Exploratory Data Analysis, our goal was to get you comfortable wading into a new dataset and provide a sense of the considerations and techniques needed to find the most important structure in your data.
2. Pie chart vs. bar chart
We started with categorical data, often the domain of the pie chart, and hopefully convinced you that bar charts are a useful tool in finding associations between variables and comparing levels to one another.
3. Faceting vs. stacking
We saw how the story can change, depending on if we're visualizing counts or proportions.
4. Histogram
From there, we moved on to numerical data and a collection of graphical techniques that are important: the histogram,
5. Density plot
the density plot, and the
6. Side-by-side box plots
box plot. Each has its strengths and weaknesses.
7. Center: mean, median, mode
In the third chapter, we discussed the common numerical measures of a distribution: measures of center, variability,
8. Shape of income
shape, plus the presence of outliers. Our life was made much easier by using the combination
9. With group_by()
of group by and summarize to compute statistics on different subsets of our data.
10. Spam and exclamation points
In the final chapter, we explored an email dataset to learn about the aspects of an email that are
11. Spam and images
associated with it being spam.
12. Let's practice!
It's been my pleasure to be your instructor and I hope you'll continue on with the next course in this intro stats series.