Conclusion

1. Conclusion

In this course on Exploratory Data Analysis, our goal was to get you comfortable wading into a new dataset and provide a sense of the considerations and techniques needed to find the most important structure in your data.

2. Pie chart vs. bar chart

We started with categorical data, often the domain of the pie chart, and hopefully convinced you that bar charts are a useful tool in finding associations between variables and comparing levels to one another.

3. Faceting vs. stacking

We saw how the story can change, depending on if we're visualizing counts or proportions.

4. Histogram

From there, we moved on to numerical data and a collection of graphical techniques that are important: the histogram,

5. Density plot

the density plot, and the

6. Side-by-side box plots

box plot. Each has its strengths and weaknesses.

7. Center: mean, median, mode

In the third chapter, we discussed the common numerical measures of a distribution: measures of center, variability,

8. Shape of income

shape, plus the presence of outliers. Our life was made much easier by using the combination

9. With group_by()

of group by and summarize to compute statistics on different subsets of our data.

10. Spam and exclamation points

In the final chapter, we explored an email dataset to learn about the aspects of an email that are

11. Spam and images

associated with it being spam.

12. Let's practice!

It's been my pleasure to be your instructor and I hope you'll continue on with the next course in this intro stats series.

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.