Wrap-up

1. Congratulations!

Congratulations! You made it to the end of the course! I hope that over these past four chapters you have gained a new appreciation for the subtle things that can take a visualization from good to great.

2. Chapter 1 recap

To recap. In chapter one we learned about proportions. How pie charts are not as bad as they are made out to be, as long as you don't put too many classes in them or expect high precision. If you need more precision or just have the space, a waffle chart is a great alternative. If you want to compare multiple proportions, stacked bars are great for comparing across the groups, but bad for comparing within.

3. Chapter 2 recap

In chapter two we moved on to point data. We saw that bar charts are great, but that we should be careful to make sure the data they represent can be 'stacked'. If the data aren't stackable we can simply replace the bars with points. Additionally, we saw some tricks for optimizing our plots. Mainly that we can avoid unnecessary grid lines with bars and that we should always order the categories by value, unless those categories have some natural order to them already

4. Chapter 3 recap

In chapter three we moved into distributions, and specifically looking at a single distribution. We saw that histograms are nice and intuitive but can be very sensitive to bin width and placement. A good alternative, especially in low-data situations is the kernel density estimator. However, with these plots we should always try and show the raw data to the user so they can see what assumptions are being made.

5. Chapter 4 recap

In our last chapter, we moved from single distributions to looking at multiple distributions. We saw that stock box-plots hide their data and should almost always be augmented with jitter plots to show the underlying data. An alternative to boxplots with jitter that also shows density is the beeswarm plot, but since beeswarms show each and every point they don't work well when we have a lot of data. In these scenarios a violin plot can work well as individual points aren't explicitely drawn. Last, we saw that when we have spatially related, or ordinal, categories, we can use ridgeline plots to display a lot of data in a single plot and get an idea of shifts in distributional characteristics.

6. Going further

If you found this brief foray into the more involved details of data visualization exciting there are plenty of resources for learning more. The flowing data blog by Nathan Yao is a fantastic set of data visualization examples and tutorials on making advanced charts in R. The Datawrapper blog run by Lisa Ross has wonderful deep dives into novel visualization types and also some of the mistakes commonly made in visualizations. Twitter is another really great resource that has an active data visualization community. The hashtag dataviz contains a wonderful assortment of inspirational data visualizations. Last, old-fashioned books. There are many but two that stand out to me are the book Data Visualization by Andy Kirk which offers a comprehensive sweep over lots of visualization techniques and best practices... and The functional art by Alberto Cairo. Which provides a great window into data visualization from the perspective of data journalism and is a balanced and no-nonsense view of the field.

7. Thank you!

I hope you enjoyed taking this course a tenth as much as I enjoyed making it for you. Now go out and make some great data visualizations!

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.