Effective explanatory plots

1. Effective explanatory plots

For our last exercises, I want to go through an example of producing explanatory plots in an info viz style, something that you'd see in a magazine or website for a mostly lay audience.

2. Our goal, an effective explanatory plot

These plots tend to have both a small number of observations and variables, have embellishments and typically make a clear or dramatic statement. Our example comes from the gapminder data set and plots the countries with the highest and lowest life expectancies in 2007. The global mean is plotted for comparison. Here we focus on the large gap between the highest and lowest life expectancies, which is about 40 years!

3. Complete data

4. First exploratory plots - distributions

Our first exploratory plot would probably be a histogram, which isn't a bad choice. Recall that we have already applied a binning statistic here.

5. First exploratory plots - distributions

An alternative would be to arrange the data according life expectancy and plot that as an index, which allows us to see each point individually, without first binning the variable.

6. First exploratory plots - distributions

This has the advantage that we can color each point according to continent. This is already a quite informative plot. We can see differences in the distribution between continents. After getting familiar with our data, we need to reduce it to a compact and understandable format for a lay audience.

7. Our data

In this form we only have 20 observations, the top 10 and bottom 10 observations.

8. life expectancy plot

Here, I'd map the country to the y axis, so that it's easy to read, I mapped life expectancy onto both color and the x axis, which is redundant, but helps a lay audience.

9. Use intuitive and attractive geoms

The line segments add some perspective and is sometimes referred to as a lollipop plot when used with points.

10. Add text labels to your plot

Typically, we're happy to just read a value from the axis, but adding the actual value using a geom_text layer makes it immediately more intuitive for unexperienced viewers. You can already see that there are many things happening that we wouldn't typically do for a scientific audience.

11. Use appropriate scales

Next, I'd clean up the scales, using an intuitive color palette, removing unnecessary buffering, and changing the x axis location to the top of the plot.

12. Add useful titles and citations

Titles and captions help to make the plot complete, if it will be viewed alone.

13. Remove non-data ink

And of course, removing non-data ink makes for a great looking plot. Notice that I removed the x and y axis labels as well as the legend. None of them are actually necessary.

14. Add threshold lines

Adding a threshold line helps to orientate the viewer. Here, it's the global mean from 2007.

15. Add informative text

Of course, it's also helpful to label the threshold line. We'll do this with the annotate function, which allows us to access any geom and place it manually on a plot.

16. Add embellishments

For example, another geom we haven't seen yet is geom_curve, for drawing curved lines. This is really great for adding handy little arrows anywhere on our plot.

17. Let's practice!

By now you have the ggplot2 core competencies that will allow you to make beautiful and effective exploratory plots, but there is a lot more to data visualization and ggplot2. After you're finished the exercises, head over to the next course to learn about the remaining layers and round out your knowledge.