When good data makes bad plots
1. When good data makes bad plots
So far, we've focused on making good plots, but it's worthwhile to be able to identify and correct bad plots.2. Bad plots: style
There are many ways in which we can have bad plots. Simple formatting errors like poor color and text choices make it difficult to read the data correctly or even make our plots look ugly. All text should also serve a purpose and be legible for our audience. Pause the video and take a minute to review the items listed here. You've probably seen them already in the wild.3. Bad plots: structure and content
Deeper problems occur with structure and content. As the domain expert it's your job to know if you are overloading a plot with too much information to simply impress your viewers, or if you are producing a useless plot just to fill us space. The axes, statistics and geometries must also be used effectively. It doesn't hurt to reduce non-data ink and, finally, 3D plots should be avoided. Pause the video again and take a minute to review these topics before continuing. Let's take a look at some common data viz pitfalls.4. Wrong orientation
We typically read the y axis of a plot as a function of x, denoted f(x). That means that the variable on the y axis is a dependent variable of the independent x axis variable. Flipping the axes is confusing.5. Wrong orientation?
But, actually, sometimes it works great! We saw this at the end of the last course,6. Wrong orientation?
and in the last video. In both cases the axes were flipped to make them easier to read.7. Broken y-axes
Broken y-axes are also popular. This compensates for a large range in the data set with a large gap between the high and low values. Unfortunately, the upper and lower parts are on different scales!8. Broken y-axes, replace with transformed data
We would rather transform the scales. For example using a log 10 transformation, as shown here, or,9. Broken y-axes, use facets
or more typically, use facets with free scales.10. 3D plots, without data on the 3rd axis
The 3D plot is another favorite but often, the 3rd axis actually serves no purpose but to confuse the audience as to which part of the geometry should be read on the scale.11. 3D plots, with data on the 3rd axis
Sometimes 3D plots really do contain information in the 3rd axis, like 3D scatter plots. But can you figure out the position of each dot in this plot? It just adds to obscuring our data. Ideally we'd like to provide this as an interactive object or else as a series of two dimensional plots.12. Double y-axes
Double y-axes are also problematic but popular. Perceptual challenges in reading the data make this difficult, and it also invites suspicious activity since the scales are independent and the visual message can be manipulated to emphasis or diminish the perceived correlation by changing the range of values on the scale. If the two values are to be correlated then we should have an x-y plot that shows the correlation.13. Double y-axis for transformations
We did actually see a great example of double x and y-axes in the second chapter, when we had a raw and transformed scale.14. Guidelines not rules
But, remember, there are very few rules in data visualization, which is what makes it so interesting and difficult. Just use your common sense -- if anything on your plot obscures communication it is at worst unethical and at best poor execution. I hope that by now you are also a critical consumer of data visualization and are not so easily fooled by other people's poor judgement or purposeful misdirection.15. Let's practice!
We'll explore the bits that we can fix in ggplot2 in the exercises, so let's get started.Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.