Coordinates vs. scales

1. Coordinates vs. scales

In the next set of exercises, I want to look at how to use the coordinate layer to perform transformations, and how that differs from using the scale functions.

2. Plot the raw data

For these examples, I'm going to use the body weight variable from the msleep data set. I made more adjustments than what's shown here, but this is the basic code to get this univariate plot. We can see that this variable has a strong positive skew. In the first course, we saw how we can use the scale functions to modify things like the x-axis limits and breaks. Let's consider three ways in which we can transform our data. A common transformation for positively skewed data is a natural, base e, logarithm, or the more intuitive common, base 10, logarithm.

3. Transform the raw data

We can transform the data before we begin plotting, and update the actual data frame, or we can transform the variable on-the-fly when we specify it in the aes function, as shown here. The result is the same. So far, so good! Notice that the axis labels are the log-transformed values, where zero is the log 10 of 1 kilograms, and 4 is the log 10 of 10000 kilograms. This is a very common solution, but it is a bit misleading in that the transformed scale is linear and we have to do some mental arithmetic to get back to the original values. So we've lost a bit of precision here.

4. Add logtick annotation

We could add log annotation tick marks using the annotation_logticks function. This highlights that the data is a log transformation. However, another solution is to have the data on a log scale, and label it with the actual original body weight value. We can do this in two ways.

5. Use scale_*_log10()

The first method uses the scale_x_log10 function. This transformed the data and then calculates any statistics needed.

6. Compare direct transform and scale_*_log10() output

The plots are almost identical, but pay attention to the axis labeling in the second plot using the scale_x_log10 function. The labels correspond to the actual value in the data set. This is the default output, we saw how to clean up axis labels in the first course.

7. Use coord_trans()

As you could imagine, we also have a function in the coordinate layer: coord_trans, which is actually more flexible in that we can apply any transformation we'd like.

8. Compare scale_*_log10() and coord_trans() output

Using coord_trans and setting the x argument to "log10" results in the same plot as with the scale function. The default labels happen to be different, but the plot is the same.

9. Adjusting labels

As a final step, we can add the actual values of the data on the axis. This is a really nice way to show the transformed values in relation to the original value on the axis labels. This may give you the impression that scale and coord functions work in the same way, but just like zooming, there are some fundamental differences under the hood when applying transformations. We'll take a look at those in the exercises.

10. Time for exercises

Alright, now that you know how to use the scale and coord functions to apply transformations, let's look at bivariate plots and see how these functions affect our statistics.

This exercise is part of the course

Intermediate Data Visualization with ggplot2

IntermediateSkill Level

4.8+

Start Course for Free

A picture paints a thousand words, which is why R ggplot2 is such a powerful tool for graphical data analysis. In this chapter, you’ll progress from simply plotting data to applying a variety of statistical methods. These include a variety of linear models, descriptive and inferential statistics (mean, standard deviation and confidence intervals) and custom functions.

Exercise 1: Stats with geoms Exercise 2: Smoothing Exercise 3: Grouping variables Exercise 4: Modifying stat_smooth Exercise 5: Modifying stat_smooth (2)Exercise 6: Stats: sum and quantile Exercise 7: Quantiles Exercise 8: Using stat_sum Exercise 9: Stats outside geoms Exercise 10: Preparations Exercise 11: Using position objects Exercise 12: Plotting variations

The Coordinates layers offer specific and very useful tools for efficiently and accurately communicating data. Here we’ll look at the various ways of effectively using these layers, so you can clearly visualize lognormal datasets, variables with units, and periodic data.

Exercise 1: Coordinates Exercise 2: Zooming In Exercise 3: Aspect ratio I: 1:1 ratios Exercise 4: Aspect ratio II: setting ratios Exercise 5: Expand and clip Exercise 6: Coordinates vs. scales

Current Exercise

Exercise 7: Log-transforming scales Exercise 8: Adding stats to transformed scales Exercise 9: Double and flipped axes Exercise 10: Useful double axes Exercise 11: Flipping axes I Exercise 12: Flipping axes II Exercise 13: Polar coordinates Exercise 14: Pie charts Exercise 15: Wind rose plots

Facets let you split plots into multiple panes, each displaying subsets of the dataset. Here you'll learn how to wrap facets and arrange them in a grid, as well as providing custom labeling.

Exercise 1: The facets layer Exercise 2: Facet layer basics Exercise 3: Many variables Exercise 4: Formula notation Exercise 5: Facet labels and order Exercise 6: Labeling facets Exercise 7: Setting order Exercise 8: Facet plotting spaces Exercise 9: Variable plotting spaces I: continuous variables Exercise 10: Variable plotting spaces II: categorical variables Exercise 11: Facet wrap & margins Exercise 12: Wrapping for many levels Exercise 13: Margin plots

Now that you have the technical skills to make great visualizations, it’s important that you make them as meaningful as possible. In this chapter, you’ll review three plot types that are commonly discouraged in the data viz community: heat maps, pie charts, and dynamite plots. You’ll learn the pitfalls with these plots and how to avoid making these mistakes yourself.

Exercise 1: Best practices: bar plots Exercise 2: Bar plots: dynamite plots Exercise 3: Bar plots: position dodging Exercise 4: Bar plots: Using aggregated data Exercise 5: Heatmaps use case scenario Exercise 6: Heat maps Exercise 7: Useful heat maps Exercise 8: Heat map alternatives Exercise 9: When good data makes bad plots Exercise 10: Suppression of the origin Exercise 11: Color blindness Exercise 12: Typical problems