Aan de slagGa gratis aan de slag

Using stat_sum

In the Vocab dataset, education and vocabulary are integer variables. In the first course, you saw that this is one of the four causes of overplotting. You'd get a single point at each intersection between the two variables.

One solution, shown in the step 1, is jittering with transparency. Another solution is to use stat_sum(), which calculates the total number of overlapping observations and maps that onto the size aesthetic.

stat_sum() allows a special variable, ..prop.., to show the proportion of values within the dataset.

Deze oefening maakt deel uit van de cursus

Intermediate Data Visualization with ggplot2

Cursus bekijken

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# Run this, look at the plot, then update it
ggplot(Vocab, aes(x = education, y = vocabulary)) +
  # Replace this with a sum stat
  geom_jitter(alpha = 0.25)
Code bewerken en uitvoeren