Using stat_sum
In the Vocab dataset, education and vocabulary are integer variables. In the first course, you saw that this is one of the four causes of overplotting. You'd get a single point at each intersection between the two variables.
One solution, shown in the step 1, is jittering with transparency. Another solution is to use stat_sum(), which calculates the total number of overlapping observations and maps that onto the size aesthetic.
stat_sum() allows a special variable, ..prop.., to show the proportion of values within the dataset.
Questo esercizio fa parte del corso
Intermediate Data Visualization with ggplot2
Esercizio pratico interattivo
Prova a risolvere questo esercizio completando il codice di esempio.
# Run this, look at the plot, then update it
ggplot(Vocab, aes(x = education, y = vocabulary)) +
# Replace this with a sum stat
geom_jitter(alpha = 0.25)