Changing y-axis to density
By default, you will notice that the y-axis is the 'count' of points that fell within a given bin. This is nice and interpretable, but what if we wanted to interpret the plot as a true density curve like it's trying to estimate? I.e. all the (bar widths) * (bar heights) sum to 1?
To do this we simply add y = stat(density)
to the aesthetic mappings, this will re-scale the y-axis from counts to an empirical probability estimate. Note this won't change the shape of the plot at all, but will simply give you a different interpretation of the y-axis.
Let's try it out on the hour of the day that a speeder was pulled over (hour_of_day
). In addition, lower the opacity of the bars a bit so the grid lines show through to allow easier comparisons.
This is a part of the course
“Visualization Best Practices in R”
Exercise instructions
- set x-aesthetic to
hour_of_day
. - set the y-aesthetic to
stat(density)
. - change the
alpha
value ingeom_histogram()
to 0.8.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
ggplot(md_speeding) +
geom_histogram(
# set x and y aesthetics to hour_of_day and stat(density) respectively.
___
# make points see-through by setting alpha to 0.8
___
)
This exercise is part of the course
Visualization Best Practices in R
Learn to effectively convey your data with an overview of common charts, alternative visualization types, and perception-driven style enhancements.
We now move on to visualizing distributional data, we expose the fragility of histograms, discuss when it is better to shift to a kernel density plots, and how to make both plots work best for your data.
Exercise 1: Importance of distributionsExercise 2: Orienting with the dataExercise 3: Looking at all dataExercise 4: Changing y-axis to densityExercise 5: Histogram nuancesExercise 6: Adjusting the bin numbersExercise 7: More barsExercise 8: Bin width by contextExercise 9: The kernel density estimatorExercise 10: Histogram to KDEExercise 11: Putting a rug downExercise 12: KDE with lots of dataWhat is DataCamp?
Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.