Calculate spread measures
Let's extend the powerful group_by() and summarize() syntax to measures of spread. If you're unsure whether you're working with symmetric or skewed distributions, it's a good idea to consider a robust measure like IQR in addition to the usual measures of variance or standard deviation.
This exercise is part of the course
Exploratory Data Analysis in R
Exercise instructions
The gap2007 dataset that you created in an earlier exercise is available in your workspace.
- For each continent in
gap2007, summarize life expectancies using thesd(), theIQR(), and the count of countries,n(). No need to name the new columns produced here. Then()function within yoursummarize()call does not take any arguments. - Graphically compare the spread of these distributions by constructing overlaid density plots of life expectancy broken down by continent.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Compute groupwise measures of spread
gap2007 %>%
group_by(___) %>%
summarize(___,
___,
___)
# Generate overlaid density plots
gap2007 %>%
ggplot(aes(x = ___, fill = ___)) +
geom_density(alpha = 0.3)