Calculate spread measures
Let's extend the powerful group_by()
and summarize()
syntax to measures of spread. If you're unsure whether you're working with symmetric or skewed distributions, it's a good idea to consider a robust measure like IQR in addition to the usual measures of variance or standard deviation.
This exercise is part of the course
Exploratory Data Analysis in R
Exercise instructions
The gap2007
dataset that you created in an earlier exercise is available in your workspace.
- For each continent in
gap2007
, summarize life expectancies using thesd()
, theIQR()
, and the count of countries,n()
. No need to name the new columns produced here. Then()
function within yoursummarize()
call does not take any arguments. - Graphically compare the spread of these distributions by constructing overlaid density plots of life expectancy broken down by continent.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Compute groupwise measures of spread
gap2007 %>%
group_by(___) %>%
summarize(___,
___,
___)
# Generate overlaid density plots
gap2007 %>%
ggplot(aes(x = ___, fill = ___)) +
geom_density(alpha = 0.3)