Boxplots and density plots
The mileage of a car tends to be associated with the size of its engine (as measured by the number of cylinders). To explore the relationship between these two variables, you could stick to using histograms, but in this exercise you'll try your hand at two alternatives: the box plot and the density plot.
This exercise is part of the course
Exploratory Data Analysis in R
Exercise instructions
A quick look at unique(cars$ncyl)
shows that there are more possible levels of ncyl
than you might think. Here, restrict your attention to the most common levels.
- Filter
cars
to include only cars with 4, 6, or 8 cylinders and save the result ascommon_cyl
. The%in%
operator may prove useful here. - Create side-by-side box plots of
city_mpg
separated out byncyl
. - Create overlaid density plots of
city_mpg
colored byncyl
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Filter cars with 4, 6, 8 cylinders
common_cyl <- filter(___, ___)
# Create box plots of city mpg by ncyl
ggplot(___, aes(x = as.factor(___), y = ___)) +
geom_boxplot()
# Create overlaid density plots for same data
ggplot(common_cyl, aes(x = ___, fill = as.factor(___))) +
geom_density(alpha = .3)