1. Box plots
In this final lesson, you'll learn how to make one more type of graph: a box plot. Recall that we used
2. Histograms
a histogram when we wanted to examine the distribution of one variable, such as life expectancy, across all countries.
Notice that a histogram combines all the life expectancies across all continents, without distinguishing them. But what if your goal is to compare the distribution of life expectancies among continents?
3. Box plots
This is a box plot, which shows the distribution of life expectancies within each continent, so that you can compare them. It is created with geom underscore boxplot, and it has two aesthetics- x is the category (continent), and y is the values that we're comparing, which in this case is life expectancy.
A box plot takes a bit of practice to interpret, so here's what each of the components means. The black line in the middle of each white box is the median of that continent's distribution. The top and bottom of each box represent the 75th percentile and the 25th percentile of that group, meaning half of the distribution lies within that box. The lines going up and down from the box, called "whiskers", cover additional countries. The two dots below the whiskers for Asia and the Americas represent outliers: countries with unusually low life expectancy relative to the rest of the distribution.
So there's a lot that this plot tells us about differences in life expectancy across continents. We can see that the median life expectancy of Europe is one of the highest, and that the two countries in Oceana (Australia and New Zealand) both have very high values. We can also see that the distribution for Africa is unusually low, with about half of its countries having a life expectancy between 50 and 60 years.
4. Histogram vs box plot
A boxplot helps give more context to the shape of the earlier histogram, where there were two bumps: one for countries between 65 and 80 representing most of Europe, Asia, and the Americas, and another, lower distribution. You'll use boxplots to examine other differences in distribution between continents in the final exercises.
5. Let's practice!