Get startedGet started for free

Histograms

1. Histograms

Each kind of graph offers a different way to investigate your data. So far we've been looking at relationships between two or more variables. But we can instead investigate one dimension of the data at a time, using a histogram.

2. Histogram

A histogram shows a distribution. In this case, it's the distribution of life expectancy across countries in the year 2007. Every bar represents a bin of life expectancies, and the height represents how many countries fall into that bin. This lets you get a sense of the distribution based on the histogram's shape. We can see that most countries have a life expectancy between 70 to 80 years, but that another set of countries have life expectancies between 40 and 65. A histogram is created with geom underscore histogram. It has only one aesthetic: the x-axis, the variable whose distribution you are examining. The width of each bin in the histogram is chosen automatically, and it has a large effect on how the histogram communicates the distribution. You may need to customize that width. You can

3. Adjusting bin width

do so with the binwidth option, which is set inside the parentheses of the geom underscore histogram layer. Setting binwidth equals 5 means that each of the bars in the histograms represents a width of five years. Setting a wide binwidth like this makes the histogram a bit blockier, which focuses on the general shape more than the small details. As you gain experience with histograms, you'll learn how to customize this to give the clearest picture of your data. In some cases, you may need to put the x-axis of a histogram on a log scale for it to be understandable, just like you did in several of the scatter plots in Chapter 2.

4. Log x-axis

Recall that you do this by adding a scale underscore x underscore log10 to the graph. You'll practice doing so in the exercises.

5. Let's practice!