Get startedGet started for free

Visualizing with ggplot2

1. Visualizing with ggplot2

In the last chapter, you used the dplyr package to answer some questions about the gapminder dataset. You've been able to filter for particular observations, arrange to find the highest or lowest values, and mutate to add new columns. However, so far you've engaged with the results only as a table printed out from your code. Often a better way to understand and present this kind of data is as a graph.

2. Data visualization

In this chapter, you'll learn the essential skill of data visualization using the ggplot2 package. In particular, this chapter will show you how to create scatterplots, like the one you see here, that compare two variables on an x- and y- axis. Visualization and data wrangling are often intertwined, so you'll see how the dplyr and ggplot2 packages work closely together to create informative graphs.

3. Variable Assignment

In this chapter, you'll mostly be visualizing subsets of the gapminder dataset. For example, you'll often be visualizing only data from 2007. When you're working with just that subset, it's useful to save the filtered data, as a new data frame. To do this, you use the assignment operator. This is a less then and a minus sign, like an arrow facing to the left. In this operation, you're taking the gapminder dataset, filtering it for the observations from the year 2007, and then saving it- with that arrow going to the left- into a dataset called gapminder underscore 2007. Now if you print the gapminder_2007 dataset, we can see that it's another table. But this one has only 142 rows, and they come only from the year 2007. Now that you've saved this variable, you can use it to create our visualization.

4. Visualizing with ggplot2

Suppose you want to examine the relationship between a country's wealth and its life expectancy. You could do this with a scatterplot comparing two variables in our gapminder dataset: GDP per capita on the X axis and life expectancy on the y-axis. You'll be creating this plot using the ggplot2 package. Just like the gapminder and dplyr packages, you'll have to load it with library parentheses ggplot2 end parentheses first. This is the code to create this scatterplot. There are three parts to a ggplot graph. First is the data that we're visualizing. In this case, that is the gapminder_2007 variable you just created. Second is the mapping of variables in your dataset to aesthetics in your graph. An aesthetic is a visual dimension of a graph that can be used to communicate information. In a scatterplot, your two dimensions are the x axis and the y axis, so you write aes (for "aesthetic"), parentheses, x equals gdpPerCap, y = lifeExp, telling it which variables to place on which axes. The third step is specifying the type of graph you're creating. You do that by adding a layer to the graph: use a plus after the ggplot, and then geom underscore point. The "geom" means you're adding a type of geometric object to the graph, the "point" indicates it's a scatter plot, where each observation corresponds to one point. Together, these three parts of the code- the data, the aesthetic mapping, and the layer- construct the scatter plot you see here. In the exercises, you'll practice creating other scatter plots to compare variables across countries, and in the rest of this chapter you'll learn more ways to communicate information in a graph.

5. Let's practice!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.