Get startedGet started for free

Visualizing the engagement data

1. Visualizing engagement data

As you explore the differences between the employee groups you've chosen to analyze, you'll often want to look at more than one attribute at a time.

2. Visualizing several variables at once

Instead of building a visualization for every metric, you can get a clearer overall look at your data by visualizing several at once. When you've finished exploring the data, being able to plot multiple attributes of your groups in one place can make your presentation more powerful. To use ggplot2 to create a visual like this one, your data needs to be in a slightly different format.

3. The tidyr package

This is where the tidyr package comes in. tidyr is part of the tidyverse, just like dplyr and ggplot2, and it provides several useful functions to help get your data arranged properly. In this course, you'll only be using one function from tidyr, the gather() function.

4. Using tidyr::gather()

To use gather(), you'll need to load the tidyr package. You use gather() when you want to collect, or gather, column headings into a single column, and bring the corresponding data along with them in an adjacent column. When you use gather() in a pipe, the only required arguments are the columns you want to gather. By default, the resulting column of headings is named "key", and the column of data values is named "value". You can rename them by passing new names to the function call.

5. Using tidyr::gather()

With gather(), it's best to see how it works. Here is the result of a group_by() %>% summarize() pipe called survey_summary. It has a department column, as well as two columns with attributes of those departments. When we gather() the average_engagement and average_promotions columns together, we end up with three columns again. The department column is the same, except there are now two rows for each department - one for Average Engagement, and one for Average Promotions. The key column tells us which attribute the value column refers to. With the data in this format, you can use ggplot2 to visually compare the differences between several attributes at a time.

6. Adding color to bar charts

Let's call the gathered data survey_gathered, and plot it as a bar chart with key on the x axis and value on the y axis. This will produce a bar chart for each variable of interest. To compare different departments, use the fill aesthetic to break up each bar into different colors by department. Do this with, "fill = department", inside aes().

7. Adding color to bar charts

By default, using fill will produce a single bar split into the different groups you're comparing. It's easier to compare differences when the colored bars are side by side, such as in this second example.

8. Side-by-side bar charts

You can create the side-by-side bar chart by adding position = "dodge" inside of geom_col(). This lets you more easily compare the difference between departments by checking the height of the bars.

9. Side-by-side bar charts

Notice that there is a single y-axis for both sets of bars. Using a single y-axis is usually good practice, but what if we'd also chosen to compare average bonus? Now engagement and promotion are too small to see.

10. Adding facets

The solution to this problem is to add a facet layer with facet_wrap(). The first argument uses the tilde to determine which column to use to create the facets. Adding the scales = "free" argument lets each facet's y-axis have its own scale.

11. Faceted bar chart

The result is a bar chart with a different facet and y-axis for each attribute.

12. Let's practice!

Now you can practice these new tools on the survey data.