Tricks of ggplot2
1. Tricks of ggplot2
We’ve done a lot of graphing in this course with ggplot2. Along the way, you’ve seen a few helpful functions like coord_flip(), to flip the x- and y-axis, and labs(), to change the labels of your axes. In this lesson, we’ll review these and learn a couple other functions that can help make your visualizations as clear and effective as possible.2. Job title data
Let's take a look at a new dataset made from the Kaggle Survey. Here, we have summarized the job title information to get a dataset with one row for each title and a column, percent_with_title, to say what percentage of respondents have that title. What happens when we try to make a simple scatter plot?3. Initial plot
Well, that doesn't look very nice. We've got a lot of issues here, so let's tackle them one at a time.4. Changing tick labels angle
First, let's deal with the issue where we can't read the tick labels of the x-axis. In a previous lesson, we saw that we can flip the axes so that we can read what each label is. Another option is to change the text labels of the bars from horizontal to vertical. We can do this with the theme() function. Here we make axis dot text dot x equal to the function element_text(), which has two arguments, angle and hjust. We set angle equal to 90 to change the angle of the text to 90 degrees (180 would make the text still horizontal but upside down) and set the argument hjust equal to 1 to right justify the text.5. Using fct_reorder()
Another issue is that the scatter plot is all jumbled. It would be easier to read if we had the job titles in order of popularity. We can use fct_reorder() to order the job titles by the variable percent_with_title. With that, we now see the percent_with title_are strictly increasing from left to right.6. Adding fct_rev()
While that was better, generally we like to have it go from largest to smallest left to right. To do so, we can use fct_rev() around the fct_reorder() function. As we've seen before, this reverses the order of our factor.7. Using labs()
Generally, good column names are not the same as good axis labels. We usually want column names to not contain spaces and be as short as possible while still being understandable. But once we graph those variables, we can be a little more descriptive. Let’s use the labs() function to relabel the axes. We set the argument y equal to what we want the y-axis label to be and the argument x what we want the x-axis label to be.8. Changing to % scales
Finally, we can change the y-axis ticks to explicitly show that they’re percentages. We can do this with another function, scale_y_continuous(). We’ll set the argument labels equals to scales double colon percent_format. scales is a package, but if we don’t want to have library scales in our script, we can call the function percent_format() from scales without loading the package by using the double colon before percent_format and specifying it comes from scales. Similarly, we could write dplyr double colon select, or ggplot2 double colon ggplot, to call the select() function and ggplot() function without loading dplyr or ggplot2.9. Let's practice!
Now let's try some examples.Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.