1. Graphical visualizations in R using ggplot2
This video will give you a brief introduction to making graphs in R with the ggplot2 package which you will use throughout this course.
It is important to always make plots of your data to visualize distributions and relationships between your variables.
These visualizations are necessary for understanding and communicating your results.
2. ggplot2 package
ggplot2 is a powerful graphics package for R.
The GG in ggplot stands for the grammar of graphics.
ggplot2 uses a layering approach to build graphics.
The process begins by creating a basic graphical environment to which one or more geometric objects are added.
3. Layers - base layer
Building graphs with ggplot2 begins by running the ggplot function to define the base layer options
including the dataset abalone and
aesthetics for sex and diameter variables.
The code shown here defines a graphics window where the horizontal axis is for sex and the vertical axis is for diameter,
The code also adds a grid for the plot.
Next you will add a graphical object to this plot.
4. Layers - add boxplot geom
To the graphical environment defined by the base layer code, you can add a geometric object or geom like a boxplot using the geom_boxplot function.
The plus operator is used to add the boxplot geom to the base layer.
The boxplots for abalone diameters are shown here by sex for females, infants and males.
5. Layers - add a theme
Another layer such as a theme can be added to the plot.
The theme_bw function removes the gray background and draws a black box around the outer edge of the plot.
6. Change boxplot geom to violin geom
A geom can easily be changed.
For example, the geom_violin is similar to geom_boxplot
except it creates a shape similar to a violin that reflects the data density distribution better than a box.
Changing this geom easily creates a new figure.
7. Single variable histogram
To create a histogram you only need to define one variable aesthetic.
To create a histogram of the abalone shuckedWeight distribution you use the geom_histogram and set the
aes aesthetic to shuckedWeight
The default colors are not pretty. The next slide shows how to improve this figure.
8. Histogram add colors
To improve this figure, you will specify parameters such as color and fill inside the parentheses for geom_histogram.
Each option is set using the equals sign followed by the color choices in double quotations.
9. Histogram add title and axis labels
You can further improve this histogram figure by adding better labels for the x and y axes
using xlab() and ylab()
and adding a title using ggtitle.
The text labels are provided inside the parentheses and double quotations.
This resulting figure is ready to publish.
10. Make scatterplot
To see the association between abalone shell weights and their number of rings you can make a scatterplot.
For this plot, you need to define two aesthetics and add points using geom_point.
11. Scatterplot add smoothed fit line
To this scatterplot, smoothed fit line can be added
using the geom_smooth layer
which shows a positive trend line plus a shaded confidence area.
12. Create panels by another variable
To this scatterplot, you can another layer
to create panels for each abalone sex.
This is accomplished using the facet_wrap function.
vars(sex) defines the variable sex to be used for each panel of the facet_wrap.
13. Rest of course
Chapter 1 will finish with you using your ggplot2 plotting skills to visualize the abalone measurements by sex. These exercises establish your graphics foundation for the rest of the course.
In chapter 2 you will learn data wrangling skills to clean up the abalone dataset.
In chapter 3 you will further explore the abalone data using descriptive statistics, correlations and comparison tests.
Finally, in chapter 4 you will learn how to run models and present your results.
14. Let's make some plots for abalones
Let's make some plots for abalones