Get startedGet started for free

Visualization of results

1. Visualization of results

Congratulations! We have gone through the entire RNA-Seq workflow! Often visualizations of the results help to get a big picture overview of the analysis. We will explore a few standard visualization methods.

2. Visualizing results - Expression heatmap

The first method is the expression heatmap, which explores the expression of significant genes. In contrast to the previous correlation heatmap, we will plot the normalized count values of the significant genes instead of the sample correlation values. To create the heatmap, subset the normalized counts to only the significant DE genes. Then, after loading the RColorBrewer package, specify the palette of colors to use in the heatmap with the brewer-dot-pal() function saved to heat_colors. To see a full list of available palettes, you can run the display-dot-brewer-dot-all() function.

3. Visualizing results - Expression heatmap

Finally, plot the heatmap using the pheatmap() function with the sig_norm_counts_wt data, coloring with heat_colors, clustering by row, annotating with condition information and scaling by row, which plots Z-scores, rather than the actual normalized count values. Generally, we would expect to see the expression levels for the significant genes to cluster by sample group, which is the case for our data.

4. Visualizing results - Volcano plot

In addition to the MA plot explored previously, another useful plot providing a global view of the results is the volcano plot, which shows the fold changes relative to the adjusted p-values for all genes. First, using all results, wt_res_all, convert the row names to a column called ensgene, then create a column of logical values indicating if the gene is DE using the mutate() function, with p-adjusted value threshold less than 0-point-05. Then, use ggplot2 to plot the log2 foldchange values versus the -log10 adjusted p-value. The points for the genes should then be colored by whether they are significant using the threshold column.

5. Visualizing results - Volcano plot

We can zoom in on the volcano plot to visualize better the significance cut-off using the ylim() function within ggplot2.

6. Visualizing results - Expression plot

One last plot that can be helpful is visualizing the expression of the top significant genes or any genes of interest. We will plot the top 20 significant genes to visualize the expression differences between our sample groups. To do this we will use our normalized counts for the significant genes ordered by adjusted p-values.

7. Visualizing results - Expression plot

To plot the normalized counts, we need to gather the normalized counts for our genes of interest into a single column, in our case, we are interested in plotting the first 20 genes. We can do this by turning our matrix into a data frame, subsetting the first 20 rows, then using the gather() function. For the gather() function, we need to specify our data frame and how to name our key and value columns. The last argument specifies the columns we want to gather into a single column of values.

8. Visualizing results - Expression plot

To plot the expression, we want to merge the metadata so that we can color the plot by sample group. We can use the inner_join() function to keep only those columns in both datasets; however, this should include all samples, so we really could have used any of the join functions to merge the data frames. For the metadata, we want to join on the row names, so we need to turn these into a column to join on them. Finally, to create the plot we can use ggplot, using geom_point and plotting the gene IDs on the x-axis and normalized counts on the y-axis. Then, we can color the points by condition. To more easily visualize the wide range in expression values we use a log10 scale on the y-axis.

9. Visualizing results - Expression plot

We can see all of the top 20 genes are up-regulated in the fibrosis condition, which could be informative relative to the pathways affected.

10. Let's practice!

Now let's try some examples.