DE analysis results

After exploring the PCA and correlation heatmap, we found good clustering of our samples on PC1, which seemed to represent the variation in the data due to fibrosis, and PC2, which appeared to represent variation in the data due to smoc2 overexpression. We did not find additional sources of variation in the data, nor any outliers to remove. Therefore, we can proceed by running DESeq2, DE testing, and shrinking the fold changes. We performed these steps for you to generate the final results, res_all.

In this exercise, we'll want to subset the significant genes from the results and output the top 10 DE genes by adjusted p-value.

Use the subset() function to extract those values with an adjusted p-value less than 0.05. Save the subset as a data frame named smoc2_sig by using the data.frame() function and turning the row names to a column named geneID using the rownames_to_column() function.
Order the significant results by adjusted p-values using the arrange() function, select the columns with Ensembl gene ID and adjusted p-values, and output the top significant genes using head().

script.R

R Console

Introduction to RNA-Seq theory and workflow

Exploratory data analysis

Differential expression analysis with DESeq2

Exploration of differential expression results

Exercise

Exercise

DE analysis results

Instructions