Matching metadata and counts data
To perform any analysis with DESeq2, we need to create a DESeq2 object by providing the raw counts, metadata, and design formula. To do this, we need to read in the raw counts data and associated metadata we created previously, make sure the sample names are in the same order in both datasets, then create a DESeq2 object to use for differential expression analysis. We will use the design formula ~ condition
to test for differential expression between conditions (normal and fibrosis).
The DESeq2
and dplyr
libraries have been loaded for you, and the smoc2_rawcounts
and smoc2_metadata
files have been read in.
This is a part of the course
“RNA-Seq with Bioconductor in R”
Exercise instructions
Use the
match()
function to return the indices for how to reorder the columns of the counts data to match the order of the row names of the metadata. Assign the result toreorder_idx
.Reorder the columns of the count data with
reorder_idx
such that the column names match the order of the row names in the metadata.Create a DESeq2 object,
dds_smoc2
using theDESeqDataSetFromMatrix()
function using the metadata and reordered counts.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Use the match() function to reorder the columns of the raw counts
reorder_idx <- match(___(___), ___(___))
# Reorder the columns of the count data
reordered_smoc2_rawcounts <- smoc2_rawcounts[ , ___]
# Create a DESeq2 object
dds_smoc2 <- DESeqDataSetFromMatrix(countData = ___,
colData = ___,
design = ~ condition)
This exercise is part of the course
RNA-Seq with Bioconductor in R
Use RNA-Seq differential expression analysis to identify genes likely to be important for different diseases or conditions.
In this chapter, we perform quality control on the RNA-Seq count data using heatmaps and principal component analysis. We explore the similarity of the samples to each other and determine whether there are any sample outliers.
Exercise 1: Introduction to differential expression analysisExercise 2: Practice with the DESeq2 vignetteExercise 3: Organizing the data for DESeq2Exercise 4: Matching metadata and counts dataExercise 5: Count normalizationExercise 6: Normalizing counts with DESeq2Exercise 7: Hierarchical heatmapExercise 8: Hierarchical heatmap by conditionExercise 9: Hierarchical heatmap analysisExercise 10: Principal component analysisExercise 11: PCA analysisExercise 12: PCA practice: exploring variationsExercise 13: PCA practice: exploring additional variationsWhat is DataCamp?
Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.