Get Started

Matching metadata and counts data

To perform any analysis with DESeq2, we need to create a DESeq2 object by providing the raw counts, metadata, and design formula. To do this, we need to read in the raw counts data and associated metadata we created previously, make sure the sample names are in the same order in both datasets, then create a DESeq2 object to use for differential expression analysis. We will use the design formula ~ condition to test for differential expression between conditions (normal and fibrosis).

The DESeq2 and dplyr libraries have been loaded for you, and the smoc2_rawcounts and smoc2_metadata files have been read in.

This is a part of the course

“RNA-Seq with Bioconductor in R”

View Course

Exercise instructions

  • Use the match() function to return the indices for how to reorder the columns of the counts data to match the order of the row names of the metadata. Assign the result to reorder_idx.

  • Reorder the columns of the count data with reorder_idx such that the column names match the order of the row names in the metadata.

  • Create a DESeq2 object, dds_smoc2 using the DESeqDataSetFromMatrix() function using the metadata and reordered counts.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Use the match() function to reorder the columns of the raw counts
reorder_idx <- match(___(___), ___(___))

# Reorder the columns of the count data
reordered_smoc2_rawcounts <- smoc2_rawcounts[ , ___]

# Create a DESeq2 object
dds_smoc2 <- DESeqDataSetFromMatrix(countData =  ___,
                              colData =  ___,
                              design = ~ condition)
Edit and Run Code

This exercise is part of the course

RNA-Seq with Bioconductor in R

IntermediateSkill Level
4.2+
15 reviews

Use RNA-Seq differential expression analysis to identify genes likely to be important for different diseases or conditions.

In this chapter, we perform quality control on the RNA-Seq count data using heatmaps and principal component analysis. We explore the similarity of the samples to each other and determine whether there are any sample outliers.

Exercise 1: Introduction to differential expression analysisExercise 2: Practice with the DESeq2 vignetteExercise 3: Organizing the data for DESeq2Exercise 4: Matching metadata and counts data
Exercise 5: Count normalizationExercise 6: Normalizing counts with DESeq2Exercise 7: Hierarchical heatmapExercise 8: Hierarchical heatmap by conditionExercise 9: Hierarchical heatmap analysisExercise 10: Principal component analysisExercise 11: PCA analysisExercise 12: PCA practice: exploring variationsExercise 13: PCA practice: exploring additional variations

What is DataCamp?

Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.

Start Learning for Free