RNA-Seq DE analysis summary - setup
1. RNA-Seq DE analysis summary
Now that we have made our way through the entire differential expression workflow, we can explore our list of differentially-expressed genes for relevant or expected genes or use the results for downstream analyses. To solidify our understanding of the differential expression analysis, let's review briefly the steps leading to the identification of significant DE genes.2. RNA-Seq Workflow: Sample prep
The first step in a successful RNA-Seq DE analysis is a well-planned experiment. A well-planned experiment should avoid batch effects, should divide known major sources of variation, such as different sexes or ages, equally between sample groups, and should have a good number of biological replicates, preferably more than 3. The more biological replicates we have, the better our ability to detect DE genes with better estimates of mean expression and variation. Also, if we need to remove an outlier sample, we will still have biological replicates for the analysis.3. RNA-Seq Workflow: Sample prep
After you have designed a well-planned out experiment, the sample libraries are created for sequencing. The samples are harvested, the RNA is isolated and DNA contamination is removed. The rRNA is removed or mature mRNAs are selected by their polyA tails. For Illumina sequencing, the RNA is turned into cDNA, fragmented, size selected and adapters are added to generate the RNA-Seq libraries. Either a single end or both ends of the fragments are sequenced, generating millions of nucleotide sequences called reads. The sequences of the reads and quality information are output into FASTQ files.4. RNA-Seq Workflow: Quality control
With the sequenced reads in the FASTQ files, a series of analytical steps is performed on the command line, beginning with the assessment of the raw data quality.5. RNA-Seq Workflow: Raw data quality control
At this step, we ensure something didn't go wrong at the sequencing facility and explore the data for contamination, such as vector, adapter, or ribosomal.6. RNA-Seq Workflow: Alignment
The next step is alignment or mapping of the reads to the genome to determine the location on the genome where the reads originated.7. RNA-Seq Workflow: Alignment
The reads derived from mRNA are often aligned to the organism's genome, but some these reads cross introns. Therefore, tools for aligning reads to the genome need to align across introns or be splice-aware for RNA-seq. The output of alignment gives the genome coordinates for where the read most likely originated from in the genome and information about the quality of the mapping.8. RNA-Seq Workflow: Quantitation
Following alignment, the reads aligning to the exons of each gene are quantified to yield a matrix of gene counts.9. RNA-Seq Workflow: Quantitation
The number of reads aligning to each of the genes is given in the count matrix and represents the expression level of the gene. The more reads aligning to the gene, the higher the expression of the gene, indicating more RNA transcripts were expressed.10. RNA-Seq Workflow: Differential expression
Once we have count data, differential expression analysis is performed. With differential expression analysis, our goal is to determine whether the gene counts between the sample groups are significantly different given the variation in the counts within the sample group.11. Preparation for differential expression analysis: DESeq2 object
To start the differential expression analysis we use the `DESeqDataSetFromMatrix()` function, which takes a raw count matrix as input, along with the metadata and a design formula to create the DESeq2 object. The design formula given should contain major expected sources of variation to control for and the condition of interest as the last term in the formula. If the raw count data is a Summarized Experiment from the htseq-count tool, or generated by pseudo-alignment tools, DESeq2 has other functions to use to create the DESeq2 object as detailed in the vignette.12. Let's practice!
Time to put this into practice.Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.