Filter genes
Now that the data have been log-transformed and quantile-normalized, you need to remove the lowly expressed genes that are not relevant to the system being studied.
This exercise is part of the course
Differential Expression Analysis with limma in R
Exercise instructions
The ExpressionSet object eset_norm with the normalized Populus data has been loaded in your workspace.
Use
plotDensitiesto visualize the distribution of gene expression levels for each sample. Disable the legend.Use
rowMeansto determine which genes have a mean expression level greater than 5. Name this logical vectorkeep.Filter the genes (i.e. rows) of the ExpressionSet object with the logical vector
keepand re-visualize.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
library(limma)
# Create new ExpressionSet to store filtered data
eset <- eset_norm
# View the normalized gene expression levels
___(eset, legend = ___); abline(v = 5)
# Determine the genes with mean expression level greater than 5
keep <- ___(exprs(eset)) > ___
sum(keep)
# Filter the genes
eset <- eset[___]
___(eset, legend = ___)