Filtering reads
In this exercise, you will further clean-up the data by removing reads with low-quality alignments. To be able to do this you need to load alignment qualities from the BAM file. These are stored in the mapq field.
This exercise is part of the course
ChIP-seq with Bioconductor in R
Exercise instructions
- Load
reads
with information about the alignment qualities attached to each read. - Identify all alignments with a quality of at least 20.
- Create a boxplot comparing alignment quality distributions between the high and low-quality groups.
- Remove all low-quality alignments.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Load reads with mapping qualities by requesting the "mapq" entries
reads <- readGAlignments(bam_file, param=ScanBamParam(what=___))
# Identify good quality alignments
high_mapq <- mcols(reads)$mapq >= ___
# Examine mapping quality distribution for high and low quality alignments
___(mcols(reads)$mapq ~ high_mapq, xlab="good quality alignments", ylab="mapping quality")
# Remove low quality alignments
reads_good <- subset(reads, ___)