Removing blacklisted regions
Identifying and removing peaks in blacklisted regions is an important step in preparing the data for further analysis. For this exercise, we use the blacklist included in the ChIPQC package. This is also available from ENCODE directly.
For the purpose of this exercise, peak calls are available in peaks, coverage data is in cover, and blacklisted regions are in blacklist.hg19. The findOverlaps() function will be useful here. You've encountered the concept of overlapping regions in the introductory Bioconductor course and we will revisit it later in this chapter.
It may take a moment to load all required data and R packages for this exercise. Please be patient.
Este ejercicio forma parte del curso
ChIP-seq with Bioconductor in R
Instrucciones del ejercicio
- Find all overlaps between peaks and blacklisted regions.
- Plot read coverage, peak calls, and blacklisted regions using Gviz.
- Remove all blacklisted peaks.
Ejercicio interactivo práctico
Prueba este ejercicio y completa el código de muestra.
# Find all overlaps between peaks and blacklisted regions
blacklisted <- ___(peaks, blacklist.hg19, type="within")
# Create a plot to display read coverage together with peak calls and blacklisted regions in the selected region
cover_track <- ___(cover, window=10500, type="polygon", name="Coverage",
fill.mountain=c("lighgrey", "lightgrey"), col.mountain="grey")
# Calculate peak_track and region_track, plot plotTracks
peak_track <- ___(peaks, name="Peaks", fill="orange")
region_track <- ___(region, name="Blacklist")
plotTracks(list(ideogram, cover_track, peak_track, region_track, GenomeAxisTrack()),
chromosome="chr21", from=start(region)-1000, to=end(region)+1000)
# Remove all blacklisted peaks
clean_peaks <- ___[-from(blacklisted)]