Get startedGet started for free

Removing blacklisted regions

Identifying and removing peaks in blacklisted regions is an important step in preparing the data for further analysis. For this exercise, we use the blacklist included in the ChIPQC package. This is also available from ENCODE directly.

For the purpose of this exercise, peak calls are available in peaks, coverage data is in cover, and blacklisted regions are in blacklist.hg19. The findOverlaps() function will be useful here. You've encountered the concept of overlapping regions in the introductory Bioconductor course and we will revisit it later in this chapter.

It may take a moment to load all required data and R packages for this exercise. Please be patient.

This exercise is part of the course

ChIP-seq with Bioconductor in R

View Course

Exercise instructions

  • Find all overlaps between peaks and blacklisted regions.
  • Plot read coverage, peak calls, and blacklisted regions using Gviz.
  • Remove all blacklisted peaks.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Find all overlaps between peaks and blacklisted regions
blacklisted <- ___(peaks, blacklist.hg19, type="within")

# Create a plot to display read coverage together with peak calls and blacklisted regions in the selected region
cover_track <- ___(cover, window=10500, type="polygon", name="Coverage",
                         fill.mountain=c("lighgrey", "lightgrey"), col.mountain="grey")

# Calculate peak_track and region_track, plot plotTracks
peak_track <- ___(peaks, name="Peaks", fill="orange")
region_track <- ___(region, name="Blacklist")
plotTracks(list(ideogram, cover_track, peak_track, region_track, GenomeAxisTrack()),
           chromosome="chr21", from=start(region)-1000, to=end(region)+1000)

# Remove all blacklisted peaks
clean_peaks <- ___[-from(blacklisted)]
Edit and Run Code