Get startedGet started for free

Taking a closer look at peaks

1. Taking a closer look at peaks

Now that you know how to import ChIP-seq data into R it is time for a closer look at what these peaks actually look like. In this video You will learn how to visualise individual peaks in their genomic context.

2. Using Gviz

The *Gviz* package provides functions to facilitate the plotting of genomic data. *Gviz* organises data in tracks, each aligned to the same genomic coordinates, as you can see here. This makes it easy to combine data from different sources into a single plot. Before we get into the details of how to create a plot like this, let's take a closer look at what it is exactly that we are plotting here.

3. What are we plotting?

At the top of the plot is an ideogram of the chromosome we are looking at. This tells us which chromosome we are dealing with and where, roughly, the data plotted below are located on that chromosome.

4. What are we plotting?

Next, we have the coverage track. Read coverage is computed as the number of reads overlapping a given position in the genome. For example, a covergae of 5 means that 5 reads, possibly all starting at different positions, include this location in their alignment. This provides a summary of where reads are located without displaying details about individual alignments. You'll lern more about how to calculate coverage in R later in this chapter.

5. What are we plotting?

The coverage track is followed by a set of annotation tracks. These highlight the location of certain features relative to the read coverage. In this case we have one track showing peak calls and one visualizing transcript annotations for genes located in this part of the genome.

6. What are we plotting?

Finally, we have the axis track that provides more detailed information about the location on the chromosome. Now let's take a look at the steps involved in assembling a plot like this.

7. Setting-up coordinates

Before we can do any plotting we'll have to load the Gviz package. We'll start by providing some context for the data we are about to plot. An ideogram at the top allows us to show the location on the chromosome we are looking at. You can create this with the `IdeogramTrack()` function by providing the name of the chromosome and reference genome. Next, we'll use the `GenomeAxisTrack()` function to show the coordinates of the plotted region. All of this is combined into a plot by the `plotTracks()` function. Using the `from` and `to` arguments we can restrict the range of the plot.

8. Adding Data

Now we are ready to add our own data to the plot. Here we display read coverage data. *Gviz* expects this to be provided as a *GRanges* object. To add the coverage data to the plot we create a DataTrack that we'll place between the ideogram and axis. The `DataTrack()` function accepts a number of parameters to adjust the display. Using a large value for `window` causes the data to be displayed at high resolution. Wealso request a histogram like display by setting `type` to *h* and set the track name to *Coverage*.

9. Adding Annotations

In addition to the coverage, it is very useful to also display the peak calls. The `AnnotationTrack()` function allows us to create a track with this information, which we can then pass to the `plotTracks()` call.

10. Gene Annotations

To provide further context it is helpful to also display existing genomic annotations. One useful annotation to add is the location of genes. Here we use the `TxDb` package for the hg19 reference genome. You can pass this directly to the `GeneRegionTrack()` function to plot transcript annotations for a region defined by the `chromosome`, `start` and `end` arguments.

11. Let's practice!

Now let's try some examples.