Session Ready
Exercise

Try your own nucleotide frequency plot

Now it's time to take a closer look at the frequency of nucleotides per cycle. The best way to do this is by making a visualization. Usually, the first cycles are a bit random and then the frequency of nucleotides should stabilize with the coming cycles.

This exercise uses the complete fastq file SRR1971253 with some pre-processing done for you:

library(ShortRead)
fqsample <- readFastq(dirPath = "data", 
                      pattern = "SRR1971253.fastq")
# extract reads                      
abc <- alphabetByCycle(sread(fqsample))

# Transpose nucleotides A, C, G, T per column
nucByCycle <- t(abc[1:4,]) 

# Tidy dataset
nucByCycle <- nucByCycle %>% 
  as.tibble() %>% # convert to tibble
  mutate(cycle = 1:50) # add cycle numbers

Your task is to make a Nucleotide Frequency by Cycle plot using tidyverse functions!

Instructions
100 XP
  • glimpse() the object nucByCycle to get a view of the data.
  • gather() the nucleotide letters in alphabet and get a new count column.
  • Make a line plot of cycle on the x-axis vs count on the y-axis, colored by alphabet.