Session Ready
Exercise

Exploring a fastq file

Fastq files usually contain thousands/millions of reads and can become very large in size! For this exercise, you will use a small fastq sub sample of 500 reads which fits easily into memory and can be read entirely using the function readFastq().

The original sequence file comes from Arabidopsis thaliana, provided by the UC Davis Genome Center. The accession number is SRR1971253 and was downloaded from the Sequence Read Archive (SRA). It contains DNA from leaf tissues, pooled and sequenced on Illumina HiSeq 2000. These sequences are single-read sequences with 50 base pairs (bp) length.

fqsample is a ShortReadQ object and contains information about reads, quality scores, and ids. It's your turn to explore it!

Instructions 1/3
undefined XP
  • 1
  • 2
  • 3
  • Load the ShortRead package and print fqsample to view it.