Exploring a fastq file
Fastq files usually contain thousands or millions of reads, and can become very large in size! For this exercise, you will use a small fastq
sub sample of 500 reads, which fits easily into memory and can be read entirely using the function readFastq()
.
The original sequence file comes from Arabidopsis thaliana, provided by the UC Davis Genome Center. The accession number is SRR1971253 and was downloaded from the Sequence Read Archive (SRA). It contains DNA from leaf tissues, pooled and sequenced on Illumina HiSeq 2000. These sequences are single-read sequences with 50 base pairs (bp) length.
fqsample
is a ShortReadQ
object and contains information about reads, quality scores, and ids. It's your turn to explore it!
This exercise is part of the course
Introduction to Bioconductor in R
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Load ShortRead
___
# Print fqsample
___