Torrent Browser Analysis Report Guide


Torrent Suite Software space on Ion Community

Analysis Report Guide TOC

Run Metrics Overview

Thispage provides background information on quality metrics, read lengths, and alignment. These concepts are required to understand your run report.

The Torrent Browser Analysis Report gives performance metrics for reads whose initial bases match the library key.

Icon

These reads are generated from the input library, not from the positive control Test Fragments.

Performance is measured based on either predicted quality or quality as measured following alignment. Q20 and AQ20 are explained as examples of predicted quality and quality following alignment.

Predicted Quality (Q20)

Quality Following Alignment (AQ20)

Predicted Quality (Q20)

The number of called bases with a predicted quality of Q20 is reported. The predicted quality values are reported on the Phred scale, defined as -10log10 (error probability). Q20, therefore, corresponds to a predicted error rate of one percent.

Icon

Refer to http://en.wikipedia.org/wiki/Phred_quality_score for a more complete description of Phred values.

Quality Following Alignment (AQ20)

Alignment of reads can be a useful process to assess the quality of the sequencing reaction and the quality of the underlying library where an accurate reference is available. Reads are aligned to a reference genome. Any discrepancy in alignment to a reference (whether biological or technical, meaning a real variant or a sequencing error) is listed as a mismatch. Alignment performance metrics are reported depending on how many misaligned bases are permitted. Torrent Suite Software reports alignment performance at two quality levels:

  • AQ20
  • Perfect
How Is Aligned Read Length Calculated?

The aligned length of a read at a given accuracy threshold is defined as the greatest position in the read at which the accuracy in the bases up to and including the position meets the accuracy threshold. So for example the AQ20 length is the greatest length at which the error rate is 1% or less. The "perfect" length is simply the longest perfectly aligned segment. For all of these calculations the alignment is constrained to start from position 1 in the read - in other words, no 5' clipping is permitted.

The underlying assumption is that the reference to which the read is aligned represents the true sequence that should have been seen. Suitable caution should be taken when interpreting AQ20 values in situations where the sample sequenced has substantial differences relative to the reference used, such as working with alignments to a rough draft genome or with samples that are expected to have high mutation rates relative to the reference used. In these situations the AQ20 lengths might be short even when sequencing quality is excellent.

Specifically, the AQ20 length is computed as follows:

  1. Every base in the read is classified as being correct or incorrect according to the alignment to the reference.
  2. At every position in the read the total error rate is computed up to and including that position.
  3. The greatest position at which the error rate is one percent or less is identified and that position defines the AQ20 length.

For example, if a 100bp read consists of 80 perfect bases followed by 2 errors followed by 18 more perfect bases, the total error rate at position 80 is zero percent. At position 81 the total error rate is 1.2% (1/81), at position 82 the error rate is 2.4%, continuing up to position 100 where it is two percent (2/100). The greatest length at which the error rate is one percent or less is 80 and the greatest length at which the error rate is two percent or less is 100, so the AQ20 lengths are 80 and 100 bases, respectively.

How Is Alignment Performed?

Within Torrent Browser, the objective is to provide you with a view on alignment that helps determine run and library quality.

There are many alignment algorithms available within the marketplace and you are encouraged to consult with a bioinformatician for the most appropriate alignment algorithm for your downstream analysis needs. Alignment algorithms are also embedded in many of the commercial software tools available within the Ion Torrent Web store. You are also encouraged to experiment with these tools.

Alignment within Torrent Browser is performed using TMAP. TMAP is currently an unpublished alignment algorithm, created by the authors of the BFAST algorithm. Please, contact your Ion Torrent representative or Technical Support for more information about TMAP.

Technical Note - Analysis Pipeline

Technical Note - TMAP Alignment

Although TMAP is unpublished and a reference is not currently available, the precursor to TMAP, BFAST, is based on the ideas in the following publications:

Homer N, Merriman B, Nelson SF.

BFAST: An alignment tool for large scale genome resequencing.

PMID: 19907642

PLoS ONE. 2009 4(11): e7767. http://dx.doi.org/10.1371/journal.pone.0007767

Homer N, Merriman B, Nelson SF.

Local alignment of two-base encoded DNA sequence.

BMC Bioinformatics. 2009 Jun 9;10(1):175.

PMID: 19508732 http://dx.doi.org/10.1186/1471-2105-10-175

Which Reads Are Used in the Alignment Process?

The alignment stage involves aligning reads produced by the pipeline to a reference genome and extracting metrics from those alignments. By default, Torrent Suite Software aligns all reads to the genome, however there may be situations, particularly with large genomes, where the alignment takes longer than the user is willing to wait.So for such circumstances the Torrent Suite Software also has the capability to define on a per-reference basis the maximum number of reads that should be aligned from a run. For more detail on how to enable and specify this reference-specific limit see the Adding a Reference Sequence section of Working with Reference Sequences .

When the number of reads in a run exceeds a genome-specific maximum, a random sample of reads is taken and results are extrapolated to the full run. By sampling a quickly-aligned subset of reads and extrapolating the values to the full run, the software gives you enough information to be able to judge the quality of the sample, library and sequencing run for quality assessment purposes.

The outputs of the alignment process is a BAM file. The BAM file includes an alignment of all reads, including the unmapped, with exactly one mapping per read. When a read maps to multiple locations, the mapping with the best mapping score is used. If more than one such mapping exists, a random mapping is used and given a mapping quality of zero.