Torrent Browser Analysis Report Guide


Torrent Suite Software space on Ion Community

Analysis Report Guide TOC

Coverage Analysis Plugin

The Coverage Analysis plugin provides statistics and graphs describing the level of sequence coverage produced for targeted genomic regions.

The plugin's documentation is embedded in its output page. Access the documentation through help and options icons in the top right corner of a chart:

Run the Coverage Analysis plugin

You can run the Coverage Analysis plugin automatically or manually.

Include the Coverage Analysis plugin in a run plan

To run theCoverage Analysis plugin automatically, you select Coverage Analysis plugin during template setup. Refer to the Plan Tab and Templates pages section of the Torrent Browser User Interface Guide for information about how to set up a template and create a planned run.

Manually launch the Coverage Analysis plugin

To run the Coverage Analysis plugin manually, perform the following steps:

1. In the Torrent Browser, select a run report by clicking a run link, then clicking a report from the dropdown area. The run report opens.

2. On the run report page, scroll about halfway down the screen to the Plugin Summary area. Click Select plugins to run . The Select a plugin popup appears:





3. Select coverageAnalysis . TheCoverage AnalysisPlugin interface appears.

4. Select a library type.

5. If you have one and would like to use it, select a targeted regions file.

6. Fill out the other plugin options. These options vary depending on your Library Type selection:

  • Target Padding If you would like to pad the target by a number of bases, enter the desired number. If you do not enter a number,the default of 0 is used.
  • Use Only Uniquely Mapped Reads If you would like the plugin to examine only unique starts, select the checkbox.
  • Use Only Non-duplicate Reads Select the checkbox to avoid duplicates. The Torrent Suite analysis must have been run with Mark Duplicates enabled.
  • SampleID Tracking Check this only if the Ion AmpliSeq library employed sampleID tracking amplicons.

7. When you are satisfied with your selections, click Submit .

The analysis runs and a group of output reports is created.

The following sections of this document describe the output reports generated by theCoverage Analysisplugin.

Coverage Analysis Plugin output

The plugin generates a Coverage Analysis Report. This report includes read statistics and several charts. The statistics and charts presented depend on the library type for the analysis. I n addition, in the File Links section at the bottom ofthe Coverage Analysis Report, you can download statistics files and the aligned reads BAM file.

Most Coverage Analysis chart have help and options icons in the top right corner:

Click a chart's help icon to open a description of the chart.

Click a chart's options icon to open a panel of options for the chart.

Most fields in the report offer hover help.

Example statistics

The following is an example of the plugin statistics for a whole genome run. Most fields names offer hover help.

Click on the Coverage Overview graph to see a larger image (then click Back in your browser to return to the report).

Reads statistics

The library type determines which statistics are presented.

Statistic

Description

Number of mapped reads

Total number of reads mapped to the reference.

Number of reads on target

Total number of reads mapped to any targeted region of the reference. A read is considered to be on target if at least one alignedbaseoverlaps a target region. A read that overlaps a targeted region but where only flanking sequence is aligned, for example, due to poor matching of 5' bases of the read, is not counted.

Target Base Coverage

Summary statistics for targeted base reads of the reference.

A base covered by multiple target regions is only counted once per sequencing read.

Bases in target regions The total number of bases in all specified target regions of the reference.

Percent of reads on target

The percentage of reads mapped to any targeted region relative to all reads mapped to the reference.

Total aligned base reads

The total number of bases covered by reads aligned to the reference.

Total base reads on target

The total number of target bases covered by any number of aligned reads.

Percent base reads on target

The percent of all bases covered by reads aligned to the reference that covered bases in target regions.

Bases in targeted reference

The total number of bases in all target regions of the reference.

Bases covered (at least 1x)

The total number of target bases that had at least one read aligned over the proximal sequence. Only the aligned parts of each read are considered. For example, unaligned (soft-cut) bases at the 5' ends of mapped reads are not considered. Covered target reference bases may include sample DNA read base mismatches, but does not include read base deletions in the read, nor insertions between reference bases.

Average base coverage depth

The average number of reads of all targeted reference bases.

Uniformity of base coverage

The percentage of bases in all targeted regions (or whole genome)covered by at least 0.2x the average base coverage depth.

Maximum base read depth

The maximum number of times any single target base was read.

Average base read depth

The average number of reads of all targeted reference bases that were read at least once.

Std.Dev base read depth

The standard deviation (root variance) of the read depts of all targeted reference bases that were read at least once.

Genome Base Coverage

Summary statistics for base reads of the reference genome.

Genome base coverage at N x The percentage of reference genome bases covered by at least N reads.

Target coverage at N x

The percentage of target bases covered by at least N reads.

Targets with no strand bias

The percentage of all targets that did not show a bias towards forward or reverse strand read alignments. An individual target is considered to have read bias if it has at least 10 reads and the fraction of forward or reverse reads to total reads is greater than 70%.

Amplicon Read Coverage

Summary statistics for reads assigned to specific amplicons.

Each sequence read will be assigned to exactly one of the amplicons specified by the targets file.

Reads are assigned to particular amplicon targets based if their (5') mapping location being sufficiently

close to the end of the amplicon region, taking the read direction (mapping strand) in to account.

Number of amplicons The number of amplicons specified in the target regions file.
Percent assigned amplicon reads

The total number of reads that were assigned to individual amplicons.

A read is assigned to a particular (inner) amplicon region if any aligned bases overlap that region.

If a read might be associated with multiple amplicons this way it is assigned to the amplicon region that has the greatest overlap of aligned sequence.

Average reads per amplicon The average number of reads assigned to amplicons.
Uniformity of amplicon coverage The percentage of bases in all targeted regions (or whole genome) covered by at least 0.2x the average base read depth.
Amplicons with at least N reads The percentage of all amplicons that had at least N reads.
Amplicons with no strand bias The percentage of all amplicons that did not show a bias towards forward or reverse strand read alignments. An individual amplicon is considered to have read bias if it has at least 10 reads and the fraction of forward or reverse reads to total reads is greater than 70%.
Amplicons reading end-to-end The percentage of all amplicons that were considered to have a sufficient proportion of assigned reads (70%) that covered the whole amplicon target from 'end-to-end'. To allow for error the effective ends of the amplicon region for read alignment are within 2 bases of the actual ends of the region.

Example charts

This section shows a couple example charts. Many charts have a Plot menu that allows you to change characteristics of the chart, for instance, to show both strands.

Click a chart's options icon (in the top right corner of a chart) to open the charts viewing options panel.



In the Depth of Coverage chart above, the left Y-axis (% reads) is the number of reads at a particular read depth (or bin of read depths) as a percentage of the total number of (base) reads. The right Y-axis (% cumulative reads) is the cumulative count of the number of reads at a given read depth or greater as a percentage of the total number of (base) reads. If your analysis includes a regions of interest file, this chart reflects only targeted reads (reads that fall within a region of interest).

In most charts you click on a data point to open a detail panel for that data:

In this chart, the blue curve measures the cumulative reads at that read depth or greater. Click a point on the blue curve to open the blue detail panel for that read depth:

The following Reference Coverage Chart is shown with the Strand Base Reads option:

Note: The Viewing options panel is revealed or hidden with the chart's options icon . The help icon opens a description of the chart.

Output files

You download plugin results file from links in the File Links section. This example is from a generic sequence run:

Ion TargetSeq analyses also offer the option "Download the targetseq coverage summary file".

Ion AmpliSeq analyses also offer the option "Download the amplicon coverage summary file".

Click a file's question mark icon to open a description of the file:

The following table lists the output files with a description of each. Not all output files are generated on every type of analysis.

File Description
Coverage statistics summary

A summary of the statistics presented in the tables at the top of the plugin report. The first line is the title. Each subsequent line is either blank or a particular statistic title followed by a colon (:) and its value. See also Example Coverage Analysis Report .

Base depth of coverage

Coverage summary data used to create the Depth of Coverage Chart. This file contains these fields:

  • read_depth The depth at which a (targeted) reference base has been read.
  • base_cov The number of times any base was read (covered) at this depth.
  • base_cum_cov The cumulative number of reads (coverage) at this read depth or greater.
  • norm_read_depth The normalized read depth (depth divided by average base read depth).
  • pc_base_cum_cov As base_cum_cov but represented as a percentage of the total base reads.
Amplicon coverage summary

Coverage summary data used to create the Amplicon Coverage Chart. This file contains these fields:

  • contig_id The name of the chromosome or contig of the reference for this amplicon.
  • contig_srt The start location of the amplicon target region. Note: This coordinate is 1-based, unlike the corresponding 0-based coordinate in the original targets BED file.
  • contig_end The last base coordinate of this amplicon target region. Note: The length of the amplicon target is given as tlen = (contig_end - contig_srt + 1).
  • region_id The ID for this amplicon as given as the 4th column of the targets BED file.
  • gene_id The gene symbol as given as the last field of the targets BED file.
  • gc The number of G and C bases in the target region. Hence, %GC = 100% * gc / tlen.
  • overlaps The number of times this target was overlapped by any read by at least one base. Note: Individual reads might overlap multiple amplicons where the amplicon regions themselves overlap.
  • fwd_e2e The number of assigned forward strand reads that read from one end of the amplicon region to the other end.
  • rev_e2e The number of assigned reverse strand reads that read from one end of the amplicon region to the other end.
  • total_reads The total number of reads assigned to this amplicon. This value equals (fwd_reads + rev_reads) and is the field that rows of this file are ordered by (then by contig id, srt and end).
  • fwd_reads The number of forward strand reads assigned to this amplicon.
  • rev_reads The number of reverse strand reads assigned to this amplicon.
Target coverage summary

Coverage summary data used to create the Target Coverage Chart. This file contains fields:

  • contig_id The name of the chromosome or contig of the reference for this target.
  • contig_srt The start location of the target region. Note: T his coordinate is 1-based, unlike the corresponding 0-based coordinate in the original targets BED file.
  • contig_end The last base coordinate of this target region. Note: T he length of the target is given as tlen = (contig_end - contig_srt + 1).
  • region_id The ID for this target as given as the 4th column of the targets BED file.
  • gene_id The gene symbol as given as the last field of the targets BED file.
  • gc The number of G and C bases in the target region. Hence, %GC = 100% * gc / tlen.
  • covered The number of bases of this target that were covered by at least one read. Hence the percentage coverage of this target is calculated as %cov = 100% * covered / tlen. Note that this might also not 100% because of base deletions in the sample vs. the reference genome.
  • uncov_3p The number of bases that are not covered at the 3' ( down stream) end of the forward DNA strand. For Ion TargetSeq analyses, this may indicate poor probe coverage at this end of the target.
  • uncov_5p The number of bases that are not covered at the 5' ( up stream) end of the forward DNA strand.
  • depth The average target base read depth. This value equals (fwd_reads + rev_reads) / tlen and is the field that rows of this file are ordered by (then by contig id, srt and end).
  • fwd_reads The number of forward strand reads assigned to this target.
  • rev_reads The number of reverse strand reads assigned to this target.
Chromosome base coverage summary

Base reads per chromosome summary data used to create the default view of the Reference Coverage Chart. This file contain s these fields:

  • chrom The name of the chromosome or contig of the reference.
  • start Coordinate of the first base in this chromosome. This is always 1.
  • end Coordinate of the last base of this chromosome. Also its length in bases.
  • fwd_reads Total number of forward strand base reads for the chromosome.
  • rev_reads Total number reverse strand base reads for the chromosome.
  • fwd_ontrg (if present) Total number of forward strand base reads that were in at least one target region.
  • rev_ontrg (if present) Total number and reverse strand base reads that were in at least one target region.
  • seq_reads Total sequencing (whole) reads that are mapped to individual contigs.
Aligned reads BAM file Contains all aligned reads used to generate this report page, in BAM format. BAM is the binary form of the SAM format file that records individual reads and their alignment to the reference genome. Refer to the current SAM tools documentation for more file format information.
Aligned reads BAI file Binary BAM index file as required by some analysis tools and alignment viewers such as IGV.
Primer-trimmed reads BAM file. Binary primer-trimmed aligned reads. Created from the original alignment file by trimming reads to specific amplicons regions they are assigned to, where necessary to resolve overlaps with multiple amplicon target regions.
Primer-trimmed reads BAI file. Binary BAM index file as required by some analysis tools and alignment viewers such as IGV.
Example Coverage Analysis Report