Torrent Suite Software Data ManagementGuide

Torrent Suite Software space on Ion Community

Data Management Guide TOC

Data Management Configuration

Overview

You use the Torrent Browser Data Management utility for both automatic and on-demand control over your analysis files and disk space usage.

You can set different data age and disk space thresholds for different categories of analysis files (such as input files, debugging files, and output files) as well as different archive locations.

The Data Management actions are performed on pre-defined categories of files for a run, not on the whole run by default.

In a fantasy world with unlimited disk space, you could theoretically keep all files for all your completed analyses. At any time, you could reanalyze a run or launch a plugin on a run, without restriction. In the real world, t he Data Management archive and deletion rules allow you to define the balance that makes sense for your site, between limitations on future uses of completed runs and sufficient disk space.

Email notifications

Even if you are not ready to configure automatic Data Management features at this time, please visit the configuration page and complete the following:

  • Enter the email address of an administrator who will receive disk-full notifications. (Your I.T. department also needs to allow email from the Torrent Server to be received by the local email relay.)
  • Configure Archive Directory media for all file categories. These settings are required to support on-demand archive and export.

See also Configure the email address .

Simplest path for initial configuration

Too much information? To get started, follow the steps in Configure Data Management for your Torrent Server and in step 3 follow the basic setup.

To take full advantage of the Data Management system, you (or an administrator or I.T. person) have to mount archive media and export media.

Automatic and on-demand actions

Supported automatic actions are the following:

  • Automatic deletion based on data age and disk space thresholds
  • Automatic archival based on data age

Supported on-demand actions are the following:

  • Archival, deletion, or export of a specific run (in the Data tab > Data Management tab Disk Space Management section).
  • Archival, deletion, or export of one or more runs that are members of a project (in a Data > Projects > project_name page)

You use the configuration page mostly to configure the automatic actions that you want for your Torrent Server. However, the on-demand archive and deletion actions also depend on the Archive Directory configuration for each file category.

Archive locations

The Archive Directory menus define the following for each file category:

  • The target directory for the archive and export options.

The Torrent Browser creates the archive and export target directories following the conventions below. The destination location depends on both the action and the file category involved:

  • Signal Processing Input file category:
    • Archived files are moved to the directory < archive partition >/< RunN ame>.
    • Exported files are copied to the directory < archive partition >/< RunN ame>.
  • Basecalling, Output, and Intermediate file categories:
    • Archived reports are moved to the directory < archive partition >/archivedReports/<Report name _id>.
    • Exported reports are copied to the directory < archive partition >/exportedReports/<Report name _id>.



Icon
The Output file category uses the report name in the destination directory, while the Signal Processing Input file category uses the run name .

See also Requirements for archive and export media and directories .

If Data Management is not enabled

If Data Management is not enabled (in the Data tab > Data Management tab Configuration section), no automatic action is taken and email notifications related to Signal Processing Input files ready for deletion are not sent.

On-demand deletion is still supported (in the Data tab > Data Management tab Disk Space Management section and in a project page).

On-demand archive and export are still supported (in the Data tab > Data Management tab Disk Space Management section and in a project page), for file categories that have their Archive Directory menu configured with media that is currently mounted.

Open the Data Management Configuration page in your Torrent Browser

Follow these steps to open the Data Management Configuration page:

  1. To access theDisk Usage section, click the Data tab and the Disk Management tab::







  2. In the Configuration section, click Configure :





Sections of the configuration page

The following is an example configuration page. Each section is explained below.

File category table

The File Category table shows the automatic archive and deletion rules for the various file categories. Automatic actions are based on file categories within a run rather than on individual runs.

Note : When the Archive Auto-action is selected, the Disk Full Threshold field disappears and the Archive Directory field appears. When Delete Auto-Action is selected, the Archive Directory field disappears.

You can perform the automatic data management actions (archive or delete) on all files of a run or on one or more of the file categories of the run. To act on all files of a run, select all four file categories.

File categories are explained in the Torrent Browser configuration page and also in this list:

  • Signal Processing Input Required to reanalyze an Ion PGM or Ion Proton run from scratch. (The Output file category is also required.)

    Basecalling Input Required to reanalyze a n Ion PGM or Ion Proton run from basecalling. (The Output file category is also required.)
  • Output Required to see the run report in the Torrent Browser and either to reanalyze the run or to launch a plugin.
  • Intermediate Useful to troubleshoot a run. Not required for the run report or for reanalysis. (This file category can safely be deleted after a run completes successfully.)

You use these file categories with both automatic actions and on-demand actions.

Icon

Your selections in the Archive Directory menus also determine whether or not on-demand archive and export are enabled.

To enable on-demand archive and export, select a media location for each file category.

If the Archive Directory menu is blank, see Add a new media location to the Archive Directory menu .

See Data Management File Categories Details for file lists of each category.

Auto Actions

The Auto Actions menus in the configuration page are for automatic data management actions.

The supported automatic actions are Archive and Delete.

The following are dependencies for the Archive and Delete automatic actions:

  • Both Both automatic actions depend on the Archive Directory selection being mounted.
  • Archive Archive depends on the candidate file category meeting the Data Age Threshold.
  • Delete Deletion depends on the candidate file category meeting both the Data Age Threshold and the Disk Full Threshold.

The Signal Processing Input file category has an addition dependency on the "Auto Acknowledge Delete?" checkbox. See Auto Acknowledge Delete checkbox .

Export is supported as an on-demand feature and cannot be configured to happen automatically.

Thresholds

With the threshold menus, you set the rules for automatic archive and deletion:

  • A utomatic a rchival is affected by the Data Age threshold.
  • Automatic d eletion is affected by both Data Age and Disk Full thresholds.

The Disk Full threshold involves a trade-off between adequate disk space and the need to access older analysis results. Lower settings for this threshold ensure that adequate space is available for the data from new analyses, but also reduce the time that older analysis data is available.

Because system performance is affected when disk partitions reach or exceed 95% capacity, we do not recommend a setting much higher than 90%.

The system automatically archives a file category when the following are true for the category:

  • The Data Age threshold is exceeded .
  • The file category is not marked as Keep (in the Data tab > Data Management tab Disk Space management section, the Data Management Configuration page, or the Completed Runs and Reports list view).

The system automatically deletes a file category when the following are true for the category:

  • The Disk Full threshold is exceeded.
  • The Data Age threshold is exceeded.
  • The file category is not marked as Keep (in the Data tab > Data Management tab Disk Space management section, the Data Management Configuration page, or the Completed Runs and Reports list view).
  • For a Signal Processing Input file category, deletion must be acknowledged through the Data Management Configuration page checkbox (if auto-action is enabled) or through the Data tab > Data Management tab Disk Space management section checkbox for the specific run (if auto-action is not enabled).

Archive Directory menus

Your entries for Archive Directory d efine the target directory for the archive option (for both on-demand and automatic) and the on-demand export option.

You can configure the file categories to have the same or different Archive Directories.

See Archive locations for more information about the naming of archived and exported directories.

Icon

In most cases, you should configure an Archive Directory for each file category. On-demand archive and export depend on a configured archive media for the file categories involved.

Enabled checkbox

The Enabled checkbox turns on or off all the automatic actions configured in the Auto Action menus. The checkbox is available in both the Data tab > Data Management tab Configuration section and in the Data Management Configuration page. Your selection in either location is effective immediately and reflected in the other location . (Your Enabled checkbox s election in t he Data Management Configuration page takes effect even if you cancel out of that page.)

The Data tab > Data Management subtab Configuration section:

The Data Management Configuration page:

If the Enabled checkbox is not checked, none of the automatic actions happen. (Note that automatic action for a specific file category could be disabled while automatic actions in general are enabled.)

Configure the email address

Enter the email address of the administrator who handles disk space issues (or enter an administrators' distribution list). The recipients of these emails need to respond to disk-full scenarios that impact server operations.

Icon
We strongly recommend that you use the Data Management utility for email notifications.

See Data Management Email Notifications for an explanation of the two scenarios when emails are sent and the prerequisites for email notifications.

Here are prerequisites for email notifications in general:

  • Postfix email is allowed from this Torrent Server.
  • Your I.T. department allows email from the Torrent Server to be received by your local email relay.

Auto Acknowledge Delete? checkbox

The "Auto Acknowledge Delete?" checkbox provides an extra layer of security for your Signal Processing Input files. This file category is considered more important because if the Signal Processing Input file category is preserved, the other file categories for a run can be regenerated.

A deletion automatic action for Signal Processing Input files is not performed unless this checkbox is selected:

Requirements for archive and export media and directories

Before you can use the archive and export features, mount one or more external file systems and select these locations in the Data Management Archive Directory menus (once for each file category). Theselocations do not appear in the Data Management Archive Directory menus until they are mounted. (Theexternal file systems must be mounted for you to configure the Data Management system.)

Directory ownership

Directories that are to be used as an archive or export directory must be owned by user ionadmin and also belong to group ionadmin .

Mount points

Any remotely mounted drive can be used as an archive or export directory if the mount point is under either /mnt or /media .

Note : Do not include space characters in your remote mounted directory names as our archiving system command does not parse these correctly.

NAS drives

Any Network Attached Storage (NAS) filesystem, regardless of mount point, can be used as an archive or export directory.

An NAS drive is manually mounted. The mount point is not limited to the /media directory.

Your Torrent Server administrator (or local I.T. person) might need to use the sudo mount command to mount an NAS drive. After the drive is mounted, it appears in the Archive Directory drop down menus. You may need to consult with your Linux system administrator.

USB drives

Your USB driveshould be ext3- or ext4-formatted and uniquely labeled.

To mount a USB drive as archive media, plug the USB drive into the Torrent Server. The Torrent Server automatically mounts the drive underthe /media directory. If a USB drive is mounted, it automatically appears in the /media directory.



When using many USB drives to archive data, it is important that you name your archive disks using logical, unique disk names. Over time a large collection of archive sources can accumulate. Uniquely-named disks help you keep track of your archives.

Archive Directory menus

Your external media must be mounted before you can configure file category archive or export actions for that media.



Configure Data M anagement for your Torrent Server

This section describes how to configure the data management features.

Prerequisite

The destination media must be mounted on the Torrent Server for you to configure media for automatic archive rules.

Overview of configuration steps

  1. Plan your configuration.
    • Check the Examples and other information on this page.
    • Decide on thresholds that are suitable for your Torrent Server usage patterns.
    • Or, for initial setup of the Data Management system, use our basic setup when you get to the "Configure file category rules" step below.



  2. Add your archive media locations to the Archive Directory menu.
    • You can skip this step if your archive media media already appear in the Archive Directory drop down menus in the Data Management Configuration page.
    • Please read Requirements for archive and export media and directories .
    • Before you can use the archive and export features, mount one or more external file systems and select these locations in the Data Management Archive Directory menus (once for each file category). These locations do not appear in the Data Management Archive Directory menus until they are mounted. (The external file systems must be mounted for you to configure the Data Management system.)



  3. Configure the file category rules table.
    • Enable automatic actions and threshold rules for the file categories you want archived or deleted.
    • You can try our example file category rules as a basis to get started. After you run the system for a while and see your site's disk usage patterns, you can return and customize the rules as needed. See "A typical configuration" in Examples of Configured File Categories Rules .



  4. Enable Data Management, email notification, and Signal Processing Input deletion.
    • Click the Enabled checkbox. (See also Enabled checkbox .)
    • Enter the email address to receive disk full notifications. (See also Email address ).
    • Either click the " Auto Acknowledge Delete? " checkbox here or later use the Data tab > Data Management tab Disk Space Management section to acknowledge deletion of Signal Processing Input files for a specific run. (See also Auto Acknowledge Delete? checkbox .)

A note about Ion S5, Ion PGM, and Ion Proton data files and locations

Data generated on the Personal Genome Machine (PGM ) sequencing instruments are initially stored in acquisition files with a DAT extension. One DAT file is created per flow on the sequencing instrument. The DAT files are transferred from the Ion PGM instruments to the Torrent Server, where various algorithms are used to condense the information into a WELLS file, called 1.wells . Base calling and quality score algorithms are applied to the WELLS file to create an unmapped BAM file.

Ion Proton instruments transfer the 1.wells files, not DAT files, to the Torrent Server.
Ion PGM data files for all file categories are stored in the /results directory. For Ion Proton data files, the Signal Processing Input file category is stored in /rawdata . Note that deletion or archival of a Ion Proton Signal Processing Input file category does not help reduce the disk usage of the /results partition.


(Other Proton file categories are stored in /results and do affect the /results partition. )