InterARTIC is an interactive web application designed to simplify the use of the ARTIC bioinformatics pipelines for nanopore sequencing analysis on viral genomes. InterARTIC was initally designed and tested for analysis of SARS-CoV-2, but is suitable for analysis of any virus and/or amplicon scheme, including a user's own custom amplicons. InterARTIC spports both the Nanopolish and Medaka pipeline alternatives from ARTIC, with full parameter customisation enabled through a simple graphical interface.
Pre-print: InterARTIC: an interactive web application for whole-genome nanopore sequencing analysis of SARS-CoV-2 and other viruses
We provide a pre-compiled binary release for common Linux distributions on x86_64 architecture. The binary release is tested to work well on Ubuntu 14, 16, 18 and 20 distributions (on Windows Subsystem for Linux as well). The binary release should work on other distributions as long as GLIBC 2.17 (and basic shared libraries such as pthreads) or higher and /usr/bin/env
are present.
First, open an Ubuntu terminal and run the following commands to download the latest release, extract the tar ball and run the provided run.sh script:
wget https://github.com/Psy-Fer/interARTIC/releases/download/v0.3/interartic-v0.3-linux-x86-64-binaries.tar.gz -O interartic_bin.tar.gz
tar xf interartic_bin.tar.gz
cd interartic_bin
./run.sh
IMPORTANT: Make sure the interARTIC binaries reside at a location with no white characters and non-ASCII characters in directory names.
The run.sh script has now launched a new interactive interARTIC session. To see your session, visit http://127.0.0.1:5000 on your web browser. Here, you can configure and run your next job using the graphical interface. Make sure you keep the terminal open to keep your interARTIC session running.
Open a new terminal to download and extract the example test dataset. The commands below will extract the dataset to /data, assuming /data exists on the computer (sudo mkdir /data
, if not) and you have write permission to /data (sudo chmod 777 /data
, if not). The /data folder is the default location for sequencing outputs on an ONT GridION or PromethION device, but on your own machine you may use a custom location such as /home/username/data if you wish (hint: you may use the pwd
command on your terminal to get the path of your current working directory).
cd /data
wget https://cloudstor.aarnet.edu.au/plus/s/srVo6NEicclqQNE/download -O FLFL031920_sample_data.tar.gz
tar xf FLFL031920_sample_data.tar.gz
rm FLFL031920_sample_data.tar.gz
Once extracted, you should see two directories:
- FLFL031920 containing a data from a nanopore sequencing run of 10 multiplexed SARS-CoV-2 isolates, performed on an ONT GridION. The .fast5 files, .fastq files and the sequencing summary file are amongst the extracted data. This example dataset follows the same directory structure of a nanopore sequencing run with live base-calling enabled.
- sample-barcodes containing a .csv manifest file that matches sample names to sample barcodes.
For detailed information on the input data structure and .csv manifest file, please visit the InterARTIC usage guide here.
IMPORTANT: Make sure the the data directory and file names do not contain white space and non-ASCII characters.
Configuration is only required if you downloaded the dataset to a custom location instead of /data.
In your interARTIC web interface, click Set locations of input data
. Fill the first two fields (1. location of your input data, and 2. location of your sample-barcode .csv files are located). For example, if you used /home/username/data the fields should be /home/username/data and /home/username/data/sample-barcodes, respectively.
Click confirm
to save the settings, which will be used for all future runs.
Click Add Job
on the interARTIC web interface. Then fill the fields as given in the following table. Note that when you click (double click on some browsers) on fields for input data directory and Select a CSV file, a list of files/directories should appear from which you can select.
field | value | description |
---|---|---|
Job name | test | whatever name that you like for the run (only alpha numeric characters and underscore are allowed) |
input data directory | FLFL031920 | this is the directory containing the nanopore sequencing data |
This input contains | Multiple samples | our example test dataset contains 10 multiplexed samples |
Select a CSV file | FLFL031920-barcodes.csv | .csv manifest file that matches sample names to sample barcodes |
virus | SARS-CoV-2 (nCoV-2019) | |
Select your primer scheme | Eden V1 (2500bp) | our example test dataset used Eden V1 primers |
library preparation method | Ligation library prep (eg SQK-LSK109) | our example test dataset used ligation barcodes |
Select a pipeline to run | Both | we will test both medka and nanopolish pipelines, which will run one after the other |
Now click Submit job(s)
and you should see the pipeline running :)
Another example dataset containing ebola virus samples that you can use to directly test interARTIC can be downloaded from here. The relavent options for this data set are Multiple samples, ebola-barcodes.csv, IturiEBOV, Artic V1 and Ligation library prep (eg SQK-LSK109).
Output data generated from interARTIC (version 0.2-beta) for the above two example data set can be downloaded for your reference from here.
Before running interARTIC on your own nanopore samples, please refer to the detailed guide here.
See here for troubleshooting common issues.
Building from source is not straightforward, due to the dependency hell of Python versions (circumventing this hell was one of the motivations for developing interARTIC). Step-by-step instructions for building from source are given here.
Developers interested in learning how we create portable binary releases, read our packaging steps (aka the art of snake charming) detailed here.
interARTIC is a layer built on top of the ARTIC pipeline. Binary releases of interARTIC contain:
- Python 3.7 binaries (build: cpython-3.7.7-linux64-20200409T0045) and several Python 3.7 modules available through pypi (e.g., celery, redis, flask, redis-server)
- ARTIC pipeline binaries available through bioconda that includes many dependencies (e.g., Python 3.6, medaka, nanopolish)