Giter Site home page Giter Site logo

virgena's Introduction

Welcome to VirGenA home page

VirGenA is a reference guided assembler of highly variable viral genomes, based on iterative mapping and de novo reassembling of highly variable regions, which can handle with distant reference sequence due to specially designed read mapper. VirGenA can separate mixtures of strains of different intraspecies genetic groups (genotypes, subtypes, clades, etc.) and assemble a separate consensus sequence for each group in a mixture.

If provided with multiple sequence alignment (MSA) of target references VirGenA selects optimal reference set, sorts reads to selected references and outputs consensus sequences corresponding to these references. For each consensus sequence the multiple sequence alignment of its constituent reads is printed in BAM format.

If no MSA provided, VirGenA works in single-reference mode and use user-provided reference.

Multi-fragment references are supported in single-reference mode.

You can use VirGenA for full genome assembly or just to find optimal reference set for given fastq files with Illumina paired end reads.

Documentation

Complete documentation is provided in wiki format.

Installation

VirGenA is a java application: it runs on any platform supporting JVM. Simply download the latest release file and run according to usage instructions.

Required dependencies

The following are required to run VirGenA:

-Java version 8 or higher

-VSEARCH binary in any location. Path to the binary is set in configuration file. Recomended version is included in the distribution.

-Blast installed locally

Toy example

To run VirGenA with test data download and unzip release files.

on Windows:

You can set number of threads in config_test_win.xml by changing value of ThreadNumber element.

Using Windows command promt change dir to unzipped folder and type:

java -jar ./VirGenA.jar assemble -c config_test_win.xml

on Linux:

You can set number of threads in config_test_linux.xml by changing value of ThreadNumber element.

Change permissions of ./tools/vsearch to make it executable. After that using shell change dir to unzipped folder and type:

java -jar VirGenA.jar assemble -c config_test_linux.xml

Test data is an artificial mixture containing 100000 HIV paired reads of three different subtypes (01_AE, B and C) in equal proportions. VirGenA should detect these components and assemble genome-length consensus sequences for all components.

Results will be stored in ./res/ folder. Expected output is:

  1. Files (fasta) with assemblies of three mixture components named after the selected references: 01_AE.TH.90.CM240.U54771_assembly.fasta, B.FR.83.HXB2_LAI_IIIB_BRU.K03455_assembly.fasta, C.BW.96.96BW0502.AF110967_assembly.fasta
  2. Sorted bam files with read alignments and corresponding index files (bai): 'reference_name'_mapped_reads.bam and 'reference_name'_mapped_reads.bai
  3. Log file.

How to cite:

Fedonin GG, Fantin YS, Favorov AV, Shipulin GA, Neverov AD. VirGenA: a reference-based assembler for variable viral genomes. Brief Bioinform, 2017 Jul 28. doi: 10.1093/bib/bbx079.

virgena's People

Contributors

gfedonin avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.