Giter Site home page Giter Site logo

vallurumk / deepvariant Goto Github PK

View Code? Open in Web Editor NEW

This project forked from nf-core/deepvariant

0.0 1.0 0.0 4.94 MB

Google's DeepVariant variant caller as a Nextflow pipeline

Home Page: http://nf-co.re

License: MIT License

Dockerfile 0.84% HTML 7.37% R 3.31% Python 5.48% Nextflow 83.00%

deepvariant's Introduction

deepvariant

nf-core/deepvariant

Deep Variant as a Nextflow pipeline

Build Status Nextflow Gitter

install with bioconda Docker Singularity Container available

A Nextflow pipeline for running the Google DeepVariant variant caller.

What is DeepVariant and why in Nextflow?

The Google Brain Team in December 2017 released a Variant Caller based on DeepLearning: DeepVariant.

In practice, DeepVariant first builds images based on the BAM file, then it uses a DeepLearning image recognition approach to obtain the variants and eventually it converts the output of the prediction in the standard VCF format.

DeepVariant as a Nextflow pipeline provides several advantages to the users. It handles automatically, through preprocessing steps, the creation of some extra needed indexed and compressed files which are a necessary input for DeepVariant, and which should normally manually be produced by the users. Variant Calling can be performed at the same time on multiple BAM files and thanks to the internal parallelization of Nextflow no resources are wasted. Nextflow's support of Docker allows to produce the results in a computational reproducible and clean way by running every step inside of a Docker container.

For more detailed information about Google's DeepVariant please refer to google/deepvariant or this blog post.
For more information about DeepVariant in Nextflow please refer to this blog post

Quick Start

Warning DeepVariant can be very computationally intensive to run.

To test the pipeline you can run:

nextflow run nf-core/deepvariant -profile test,docker

A typical run on whole genome data looks like this:

nextflow run nf-core/deepvariant --genome hg19 --bam yourBamFile --bed yourBedFile -profile standard,docker

In this case variants are called on the bam files contained in the testdata directory. The hg19 version of the reference genome is used. One vcf files is produced and can be found in the folder "results"

A typical run on whole exome data looks like this:

nextflow run nf-core/deepvariant --exome --genome hg19 --bam_folder myBamFolder --bed myBedFile -profile standard,docker

Documentation

The nf-core/deepvariant documentation is split into the following files:

  1. Installation
  2. Running the pipeline
  3. Pipeline configuration
  4. Output and how to interpret the results
  5. Troubleshooting
  6. More about DeepVariant

More about the pipeline

As shown in the following picture, the worklow both contains preprocessing steps ( light blue ones ) and proper variant calling steps ( darker blue ones ).

Some input files ar optional and if not given, they will be automatically created for the user during the preprocessing steps. If these are given, the preprocessing steps are skipped. For more information about preprocessing, please refer to the "INPUT PARAMETERS" section.

The worklow accepts one reference genome and multiple BAM files as input. The variant calling for the several input BAM files will be processed completely indipendently and will produce indipendent VCF result files. The advantage of this approach is that the variant calling of the different BAM files can be parallelized internally by Nextflow and take advantage of all the cores of the machine in order to get the results at the fastest.

Credits

This pipeline was originally developed at Lifebit, by @luisas, to ease and reduce cost for variant calling analyses

Many thanks to nf-core and those who have helped out along the way too, including (but not limited to): @ewels, @MaxUlysse, @apeltzer, @sven1103 & @pditommaso

deepvariant's People

Contributors

philpalmer avatar luisas avatar apeltzer avatar maxulysse avatar ewels avatar pprieto avatar mariach avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.