Giter Site home page Giter Site logo

treetory / glnexus Goto Github PK

View Code? Open in Web Editor NEW

This project forked from dnanexus-rnd/glnexus

0.0 0.0 0.0 10.22 MB

Scalable gVCF merging and joint variant calling for population sequencing projects

License: Apache License 2.0

Shell 1.39% C++ 95.26% Python 1.59% Makefile 0.01% Cap'n Proto 0.04% CMake 1.51% Dockerfile 0.20%

glnexus's Introduction

GLnexus

From DNAnexus R&D: scalable gVCF merging and joint variant calling for population sequencing projects. (GL, genotype likelihood)

Reading

Our 2018 manuscript with collaborators at Regeneron Genetics Center and Baylor College of Medicine details the design of GLnexus and scientific validation using up to 240,000 human exomes and 22,600 genomes. Compared to the DNAnexus cloud-native deployment used for such large projects, this open-source version produces identical scientific results but lacks some of the scalability and production-oriented features.

NEW for 2020: Accurate, scalable cohort variant calls using DeepVariant and GLnexus (by Google Health team) including public bucket with 1000 Genomes Project modern resequencing products.

The Getting Started wiki page has a tutorial for first-time users.

For each tagged revision, the Releases page has a static executable suitable for most Linux x86-64 hosts; just download it and chmod +x glnexus_cli. Each release also provides a lightweight Docker image wrapping glnexus_cli.

Build & test

Coverage Status

The GLnexus build process has a number of dependencies, but produces a standalone, statically-linked executable glnexus_cli. The easiest way to build it is to use our Dockerfile to control all the compile-time dependencies, then simply copy the static executable out of the resting Docker container and put it anywhere you like.

# Clone repo
git clone https://github.com/dnanexus-rnd/GLnexus.git
cd GLnexus
git checkout vX.Y.Z  # optional, check out desired revision

# Build GLnexus in docker
docker build --target builder -t glnexus_tests .

# Run GLnexus unit tests.
docker run --rm glnexus_tests

# Copy the static GLnexus executable to the current working directory.
docker run --rm -v $(pwd):/io glnexus_tests cp glnexus_cli /io

# Run it to see its usage message.
./glnexus_cli

To build GLnexus without Docker, make sure you have gcc 5+, CMake 3.2+, and all the dependencies indicated in the Dockerfile.

Then,

git clone https://github.com/dnanexus-rnd/GLnexus.git
cd GLnexus
cmake -Dtest=ON . && make -j$(nproc) && ctest -V

You will also find ./glnexus_cli here.

Coding conventions

  • C++14 - take advantage of the goodies
  • Use smart pointers to avoid passing resources needing manual deallocation across function/class boundaries
  • Prefer references over pointers when they shouldn't be null nor change ever.
  • Avoid exceptions; prefer returning a Status, defined early in types.h
  • nb the frequently-used convenience macro S() defined just below Status
  • Avoid public constructors with nontrivial bodies; prefer static initializer function returning Status
  • Avoid elaborate templated class hierarchies

Libraries used

Performance profiling

The Performance wiki page has practical advice for deploying GLnexus on a powerful server.

The code has some hooks for performance profiling using perf and FlameGraph.

To profile performance within the DNAnexus applet run the applet as usual plus -i perf=true. This produces an output file genotype.stacks containing sampling observation counts for common call stacks. To generate an SVG visualization with FlameGraph:

git clone https://github.com/brendangregg/FlameGraph
FlameGraph/flamegraph.pl < genotype.stacks > genotype.svg

glnexus's People

Contributors

cmclean avatar mlin avatar orodeh avatar tedyun avatar vorontsovie avatar xunjieli avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.