Giter Site home page Giter Site logo

annovar2maf's Introduction

Introduction

This is a tiny python script to generate MAF files from output generated by stadard annotation programs. Currently, annovar - table_annovar.pl output and bcftools csq outputs can be converted to maf.

$ python annovar2maf.py -h
usage: annovar2maf [-h] [-t TSB] [-b BUILD] [-p {refGene,ensGene}] [-c] input

Convert annovar and bcftools-csq annotations to MAF

positional arguments:
  input                 Annovar anotations file [Ex: myanno.hg19_multianno.txt] or a csq formatted file.

optional arguments:
  -h, --help            show this help message and exit
  -t TSB, --tsb TSB     Sample name. Default parses from the file name
  -b BUILD, --build BUILD
                        Reference genome build [Default: hg38]
  -p {refGene,ensGene}, --protocol {refGene,ensGene}
                        Protocol used to generate annovar annotations [Default: refGene]
  -c, --csq             Input file is a bcftools csq formatted output

annovar2maf

python annovar2maf.py -t foo -b GRCh37 tests/test_mutect.refseq.hg19_multianno.txt 

# For annovar annotations generated with ensGene as a protocol
python annovar2maf.py -p ensGene -t foo -b GRCh37 tests/test_mutect.ens.hg19_multianno.txt

csq2maf

Similar to VEP, bcftools csq command can annotate variants with consequences. The program is lightweight and extremely fast Output can be converted to tsv with split-vep and then converted to MAF.

ref="Homo_sapiens.GRCh37.dna.primary_assembly.fa"

# Get the GFF files for your ref build
## GRCh38 with and without the chr prefix
#wget ftp://ftp.ensembl.org/pub/current_gff3/homo_sapiens/Homo_sapiens.GRCh38.110.chr.gff3.gz
#wget ftp://ftp.ensembl.org/pub/current_gff3/homo_sapiens/Homo_sapiens.GRCh38.110.gff3.gz

## GRCh37 with and without the chr prefix
#wget ftp://ftp.ensembl.org/pub/grch37/release-84/gff3/homo_sapiens/Homo_sapiens.GRCh37.82.chr.gff3.gz
wget ftp://ftp.ensembl.org/pub/grch37/release-84/gff3/homo_sapiens/Homo_sapiens.GRCh37.82.gff3.gz

## Step-1: Below commands left normalizes the VCF, splits multi-alleleic variants, annotates vcf with variant consequences while prioritizing variants with worst consequences. 
bcftools norm -f ${ref} -m -both -Oz tests/test_mutect.vcf.gz | bcftools csq -c CSQ -f ${ref} -g Homo_sapiens.GRCh37.82.gff3.gz -p a | \
bcftools +split-vep /dev/stdin -Oz -o tests/test_mutect.csq.vcf.gz -c - -s worst

## Step-2: Below command converts csq annotated vcf to tsv
bcftools query -f '%CHROM\t%POS\t%REF\t%ALT\t%gene\t%transcript\t%Consequence\t%amino_acid_change\t%dna_change\n' tests/test_mutect.csq.vcf.gz > tests/test_mutect.csq.tsv

## Step-3: Now Covert tsv to maf
python annovar2maf.py -c -t foo -b GRCh37 tests/test_mutect.csq.tsv

annovar2maf's People

Contributors

poisonalien avatar

Stargazers

 avatar Nilesh Mukherjee avatar dapeng Liang avatar

Watchers

 avatar  avatar

Forkers

liangdp1984

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.