Giter Site home page Giter Site logo

raonyguimaraes / pynnotator Goto Github PK

View Code? Open in Web Editor NEW
23.0 3.0 3.0 9.15 MB

This is a Genome Annotation Framework developed with the goal of annotating VCF files (Exomes or Genomes) from patients with Mendelian Disorders.

Home Page: https://mendelmd.org

License: BSD 3-Clause "New" or "Revised" License

Python 97.81% Makefile 0.03% Shell 1.53% Dockerfile 0.63%

pynnotator's Introduction

Pynnotator

(https://circleci.com/gh/raonyguimaraes/pynnotator.svg?style=svg)

This is a Python library developed with the goal of helping annotate VCF files from Exome or Genomes of individuals with Mendelian Disorders.

It was built using state-of-the-art tools and databases for human genome annotation.

Development

git clone https://github.com/raonyguimaraes/pynnotator
cd pynnotator
python3 -m venv venv
source venv/bin/activate
python setup.py develop
pynnotator install
pynnotator test

This is what you should get:

pynnotator test
Testing Annotation...                             
Running Command gunzip -c -d /home/raony/dev/pynnotator/pynnotator/tests/sample.100.vcf.gz > /home/raony/dev/pynnotator/pynnotator/tests/ann_sample.100/sample.100.vcf
2021-03-21 03:18:20.152735 Starting sanity_check:  /home/raony/dev/pynnotator/pynnotator/tests/ann_s
ample.100/sample.100.vcf                                                                            
sort -k1,1d -k2,2n                                
2021-03-21 03:18:20.169532 Finished sanity_check, it took:  0:00:00.016797
2021-03-21 03:18:20.170173 Starting snpEff annotation:  sanity_check/sorted.vcf
2021-03-21 03:18:20.170460 Starting vep annotation:  sanity_check/sorted.vcf
2021-03-21 03:18:20.171451 Starting snpsift annotation:  sanity_check/sorted.vcf
2021-03-21 03:18:20.389561 Finished snpsift annotation, it took:  0:00:00.218110
2021-03-21 03:18:54.005021 Finished snpEff annotation, it took:  0:00:33.834848
2021-03-21 03:20:26.368233 Finished vep annotation, it took:  0:02:06.197773
2021-03-21 03:20:26.368687 Merging all VCF Files...
2021-03-21 03:20:26.368969 Starting merge:  sanity_check/sorted.vcf

=============================================
vcfanno version 0.3.2 [built with go1.12.1]

see: https://github.com/brentp/vcfanno
=============================================
vcfanno.go:115: found 10 sources from 3 files
vcfanno.go:156: falling back to non-bgzip
vcfanno.go:194: Info Error: CSQ not found in INFO >> this error/warning may occur many times. reporting once here...
vcfanno.go:248: annotated 45 variants in 0.00 seconds (10489.1 / second)
2021-03-21 03:20:26.416464 Finished merge, it took:  0:00:00.047495
2021-03-21 03:20:26.416888 Convert VCF to CSV...
2021-03-21 03:20:26.448489 Finished Annotation, it took 0:02:06.299032

A       A G       T G       A       A G       T G       A
| C   C | | C   C | | A   C | C   C | | C   C | | A   C |
| | T | | | | A | | | | G | | | T | | | | A | | | | G | |
| G   G | | G   G | | T   G | G   G | | G   G | | T   G |
T       T C       A C       T       T C       A C       T

Installation

Using conda:

conda install pynnotator
pynnotator install
pynnotator -i sample.vcf

Using pip:

pip install pynnotator
pynnotator install
pynnotator -i sample.vcf

Using docker-compose

cd compose
bash run-pynnotator-with-docker.sh

Languages

  • Perl
  • Python
  • Java
  • Go

Tools

  • vep (version 91.1)
  • snpeff (SnpEff 4.3r)
  • htslib (1.5)
  • vcftools (0.1.15)
  • vcfanno (v0.3.2)

Databases

  • 1000Genomes (Phase 3) - ALL.wgs.phase3_shapeit2_mvncall_integrated_v5b.20130502.sites.vcf
  • dbSNP (including clinvar) - (human_9606_b150_GRCh37p13)
  • Exome Sequencing Project - ESP6500SI-V2-SSA137.GRCh38-liftover
  • dbNFSP 3.5a (including dbscSNV 1.1)
  • Ensembl 90 (phenotype and clinically associated variants)
  • Decipher (HI_Predictions_Version3 and DDG2P)

Features

  • Annotate an exome in only 10 minutes.
  • Supports .VCF and .VCF.GZ files.
  • 20 min installation.
  • Multithread efficient!
  • Annotate a VCF file using multiple VCFs as a reference.
  • Combine the best tools and databases currently available for vcf annotation.

Files

.
├── 1000genomes
│   ├── ALL.wgs.phase3_shapeit2_mvncall_integrated_v5b.20130502.sites.vcf.gz
│   └── ALL.wgs.phase3_shapeit2_mvncall_integrated_v5b.20130502.sites.vcf.gz.tbi
├── dbnsfp
│   ├── dbNSFP3.4a.txt.gz
│   ├── dbNSFP3.4a.txt.gz.tbi
│   ├── dbscSNV1.1.txt.gz
│   └── dbscSNV1.1.txt.gz.tbi
├── dbsnp
│   ├── All_20170403.vcf.gz
│   ├── All_20170403.vcf.gz.tbi
│   ├── clinvar.vcf.gz
│   └── clinvar.vcf.gz.tbi
├── decipher
│   ├── DDG2P.csv.gz
│   ├── HI_Predictions_Version3.bed.gz
│   ├── HI_Predictions_Version3.bed.gz.tbi
│   └── population_cnv.txt.gz
├── ensembl
│   ├── Homo_sapiens_clinically_associated.vcf.gz
│   ├── Homo_sapiens_clinically_associated.vcf.gz.tbi
│   ├── Homo_sapiens_phenotype_associated.vcf.gz
│   └── Homo_sapiens_phenotype_associated.vcf.gz.tbi
├── esp6500
│   ├── esp6500si.vcf.gz
│   └── esp6500si.vcf.gz.tbi
├── snpeff_data
│   └── GRCh37.75
└── vep_cache
    └── homo_sapiens
        └── 88_GRCh37

705 directories, 11839 files

Examples of VCFs from patients with Mendelian Disorders

.
├── annotation.validated.vcf.gz
├── examples
│   ├── miller.vcf.gz
│   ├── NA12878.compound_heterozygous.vcf.gz
│   ├── NA12878.dominant.vcf.gz
│   ├── NA12878.recessive.vcf.gz
│   ├── NA12878.xlinked.vcf.gz
│   └── schinzel_giedion.vcf.gz
└── sample.1000.vcf

Requirements

  • Docker Compose or
  • Ubuntu 16.04 LTS or Red Hat/CentOS 7
  • Python 2 or 3

How to run it?

Requires at least 65GB of disk space during installation and 35GB after installed.

1º Method::

docker-compose run pynnotator -i pynnotator/tests/sample.1000.vcf
or
docker-compose run pynnotator -i sample.vcf.gz

2º Method::

# Using Ubuntu 16.04 LTS

sudo apt-get install gcc git python3-dev zlib1g-dev make zip libssl-dev libbz2-dev liblzma-dev libcurl4-openssl-dev build-essential
python3 -m venv mendelmdenv
source mendelmdenv/bin/activate
pip install pynnotator
pynnotator install

#And them finally:
pynnotator -i sample.vcf
#or
pynnotator -i sample.vcf.gz

Options

You can change settings of memory usage and number of cores in settings.py

Test

pynnotator test

Others

pynnotator install
#this will download and install all libraries and data needed.
pynnotator build
#this will rebuild the whole dataset required from scratch (this will take about 8h hours and requires a lot of memory)

Development

 git clone https://github.com/raonyguimaraes/pynnotator
 python setup.py develop
 # And have fun!

Annotations you can get from dbnfsp

Major sources:

    Variant determination:
            Gencode release 22/Ensembl 79, released March, 2015 (hg38)
    Functional predictions:
            SIFT ensembl 66, released Jan, 2015 http://provean.jcvi.org/index.php
            PROVEAN 1.1 ensembl 66, released Jan, 2015 http://provean.jcvi.org/index.php
            Polyphen-2 v2.2.2, released Feb, 2012 http://genetics.bwh.harvard.edu/pph2/
            LRT, released November, 2009 http://www.genetics.wustl.edu/jflab/lrt_query.html
            MutationTaster 2, data retrieved in 2015 http://www.mutationtaster.org/
            MutationAssessor, release 3 http://mutationassessor.org/
            FATHMM, v2.3 http://fathmm.biocompute.org.uk
            fathmm-MKL, http://fathmm.biocompute.org.uk/fathmmMKL.htm
            CADD, v1.3 http://cadd.gs.washington.edu/
            VEST, v3.0 http://karchinlab.org/apps/appVest.html
            fitCons, v1.01 http://compgen.bscb.cornell.edu/fitCons/
            DANN, https://cbcl.ics.uci.edu/public_data/DANN/
            MetaSVM and MetaLR, doi: 10.1093/hmg/ddu733
            GenoCanyon, v1.0.3 http://genocanyon.med.yale.edu/index.html
            Eigen & Eigen PC, v1.1 http://www.columbia.edu/~ii2135/eigen.html
            M-CAP, v1.0 http://bejerano.stanford.edu/MCAP/
            REVEL, https://sites.google.com/site/revelgenomics/
            MutPred, v1.2 http://mutpred.mutdb.org/
    Conservation scores:
            phyloP100way_vertebrate (hg38) http://hgdownload.soe.ucsc.edu/goldenPath/hg38/phyloP100way/
            phyloP20way_mammalian (hg38) http://hgdownload.soe.ucsc.edu/goldenPath/hg38/phyloP20way/
            phastCons100way_vertebrate (hg38) http://hgdownload.soe.ucsc.edu/goldenPath/hg38/phastCons100way/
            phastCons20way_mammalian (hg38) http://hgdownload.soe.ucsc.edu/goldenPath/hg38/phastCons20way/
            GERP++ http://mendel.stanford.edu/SidowLab/downloads/gerp/
            SiPhy http://www.broadinstitute.org/mammals/2x/siphy_hg19/
    Other variant annotation sources:
            Interpro v56 http://www.ebi.ac.uk/interpro/
            1000 Genomes project http://www.1000genomes.org/
            ESP http://evs.gs.washington.edu/EVS/
            dbSNP 147 (hg38) ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b147_GRCh38p2/VCF/All_20160527.vcf.gz
            clinvar 20161101 (hg38) ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh38/clinvar_20161101.vcf.gz
            ExAC v0.3 http://exac.broadinstitute.org/
            UK10K COHORT http://www.uk10k.org/studies/cohorts.html
            Ancestral alleles (hg38) ftp://ftp.ensembl.org/pub/release-84/fasta/ancestral_alleles
            Altai Neanderthal genotypes: http://cdna.eva.mpg.de/neandertal/altai/AltaiNeandertal/VCF/
            Denisova genotypes: http://www.eva.mpg.de/denisova
            RSRS http://dx.doi.org/10.1016/j.ajhg.2012.03.002
            GTEx v6 http://www.gtexportal.org/static/datasets/gtex_analysis_v6/single_tissue_eqtl_data/
    Other gene annotation sources:
            HGNC, downloaded on March 15, 2016
            Uniprot, released 2016_2
            IntAct, downloaded on March 15, 2016
            GWAS catalog, downloaded on March 15, 2015
            egenetics and GNF/Atlas expression data, downloaded from BioMart on Oct. 1, 2013
            BioGRID, version 3.4.134
            Haploinsufficiency probability data, from doi:10.1371/journal.pgen.1001154
            Recessive probability data, from DOI:10.1126/science.1215040
            Residual Variation Intolerance Score (RVIS), from http://genic-intolerance.org/
            GO, downloaded on March 15, 2016
            ConsensusPathDB, Release 31
            Essential genes, based on doi:10.1371/journal.pgen.1003484
            Mouse genes, from ftp://ftp.informatics.jax.org/pub/reports/index.html on March 15, 2016
            Zebra fish genes, from http://zfin.org/downloads/pheno.txt on March 15, 2016
            KEGG pathway, from http://www.openbioinformatics.org/gengen/tutorial_calculate_gsea.html
            BioCarta pathway, from http://www.openbioinformatics.org/gengen/tutorial_calculate_gsea.html
            GTEx v6 http://www.gtexportal.org/static/datasets/gtex_analysis_v6/rna_seq_data/
            GDI doi: 10.1073/pnas.1518646112
            LoFtool: [email protected]
            SORVA: doi: 10.1101/103218

Annotation example

cd tests
pynnotator -i miller.vcf.gz
grep 'Miller' ann_miller/annotation.final.vcf

16      72050942        rs267606766     G       A       287.41  PASS    AC=1;AF=0.50;AN=2;BaseQRankSum=2.237;DB;DP=13;Dels=0.00;FS=5.119;HRun=0;HaplotypeScore=0.0000;MQ0=0;MQ=60.00;MQRankSum=0.231;QD=22.11;ReadPosRankSum=-0.077;set=variant2;EFF=NON_SYNONYMOUS_CODING(MODERATE|MISSENSE|Ggg/Agg|G152R|395|DHODH|protein_coding|CODING|ENST00000219240|4|A);CSQ=A|missense_variant|MODERATE|DHODH|ENSG00000102967|Transcript|ENST00000219240|protein_coding|4/9||||475/2065|454/1188|152/395|G/R|Ggg/Agg|||1||HGNC|2867|deleterious(0)|probably_damaging(1);SNP;HET;VARTYPE=SNP;HI_PREDICTIONS=DHODH|0.325470662|25.78%;dbsnp.RS=267606766;dbsnp.RSPOS=72050942;dbsnp.dbSNPBuildID=137;dbsnp.SSR=0;dbsnp.SAO=1;dbsnp.VP=0x050268000a05040002110100;dbsnp.GENEINFO=DHODH:1723;dbsnp.WGT=1;dbsnp.VC=SNV;dbsnp.PM;dbsnp.PMC;dbsnp.S3D;dbsnp.NSM;dbsnp.REF;dbsnp.ASP;dbsnp.VLD;dbsnp.LSD;dbsnp.OM;clinvar.RS=267606766;clinvar.RSPOS=72050942;clinvar.dbSNPBuildID=137;clinvar.SSR=0;clinvar.SAO=1;clinvar.VP=0x050268000a05040002110100;clinvar.GENEINFO=DHODH:1723;clinvar.WGT=1;clinvar.VC=SNV;clinvar.PM;clinvar.PMC;clinvar.S3D;clinvar.NSM;clinvar.REF;clinvar.ASP;clinvar.VLD;clinvar.LSD;clinvar.OM;clinvar.CLNALLE=1;clinvar.CLNHGVS=NC_000016.9:g.72050942G>A;clinvar.CLNSRC=OMIM_Allelic_Variant|UniProtKB_(protein);clinvar.CLNORIGIN=1;clinvar.CLNSRCID=126064.0004|Q02127#VAR_062414;clinvar.CLNSIG=5;clinvar.CLNDSDB=MedGen:OMIM:SNOMED_CT;clinvar.CLNDSDBID=C0265257:263750:66038001;clinvar.CLNDBN=Miller_syndrome;clinvar.CLNREVSTAT=no_criteria;clinvar.CLNACC=RCV000018294.28;esp6500.DBSNP=dbSNP_138;esp6500.EA_AC=1,8301;esp6500.AA_AC=0,3878;esp6500.TAC=1,12179;esp6500.MAF=0.012,0.0,0.0082;esp6500.GTS=AA,AG,GG;esp6500.EA_GTC=0,1,4150;esp6500.AA_GTC=0,0,1939;esp6500.GTC=0,1,6089;esp6500.DP=130;esp6500.GL=DHODH;esp6500.CP=0.8;esp6500.CG=5.8;esp6500.AA=G;esp6500.CA=.;esp6500.EXOME_CHIP=no;esp6500.GWAS_PUBMED=.;esp6500.FG=NM_001361.4:missense;esp6500.HGVS_CDNA_VAR=NM_001361.4:c.454G>A;esp6500.HGVS_PROTEIN_VAR=NM_001361.4:p.(G152R);esp6500.CDS_SIZES=NM_001361.4:1188;esp6500.GS=125;esp6500.PH=probably-damaging:1.0;esp6500.EA_AGE=.;esp6500.AA_AGE=.;esp6500.GRCh38_POSITION=16:72017043 GT:AD:DP:GQ:PL  0/1:4,9:13:99:317,0,101
16      72055110        rs267606767     G       C       287.41  PASS    AC=1;AF=0.50;AN=2;BaseQRankSum=2.237;DB;DP=13;Dels=0.00;FS=5.119;HRun=0;HaplotypeScore=0.0000;MQ0=0;MQ=60.00;MQRankSum=0.231;QD=22.11;ReadPosRankSum=-0.077;set=variant2;EFF=NON_SYNONYMOUS_CODING(MODERATE|MISSENSE|gGc/gCc|G202A|395|DHODH|protein_coding|CODING|ENST00000219240|5|C);CSQ=C|missense_variant|MODERATE|DHODH|ENSG00000102967|Transcript|ENST00000219240|protein_coding|5/9||||626/2065|605/1188|202/395|G/A|gGc/gCc|||1||HGNC|2867|tolerated(0.18)|possibly_damaging(0.893);SNP;HET;VARTYPE=SNP;HI_PREDICTIONS=DHODH|0.325470662|25.78%;dbsnp.RS=267606767;dbsnp.RSPOS=72055110;dbsnp.dbSNPBuildID=137;dbsnp.SSR=0;dbsnp.SAO=1;dbsnp.VP=0x050268000a05040002110100;dbsnp.GENEINFO=DHODH:1723;dbsnp.WGT=1;dbsnp.VC=SNV;dbsnp.PM;dbsnp.PMC;dbsnp.S3D;dbsnp.NSM;dbsnp.REF;dbsnp.ASP;dbsnp.VLD;dbsnp.LSD;dbsnp.OM;dbsnp.TOPMED=0.999828,0.000171715,.;clinvar.RS=267606767;clinvar.RSPOS=72055110;clinvar.dbSNPBuildID=137;clinvar.SSR=0;clinvar.SAO=1;clinvar.VP=0x050268000a05040002110100;clinvar.GENEINFO=DHODH:1723;clinvar.WGT=1;clinvar.VC=SNV;clinvar.PM;clinvar.PMC;clinvar.S3D;clinvar.NSM;clinvar.REF;clinvar.ASP;clinvar.VLD;clinvar.LSD;clinvar.OM;clinvar.CLNALLE=1,2;clinvar.CLNHGVS=NC_000016.9:g.72055110G>A,NC_000016.9:g.72055110G>C;clinvar.CLNSRC=OMIM_Allelic_Variant|UniProtKB_(protein),OMIM_Allelic_Variant|UniProtKB_(protein);clinvar.CLNORIGIN=1,1;clinvar.CLNSRCID=126064.0006|Q02127#VAR_062417,126064.0005|Q02127#VAR_062416;clinvar.CLNSIG=5,5;clinvar.CLNDSDB=MedGen:OMIM:SNOMED_CT,MedGen:OMIM:SNOMED_CT;clinvar.CLNDSDBID=C0265257:263750:66038001,C0265257:263750:66038001;clinvar.CLNDBN=Miller_syndrome,Miller_syndrome;clinvar.CLNREVSTAT=no_criteria,no_criteria;clinvar.CLNACC=RCV000018296.27,RCV000018295.27   GT:AD:DP:GQ:PL       0/1:4,9:13:99:317,0,101

pynnotator's People

Contributors

raony-guimaraes avatar raonyguimaraes avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

pynnotator's Issues

Tabix [E::hts_idx_push] Chromosome blocks not continuous sort error

OS : Centos 7
Tried to annotate a vcf and got this error

E::hts_idx_push] Chromosome blocks not continuous
tbx_index_build failed: vep.output.sorted.vcf.gz

Checked the vep.py
fixed by commenting all sort commands and added this

command = '''grep -E -v '^X|^Y|^M|^#|^GL' vep.output.vcf | sort -k1,1V -k2,2n >> vep/vep.output.sorted.vcf'''
        call(command, shell=True)

worked fine. Please check whether this fix solves this problem permanently and if yes please incorporate to the vep.py

The version "0" not supported, assuming VCFv4.2

The version "0" not supported, assuming VCFv4.2
Broken VCF header, no column names?
at /projects/pynnotator/pynnotator/libs/vcftools/vcftools-0.1.15/src/perl/Vcf.pm line 172, line 1.
Vcf::throw(Vcf4_2=HASH(0x2938720), "Broken VCF header, no column names?") called at /projects/pynnotator/pynnotator/libs/vcftools/vcftools-0.1.15/src/perl/Vcf.pm line 867
VcfReader::_read_column_names(Vcf4_2=HASH(0x2938720)) called at /projects/pynnotator/pynnotator/libs/vcftools/vcftools-0.1.15/src/perl/Vcf.pm line 602
VcfReader::parse_header(Vcf4_2=HASH(0x2938720)) called at /projects/pynnotator/pynnotator/libs/vcftools/vcftools-0.1.15/src/perl/vcf-annotate line 408
main::annotate(HASH(0x2931168)) called at /projects/pynnotator/pynnotator/libs/vcftools/vcftools-0.1.15/src/perl/vcf-annotate line 33
2017-10-13 10:14:36.649397 Finished decipher annotation, it took: 0:00:10.386481

Genome build option not implemented yet?

parser.add_argument('-b', dest='build', required=False, metavar='hg19 or hg38', help='The genome build you want to use')

I don't see where this option is used anywhere in the codebase. Is it the case that this simply isn't implemented yet? If so, what is the default option? Do any of the downstream helpers enforce any kind of liftover?

Great project!

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 1348: invalid continuation byte

Exception in thread Thread-15:
Traceback (most recent call last):
File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
self.run()
File "/usr/lib/python3.5/threading.py", line 862, in run
self._target(*self._args, **self._kwargs)
File "/projects/pynnotator/pynnotator/annotator.py", line 285, in merge
mg.run()
File "/projects/pynnotator/pynnotator/helpers/merge.py", line 84, in run
self.merge()
File "/projects/pynnotator/pynnotator/helpers/merge.py", line 145, in merge
for record in records:
File "pysam/libctabix.pyx", line 637, in pysam.libctabix.TabixIterator.next (pysam/libctabix.c:6988)
File "pysam/libcutils.pyx", line 131, in pysam.libcutils.charptr_to_str (pysam/libcutils.c:3219)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 1348: invalid continuation byte

blocking bug in vcftocsv

hi,
thanks for the work on pynnotator. I just install the software, and the installation works well. For the first try i just usepynnotator testand i have the following error:

 Testing Annotation...
Running Command gunzip -c -d /mnt/data/.virtualenvs/pynnotator_env/lib/python3.5/site-packages/pynnotator-1.6-py3.5.egg/pynnotator/tests/sample.1000.vcf.gz > /mnt/data/.virtualenvs/pynnotator_env/lib/python3.5/site-packages/pynnotator-1.6-py3.5.egg/pynnotator/tests/ann_sample.1000/sample.1000.vcf
2018-06-22 10:41:21.941225 Starting sanity_check:  /mnt/data/.virtualenvs/pynnotator_env/lib/python3.5/site-packages/pynnotator-1.6-py3.5.egg/pynnotator/tests/ann_sample.1000/sample.1000.vcf
sort -k1,1d -k2,2n
2018-06-22 10:41:21.948425 Finished sanity_check, it took:  0:00:00.007200
2018-06-22 10:41:21.948885 Starting snpEff annotation:  sanity_check/sorted.vcf
2018-06-22 10:41:21.948985 Starting decipher annotation:  sanity_check/sorted.vcf
2018-06-22 10:41:21.949938 Starting vep annotation:  sanity_check/sorted.vcf
2018-06-22 10:41:21.950630 Starting snpsift annotation:  sanity_check/sorted.vcf
2018-06-22 10:41:22.116361 Finished decipher annotation, it took:  0:00:00.167376
2018-06-22 10:41:22.197224 Finished snpsift annotation, it took:  0:00:00.246594
Use of uninitialized value in concatenation (.) or string at /home/aurelien.beliard/.virtualenvs/pynnotator_env/lib/python3.5/site-packages/pynnotator-1.6-py3.5.egg/pynnotator/data/vep_data/Plugins/dbNSFP.pm line 227.
Use of uninitialized value in concatenation (.) or string at /home/aurelien.beliard/.virtualenvs/pynnotator_env/lib/python3.5/site-packages/pynnotator-1.6-py3.5.egg/pynnotator/data/vep_data/Plugins/dbNSFP.pm line 227.
Use of uninitialized value in concatenation (.) or string at /home/aurelien.beliard/.virtualenvs/pynnotator_env/lib/python3.5/site-packages/pynnotator-1.6-py3.5.egg/pynnotator/data/vep_data/Plugins/dbNSFP.pm line 227.
Use of uninitialized value in concatenation (.) or string at /home/aurelien.beliard/.virtualenvs/pynnotator_env/lib/python3.5/site-packages/pynnotator-1.6-py3.5.egg/pynnotator/data/vep_data/Plugins/dbNSFP.pm line 227.
Use of uninitialized value in concatenation (.) or string at /home/aurelien.beliard/.virtualenvs/pynnotator_env/lib/python3.5/site-packages/pynnotator-1.6-py3.5.egg/pynnotator/data/vep_data/Plugins/dbNSFP.pm line 227.
2018-06-22 10:41:24.257465 Finished vep annotation, it took:  0:00:02.307527
2018-06-22 10:41:45.991472 Finished snpEff annotation, it took:  0:00:24.042587
2018-06-22 10:41:45.991741 Merging all VCF Files...
2018-06-22 10:41:45.991942 Starting merge:  sanity_check/sorted.vcf

=============================================
vcfanno version 0.2.9 [built with go1.10]

see: https://github.com/brentp/vcfanno
=============================================
vcfanno.go:112: [Flatten] unable to open file: //home/aurelien.beliard/.virtualenvs/pynnotator_env/lib/python3.5/site-packages/pynnotator-1.6-py3.5.egg/pynnotator/data/dbsnp/clinvar_20180429.vcf.gz in 
2018-06-22 10:41:46.324350 Finished merge, it took:  0:00:00.332408
2018-06-22 10:41:46.324934 Convert VCF to CSV...
Traceback (most recent call last):
  File "/home/aurelien.beliard/.virtualenvs/pynnotator_env/lib/python3.5/site-packages/pynnotator-1.6-py3.5.egg/pynnotator/scripts/vcf2csv.py", line 91, in <module>
    vcf_header = Get_vcfheader(vcffile)
  File "/home/aurelien.beliard/.virtualenvs/pynnotator_env/lib/python3.5/site-packages/pynnotator-1.6-py3.5.egg/pynnotator/scripts/vcf2csv.py", line 89, in Get_vcfheader
    return header_tags
UnboundLocalError: local variable 'header_tags' referenced before assignment
2018-06-22 10:41:46.365109 Finished Annotation, it took 0:00:24.427129

Thanks for your help

Aurélien Béliard
bioinformatics engineer
brain and spine institute

Refactor concurrency of pynnotator module

Exception in thread Thread-7:
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.5/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/usr/lib/python3.5/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "/projects/mendelmd/mendelmd_source/mendelmdenv/lib/python3.5/site-packages/pynnotator/helpers/vcf_annotator.py", line 140, in annotate
vcf_file = open('%s' % (vcf_file), 'r')
FileNotFoundError: [Errno 2] No such file or directory: 'pynnotator/part.00'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
self.run()
File "/usr/lib/python3.5/threading.py", line 862, in run
self._target(*self._args, **self._kwargs)
File "/projects/mendelmd/mendelmd_source/mendelmdenv/lib/python3.5/site-packages/pynnotator/annotator.py", line 348, in vcf_annotator
annotator_obj.run()
File "/projects/mendelmd/mendelmd_source/mendelmdenv/lib/python3.5/site-packages/pynnotator/helpers/vcf_annotator.py", line 62, in run
pool.map(self.annotate, range(1,self.cores+1))
File "/usr/lib/python3.5/multiprocessing/pool.py", line 260, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/usr/lib/python3.5/multiprocessing/pool.py", line 608, in get
raise self._value
FileNotFoundError: [Errno 2] No such file or directory: 'pynnotator/part.00'

Varscan2 does not generate a valid VCF!

##source=VarScan2

Could not parse header line: FORMAT=<ID=ZYG,Number=1,Type=String,Description="Zygosity(Hetz, Homz, Other)")>
Stopped at [)].
at /projects/pynnotator/pynnotator/libs/vcftools/vcftools-0.1.14/src/perl/Vcf.pm line 172, <ANONIO> line 25.
Vcf::throw(Vcf4_1=HASH(0xcc1d08), "Could not parse header line: FORMAT=<ID=ZYG,Number=1,Type=Str"...) called at /projects/pynnotator/pynnotator/libs/vcftools/vcftools-0.1.14/src/perl/Vcf.pm line 2981
Vcf4_0::parse_header_line(Vcf4_1=HASH(0xcc1d08), "##FORMAT=<ID=ZYG,Number=1,Type=String,Description="Zygosity(H"...) called at /projects/pynnotator/pynnotator/libs/vcftools/vcftools-0.1.14/src/perl/Vcf.pm line 625
VcfReader::_next_header_line(Vcf4_1=HASH(0xcc1d08)) called at /projects/pynnotator/pynnotator/libs/vcftools/vcftools-0.1.14/src/perl/Vcf.pm line 598
VcfReader::parse_header(Vcf4_1=HASH(0xcc1d08)) called at /projects/pynnotator/pynnotator/libs/vcftools/vcftools-0.1.14/src/perl/Vcf.pm line 2555
VcfReader::run_validation(Vcf4_1=HASH(0xcc1d08)) called at /projects/pynnotator/pynnotator/libs/vcftools/vcftools-0.1.14/src/perl/vcf-validator line 60
main::do_validation(HASH(0x832f30)) called at /projects/pynnotator/pynnotator/libs/vcftools/vcftools-0.1.14/src/perl/vcf-validator line 14

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.