Giter Site home page Giter Site logo

ding-lab / charger Goto Github PK

View Code? Open in Web Editor NEW
93.0 19.0 37.0 17.19 MB

Characterization of Germline variants

Home Page: https://ding-lab.github.io/CharGer/

License: GNU General Public License v3.0

Python 100.00%
diseases germline-variants pathogenicity variants characterization annotations clinvar vep exac acmg

charger's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

charger's Issues

I can not get any output file using v0.6.0b1 version

vep command:
~/VEP/ensembl-vep/vep --assembly GRCh38 --cache --dir_plugins /home/yubau/.vep/Plugins --everything --fasta /home/yubau/.vep/homo_sapiens/99_GRCh38 --force_overwrite --fork 48 --format vcf --input_file HCC297_S7_hg38.GATK.HaplotypeCaller.variants.vcf --offline --output_file HCC297_S7_hg38.GATK.HaplotypeCaller.variants.vep.vcf --symbol --terms SO --tsl --vcf

vcf using VEP annotated file (head -n 1000) (Total 4,773,130 line when run charger):
HCC297_S7_hg38.GATK.HaplotypeCaller.variants.vep_1000.txt

install:
mkdir CharGer_test_python38
cd CharGer_test_python38
conda create --name CharGer_python_38 python=3.8
conda activate CharGer_python_38
pip install pysam

(CharGer_python_38) [yubau@cmuh-i2 CharGer]$ conda list
# packages in environment at /home/yubau/anaconda2/envs/CharGer_python_38:
#
# Name Version Build Channel
_libgcc_mutex 0.1 main
attrs 19.3.0 pypi_0 pypi
ca-certificates 2020.1.1 0
certifi 2019.11.28 py38_0
charger 0.6.0b1 pypi_0 pypi
cyvcf2 0.11.6 pypi_0 pypi
ld_impl_linux-64 2.33.1 h53a641e_7
libedit 3.1.20181209 hc058e9b_0
libffi 3.2.1 hd88cf55_4
libgcc-ng 9.1.0 hdf63c60_0
libstdcxx-ng 9.1.0 hdf63c60_0
loguru 0.4.1 pypi_0 pypi
ncurses 6.2 he6710b0_0
numpy 1.18.1 pypi_0 pypi
openssl 1.1.1d h7b6447c_4
pip 20.0.2 py38_1
pysam 0.15.4 pypi_0 pypi
python 3.8.1 h0371630_1
readline 7.0 h7b6447c_5
setuptools 45.2.0 py38_0
sqlite 3.31.1 h7b6447c_0
tk 8.6.8 hbc83047_0
wheel 0.34.2 py38_0
xz 5.2.4 h14c3975_4
zlib 1.2.11 h7b6447c_3

(CharGer_python_38) [yubau@cmuh-i2 CharGer]$ pip install --pre -i https://pypi.org/simple/ --extra-index-url https://test.pypi.org/simple/ charger

command:

(CharGer_python_38) [yubau@cmuh-i2 CharGer]$ charger --input HCC297_S7_hg38.GATK.HaplotypeCaller.variants.vep.vcf --output HCC297_S7_hg38.GATK.HaplotypeCaller.variants.vep.charger.tsv INFO | Running CharGer v0.6.0b1 with parameters: --input HCC297_S7_hg38.GATK.HaplotypeCaller.variants.vep.vcf --output HCC297_S7_hg38.GATK.HaplotypeCaller.variants.vep.charger.tsv

----screen info start-------------------------------------------------------------------------------------
INFO | Validate the given config
DEBUG | Given config: CharGerConfig(input=PosixPath('HCC297_S7_hg38.GATK.HaplotypeCaller.variants.vep.vcf'), output=PosixPath('HCC297_S7_hg38.GATK.HaplotypeCaller.variants.vep.charger.tsv'), hotspot3d_cluster=None, pathogenic_variant=None, override_variant_info=False, include_vcf_details=False, clinvar_table=None, rare_threshold=0.0005, common_threshold=0.005, acmg_module_scores={'PVS1': 8, 'PS1': 7, 'PS2': 4, 'PS3': 4, 'PS4': 4, 'PM1': 2, 'PM2': 2, 'PM3': 2, 'PM4': 2, 'PM5': 2, 'PM6': 2, 'PP1': 1, 'PP2': 1, 'PP3': 1, 'PP4': 1, 'PP5': 1, 'BP1': -1, 'BP2': -1, 'BP3': -1, 'BP4': -1, 'BP5': -1, 'BP6': -1, 'BP7': -1, 'BS1': -4, 'BS2': -4, 'BS3': -4, 'BS4': -4, 'BA1': -8}, charger_module_scores={'PSC1': 4, 'PMC1': 2, 'PPC1': 1, 'PPC2': 1, 'BMC1': -2, 'BSC1': -6}, min_pathogenic_score=9, min_likely_pathogenic_score=5, max_likely_benign_score=-4, max_benign_score=-8, disease_specific=False, inheritance_gene_table=None, PP2_gene_list=None, BP1_gene_list=None)
INFO | Read input VCF from HCC297_S7_hg38.GATK.HaplotypeCaller.variants.vep.vcf
DEBUG | VEP version 99 with CSQ format [66 fields]: Allele,Consequence,IMPACT,SYMBOL,Gene,Feature_type,Feature,BIOTYPE,EXON,INTRON,HGVSc,HGVSp,cDNA_position,CDS_position,Protein_position,Amino_acids,Codons,Existing_variation,DISTANCE,STRAND,FLAGS,VARIANT_CLASS,SYMBOL_SOURCE,HGNC_ID,CANONICAL,MANE,TSL,APPRIS,CCDS,ENSP,SWISSPROT,TREMBL,UNIPARC,GENE_PHENO,SIFT,PolyPhen,DOMAINS,miRNA,HGVS_OFFSET,AF,AFR_AF,AMR_AF,EAS_AF,EUR_AF,SAS_AF,AA_AF,EA_AF,gnomAD_AF,gnomAD_AFR_AF,gnomAD_AMR_AF,gnomAD_ASJ_AF,gnomAD_EAS_AF,gnomAD_FIN_AF,gnomAD_NFE_AF,gnomAD_OTH_AF,gnomAD_SAS_AF,MAX_AF,MAX_AF_POPS,CLIN_SIG,SOMATIC,PHENO,PUBMED,MOTIF_NAME,MOTIF_POS,HIGH_INF_POS,MOTIF_SCORE_CHANGE
SUCCESS | Read total 4,772,905 variants from the input VCF
WARNING | Inheritance gene table is not provided, CharGer cannot make ACMG PVS1/PM4 calls or CharGer PSC1/PPC1 calls. Disable all these modules
WARNING | CharGer cannot make PP2 calls without the given gene list. Disable PP2 module
WARNING | CharGer cannot make BP1 calls without the given gene list. Disable BP1 module
INFO | Skip matching ClinVar
INFO | Run all ACMG modules
INFO | Skipped PVS1 module
INFO | Skipped PM4 module
INFO | Run all CharGer modules
INFO | Skipped PSC1 module
INFO | Running PMC1 module
INFO | Skipped PPC1 module
INFO | Running PPC2 module
----screen info end-------------------------------------------------------------------------------------

image

(CharGer_python_38) [yubau@cmuh-i2 CharGer]$ ls
CHANGES.rst demo.vcf demo.vep.vcf_summary.html HCC297_S7_hg38.GATK.HaplotypeCaller.variants.vep.vcf poetry.lock README.md setup.cfg tests tox.ini
clinvar_20200210.vep.vcf demo.vep.vcf docs LICENSE.txt pyproject.toml scripts src tox_conda.ini

(CharGer_python_38) [yubau@cmuh-i2 CharGer]$ python -V
Python 3.8.1
(CharGer_python_38) [yubau@cmuh-i2 CharGer]$ which python
~/anaconda2/envs/CharGer_python_38/bin/python
(CharGer_python_38) [yubau@cmuh-i2 CharGer]$ which pip
~/anaconda2/envs/CharGer_python_38/bin/pip
(CharGer_python_38) [yubau@cmuh-i2 CharGer]$ pip --version
pip 20.0.2 from /home/yubau/anaconda2/envs/CharGer_python_38/lib/python3.8/site-packages/pip (python 3.8)
(CharGer_python_38) [yubau@cmuh-i2 CharGer]$ charger --version
CharGer v0.6.0b1
(CharGer_python_38) [yubau@cmuh-i2 CharGer]$ which charger
~/anaconda2/envs/CharGer_python_38/bin/charger

I can not get any output file using v0.6.0b1 version

I used VEP to annotate 0.5.4 version demo.vcf file and run charger v0.6.0b1, than can not get any output file

Can you upload a demo.vcf for charger v0.6.0b1 thanks!

yubau, from taiwan

Unable to install charger via pip or conda

When I try conda install charger, I get:
PackagesNotFoundError: The following packages are not available from current channels:

Also pip install charger, returns:
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [1 lines of output]
error in PyVCF setup command: use_2to3 is invalid.
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.

is this software alive?

Hey, are there any clear instructions how to run the software? After digging in Python files I built such command:
charger --input tests/examples/10.1056_NEJMoa1508054_S4_AD_vep85.sorted.vcf.gz --pathogenic-variant tests/examples/annotations/grch37_pathogenic_variants.vcf.gz --inheritance-gene-table tests/examples/annotations/inheritance_gene_table.tsv.gz --PP2-gene-list tests/examples/annotations/pp2_gene_list.txt.gz --clinvar-table tests/examples/annotations/clinvar_chrom_22_only.b37.tsv.gz --output out.tsv

however I get no output file...

I only get messages like:

charger --input tests/examples/10.1056_NEJMoa1508054_S4_AD_vep85.sorted.vcf.gz --pathogenic-variant tests/examples/annotations/grch37_pathogenic_variants.vcf.gz --inheritance-gene-table tests/examples/annotations/inheritance_gene_table.tsv.gz --PP2-gene-list tests/examples/annotations/pp2_gene_list.txt.gz --clinvar-table tests/examples/annotations/clinvar_chrom_22_only.b37.tsv.gz --output out.tsv
INFO | Running CharGer v0.6.0b1 with parameters: --input tests/examples/10.1056_NEJMoa1508054_S4_AD_vep85.sorted.vcf.gz --pathogenic-variant tests/examples/annotations/grch37_pathogenic_variants.vcf.gz --inheritance-gene-table tests/examples/annotations/inheritance_gene_table.tsv.gz --PP2-gene-list tests/examples/annotations/pp2_gene_list.txt.gz --clinvar-table tests/examples/annotations/clinvar_chrom_22_only.b37.tsv.gz --output out.tsv
INFO | Validate the given config
DEBUG | Given config: CharGerConfig(input=PosixPath('tests/examples/10.1056_NEJMoa1508054_S4_AD_vep85.sorted.vcf.gz'), output=PosixPath('out.tsv'), hotspot3d_cluster=None, pathogenic_variant=PosixPath('tests/examples/annotations/grch37_pathogenic_variants.vcf.gz'), override_variant_info=False, include_vcf_details=False, clinvar_table=PosixPath('tests/examples/annotations/clinvar_chrom_22_only.b37.tsv.gz'), rare_threshold=0.0005, common_threshold=0.005, acmg_module_scores={'PVS1': 8, 'PS1': 7, 'PS2': 4, 'PS3': 4, 'PS4': 4, 'PM1': 2, 'PM2': 2, 'PM3': 2, 'PM4': 2, 'PM5': 2, 'PM6': 2, 'PP1': 1, 'PP2': 1, 'PP3': 1, 'PP4': 1, 'PP5': 1, 'BP1': -1, 'BP2': -1, 'BP3': -1, 'BP4': -1, 'BP5': -1, 'BP6': -1, 'BP7': -1, 'BS1': -4, 'BS2': -4, 'BS3': -4, 'BS4': -4, 'BA1': -8}, charger_module_scores={'PSC1': 4, 'PMC1': 2, 'PPC1': 1, 'PPC2': 1, 'BMC1': -2, 'BSC1': -6}, min_pathogenic_score=9, min_likely_pathogenic_score=5, max_likely_benign_score=-4, max_benign_score=-8, disease_specific=False, inheritance_gene_table=PosixPath('tests/examples/annotations/inheritance_gene_table.tsv.gz'), PP2_gene_list=PosixPath('tests/examples/annotations/pp2_gene_list.txt.gz'), BP1_gene_list=None)
INFO | Read input VCF from tests/examples/10.1056_NEJMoa1508054_S4_AD_vep85.sorted.vcf.gz
DEBUG | VEP version 85 with CSQ format [62 fields]: Allele,Consequence,IMPACT,SYMBOL,Gene,Feature_type,Feature,BIOTYPE,EXON,INTRON,HGVSc,HGVSp,cDNA_position,CDS_position,Protein_position,Amino_acids,Codons,Existing_variation,DISTANCE,STRAND,FLAGS,VARIANT_CLASS,SYMBOL_SOURCE,HGNC_ID,CANONICAL,TSL,APPRIS,CCDS,ENSP,SWISSPROT,TREMBL,UNIPARC,GENE_PHENO,SIFT,PolyPhen,DOMAINS,HGVS_OFFSET,GMAF,AFR_MAF,AMR_MAF,EAS_MAF,EUR_MAF,SAS_MAF,AA_MAF,EA_MAF,ExAC_MAF,ExAC_Adj_MAF,ExAC_AFR_MAF,ExAC_AMR_MAF,ExAC_EAS_MAF,ExAC_FIN_MAF,ExAC_NFE_MAF,ExAC_OTH_MAF,ExAC_SAS_MAF,CLIN_SIG,SOMATIC,PHENO,PUBMED,MOTIF_NAME,MOTIF_POS,HIGH_INF_POS,MOTIF_SCORE_CHANGE
SUCCESS | Read total 550 variants from the input VCF
INFO | Read pathogenic VCF from tests/examples/annotations/grch37_pathogenic_variants.vcf.gz
DEBUG | VEP version 84 with CSQ format [52 fields]: Allele,Consequence,IMPACT,SYMBOL,Gene,Feature_type,Feature,BIOTYPE,EXON,INTRON,HGVSc,HGVSp,cDNA_position,CDS_position,Protein_position,Amino_acids,Codons,Existing_variation,DISTANCE,STRAND,FLAGS,SYMBOL_SOURCE,HGNC_ID,TSL,APPRIS,SIFT,PolyPhen,GMAF,AFR_MAF,AMR_MAF,EAS_MAF,EUR_MAF,SAS_MAF,AA_MAF,EA_MAF,ExAC_MAF,ExAC_Adj_MAF,ExAC_AFR_MAF,ExAC_AMR_MAF,ExAC_EAS_MAF,ExAC_FIN_MAF,ExAC_NFE_MAF,ExAC_OTH_MAF,ExAC_SAS_MAF,CLIN_SIG,SOMATIC,PHENO,PUBMED,MOTIF_NAME,MOTIF_POS,HIGH_INF_POS,MOTIF_SCORE_CHANGE
[W::vcf_parse] Contig '10' is not defined in the header. (Quick workaround: index the file with tabix.)
[W::vcf_parse] Contig '13' is not defined in the header. (Quick workaround: index the file with tabix.)
[W::vcf_parse] Contig '17' is not defined in the header. (Quick workaround: index the file with tabix.)
[W::vcf_parse] Contig '5' is not defined in the header. (Quick workaround: index the file with tabix.)
INFO | Read total 1,819 pathogenic variants from the VCF
INFO | Disable CharGer PMC1/PPC2 modules when inheritance gene table is provided
INFO | Read inheritance gene table from tests/examples/annotations/inheritance_gene_table.tsv.gz
INFO | Loaded inheritance mode of 152 genes
INFO | Read PP2 gene list from tests/examples/annotations/pp2_gene_list.txt.gz
INFO | Marked 152 genes for PP2
WARNING | CharGer cannot make BP1 calls without the given gene list. Disable BP1 module
INFO | Match input variants with ClinVar table at tests/examples/annotations/clinvar_chrom_22_only.b37.tsv.gz
SUCCESS | Matched 2 out of 550 input variants to a ClinVar record
INFO | Run all ACMG modules
INFO | Running PVS1 module
INFO | Running PM4 module
INFO | Run all CharGer modules
INFO | Running PSC1 module
INFO | Skipped PMC1 module
INFO | Running PPC1 module
INFO | Skipped PPC2 module

Are you planning to maintain this program?

Do you support the cross-reference file?

Dear CharGer team,

I see that some small versions of cross-reference files are available in the "Demo" folder.
Do you have plan to support the full version of these files? eg. mode of inheritance file.

Thanks
Lee

CharGer - v0.5.2: Error running with MacArthur lab clinvar TSV

Hello,

We are encountering error while running charger with MacArthur lab clinvar.

Here are the details:

charger -m ./maf/APC_vcf_maf.maf -o APC_charger.tsv -H APC_inp.maf.3D_Proximity.pairwise.site.l0.ad10.r10.clusters -D --inheritanceGeneList inheritanceGeneList.txt --exac-vcf ~pyang/.vep/ExAC_nonTCGA.r0.3.1.sites.vep.vcf.gz --mac-clinvar-tsv ./clinvar_alleles.single.b37.tsv.gz

Traceback (most recent call last):
 File "/sonas-hs/nwhgenomics/hpc/home/pyang/software/miniconda2/envs/py27/bin/charger", line 743, in <module>
   main( sys.argv[1:] )
 File "/sonas-hs/nwhgenomics/hpc/home/pyang/software/miniconda2/envs/py27/bin/charger", line 662, in main
   mutationTypes = mutationTypes , \
 File "/sonas-hs/nwhgenomics/hpc/home/pyang/software/miniconda2/envs/py27/lib/python2.7/site-packages/charger/charger.py", line 821, in getExternalData
   self.getClinVar( **kwargs )
 File "/sonas-hs/nwhgenomics/hpc/home/pyang/software/miniconda2/envs/py27/lib/python2.7/site-packages/charger/charger.py", line 842, in getClinVar
   clinvarSet = self.getMacClinVarTSV( macClinVarTSV )
 File "/sonas-hs/nwhgenomics/hpc/home/pyang/software/miniconda2/envs/py27/lib/python2.7/site-packages/charger/charger.py", line 887, in getMacClinVarTSV
   [ description , status ] = self.parseMacPathogenicity( fields[12:17] )
 File "/sonas-hs/nwhgenomics/hpc/home/pyang/software/miniconda2/envs/py27/lib/python2.7/site-packages/charger/charger.py", line 914, in parseMacPathogenicity
   isPathogenic = int( isPathogenic )
ValueError: invalid literal for int() with base 10: 'NM_005101.3:c.62G>A'

After digging a little into the code we see the header mentioned in the code and in our MacArthur Lab TSV

The header listed in the script at line:

chrom pos ref alt measureset_type measureset_id rcv allele_id

are different from the file which looks like this:

zcat /sonas-hs/nwhgenomics/hpc/home/pyang/projects/charger/clinvar_alleles.single.b37.tsv.gz | head -n1
chrom	pos	ref	alt	start	stop	strand	variation_type	variation_id	rcv	scv	allele_id	symbol	hgvs_c	hgvs_p	molecular_consequence	clinical_significance	clinical_significance_ordered	pathogenic	likely_pathogenic	uncertain_significance	likely_benign	benign	review_status	review_status_ordered	last_evaluated	all_submitters	submitters_ordered	all_traits	all_pmids	inheritance_modes	age_of_onset	prevalence	disease_mechanism	origin	xrefs	dates_ordered	gold_stars	conflicted

If we are using the wrong clinvar file can you please point us to the right ones.

Thank you in advance for your guidance and support.

problem running executable "charger" file

After installing the program, I type
charger as suggested to view executable options and I receive the following error:

File "/home/[username]/mycharger/bin/charger", line 256
print "CharGer ERROR: Command not recognized"
^
SyntaxError: Missing parentheses in call to 'print

Any advice on how to fix this? Thank you

module error

Hi

I installed CharGer succesfully, but failed to run it, here it was:

q1

error:

q2

Null Variant classified as Benign

Hi, I have succesfully run CharGer using the following command:

charger \
    -f ~/vcf/17D2625146_FB_hg19.vcf \
    -o test_charger_everything.tsv \
    -l -D \
    --mac-clinvar-tsv ~/clinvar_5b04ade/output/b37/single/clinvar_alleles.single.b37.tsv.gz \
    -z ~/charger_db/grch37_pathogenic_variants.vcf \
    --inheritanceGeneList ~/charger_db/inheritance_gene_table.tsv \
    --PP2 ~/charger_db/pp2_gene_list.txt

The database files I got from this: https://github.com/ding-lab/CharGer/tree/master/tests/examples/annotations
My VCF is annotated by VEP --everything flag. In this sample, I have a particular variant: ENST00000267163.4:c.1954_1960+2del, that is a null variant, and not in GnomeAD database. However, CharGer report this variant has Allele frequency of 0.5, and put it in BA1 evidence, hence classify it at Benign.

This is the terminal output from CharGer run:

(skipping a lot of warning rows)
Running ClinVar took 30.4019100666seconds
Running exac took 2.28881835938e-05seconds
CharGer module PVS1
- truncations in genes where LOF is a known mechanism of the disease
- require the mode of inheritance to be dominant (assuming heterzygosity) and co-occurence with reduced gene expression
- run concurrently with PSC1, PMC1, PM4, PPC1, and PPC2 -
CharGer module PS1
- same peptide change as a previously established pathogenic variant
PS1 found 0 pathogenic variants
CharGer module PS2
- de novo with maternity and paternity confirmation and no family history
CharGer module PS3: Well-established in vitro or in vivo functional studies             supportive of a damaging effect on the gene or gene product
CharGer module PS4: not yet implemented
CharGer module PM1:  Located in a mutational hot spot and/or critical and well-established              functional domain (e.g., active site of an enzyme) without benign variation
CharGer::PM1 Warning: clustersFile is not supplied. PM1 was not executed.
CharGer module PM2
- absent or extremely low frequency in controls
CharGer module PM3: not yet implemented
CharGer module PM4
- protein length changes due to inframe indels or nonstop variant of selected genes -
CharGer module PM5
- different peptide change of a pathogenic variant at the same reference peptide
PM5 found 0 pathogenic variants
CharGer module PM6
- assumed de novo without maternity and paternity confirmation
CharGer module PP1
- cosegregation with disease in family members in a known disease gene
CharGer module PP2: Missense variant in a gene that has low rate of benign missense and in which missense are common mechanism of disease
CharGer module PP3
- multiple lines of in silico evidence of deliterous effect
Found 0 variants with >= 2 of in silico evidence
CharGer module PP4: not yet implemented
CharGer module PP5: not yet implemented
CharGer module BA1
- allele frequency >5%
CharGer module BS1: not yet implemented
CharGer module BS2: not yet implemented
CharGer module BS3: not yet implemented
CharGer module BS4: not yet implemented
CharGer module BP1: Missense variant in a gene for which primarily truncations cause disease
CharGer::BP1 Error: Cannot evaluate BP1: No BP1 gene list supplied.
CharGer module BP2: not yet implemented
CharGer module BP3: not yet implemented
CharGer module BP4
 - in silico evidence of no damage
Found 0 variants with >= 2 with in silico evidence
CharGer module BP5: not yet implemented
CharGer module BP6: not yet implemented
CharGer module BP7: not yet implemented
CharGer module PSC1
Recessive truncations of susceptible genes
CharGer module PMC1
Truncations of genes when no gene list provided
CharGer module PPC1
- protein length changes due to inframe indels or nonstop variant of other, not-specificied genes -
CharGer module PPC2
- protein length changes due to inframe indels or nonstop variant when no susceptibility genes given -
CharGer module BSC1
- same peptide change as a previously established benign variant
BSC1 found 0 benign variants
CharGer module BMC1
- different peptide change of a benign variant at the same reference peptide
BMC1 found 0 benign variants
0.0005 < 0.05
write 27 charged user variants to test_charger_everything.tsv
charger::writeSummary Warning: skipping pubmed link tests

CharGer run Times:
input parse time (s): 0.000585079193115
get input data time (s): 2.63706803322
get external data time (s): 30.4032850266
modules run time (s): 0.0069580078125
classification time (s): 0.00322985649109
CharGer full run time (s): 33.0511260033

What have gone wrong in my case? How can I improve the result? Thank you very much

biomine warning

Hello,
I want to ask the reason about the biomine warning :
biomine warning: EX not found in conversion tables
biomine warning: ECPIC not found in conversion tables
biomine warning: EX not found in conversion tables
biomine warning: ECPIC not found in conversion tables
biomine warning: EX not found in conversion tables
biomine warning: ECPIC not found in conversion tables
biomine warning: RVEEVQNVIN not found in conversion tables
biomine warning: RVEEVQNVIN not found in conversion tables
biomine warning: RVEEVQNVIN not found in conversion tables
biomine warning: RVEEVQNVIN not found in conversion tables
biomine warning: RVEEVQNVIN not found in conversion tables
biomine warning: RVEEVQNVIN not found in conversion tables
biomine warning: RVEEVQNVIN not found in conversion tables
biomine warning: RVEEVQNVIN not found in conversion tables
biomine warning: RVEEVQNVIN not found in conversion tables
biomine warning: RVEEVQNVIN not found in conversion tables
biomine warning: RVEEVQNVIN not found in conversion tables
biomine warning: RVEEVQNVIN not found in conversion tables
biomine warning: RVEEVQNVIN not found in conversion tables
biomine warning: RVEEVQNVIN not found in conversion tables

And this is my shell content:charger -f ${in} -o ${out} -D --inheritanceGeneList ${pvs1} -E -l --mac-clinvar-tsv ${clinvar} --PP2GeneList ${pp2} --BP1GeneList ${bp1} -H ${pm5} -g ${mmGenes} -z ${mmVariants} --exac-vcf ${ExAC};
Looking forward to your reply ,thanks

could not find amino acid change or intronic change

Hi team,
I apply CharGer with WGS vcf file( with CSQ ). And the warning message like
biomine::variant::mafvariant Warning: could not find amino acid change or intronic change
Hint: Is the input amino acid change column correct?
Problem variant: :None:None-None->-:::c.:::p. -- p.Pro22=

appears almost continually.
I wonder if should I filter the "intronic variants" before the applying CharGer?

module options vs readme

There is a mismatch in the description of the module options. The readme should be updated to reflect the program help text.

can't input maf.file

Hello,
When I input maf.file ,charger always print "CharGer Error: bad .maf file ",so i want to know whether there are some errors in my maf.file .I paste my maf.file content in follow,many thanks!

maf.file:
1 Hugo_Symbol Entrez_Gene_Id Chromosome Start_position End_position Variant_Classification Variant_Type Reference_Allele Tumor
2 CDK11A 728642 1 1647893 1647894 In_Frame_Ins INS - - TTTCTT rs200224067|rs199866927|rs144636354 3 PRDM2 7799 1 14106394 14106395 In_Frame_Ins INS - - CTC rs2308040|rs148293494|rs59028030
4 SLC25A34 284723 1 16063253 16063265 Frame_Shift_Del DEL CAGCCTGGCGTGC CAGCCTGGCGTGC - BGI-0
5 SERINC2 347735 1 31905889 31905890 In_Frame_Ins INS - - CAG rs33956499|rs3050461|rs5773362 byFrequency
6 GJB4 127534 1 35227008 35227011 Frame_Shift_Del DEL TGTC TGTC - rs146812843 BGI-0103-ESCC-060N
7 MAP7D1 55700 1 36643701 36643703 In_Frame_Del DEL AGA AGA - rs3045695|rs141305015|rs200892098 byFre
8 KLF17 128209 1 44596380 44596382 In_Frame_Del DEL CAA CAA - rs200059598|rs34057178 byFrequency 9 CYP4B1 1580 1 47280747 47280748 Splice_Site DEL AT AT - rs55835239|rs397687617|rs3215983

Problems to run CharGer v0.5.4

refer to commit 7d7d291

I remove entire anaconda2 directory

rm ~/anaconda2

mkdir ~/CharGer
cd ~/CharGer
wget https://repo.anaconda.com/archive/Anaconda2-5.2.0-Linux-x86_64.sh
bash Anaconda2-5.2.0-Linux-x86_64.sh

Do you accept the license terms? [yes|no]
[no] >>> yes

Anaconda2 will now be installed into this location:
/home/yubau/anaconda2

Press ENTER to confirm the location
Press CTRL-C to abort the installation
Or specify a different location below
[/home/yubau/anaconda2] >>>

Do you wish the installer to prepend the Anaconda2 install location
to PATH in your /home/yubau/.bashrc ? [yes|no]
[no] >>>yes

Do you wish to proceed with the installation of Microsoft VSCode? [yes|no]
>>> no

(finish install anaconda)

conda create --name CharGer python=2.7
conda activate CharGer

(CharGer) [yubau@cmuh-i2 ~]$ which pip
~/anaconda2/envs/CharGer/bin/pip
(CharGer) [yubau@cmuh-i2 ~]$ pip --version
pip 19.3.1 from /home/yubau/anaconda2/envs/CharGer/lib/python2.7/site-packages/pip (python 2.7)
(CharGer) [yubau@cmuh-i2 ~]$ conda --version
conda 4.5.4
(CharGer) [yubau@cmuh-i2 ~]$ which conda
~/anaconda2/bin/conda

conda install pysam
pip install pysam

(CharGer) [yubau@cmuh-i2 ~]$ conda list
# packages in environment at /home/yubau/anaconda2/envs/CharGer:
#
# Name Version Build Channel
_libgcc_mutex 0.1 main
AdvancedHTMLParser 9.0.1
BioMine 0.9.5
ca-certificates 2020.1.1 0
certifi 2019.11.28 py27_0
chardet 3.0.4
CharGer 0.5.4
idna 2.9
libedit 3.1.20181209 hc058e9b_0
libffi 3.2.1 hd88cf55_4
libgcc-ng 9.1.0 hdf63c60_0
libstdcxx-ng 9.1.0 hdf63c60_0
ncurses 6.2 he6710b0_0
numpy 1.16.6
openssl 1.1.1d h7b6447c_4
pip 19.3.1 py27_0
pysam 0.6 py27_0
pysam 0.15.4
python 2.7.17 h9bab390_0
PyVCF 0.6.8
QueryableList 3.1.0
readline 7.0 h7b6447c_5
requests 2.23.0
scipy 1.2.3
setuptools 44.0.0 py27_0
sqlite 3.31.1 h7b6447c_0
tk 8.6.8 hbc83047_0
urllib3 1.25.8
wheel 0.33.6 py27_0
zlib 1.2.11 h7b6447c_3

(CharGer) [yubau@cmuh-i2 ~]$ pip list
DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7. More details about Python 2 support in pip, can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support
Package Version
------------------ -------------------
AdvancedHTMLParser 9.0.1
BioMine 0.9.5
bz2file 0.98
certifi 2019.11.28
chardet 3.0.4
CharGer 0.5.4
idna 2.9
numpy 1.16.6
pip 19.3.1
pysam 0.15.4
PyVCF 0.6.8
QueryableList 3.1.0
requests 2.23.0
scipy 1.2.3
setuptools 44.0.0.post20200106
urllib3 1.25.8
virtualenv 16.4.3
wheel 0.33.6
xopen 0.5.0

cd ~/CharGer
wget -O CharGer.zip https://github.com/ding-lab/CharGer/archive/master.zip
unzip CharGer.zip
mv CharGer-master/ CharGer
cd CharGer
pip install .

(CharGer) [yubau@cmuh-i2 ~]$ which charger
~/anaconda2/envs/CharGer/bin/charger
(CharGer) [yubau@cmuh-i2 ~]$ charger
CharGer ERROR: Command not recognized

CharGer - v0.5.4
...
..
.

login linux(centos 7) terminal

conda activate CharGer
cd ~/CharGer/CharGer/Demo
charger -f demo.vcf -o demo.tsv

image

and I got output file

image

AND I run another parameter for different access data
charger -f demo.vcf -o demo.t.tsv -t
charger -f demo.vcf -o demo.E.tsv -E
charger -f demo.vcf -o demo.x.tsv -x
charger -f demo.vcf -o demo.tEx.tsv -t -E -x

Did not get fatal error

BUT !!

when run:

charger -f demo.vcf -o demo.tsv -l

and I got message

image

Unsupported VEP version or no gnomAD AF annotation in input file; will search for ExAC frequencies... Unsupported VEP version or no ExAC AF annotation in input file; will search for 1000 Genomes frequencies... Unsupported VEP version or no gnomAD AF annotation in input file; will search for ExAC frequencies... Unsupported VEP version or no ExAC AF annotation in input file; will search for 1000 Genomes frequencies... Skipping: 0 for filters and 0 for AF and 0 for mutation types out of 550 No gene list file uploaded. CharGer will not make PVS1 calls. No PP2 gene list file uploaded. CharGer will not make PP2 calls. No BP1 gene list file uploaded. CharGer will not make BP1 calls. No expression file uploaded. CharGer will allow all passed truncations without expression data in PVS1. charger::getVEP Warning: skipping VEP Running VEP took 2.09808349609e-05seconds charger::getClinVar warning: ClinVar ReST search batch size given is greater than max allowed (50). Overriding to max search batch size. Traceback (most recent call last): File "/home/yubau/anaconda2/envs/CharGer/bin/charger", line 744, in <module> main( sys.argv[1:] ) File "/home/yubau/anaconda2/envs/CharGer/bin/charger", line 663, in main mutationTypes = mutationTypes , \ File "/home/yubau/anaconda2/envs/CharGer/lib/python2.7/site-packages/charger/charger.py", line 879, in getExternalData self.getClinVar( **kwargs ) File "/home/yubau/anaconda2/envs/CharGer/lib/python2.7/site-packages/charger/charger.py", line 905, in getClinVar self.getClinVarviaREST( **kwargs ) File "/home/yubau/anaconda2/envs/CharGer/lib/python2.7/site-packages/charger/charger.py", line 918, in getClinVarviaREST ent = entrezapi() File "/home/yubau/anaconda2/envs/CharGer/lib/python2.7/site-packages/biomine/webapi/entrez/entrezapi.py", line 74, in __init__ self.setRequestLimits() File "/home/yubau/anaconda2/envs/CharGer/lib/python2.7/site-packages/biomine/webapi/entrez/entrezapi.py", line 332, in setRequestLimits self.setSummaryBatchSize( entrezaip.summaryBatchSize ) NameError: global name 'entrezaip' is not defined

AND I got a empty file

AND run:
charger -f demo.vcf -o demo.ltEx.tsv -l -t -E -x

image

BUT !!

when run:
charger -f demo.vcf -o demo.l.tsv -l --exac-vcf ~/CharGer3/CharGer/Demo/ExAC.r1.sites.vep.vcf --mac-clinvar-tsv ~/CharGer3/CharGer/Demo/clinvar_alleles.multi.b37.tsv.gz

image

it's can work, why must need to add "--exac-vcf ~/CharGer3/CharGer/Demo/ExAC.r1.sites.vep.vcf --mac-clinvar-tsv ~/CharGer3/CharGer/Demo/clinvar_alleles.multi.b37.tsv.gz" for access ClinVar data?

AND run:
charger -f demo.vcf -o demo.ltEx.tsv -l -t -E -x --exac-vcf ~/CharGer3/CharGer/Demo/ExAC.r1.sites.vep.vcf --mac-clinvar-tsv ~/CharGer3/CharGer/Demo/clinvar_alleles.multi.b37.tsv.gz

image

Why?

somebody found 3 bug
(https://www.jianshu.com/p/544caf92b24c)

One of 3 bugs, somebody say following file need to modify:
/home/yubau/anaconda2/envs/CharGer/lib/python2.7/site-packages/biomine/webapi/entrez/entrezapi.py

Line 332 and Line 333 entrezaip ---need to modify---> entrezapi

it's really ?

(CharGer) [yubau@cmuh-i2 Demo]$ which python
~/anaconda2/envs/CharGer/bin/python
(CharGer) [yubau@cmuh-i2 Demo]$ python -V
Python 2.7.17 :: Anaconda, Inc.

less ~/.bashrc

# added by Anaconda2 installer
export PATH="/home/yubau/anaconda2/bin:$PATH"
export PATH="/home/yubau/anaconda2/envs/CharGer/bin:$PATH"

source "/home/yubau/anaconda2/etc/profile.d/conda.sh"

AND where is "diseases file" or "gene\tdisease\tmode_of_inheritance.tsv"?

Access data
-l ClinVar (flag)
-x ExAC (flag)
-E VEP (flag)
-t TCGA cancer types (flag)
Using these flags turns on accession features built in. For the ClinVar, ExAC, and VEP flags, if no local VEP or database is provided, then BioMine will be used to access the ReST interface. CharGer is currently capable of handling all VEP releases up until release 97. [[[[[The TCGA flag allows disease determination from sample barcodes in a .maf when using a diseases file (see below).]]]]]

Cross-reference data files
-z pathogenic variants, .vcf
-e expression matrix file, .tsv
--inheritanceGeneList inheritance gene list file, (format: gene\tdisease\tmode_of_inheritance) .txt
--PP2GeneList PP2 gene list file, (format: column of genes) .txt
--BP1GeneList BP1 gene list file, (format: column of genes) .txt
[[[[[ -d diseases file, (format: gene\tdisease\tmode_of_inheritance) .tsv]]]]]
-n de novo file, standard .maf
-a assumed de novo file, standard .maf
-c co-segregation file, standard .maf
-H HotSpot3D clusters file, .clusters

I am not found diseases file. Can you upload a "diseases file" or "gene\tdisease\tmode_of_inheritance.tsv"? or tell me where the file, thanks

I need to run charger, because hole my team is waiting for run charger.
we have two thousand whole genome sequencing vcf file and hundreds of cancer panel vcf file waiting for run charger, especially access ClinVar data.

please reply to me, thanks so so much.

yubau, from taiwan

IndexError: list index out of range while running with Mac-Clinvar

Hi, I'm trying out CharGer for prediction and annotation of my vcf file (annotated by VEP), my command is as follow:


charger \
    -f test_charger_vep.vcf \
    -o test_charger_vep2.tsv \
    -l -D \
    --mac-clinvar-tsv ~/clinvar/output/b37/single/clinvar_alleles.single.b37.vcf.gz

But it resulted in error:

charger::getClinVar
Traceback (most recent call last):
  File "~/anaconda3/envs/charger/bin/charger", line 743, in <module>
    main( sys.argv[1:] )
  File "~/anaconda3/envs/charger/bin/charger", line 662, in main
    mutationTypes = mutationTypes , \
  File "~/anaconda3/envs/charger/lib/python2.7/site-packages/charger/charger.py", line 821, in getExternalData
    self.getClinVar( **kwargs )
  File "~/anaconda3/envs/charger/lib/python2.7/site-packages/charger/charger.py", line 842, in getClinVar
    clinvarSet = self.getMacClinVarTSV( macClinVarTSV )
  File "~/anaconda3/envs/charger/lib/python2.7/site-packages/charger/charger.py", line 887, in getMacClinVarTSV
    [ description , status ] = self.parseMacPathogenicity( fields[12:17] )
  File "~/anaconda3/envs/charger/lib/python2.7/site-packages/charger/charger.py", line 909, in parseMacPathogenicity
    named = fields[0]
IndexError: list index out of range

I removed the option --mac-clinvar-tsv and it can run fine.
I used the single file from latest Mac Clinvar repository.

Could you please help me on this problem?

P/s: I also want to add the HotSpot3D to CharGer, how would I use https://github.com/ding-lab/hotspot3d to generate cluster for this task? What is my input file to get the cluster?

Thank you very much

Warnings appear when running with a VEP-annotated input file

Hello,

We got the warnings when running with a VEP-annotated VCF or MAF files.
Here are the details:

charger -f ../../software/vcf2maf-1.6.16/gnomAD_APC_maf.vep.vcf -o APC_charger.tsv --PP2GeneList PP2.genes.hg19 --BP1GeneList BP1.genes.hg19 -H APC_inp.maf.3D_Proximity.pairwise.site.l0.ad10.r20.clusters -D --inheritanceGeneList inheritanceGeneList.txt --exac-vcf ~pyang/.vep/ExAC_nonTCGA.r0.3.1.sites.vep.vcf.gz --mac-clinvar-tsv ./clinvar_alleles.single.b37.tsv.gz
Using default module scores and category thresholds:
   BA1 = -8
   BMC1 = -2
   BP1 = -1
   BP2 = -1
   BP3 = -1
   BP4 = -1
   BP5 = -1
   BP6 = -1
   BP7 = -1
   BS1 = -4
   BS2 = -4
   BS3 = -4
   BS4 = -4
   BSC1 = -6
   PM1 = 2
   PM2 = 2
   PM3 = 2
   PM4 = 2
   PM5 = 2
   PM6 = 2
   PMC1 = 2
   PP1 = 1
   PP2 = 1
   PP3 = 1
   PP4 = 1
   PP5 = 1
   PPC1 = 1
   PPC2 = 1
   PS1 = 7
   PS2 = 4
   PS3 = 4
   PS4 = 4
   PSC1 = 4
   PVS1 = 8
   maxBenignScore = -8
   maxLikelyBenignScore = -4
   minLikelyPathogenicScore = 5
   minPathogenicScore = 9
Will capture vcf details for output: False
This .vcf has VEP annotations!
biomine::variant::mafvariant Warning: could not find amino acid change or intronic change
  Hint: Is the input amino acid change column correct?
    Problem variant:  :None:None-None->-:::c.:::p.  --  p.Pro9=
biomine::variant::mafvariant Warning: could not find amino acid change or intronic change
  Hint: Is the input amino acid change column correct?
    Problem variant:  :None:None-None->-:::c.:::p.  --  p.Val10=
biomine::variant::mafvariant Warning: could not find amino acid change or intronic change
  Hint: Is the input amino acid change column correct?
    Problem variant:  :None:None-None->-:::c.:::p.  --  p.Pro14=
biomine::variant::mafvariant Warning: could not find amino acid change or intronic change
  Hint: Is the input amino acid change column correct?
    Problem variant:  :None:None-None->-:::c.:::p.  --  p.Pro14=

Why "charger" is trying to access "biomine" because the input file already has VEP annotations? Moreover, our input file is limited to a gene, i.e. APC, but the standard outputs (see the enclosed file
test.log) with warning appear to have many genes included.

Thanks for your helps.

NameError: global name 'entrezaip' is not defined when doing clinvar search

Hi CharGer Team!

whenever I try to use clinvar using the -l option, I encounter this error:

charger::getClinVar
warning: ClinVar ReST search batch size given is greater than max allowed (50). Overriding to max search batch size.
Traceback (most recent call last):
  File "/home/minku/anaconda3/envs/charger/bin/charger", line 744, in <module>
    main( sys.argv[1:] )
  File "/home/minku/anaconda3/envs/charger/bin/charger", line 663, in main
    mutationTypes = mutationTypes , \
  File "/home/minku/anaconda3/envs/charger/lib/python2.7/site-packages/charger/charger.py", line 879, in getExternalData
    self.getClinVar( **kwargs )
  File "/home/minku/anaconda3/envs/charger/lib/python2.7/site-packages/charger/charger.py", line 905, in getClinVar
    self.getClinVarviaREST( **kwargs )
  File "/home/minku/anaconda3/envs/charger/lib/python2.7/site-packages/charger/charger.py", line 918, in getClinVarviaREST
    ent = entrezapi()
  File "/home/minku/anaconda3/envs/charger/lib/python2.7/site-packages/biomine/webapi/entrez/entrezapi.py", line 74, in __init__
    self.setRequestLimits()
  File "/home/minku/anaconda3/envs/charger/lib/python2.7/site-packages/biomine/webapi/entrez/entrezapi.py", line 332, in setRequestLimits
    self.setSummaryBatchSize( entrezaip.summaryBatchSize )
NameError: global name 'entrezaip' is not defined

Is entrezaip supposed to be entrezapi on line 332 and 333?

Thank you! :)

error: Hint: is the input amino acid change column correct , charger version 0.5.4

I'm running charger 0.5.4 on a gzipped VEP version 99 annotated file. I'm getting a biomine error (could not find amino acid change or intronic change) as shown in the image below.

image

Charger completes however and I'm able to see the scores for the variants. I'm not sure if there are any variants missed though. I've seen similar errors posted by other users (#5) and the reply was that this error doesn't affect results and that it was fixed by version 0.5.4. However, I still see the error.

Reference for inheritanceGeneList 20160301_Rahman_KJ_KH_gene_table_CharGer.txt.gz

Hello,

I was looking for the reference for the inheritance gene list.. (20160301_Rahman_KJ_KH_gene_table_CharGer.txt.gz).

I found this paper from 2016 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4975511/bin/NIHMS69105-supplement-Supplementary_Table_1.xlsx) but it contains fewer genes than the one here: https://github.com/ding-lab/CharGer/tree/7d7d2911b89261fa5dceea6395a5d188a82757f2/PanCanAtlasData

Can the exact reference be shared?

What Cross-reference data files to use and where to get for lung cancer

Hi,
Thank you for develop such a good tool! I was able to run CharGer sucessfully, but all variants are classified as "Benign" in the column "CharGer_Classification" if I didn't use parameters -l -O --mac-clinvar-tsv.
I found that there are also many Cross-reference data files. But I'm not sure what Cross-reference data files can help to improve my results and where to acquire them. The disease of my samples is lung cancer. Would you mind provide some sugesstions?

All Uncertain Significance

Hi -- I was able to successfully run CharGer, but all variants are classified as "Uncertain Significance." Is there a setting that must be tweaked for variant annotation to work correctly? I'm guessing at least some of the variants we have must not be "Uncertain Significance." I'm using the docker image on DockerHub and the command "charger -f /mount/sample_id.vt2_normalized_spanning_alleles.vcf -o /mount/charger_annotated.tsv"

errors

python test_chargervariant.py

E

======================================================================

ERROR: test_nonzero (main.testchargervariant)


Traceback (most recent call last):

File "test_chargervariant.py", line 13, in test_nonzero

if ( v ):

File "/xxx/python2.7/site-packages/charger/chargervariant.py", line 201, in nonzero

elif ( self.checkIfRefAltStrand( k ) ):

AttributeError: 'chargervariant' object has no attribute 'checkIfRefAltStrand'


Ran 1 test in 0.001s

FAILED (errors=1)

PM5 found 0 pathogenic variants

I use the pos chr11 108345818 rs587779872 C T , it could be annotated as PM5 in paper Pathogenic Germline Variants in 10389 Adult. But I cant get the same output.

Need most updated ClinVar files

When running CharGer, to use ClinVar database, we have to upload 'clinvar_alleles.tsv.gz' file, but the one we can download from your website is out of date (2018 version), do you have the latest version of this ClinVar file? Otherwise, can we use other alternative way to annotate ClinVar database for CharGer to evaluate the PS1 level? Thanks a lot!

installed Charger 0.5.4 but version in help is 0.5.3

When I run charger 0.5.4 to see the options, this is the output (first few lines):

CharGer - v0.5.3

Usage: charger [options]

Accepted input data files:
-m Standard .maf
-f Standard .vcf
-T Custom .tsv

I checked if I downloaded the correct version.

can you give me an example command with run VEP?

I want to use Local VEP when use charger

my command is:

(CharGer) [yubau@cmuh-i2 Demo]$ charger -f clinvar_20200210.vep.vcf -o clinvar_20200210.vep.charger.ltEx.tsv -l -t -E -x --exac-vcf ExAC.r1.sites.vep.vcf --mac-clinvar-tsv clinvar_alleles.multi.b37.tsv.gz --perl /home/yubau/perl --vep-script /home/yubau/VEP/ensembl-vep/vep --vep-config /home/yubau/VEP/ensembl-vep/t/Config.t --vep-cache /home/yubau/.vep --vep-version 99 --vep-output clinvar_20200210_output.charger.vep.vcf --grch 37 --ensembl-release 75 --reference-fasta /home/yubau/.vep/homo_sapiens/99_GRCh37 --fork 48

any wrong?

Can you give me an example command when use charger with run VEP?

thanks

CharGer::runIndelModules Error:

Hello, I had this question when I ran charger:
CharGer::runIndelModules Error: Cannot evaluate PVS1 or PM4: No gene list supplied. which argument should I provide?
I used charger -f ${input} -o ${out} -D --inheritanceGeneList ${inherit} --mac-clinvar-tsv ${clinvar} --PP2GeneList ${pp2} -z ${mmVariants} --exac-vcf ${ExAC} -H $d --BP1GeneList ${bp1} -g ${mmGenes}.

Another question : Unsupported VEP version or no gnomAD AF annotation in input file; will search for ExAC frequencies...
my input file was annotated with VEP. I want you will reply to me soon, thanks!

I used CharGer. But I got different results.

Hello. I'm CharGer user.
I appreciate your helpful program.

In this paper (Cell, 2018, 173, 355-370), this BRCA2 variant (Supplement xlsx file 2A, 13:g.32890660A>G) is one of pathogenic and rare variants (Charger score; 11, PM2, PM5, PS1). But, in my results, this variant's CharGer score is only 2 (only PM2). So in my results, this variant is not called pathogenic or likely pathogenic.

And, in your Supplementary table, Charger score of ATM variant (11:g.108129749C>T) is 17 (PS1+PVS1+PM2). But in my result, this variant's CharGer score is only 6 (PM2 + PSC1).

Could you point out my faults?
I want to get the same results as you.
I think this difference results from "emptyRemoved_20160428_pathogenic_variants_HGVSg_VEP.vcf".
This file has only variants on following chromosomes; chr10, chr13, chr17, chr5.

I added my scripts.

Thanks

Oh.


#cf. my input vcf : varscan2 vcf --> annotation by vep (v94, ref: GRCh37, Exac ; nonTCGA version r1)

mmGenes=$PanCanAtlasData/20160301_Rahman_KJ_KH_gene_table_CharGer.txt
mmVariants=$PanCanAtlasData/emptyRemoved_20160428_pathogenic_variants_HGVSg_VEP.vcf
hotspot=$PanCanAtlasData/MC3.noHypers.mericUnspecified.d10.r20.v114.clusters
clinvar=$PanCanAtlasData/clinvar_alleles.single.b37.tsv.gz
rareThreshold="0.01" # 1% threshold
commonThreshold="0.05" # 5% threshold

$bin/charger --include-vcf-details
-f $input_dir/${Sample}.$Pair.$VC.$Class.vep.vcf
-o $Output_dir/${Sample}.$Pair.$VC.$Class.vep.Hg19.CharGer.rare0.01.common0.05.tsv
-O
-D
-g ${mmGenes}
-z ${mmVariants}
-H ${hotspot}
-l
--rare-threshold $rareThreshold
--common-threshold $commonThreshold
--mac-clinvar-tsv ${clinvar}

Install error

Hi there!

I am trying to install CharGer using conda on my mac. I have followed the instruction here (Installation using conda section)

I have created conda environment that has python 2.7.18.

and when I type pip install ., this mssg appears:

Installing build dependencies ... done
  Getting requirements to build wheel ... done
    Preparing wheel metadata ... done

ERROR: Package u'charger-0.6.0b1' requires a different Python: 2.7.18 not in '>=3.6'

What should I do to fix this error? Thank you so much! :)


update (20.12.05):

  • so I have tried conda install charger but it wasn't available
  • when I do pip install charger, regardless of python versions, version 5.2.0 is installed.
  • if I download the master folder and do pip install . in python 3.9 environment, version 6 beta is installed

what should I do to install the stable version? (5.4.0). Or is it ok to use 6 beta?

Thank you! :)


update (20.12.7):

  • I realized that I can just tweak the code so that I can download the 5.4.0 version (sorry I am really new to this lol)
wget -O CharGer.zip https://github.com/ding-lab/CharGer/archive/v0.5.4.zip

yay! :)

All variants classified as Benign or Uncertain Significance

Hello,

I tried to use CharGer on a WES vcf file processed with GATK 4.1.1.0 from hg38 bam files and annotate wih VEP 95. VCF contains 547 samples. Looking a the results all variant are classified as "Uncertain"

cat out.charger.txt | awk -F '\t' '{print $20}' | sort | uniq -c
 224571 Benign
 933207 Uncertain Significance

Looking in the log file I can see that some warnings pop out :

No gene list file uploaded. CharGer will not make PVS1 calls.
No PP2 gene list file uploaded. CharGer will not make PP2 calls.
No BP1 gene list file uploaded. CharGer will not make BP1 calls.
No expression file uploaded. CharGer will allow all passed truncations without expression data in PVS1.

Is is expected to have either Benign or Uncertain Significance variants ? Do you have maybe an (unannotated) test VCF (in hg38) with variants to should be annotated as pathogenic in order for me to test my config.

Thanks

The log file :

charger -f input.sort.vcf.gz -o out.charger.txt --vep-cache /home/vep/.vep/ --vep-version 95 --grch 38 --reference-fasta /home/genomes/hg38/Homo_sapiens_assembly38.fasta

Using default module scores and category thresholds:
   BA1 = -8
   BMC1 = -2
   BP1 = -1
   BP2 = -1
   BP3 = -1
   BP4 = -1
   BP5 = -1
   BP6 = -1
   BP7 = -1
   BS1 = -4
   BS2 = -4
   BS3 = -4
   BS4 = -4
   BSC1 = -6
   PM1 = 2
   PM2 = 2
   PM3 = 2
   PM4 = 2
   PM5 = 2
   PM6 = 2
   PMC1 = 2
   PP1 = 1
   PP2 = 1
   PP3 = 1
   PP4 = 1
   PP5 = 1
   PPC1 = 1
   PPC2 = 1
   PS1 = 7
   PS2 = 4
   PS3 = 4
   PS4 = 4
   PSC1 = 4
   PVS1 = 8
   maxBenignScore = -8
   maxLikelyBenignScore = -4
   minLikelyPathogenicScore = 5
   minPathogenicScore = 9
Will capture vcf details for output: False
This .vcf has AF!


Skipping: 0 for filters and 0 for AF and 0 for mutation types out of 1157778
No gene list file uploaded. CharGer will not make PVS1 calls.
No PP2 gene list file uploaded. CharGer will not make PP2 calls.
No BP1 gene list file uploaded. CharGer will not make BP1 calls.
No expression file uploaded. CharGer will allow all passed truncations without expression data in PVS1.
charger::getVEP Warning: skipping VEP 
Running VEP took 7.10487365723e-05seconds
charger::getClinVar
Running ClinVar took 1.69277191162e-05seconds
Running exac took 1.59740447998e-05seconds
CharGer module PVS1
- truncations in genes where LOF is a known mechanism of the disease
- require the mode of inheritance to be dominant (assuming heterzygosity) and co-occurence with reduced gene expression
- run concurrently with PSC1, PMC1, PM4, PPC1, and PPC2 -
CharGer::runIndelModules Error: Cannot evaluate PVS1 or PM4: No gene list supplied.
CharGer module PS1
- same peptide change as a previously established pathogenic variant
PS1 found 0 pathogenic variants
CharGer module PS2
- de novo with maternity and paternity confirmation and no family history
CharGer module PS3: Well-established in vitro or in vivo functional studies             supportive of a damaging effect on the gene or gene product
CharGer module PS4: not yet implemented
CharGer module PM1:  Located in a mutational hot spot and/or critical and well-established               functional domain (e.g., active site of an enzyme) without benign variation
CharGer::PM1 Warning: clustersFile is not supplied. PM1 was not executed.
CharGer module PM2
- absent or extremely low frequency in controls
CharGer module PM3: not yet implemented
CharGer module PM4
- protein length changes due to inframe indels or nonstop variant of selected genes -
CharGer module PM5
- different peptide change of a pathogenic variant at the same reference peptide
PM5 found 0 pathogenic variants
CharGer module PM6
- assumed de novo without maternity and paternity confirmation
CharGer module PP1
- cosegregation with disease in family members in a known disease gene
CharGer module PP2: Missense variant in a gene that has low rate of benign missense and in which missense are common mechanism of disease
CharGer::PP2 Error: Cannot evaluate PP2: No PP2 gene list supplied.
CharGer module PP3
- multiple lines of in silico evidence of deliterous effect
Found 0 variants with >= 2 of in silico evidence
CharGer module PP4: not yet implemented
CharGer module PP5: not yet implemented
CharGer module BA1
- allele frequency >5%
CharGer module BS1: not yet implemented
CharGer module BS2: not yet implemented
CharGer module BS3: not yet implemented
CharGer module BS4: not yet implemented
CharGer module BP1: Missense variant in a gene for which primarily truncations cause disease
CharGer::BP1 Error: Cannot evaluate BP1: No BP1 gene list supplied.
CharGer module BP2: not yet implemented
CharGer module BP3: not yet implemented
CharGer module BP4
 - in silico evidence of no damage
Found 0 variants with >= 2 with in silico evidence
CharGer module BP5: not yet implemented
CharGer module BP6: not yet implemented
CharGer module BP7: not yet implemented
CharGer module PSC1
Recessive truncations of susceptible genes
CharGer module PMC1
Truncations of genes when no gene list provided
CharGer module PPC1
- protein length changes due to inframe indels or nonstop variant of other, not-specificied genes -
CharGer module PPC2
- protein length changes due to inframe indels or nonstop variant when no susceptibility genes given -
CharGer module BSC1
- same peptide change as a previously established benign variant
BSC1 found 0 benign variants
CharGer module BMC1
- different peptide change of a benign variant at the same reference peptide
BMC1 found 0 benign variants
0.0005 < 0.05
write 1157778 charged user variants to out.vep.charger.txt
charger::writeSummary Warning: skipping pubmed link tests

CharGer run Times:
input parse time (s): 0.000698089599609
get input data time (s): 37452.09834  
get external data time (s): 105.454962015
modules run time (s): 268.449123859   
classification time (s): 314.865689993


Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.