ding-lab / charger Goto Github PK
View Code? Open in Web Editor NEWCharacterization of Germline variants
Home Page: https://ding-lab.github.io/CharGer/
License: GNU General Public License v3.0
Characterization of Germline variants
Home Page: https://ding-lab.github.io/CharGer/
License: GNU General Public License v3.0
Pls help provide these default resources... The determination of the gene list for BP2 and PP2 is already part of the interpretation workload. Thanks!
I already installed Charger and run "runDemo.sh", but I get a lot of "Warning" message
vep command:
~/VEP/ensembl-vep/vep --assembly GRCh38 --cache --dir_plugins /home/yubau/.vep/Plugins --everything --fasta /home/yubau/.vep/homo_sapiens/99_GRCh38 --force_overwrite --fork 48 --format vcf --input_file HCC297_S7_hg38.GATK.HaplotypeCaller.variants.vcf --offline --output_file HCC297_S7_hg38.GATK.HaplotypeCaller.variants.vep.vcf --symbol --terms SO --tsl --vcf
vcf using VEP annotated file (head -n 1000) (Total 4,773,130 line when run charger):
HCC297_S7_hg38.GATK.HaplotypeCaller.variants.vep_1000.txt
install:
mkdir CharGer_test_python38
cd CharGer_test_python38
conda create --name CharGer_python_38 python=3.8
conda activate CharGer_python_38
pip install pysam
(CharGer_python_38) [yubau@cmuh-i2 CharGer]$ conda list
# packages in environment at /home/yubau/anaconda2/envs/CharGer_python_38:
#
# Name Version Build Channel
_libgcc_mutex 0.1 main
attrs 19.3.0 pypi_0 pypi
ca-certificates 2020.1.1 0
certifi 2019.11.28 py38_0
charger 0.6.0b1 pypi_0 pypi
cyvcf2 0.11.6 pypi_0 pypi
ld_impl_linux-64 2.33.1 h53a641e_7
libedit 3.1.20181209 hc058e9b_0
libffi 3.2.1 hd88cf55_4
libgcc-ng 9.1.0 hdf63c60_0
libstdcxx-ng 9.1.0 hdf63c60_0
loguru 0.4.1 pypi_0 pypi
ncurses 6.2 he6710b0_0
numpy 1.18.1 pypi_0 pypi
openssl 1.1.1d h7b6447c_4
pip 20.0.2 py38_1
pysam 0.15.4 pypi_0 pypi
python 3.8.1 h0371630_1
readline 7.0 h7b6447c_5
setuptools 45.2.0 py38_0
sqlite 3.31.1 h7b6447c_0
tk 8.6.8 hbc83047_0
wheel 0.34.2 py38_0
xz 5.2.4 h14c3975_4
zlib 1.2.11 h7b6447c_3
(CharGer_python_38) [yubau@cmuh-i2 CharGer]$ pip install --pre -i https://pypi.org/simple/ --extra-index-url https://test.pypi.org/simple/ charger
command:
(CharGer_python_38) [yubau@cmuh-i2 CharGer]$ charger --input HCC297_S7_hg38.GATK.HaplotypeCaller.variants.vep.vcf --output HCC297_S7_hg38.GATK.HaplotypeCaller.variants.vep.charger.tsv INFO | Running CharGer v0.6.0b1 with parameters: --input HCC297_S7_hg38.GATK.HaplotypeCaller.variants.vep.vcf --output HCC297_S7_hg38.GATK.HaplotypeCaller.variants.vep.charger.tsv
----screen info start-------------------------------------------------------------------------------------
INFO | Validate the given config
DEBUG | Given config: CharGerConfig(input=PosixPath('HCC297_S7_hg38.GATK.HaplotypeCaller.variants.vep.vcf'), output=PosixPath('HCC297_S7_hg38.GATK.HaplotypeCaller.variants.vep.charger.tsv'), hotspot3d_cluster=None, pathogenic_variant=None, override_variant_info=False, include_vcf_details=False, clinvar_table=None, rare_threshold=0.0005, common_threshold=0.005, acmg_module_scores={'PVS1': 8, 'PS1': 7, 'PS2': 4, 'PS3': 4, 'PS4': 4, 'PM1': 2, 'PM2': 2, 'PM3': 2, 'PM4': 2, 'PM5': 2, 'PM6': 2, 'PP1': 1, 'PP2': 1, 'PP3': 1, 'PP4': 1, 'PP5': 1, 'BP1': -1, 'BP2': -1, 'BP3': -1, 'BP4': -1, 'BP5': -1, 'BP6': -1, 'BP7': -1, 'BS1': -4, 'BS2': -4, 'BS3': -4, 'BS4': -4, 'BA1': -8}, charger_module_scores={'PSC1': 4, 'PMC1': 2, 'PPC1': 1, 'PPC2': 1, 'BMC1': -2, 'BSC1': -6}, min_pathogenic_score=9, min_likely_pathogenic_score=5, max_likely_benign_score=-4, max_benign_score=-8, disease_specific=False, inheritance_gene_table=None, PP2_gene_list=None, BP1_gene_list=None)
INFO | Read input VCF from HCC297_S7_hg38.GATK.HaplotypeCaller.variants.vep.vcf
DEBUG | VEP version 99 with CSQ format [66 fields]: Allele,Consequence,IMPACT,SYMBOL,Gene,Feature_type,Feature,BIOTYPE,EXON,INTRON,HGVSc,HGVSp,cDNA_position,CDS_position,Protein_position,Amino_acids,Codons,Existing_variation,DISTANCE,STRAND,FLAGS,VARIANT_CLASS,SYMBOL_SOURCE,HGNC_ID,CANONICAL,MANE,TSL,APPRIS,CCDS,ENSP,SWISSPROT,TREMBL,UNIPARC,GENE_PHENO,SIFT,PolyPhen,DOMAINS,miRNA,HGVS_OFFSET,AF,AFR_AF,AMR_AF,EAS_AF,EUR_AF,SAS_AF,AA_AF,EA_AF,gnomAD_AF,gnomAD_AFR_AF,gnomAD_AMR_AF,gnomAD_ASJ_AF,gnomAD_EAS_AF,gnomAD_FIN_AF,gnomAD_NFE_AF,gnomAD_OTH_AF,gnomAD_SAS_AF,MAX_AF,MAX_AF_POPS,CLIN_SIG,SOMATIC,PHENO,PUBMED,MOTIF_NAME,MOTIF_POS,HIGH_INF_POS,MOTIF_SCORE_CHANGE
SUCCESS | Read total 4,772,905 variants from the input VCF
WARNING | Inheritance gene table is not provided, CharGer cannot make ACMG PVS1/PM4 calls or CharGer PSC1/PPC1 calls. Disable all these modules
WARNING | CharGer cannot make PP2 calls without the given gene list. Disable PP2 module
WARNING | CharGer cannot make BP1 calls without the given gene list. Disable BP1 module
INFO | Skip matching ClinVar
INFO | Run all ACMG modules
INFO | Skipped PVS1 module
INFO | Skipped PM4 module
INFO | Run all CharGer modules
INFO | Skipped PSC1 module
INFO | Running PMC1 module
INFO | Skipped PPC1 module
INFO | Running PPC2 module
----screen info end-------------------------------------------------------------------------------------
(CharGer_python_38) [yubau@cmuh-i2 CharGer]$ ls
CHANGES.rst demo.vcf demo.vep.vcf_summary.html HCC297_S7_hg38.GATK.HaplotypeCaller.variants.vep.vcf poetry.lock README.md setup.cfg tests tox.ini
clinvar_20200210.vep.vcf demo.vep.vcf docs LICENSE.txt pyproject.toml scripts src tox_conda.ini
(CharGer_python_38) [yubau@cmuh-i2 CharGer]$ python -V
Python 3.8.1
(CharGer_python_38) [yubau@cmuh-i2 CharGer]$ which python
~/anaconda2/envs/CharGer_python_38/bin/python
(CharGer_python_38) [yubau@cmuh-i2 CharGer]$ which pip
~/anaconda2/envs/CharGer_python_38/bin/pip
(CharGer_python_38) [yubau@cmuh-i2 CharGer]$ pip --version
pip 20.0.2 from /home/yubau/anaconda2/envs/CharGer_python_38/lib/python3.8/site-packages/pip (python 3.8)
(CharGer_python_38) [yubau@cmuh-i2 CharGer]$ charger --version
CharGer v0.6.0b1
(CharGer_python_38) [yubau@cmuh-i2 CharGer]$ which charger
~/anaconda2/envs/CharGer_python_38/bin/charger
I can not get any output file using v0.6.0b1 version
I used VEP to annotate 0.5.4 version demo.vcf file and run charger v0.6.0b1, than can not get any output file
Can you upload a demo.vcf for charger v0.6.0b1 thanks!
yubau, from taiwan
When I try conda install charger, I get:
PackagesNotFoundError: The following packages are not available from current channels:
Also pip install charger, returns:
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error
× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [1 lines of output]
error in PyVCF setup command: use_2to3 is invalid.
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
× Encountered error while generating package metadata.
╰─> See above for output.
note: This is an issue with the package mentioned above, not pip.
Hey, are there any clear instructions how to run the software? After digging in Python files I built such command:
charger --input tests/examples/10.1056_NEJMoa1508054_S4_AD_vep85.sorted.vcf.gz --pathogenic-variant tests/examples/annotations/grch37_pathogenic_variants.vcf.gz --inheritance-gene-table tests/examples/annotations/inheritance_gene_table.tsv.gz --PP2-gene-list tests/examples/annotations/pp2_gene_list.txt.gz --clinvar-table tests/examples/annotations/clinvar_chrom_22_only.b37.tsv.gz --output out.tsv
however I get no output file...
I only get messages like:
charger --input tests/examples/10.1056_NEJMoa1508054_S4_AD_vep85.sorted.vcf.gz --pathogenic-variant tests/examples/annotations/grch37_pathogenic_variants.vcf.gz --inheritance-gene-table tests/examples/annotations/inheritance_gene_table.tsv.gz --PP2-gene-list tests/examples/annotations/pp2_gene_list.txt.gz --clinvar-table tests/examples/annotations/clinvar_chrom_22_only.b37.tsv.gz --output out.tsv
INFO | Running CharGer v0.6.0b1 with parameters: --input tests/examples/10.1056_NEJMoa1508054_S4_AD_vep85.sorted.vcf.gz --pathogenic-variant tests/examples/annotations/grch37_pathogenic_variants.vcf.gz --inheritance-gene-table tests/examples/annotations/inheritance_gene_table.tsv.gz --PP2-gene-list tests/examples/annotations/pp2_gene_list.txt.gz --clinvar-table tests/examples/annotations/clinvar_chrom_22_only.b37.tsv.gz --output out.tsv
INFO | Validate the given config
DEBUG | Given config: CharGerConfig(input=PosixPath('tests/examples/10.1056_NEJMoa1508054_S4_AD_vep85.sorted.vcf.gz'), output=PosixPath('out.tsv'), hotspot3d_cluster=None, pathogenic_variant=PosixPath('tests/examples/annotations/grch37_pathogenic_variants.vcf.gz'), override_variant_info=False, include_vcf_details=False, clinvar_table=PosixPath('tests/examples/annotations/clinvar_chrom_22_only.b37.tsv.gz'), rare_threshold=0.0005, common_threshold=0.005, acmg_module_scores={'PVS1': 8, 'PS1': 7, 'PS2': 4, 'PS3': 4, 'PS4': 4, 'PM1': 2, 'PM2': 2, 'PM3': 2, 'PM4': 2, 'PM5': 2, 'PM6': 2, 'PP1': 1, 'PP2': 1, 'PP3': 1, 'PP4': 1, 'PP5': 1, 'BP1': -1, 'BP2': -1, 'BP3': -1, 'BP4': -1, 'BP5': -1, 'BP6': -1, 'BP7': -1, 'BS1': -4, 'BS2': -4, 'BS3': -4, 'BS4': -4, 'BA1': -8}, charger_module_scores={'PSC1': 4, 'PMC1': 2, 'PPC1': 1, 'PPC2': 1, 'BMC1': -2, 'BSC1': -6}, min_pathogenic_score=9, min_likely_pathogenic_score=5, max_likely_benign_score=-4, max_benign_score=-8, disease_specific=False, inheritance_gene_table=PosixPath('tests/examples/annotations/inheritance_gene_table.tsv.gz'), PP2_gene_list=PosixPath('tests/examples/annotations/pp2_gene_list.txt.gz'), BP1_gene_list=None)
INFO | Read input VCF from tests/examples/10.1056_NEJMoa1508054_S4_AD_vep85.sorted.vcf.gz
DEBUG | VEP version 85 with CSQ format [62 fields]: Allele,Consequence,IMPACT,SYMBOL,Gene,Feature_type,Feature,BIOTYPE,EXON,INTRON,HGVSc,HGVSp,cDNA_position,CDS_position,Protein_position,Amino_acids,Codons,Existing_variation,DISTANCE,STRAND,FLAGS,VARIANT_CLASS,SYMBOL_SOURCE,HGNC_ID,CANONICAL,TSL,APPRIS,CCDS,ENSP,SWISSPROT,TREMBL,UNIPARC,GENE_PHENO,SIFT,PolyPhen,DOMAINS,HGVS_OFFSET,GMAF,AFR_MAF,AMR_MAF,EAS_MAF,EUR_MAF,SAS_MAF,AA_MAF,EA_MAF,ExAC_MAF,ExAC_Adj_MAF,ExAC_AFR_MAF,ExAC_AMR_MAF,ExAC_EAS_MAF,ExAC_FIN_MAF,ExAC_NFE_MAF,ExAC_OTH_MAF,ExAC_SAS_MAF,CLIN_SIG,SOMATIC,PHENO,PUBMED,MOTIF_NAME,MOTIF_POS,HIGH_INF_POS,MOTIF_SCORE_CHANGE
SUCCESS | Read total 550 variants from the input VCF
INFO | Read pathogenic VCF from tests/examples/annotations/grch37_pathogenic_variants.vcf.gz
DEBUG | VEP version 84 with CSQ format [52 fields]: Allele,Consequence,IMPACT,SYMBOL,Gene,Feature_type,Feature,BIOTYPE,EXON,INTRON,HGVSc,HGVSp,cDNA_position,CDS_position,Protein_position,Amino_acids,Codons,Existing_variation,DISTANCE,STRAND,FLAGS,SYMBOL_SOURCE,HGNC_ID,TSL,APPRIS,SIFT,PolyPhen,GMAF,AFR_MAF,AMR_MAF,EAS_MAF,EUR_MAF,SAS_MAF,AA_MAF,EA_MAF,ExAC_MAF,ExAC_Adj_MAF,ExAC_AFR_MAF,ExAC_AMR_MAF,ExAC_EAS_MAF,ExAC_FIN_MAF,ExAC_NFE_MAF,ExAC_OTH_MAF,ExAC_SAS_MAF,CLIN_SIG,SOMATIC,PHENO,PUBMED,MOTIF_NAME,MOTIF_POS,HIGH_INF_POS,MOTIF_SCORE_CHANGE
[W::vcf_parse] Contig '10' is not defined in the header. (Quick workaround: index the file with tabix.)
[W::vcf_parse] Contig '13' is not defined in the header. (Quick workaround: index the file with tabix.)
[W::vcf_parse] Contig '17' is not defined in the header. (Quick workaround: index the file with tabix.)
[W::vcf_parse] Contig '5' is not defined in the header. (Quick workaround: index the file with tabix.)
INFO | Read total 1,819 pathogenic variants from the VCF
INFO | Disable CharGer PMC1/PPC2 modules when inheritance gene table is provided
INFO | Read inheritance gene table from tests/examples/annotations/inheritance_gene_table.tsv.gz
INFO | Loaded inheritance mode of 152 genes
INFO | Read PP2 gene list from tests/examples/annotations/pp2_gene_list.txt.gz
INFO | Marked 152 genes for PP2
WARNING | CharGer cannot make BP1 calls without the given gene list. Disable BP1 module
INFO | Match input variants with ClinVar table at tests/examples/annotations/clinvar_chrom_22_only.b37.tsv.gz
SUCCESS | Matched 2 out of 550 input variants to a ClinVar record
INFO | Run all ACMG modules
INFO | Running PVS1 module
INFO | Running PM4 module
INFO | Run all CharGer modules
INFO | Running PSC1 module
INFO | Skipped PMC1 module
INFO | Running PPC1 module
INFO | Skipped PPC2 module
Are you planning to maintain this program?
Dear CharGer team,
I see that some small versions of cross-reference files are available in the "Demo" folder.
Do you have plan to support the full version of these files? eg. mode of inheritance file.
Thanks
Lee
Hello,
We are encountering error while running charger
with MacArthur lab clinvar.
Here are the details:
charger -m ./maf/APC_vcf_maf.maf -o APC_charger.tsv -H APC_inp.maf.3D_Proximity.pairwise.site.l0.ad10.r10.clusters -D --inheritanceGeneList inheritanceGeneList.txt --exac-vcf ~pyang/.vep/ExAC_nonTCGA.r0.3.1.sites.vep.vcf.gz --mac-clinvar-tsv ./clinvar_alleles.single.b37.tsv.gz
Traceback (most recent call last):
File "/sonas-hs/nwhgenomics/hpc/home/pyang/software/miniconda2/envs/py27/bin/charger", line 743, in <module>
main( sys.argv[1:] )
File "/sonas-hs/nwhgenomics/hpc/home/pyang/software/miniconda2/envs/py27/bin/charger", line 662, in main
mutationTypes = mutationTypes , \
File "/sonas-hs/nwhgenomics/hpc/home/pyang/software/miniconda2/envs/py27/lib/python2.7/site-packages/charger/charger.py", line 821, in getExternalData
self.getClinVar( **kwargs )
File "/sonas-hs/nwhgenomics/hpc/home/pyang/software/miniconda2/envs/py27/lib/python2.7/site-packages/charger/charger.py", line 842, in getClinVar
clinvarSet = self.getMacClinVarTSV( macClinVarTSV )
File "/sonas-hs/nwhgenomics/hpc/home/pyang/software/miniconda2/envs/py27/lib/python2.7/site-packages/charger/charger.py", line 887, in getMacClinVarTSV
[ description , status ] = self.parseMacPathogenicity( fields[12:17] )
File "/sonas-hs/nwhgenomics/hpc/home/pyang/software/miniconda2/envs/py27/lib/python2.7/site-packages/charger/charger.py", line 914, in parseMacPathogenicity
isPathogenic = int( isPathogenic )
ValueError: invalid literal for int() with base 10: 'NM_005101.3:c.62G>A'
After digging a little into the code we see the header mentioned in the code and in our MacArthur Lab TSV
The header listed in the script at line:
Line 876 in ba34f1d
are different from the file which looks like this:
zcat /sonas-hs/nwhgenomics/hpc/home/pyang/projects/charger/clinvar_alleles.single.b37.tsv.gz | head -n1
chrom pos ref alt start stop strand variation_type variation_id rcv scv allele_id symbol hgvs_c hgvs_p molecular_consequence clinical_significance clinical_significance_ordered pathogenic likely_pathogenic uncertain_significance likely_benign benign review_status review_status_ordered last_evaluated all_submitters submitters_ordered all_traits all_pmids inheritance_modes age_of_onset prevalence disease_mechanism origin xrefs dates_ordered gold_stars conflicted
If we are using the wrong clinvar file can you please point us to the right ones.
Thank you in advance for your guidance and support.
After installing the program, I type
charger
as suggested to view executable options and I receive the following error:
File "/home/[username]/mycharger/bin/charger", line 256
print "CharGer ERROR: Command not recognized"
^
SyntaxError: Missing parentheses in call to 'print
Any advice on how to fix this? Thank you
Hi, I have succesfully run CharGer using the following command:
charger \
-f ~/vcf/17D2625146_FB_hg19.vcf \
-o test_charger_everything.tsv \
-l -D \
--mac-clinvar-tsv ~/clinvar_5b04ade/output/b37/single/clinvar_alleles.single.b37.tsv.gz \
-z ~/charger_db/grch37_pathogenic_variants.vcf \
--inheritanceGeneList ~/charger_db/inheritance_gene_table.tsv \
--PP2 ~/charger_db/pp2_gene_list.txt
The database files I got from this: https://github.com/ding-lab/CharGer/tree/master/tests/examples/annotations
My VCF is annotated by VEP --everything flag. In this sample, I have a particular variant: ENST00000267163.4:c.1954_1960+2del, that is a null variant, and not in GnomeAD database. However, CharGer report this variant has Allele frequency of 0.5, and put it in BA1 evidence, hence classify it at Benign.
This is the terminal output from CharGer run:
(skipping a lot of warning rows)
Running ClinVar took 30.4019100666seconds
Running exac took 2.28881835938e-05seconds
CharGer module PVS1
- truncations in genes where LOF is a known mechanism of the disease
- require the mode of inheritance to be dominant (assuming heterzygosity) and co-occurence with reduced gene expression
- run concurrently with PSC1, PMC1, PM4, PPC1, and PPC2 -
CharGer module PS1
- same peptide change as a previously established pathogenic variant
PS1 found 0 pathogenic variants
CharGer module PS2
- de novo with maternity and paternity confirmation and no family history
CharGer module PS3: Well-established in vitro or in vivo functional studies supportive of a damaging effect on the gene or gene product
CharGer module PS4: not yet implemented
CharGer module PM1: Located in a mutational hot spot and/or critical and well-established functional domain (e.g., active site of an enzyme) without benign variation
CharGer::PM1 Warning: clustersFile is not supplied. PM1 was not executed.
CharGer module PM2
- absent or extremely low frequency in controls
CharGer module PM3: not yet implemented
CharGer module PM4
- protein length changes due to inframe indels or nonstop variant of selected genes -
CharGer module PM5
- different peptide change of a pathogenic variant at the same reference peptide
PM5 found 0 pathogenic variants
CharGer module PM6
- assumed de novo without maternity and paternity confirmation
CharGer module PP1
- cosegregation with disease in family members in a known disease gene
CharGer module PP2: Missense variant in a gene that has low rate of benign missense and in which missense are common mechanism of disease
CharGer module PP3
- multiple lines of in silico evidence of deliterous effect
Found 0 variants with >= 2 of in silico evidence
CharGer module PP4: not yet implemented
CharGer module PP5: not yet implemented
CharGer module BA1
- allele frequency >5%
CharGer module BS1: not yet implemented
CharGer module BS2: not yet implemented
CharGer module BS3: not yet implemented
CharGer module BS4: not yet implemented
CharGer module BP1: Missense variant in a gene for which primarily truncations cause disease
CharGer::BP1 Error: Cannot evaluate BP1: No BP1 gene list supplied.
CharGer module BP2: not yet implemented
CharGer module BP3: not yet implemented
CharGer module BP4
- in silico evidence of no damage
Found 0 variants with >= 2 with in silico evidence
CharGer module BP5: not yet implemented
CharGer module BP6: not yet implemented
CharGer module BP7: not yet implemented
CharGer module PSC1
Recessive truncations of susceptible genes
CharGer module PMC1
Truncations of genes when no gene list provided
CharGer module PPC1
- protein length changes due to inframe indels or nonstop variant of other, not-specificied genes -
CharGer module PPC2
- protein length changes due to inframe indels or nonstop variant when no susceptibility genes given -
CharGer module BSC1
- same peptide change as a previously established benign variant
BSC1 found 0 benign variants
CharGer module BMC1
- different peptide change of a benign variant at the same reference peptide
BMC1 found 0 benign variants
0.0005 < 0.05
write 27 charged user variants to test_charger_everything.tsv
charger::writeSummary Warning: skipping pubmed link tests
CharGer run Times:
input parse time (s): 0.000585079193115
get input data time (s): 2.63706803322
get external data time (s): 30.4032850266
modules run time (s): 0.0069580078125
classification time (s): 0.00322985649109
CharGer full run time (s): 33.0511260033
What have gone wrong in my case? How can I improve the result? Thank you very much
Hello,
I want to ask the reason about the biomine warning :
biomine warning: EX not found in conversion tables
biomine warning: ECPIC not found in conversion tables
biomine warning: EX not found in conversion tables
biomine warning: ECPIC not found in conversion tables
biomine warning: EX not found in conversion tables
biomine warning: ECPIC not found in conversion tables
biomine warning: RVEEVQNVIN not found in conversion tables
biomine warning: RVEEVQNVIN not found in conversion tables
biomine warning: RVEEVQNVIN not found in conversion tables
biomine warning: RVEEVQNVIN not found in conversion tables
biomine warning: RVEEVQNVIN not found in conversion tables
biomine warning: RVEEVQNVIN not found in conversion tables
biomine warning: RVEEVQNVIN not found in conversion tables
biomine warning: RVEEVQNVIN not found in conversion tables
biomine warning: RVEEVQNVIN not found in conversion tables
biomine warning: RVEEVQNVIN not found in conversion tables
biomine warning: RVEEVQNVIN not found in conversion tables
biomine warning: RVEEVQNVIN not found in conversion tables
biomine warning: RVEEVQNVIN not found in conversion tables
biomine warning: RVEEVQNVIN not found in conversion tables
And this is my shell content:charger -f ${in} -o ${out} -D --inheritanceGeneList ${pvs1} -E -l --mac-clinvar-tsv ${clinvar} --PP2GeneList ${pp2} --BP1GeneList ${bp1} -H ${pm5} -g ${mmGenes} -z ${mmVariants} --exac-vcf ${ExAC};
Looking forward to your reply ,thanks
Hi team,
I apply CharGer with WGS vcf file( with CSQ ). And the warning message like
biomine::variant::mafvariant Warning: could not find amino acid change or intronic change
Hint: Is the input amino acid change column correct?
Problem variant: :None:None-None->-:::c.:::p. -- p.Pro22=
appears almost continually.
I wonder if should I filter the "intronic variants" before the applying CharGer?
There is a mismatch in the description of the module options. The readme should be updated to reflect the program help text.
Need an instruction or some guideline.
Hello,
When I input maf.file ,charger always print "CharGer Error: bad .maf file ",so i want to know whether there are some errors in my maf.file .I paste my maf.file content in follow,many thanks!
maf.file:
1 Hugo_Symbol Entrez_Gene_Id Chromosome Start_position End_position Variant_Classification Variant_Type Reference_Allele Tumor
2 CDK11A 728642 1 1647893 1647894 In_Frame_Ins INS - - TTTCTT rs200224067|rs199866927|rs144636354 3 PRDM2 7799 1 14106394 14106395 In_Frame_Ins INS - - CTC rs2308040|rs148293494|rs59028030
4 SLC25A34 284723 1 16063253 16063265 Frame_Shift_Del DEL CAGCCTGGCGTGC CAGCCTGGCGTGC - BGI-0
5 SERINC2 347735 1 31905889 31905890 In_Frame_Ins INS - - CAG rs33956499|rs3050461|rs5773362 byFrequency
6 GJB4 127534 1 35227008 35227011 Frame_Shift_Del DEL TGTC TGTC - rs146812843 BGI-0103-ESCC-060N
7 MAP7D1 55700 1 36643701 36643703 In_Frame_Del DEL AGA AGA - rs3045695|rs141305015|rs200892098 byFre
8 KLF17 128209 1 44596380 44596382 In_Frame_Del DEL CAA CAA - rs200059598|rs34057178 byFrequency 9 CYP4B1 1580 1 47280747 47280748 Splice_Site DEL AT AT - rs55835239|rs397687617|rs3215983
refer to commit 7d7d291
I remove entire anaconda2 directory
rm ~/anaconda2
mkdir ~/CharGer
cd ~/CharGer
wget https://repo.anaconda.com/archive/Anaconda2-5.2.0-Linux-x86_64.sh
bash Anaconda2-5.2.0-Linux-x86_64.sh
Do you accept the license terms? [yes|no]
[no] >>> yes
Anaconda2 will now be installed into this location:
/home/yubau/anaconda2
Press ENTER to confirm the location
Press CTRL-C to abort the installation
Or specify a different location below
[/home/yubau/anaconda2] >>>
Do you wish the installer to prepend the Anaconda2 install location
to PATH in your /home/yubau/.bashrc ? [yes|no]
[no] >>>yes
Do you wish to proceed with the installation of Microsoft VSCode? [yes|no]
>>> no
(finish install anaconda)
conda create --name CharGer python=2.7
conda activate CharGer
(CharGer) [yubau@cmuh-i2 ~]$ which pip
~/anaconda2/envs/CharGer/bin/pip
(CharGer) [yubau@cmuh-i2 ~]$ pip --version
pip 19.3.1 from /home/yubau/anaconda2/envs/CharGer/lib/python2.7/site-packages/pip (python 2.7)
(CharGer) [yubau@cmuh-i2 ~]$ conda --version
conda 4.5.4
(CharGer) [yubau@cmuh-i2 ~]$ which conda
~/anaconda2/bin/conda
conda install pysam
pip install pysam
(CharGer) [yubau@cmuh-i2 ~]$ conda list
# packages in environment at /home/yubau/anaconda2/envs/CharGer:
#
# Name Version Build Channel
_libgcc_mutex 0.1 main
AdvancedHTMLParser 9.0.1
BioMine 0.9.5
ca-certificates 2020.1.1 0
certifi 2019.11.28 py27_0
chardet 3.0.4
CharGer 0.5.4
idna 2.9
libedit 3.1.20181209 hc058e9b_0
libffi 3.2.1 hd88cf55_4
libgcc-ng 9.1.0 hdf63c60_0
libstdcxx-ng 9.1.0 hdf63c60_0
ncurses 6.2 he6710b0_0
numpy 1.16.6
openssl 1.1.1d h7b6447c_4
pip 19.3.1 py27_0
pysam 0.6 py27_0
pysam 0.15.4
python 2.7.17 h9bab390_0
PyVCF 0.6.8
QueryableList 3.1.0
readline 7.0 h7b6447c_5
requests 2.23.0
scipy 1.2.3
setuptools 44.0.0 py27_0
sqlite 3.31.1 h7b6447c_0
tk 8.6.8 hbc83047_0
urllib3 1.25.8
wheel 0.33.6 py27_0
zlib 1.2.11 h7b6447c_3
(CharGer) [yubau@cmuh-i2 ~]$ pip list
DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7. More details about Python 2 support in pip, can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support
Package Version
------------------ -------------------
AdvancedHTMLParser 9.0.1
BioMine 0.9.5
bz2file 0.98
certifi 2019.11.28
chardet 3.0.4
CharGer 0.5.4
idna 2.9
numpy 1.16.6
pip 19.3.1
pysam 0.15.4
PyVCF 0.6.8
QueryableList 3.1.0
requests 2.23.0
scipy 1.2.3
setuptools 44.0.0.post20200106
urllib3 1.25.8
virtualenv 16.4.3
wheel 0.33.6
xopen 0.5.0
cd ~/CharGer
wget -O CharGer.zip https://github.com/ding-lab/CharGer/archive/master.zip
unzip CharGer.zip
mv CharGer-master/ CharGer
cd CharGer
pip install .
(CharGer) [yubau@cmuh-i2 ~]$ which charger
~/anaconda2/envs/CharGer/bin/charger
(CharGer) [yubau@cmuh-i2 ~]$ charger
CharGer ERROR: Command not recognized
CharGer - v0.5.4
...
..
.
login linux(centos 7) terminal
conda activate CharGer
cd ~/CharGer/CharGer/Demo
charger -f demo.vcf -o demo.tsv
and I got output file
AND I run another parameter for different access data
charger -f demo.vcf -o demo.t.tsv -t
charger -f demo.vcf -o demo.E.tsv -E
charger -f demo.vcf -o demo.x.tsv -x
charger -f demo.vcf -o demo.tEx.tsv -t -E -x
Did not get fatal error
BUT !!
when run:
charger -f demo.vcf -o demo.tsv -l
and I got message
Unsupported VEP version or no gnomAD AF annotation in input file; will search for ExAC frequencies... Unsupported VEP version or no ExAC AF annotation in input file; will search for 1000 Genomes frequencies... Unsupported VEP version or no gnomAD AF annotation in input file; will search for ExAC frequencies... Unsupported VEP version or no ExAC AF annotation in input file; will search for 1000 Genomes frequencies... Skipping: 0 for filters and 0 for AF and 0 for mutation types out of 550 No gene list file uploaded. CharGer will not make PVS1 calls. No PP2 gene list file uploaded. CharGer will not make PP2 calls. No BP1 gene list file uploaded. CharGer will not make BP1 calls. No expression file uploaded. CharGer will allow all passed truncations without expression data in PVS1. charger::getVEP Warning: skipping VEP Running VEP took 2.09808349609e-05seconds charger::getClinVar warning: ClinVar ReST search batch size given is greater than max allowed (50). Overriding to max search batch size. Traceback (most recent call last): File "/home/yubau/anaconda2/envs/CharGer/bin/charger", line 744, in <module> main( sys.argv[1:] ) File "/home/yubau/anaconda2/envs/CharGer/bin/charger", line 663, in main mutationTypes = mutationTypes , \ File "/home/yubau/anaconda2/envs/CharGer/lib/python2.7/site-packages/charger/charger.py", line 879, in getExternalData self.getClinVar( **kwargs ) File "/home/yubau/anaconda2/envs/CharGer/lib/python2.7/site-packages/charger/charger.py", line 905, in getClinVar self.getClinVarviaREST( **kwargs ) File "/home/yubau/anaconda2/envs/CharGer/lib/python2.7/site-packages/charger/charger.py", line 918, in getClinVarviaREST ent = entrezapi() File "/home/yubau/anaconda2/envs/CharGer/lib/python2.7/site-packages/biomine/webapi/entrez/entrezapi.py", line 74, in __init__ self.setRequestLimits() File "/home/yubau/anaconda2/envs/CharGer/lib/python2.7/site-packages/biomine/webapi/entrez/entrezapi.py", line 332, in setRequestLimits self.setSummaryBatchSize( entrezaip.summaryBatchSize ) NameError: global name 'entrezaip' is not defined
AND I got a empty file
AND run:
charger -f demo.vcf -o demo.ltEx.tsv -l -t -E -x
BUT !!
when run:
charger -f demo.vcf -o demo.l.tsv -l --exac-vcf ~/CharGer3/CharGer/Demo/ExAC.r1.sites.vep.vcf --mac-clinvar-tsv ~/CharGer3/CharGer/Demo/clinvar_alleles.multi.b37.tsv.gz
it's can work, why must need to add "--exac-vcf ~/CharGer3/CharGer/Demo/ExAC.r1.sites.vep.vcf --mac-clinvar-tsv ~/CharGer3/CharGer/Demo/clinvar_alleles.multi.b37.tsv.gz" for access ClinVar data?
AND run:
charger -f demo.vcf -o demo.ltEx.tsv -l -t -E -x --exac-vcf ~/CharGer3/CharGer/Demo/ExAC.r1.sites.vep.vcf --mac-clinvar-tsv ~/CharGer3/CharGer/Demo/clinvar_alleles.multi.b37.tsv.gz
Why?
somebody found 3 bug
(https://www.jianshu.com/p/544caf92b24c)
One of 3 bugs, somebody say following file need to modify:
/home/yubau/anaconda2/envs/CharGer/lib/python2.7/site-packages/biomine/webapi/entrez/entrezapi.py
Line 332 and Line 333 entrezaip ---need to modify---> entrezapi
it's really ?
(CharGer) [yubau@cmuh-i2 Demo]$ which python
~/anaconda2/envs/CharGer/bin/python
(CharGer) [yubau@cmuh-i2 Demo]$ python -V
Python 2.7.17 :: Anaconda, Inc.
less ~/.bashrc
# added by Anaconda2 installer
export PATH="/home/yubau/anaconda2/bin:$PATH"
export PATH="/home/yubau/anaconda2/envs/CharGer/bin:$PATH"
source "/home/yubau/anaconda2/etc/profile.d/conda.sh"
AND where is "diseases file" or "gene\tdisease\tmode_of_inheritance.tsv"?
Access data
-l ClinVar (flag)
-x ExAC (flag)
-E VEP (flag)
-t TCGA cancer types (flag)
Using these flags turns on accession features built in. For the ClinVar, ExAC, and VEP flags, if no local VEP or database is provided, then BioMine will be used to access the ReST interface. CharGer is currently capable of handling all VEP releases up until release 97. [[[[[The TCGA flag allows disease determination from sample barcodes in a .maf when using a diseases file (see below).]]]]]
Cross-reference data files
-z pathogenic variants, .vcf
-e expression matrix file, .tsv
--inheritanceGeneList inheritance gene list file, (format: gene\tdisease\tmode_of_inheritance) .txt
--PP2GeneList PP2 gene list file, (format: column of genes) .txt
--BP1GeneList BP1 gene list file, (format: column of genes) .txt
[[[[[ -d diseases file, (format: gene\tdisease\tmode_of_inheritance) .tsv]]]]]
-n de novo file, standard .maf
-a assumed de novo file, standard .maf
-c co-segregation file, standard .maf
-H HotSpot3D clusters file, .clusters
I am not found diseases file. Can you upload a "diseases file" or "gene\tdisease\tmode_of_inheritance.tsv"? or tell me where the file, thanks
I need to run charger, because hole my team is waiting for run charger.
we have two thousand whole genome sequencing vcf file and hundreds of cancer panel vcf file waiting for run charger, especially access ClinVar data.
please reply to me, thanks so so much.
yubau, from taiwan
Hi, I'm trying out CharGer for prediction and annotation of my vcf file (annotated by VEP), my command is as follow:
charger \
-f test_charger_vep.vcf \
-o test_charger_vep2.tsv \
-l -D \
--mac-clinvar-tsv ~/clinvar/output/b37/single/clinvar_alleles.single.b37.vcf.gz
But it resulted in error:
charger::getClinVar
Traceback (most recent call last):
File "~/anaconda3/envs/charger/bin/charger", line 743, in <module>
main( sys.argv[1:] )
File "~/anaconda3/envs/charger/bin/charger", line 662, in main
mutationTypes = mutationTypes , \
File "~/anaconda3/envs/charger/lib/python2.7/site-packages/charger/charger.py", line 821, in getExternalData
self.getClinVar( **kwargs )
File "~/anaconda3/envs/charger/lib/python2.7/site-packages/charger/charger.py", line 842, in getClinVar
clinvarSet = self.getMacClinVarTSV( macClinVarTSV )
File "~/anaconda3/envs/charger/lib/python2.7/site-packages/charger/charger.py", line 887, in getMacClinVarTSV
[ description , status ] = self.parseMacPathogenicity( fields[12:17] )
File "~/anaconda3/envs/charger/lib/python2.7/site-packages/charger/charger.py", line 909, in parseMacPathogenicity
named = fields[0]
IndexError: list index out of range
I removed the option --mac-clinvar-tsv
and it can run fine.
I used the single file from latest Mac Clinvar repository.
Could you please help me on this problem?
P/s: I also want to add the HotSpot3D to CharGer, how would I use https://github.com/ding-lab/hotspot3d to generate cluster for this task? What is my input file to get the cluster?
Thank you very much
Hello,
We got the warnings when running with a VEP-annotated VCF or MAF files.
Here are the details:
charger -f ../../software/vcf2maf-1.6.16/gnomAD_APC_maf.vep.vcf -o APC_charger.tsv --PP2GeneList PP2.genes.hg19 --BP1GeneList BP1.genes.hg19 -H APC_inp.maf.3D_Proximity.pairwise.site.l0.ad10.r20.clusters -D --inheritanceGeneList inheritanceGeneList.txt --exac-vcf ~pyang/.vep/ExAC_nonTCGA.r0.3.1.sites.vep.vcf.gz --mac-clinvar-tsv ./clinvar_alleles.single.b37.tsv.gz
Using default module scores and category thresholds:
BA1 = -8
BMC1 = -2
BP1 = -1
BP2 = -1
BP3 = -1
BP4 = -1
BP5 = -1
BP6 = -1
BP7 = -1
BS1 = -4
BS2 = -4
BS3 = -4
BS4 = -4
BSC1 = -6
PM1 = 2
PM2 = 2
PM3 = 2
PM4 = 2
PM5 = 2
PM6 = 2
PMC1 = 2
PP1 = 1
PP2 = 1
PP3 = 1
PP4 = 1
PP5 = 1
PPC1 = 1
PPC2 = 1
PS1 = 7
PS2 = 4
PS3 = 4
PS4 = 4
PSC1 = 4
PVS1 = 8
maxBenignScore = -8
maxLikelyBenignScore = -4
minLikelyPathogenicScore = 5
minPathogenicScore = 9
Will capture vcf details for output: False
This .vcf has VEP annotations!
biomine::variant::mafvariant Warning: could not find amino acid change or intronic change
Hint: Is the input amino acid change column correct?
Problem variant: :None:None-None->-:::c.:::p. -- p.Pro9=
biomine::variant::mafvariant Warning: could not find amino acid change or intronic change
Hint: Is the input amino acid change column correct?
Problem variant: :None:None-None->-:::c.:::p. -- p.Val10=
biomine::variant::mafvariant Warning: could not find amino acid change or intronic change
Hint: Is the input amino acid change column correct?
Problem variant: :None:None-None->-:::c.:::p. -- p.Pro14=
biomine::variant::mafvariant Warning: could not find amino acid change or intronic change
Hint: Is the input amino acid change column correct?
Problem variant: :None:None-None->-:::c.:::p. -- p.Pro14=
Why "charger" is trying to access "biomine" because the input file already has VEP annotations? Moreover, our input file is limited to a gene, i.e. APC, but the standard outputs (see the enclosed file
test.log) with warning appear to have many genes included.
Thanks for your helps.
Hi CharGer Team!
whenever I try to use clinvar using the -l
option, I encounter this error:
charger::getClinVar
warning: ClinVar ReST search batch size given is greater than max allowed (50). Overriding to max search batch size.
Traceback (most recent call last):
File "/home/minku/anaconda3/envs/charger/bin/charger", line 744, in <module>
main( sys.argv[1:] )
File "/home/minku/anaconda3/envs/charger/bin/charger", line 663, in main
mutationTypes = mutationTypes , \
File "/home/minku/anaconda3/envs/charger/lib/python2.7/site-packages/charger/charger.py", line 879, in getExternalData
self.getClinVar( **kwargs )
File "/home/minku/anaconda3/envs/charger/lib/python2.7/site-packages/charger/charger.py", line 905, in getClinVar
self.getClinVarviaREST( **kwargs )
File "/home/minku/anaconda3/envs/charger/lib/python2.7/site-packages/charger/charger.py", line 918, in getClinVarviaREST
ent = entrezapi()
File "/home/minku/anaconda3/envs/charger/lib/python2.7/site-packages/biomine/webapi/entrez/entrezapi.py", line 74, in __init__
self.setRequestLimits()
File "/home/minku/anaconda3/envs/charger/lib/python2.7/site-packages/biomine/webapi/entrez/entrezapi.py", line 332, in setRequestLimits
self.setSummaryBatchSize( entrezaip.summaryBatchSize )
NameError: global name 'entrezaip' is not defined
Is entrezaip supposed to be entrezapi on line 332 and 333?
Thank you! :)
I'm running charger 0.5.4 on a gzipped VEP version 99 annotated file. I'm getting a biomine error (could not find amino acid change or intronic change) as shown in the image below.
Charger completes however and I'm able to see the scores for the variants. I'm not sure if there are any variants missed though. I've seen similar errors posted by other users (#5) and the reply was that this error doesn't affect results and that it was fixed by version 0.5.4. However, I still see the error.
Hello,
I was looking for the reference for the inheritance gene list.. (20160301_Rahman_KJ_KH_gene_table_CharGer.txt.gz).
I found this paper from 2016 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4975511/bin/NIHMS69105-supplement-Supplementary_Table_1.xlsx) but it contains fewer genes than the one here: https://github.com/ding-lab/CharGer/tree/7d7d2911b89261fa5dceea6395a5d188a82757f2/PanCanAtlasData
Can the exact reference be shared?
Hi,
Thank you for develop such a good tool! I was able to run CharGer sucessfully, but all variants are classified as "Benign" in the column "CharGer_Classification" if I didn't use parameters -l -O --mac-clinvar-tsv.
I found that there are also many Cross-reference data files. But I'm not sure what Cross-reference data files can help to improve my results and where to acquire them. The disease of my samples is lung cancer. Would you mind provide some sugesstions?
Why is clinvar done always true when reading a vcf file?
Line 489 in 9d0c2c6
Hi -- I was able to successfully run CharGer, but all variants are classified as "Uncertain Significance." Is there a setting that must be tweaked for variant annotation to work correctly? I'm guessing at least some of the variants we have must not be "Uncertain Significance." I'm using the docker image on DockerHub and the command "charger -f /mount/sample_id.vt2_normalized_spanning_alleles.vcf -o /mount/charger_annotated.tsv"
python test_chargervariant.py
E
======================================================================
ERROR: test_nonzero (main.testchargervariant)
Traceback (most recent call last):
File "test_chargervariant.py", line 13, in test_nonzero
if ( v ):
File "/xxx/python2.7/site-packages/charger/chargervariant.py", line 201, in nonzero
elif ( self.checkIfRefAltStrand( k ) ):
AttributeError: 'chargervariant' object has no attribute 'checkIfRefAltStrand'
Ran 1 test in 0.001s
FAILED (errors=1)
I use the pos chr11 108345818 rs587779872 C T , it could be annotated as PM5 in paper Pathogenic Germline Variants in 10389 Adult. But I cant get the same output.
When running CharGer, to use ClinVar database, we have to upload 'clinvar_alleles.tsv.gz' file, but the one we can download from your website is out of date (2018 version), do you have the latest version of this ClinVar file? Otherwise, can we use other alternative way to annotate ClinVar database for CharGer to evaluate the PS1 level? Thanks a lot!
When I run charger 0.5.4 to see the options, this is the output (first few lines):
CharGer - v0.5.3
Usage: charger [options]
Accepted input data files:
-m Standard .maf
-f Standard .vcf
-T Custom .tsv
I checked if I downloaded the correct version.
I want to use Local VEP when use charger
my command is:
(CharGer) [yubau@cmuh-i2 Demo]$ charger -f clinvar_20200210.vep.vcf -o clinvar_20200210.vep.charger.ltEx.tsv -l -t -E -x --exac-vcf ExAC.r1.sites.vep.vcf --mac-clinvar-tsv clinvar_alleles.multi.b37.tsv.gz --perl /home/yubau/perl --vep-script /home/yubau/VEP/ensembl-vep/vep --vep-config /home/yubau/VEP/ensembl-vep/t/Config.t --vep-cache /home/yubau/.vep --vep-version 99 --vep-output clinvar_20200210_output.charger.vep.vcf --grch 37 --ensembl-release 75 --reference-fasta /home/yubau/.vep/homo_sapiens/99_GRCh37 --fork 48
any wrong?
Can you give me an example command when use charger with run VEP?
thanks
Hello, I had this question when I ran charger:
CharGer::runIndelModules Error: Cannot evaluate PVS1 or PM4: No gene list supplied. which argument should I provide?
I used charger -f ${input} -o ${out} -D --inheritanceGeneList ${inherit} --mac-clinvar-tsv ${clinvar} --PP2GeneList ${pp2} -z ${mmVariants} --exac-vcf ${ExAC} -H
Another question : Unsupported VEP version or no gnomAD AF annotation in input file; will search for ExAC frequencies...
my input file was annotated with VEP. I want you will reply to me soon, thanks!
Hello. I'm CharGer user.
I appreciate your helpful program.
In this paper (Cell, 2018, 173, 355-370), this BRCA2 variant (Supplement xlsx file 2A, 13:g.32890660A>G) is one of pathogenic and rare variants (Charger score; 11, PM2, PM5, PS1). But, in my results, this variant's CharGer score is only 2 (only PM2). So in my results, this variant is not called pathogenic or likely pathogenic.
And, in your Supplementary table, Charger score of ATM variant (11:g.108129749C>T) is 17 (PS1+PVS1+PM2). But in my result, this variant's CharGer score is only 6 (PM2 + PSC1).
Could you point out my faults?
I want to get the same results as you.
I think this difference results from "emptyRemoved_20160428_pathogenic_variants_HGVSg_VEP.vcf".
This file has only variants on following chromosomes; chr10, chr13, chr17, chr5.
I added my scripts.
Thanks
Oh.
#cf. my input vcf : varscan2 vcf --> annotation by vep (v94, ref: GRCh37, Exac ; nonTCGA version r1)
mmGenes=$PanCanAtlasData/20160301_Rahman_KJ_KH_gene_table_CharGer.txt
mmVariants=$PanCanAtlasData/emptyRemoved_20160428_pathogenic_variants_HGVSg_VEP.vcf
hotspot=$PanCanAtlasData/MC3.noHypers.mericUnspecified.d10.r20.v114.clusters
clinvar=$PanCanAtlasData/clinvar_alleles.single.b37.tsv.gz
rareThreshold="0.01" # 1% threshold
commonThreshold="0.05" # 5% threshold
$bin/charger --include-vcf-details
-f
-o
-O
-D
-g ${mmGenes}
-z ${mmVariants}
-H ${hotspot}
-l
--rare-threshold $rareThreshold
--common-threshold $commonThreshold
--mac-clinvar-tsv ${clinvar}
Hi there!
I am trying to install CharGer using conda on my mac. I have followed the instruction here (Installation using conda section)
I have created conda environment that has python 2.7.18.
and when I type pip install .
, this mssg appears:
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing wheel metadata ... done
ERROR: Package u'charger-0.6.0b1' requires a different Python: 2.7.18 not in '>=3.6'
What should I do to fix this error? Thank you so much! :)
update (20.12.05):
conda install charger
but it wasn't availablepip install charger
, regardless of python versions, version 5.2.0
is installed.pip install .
in python 3.9 environment, version 6 beta
is installedwhat should I do to install the stable version? (5.4.0). Or is it ok to use 6 beta?
Thank you! :)
update (20.12.7):
wget -O CharGer.zip https://github.com/ding-lab/CharGer/archive/v0.5.4.zip
yay! :)
Hello,I want to ask whether the demo/runDemo.sh /pathogenicVariants.vcf file is lost? I can't find pathogenicVariants.vcf mentioned in runDemo.sh file.
Thnaks for your help.
Hello,
I tried to use CharGer on a WES vcf file processed with GATK 4.1.1.0 from hg38 bam files and annotate wih VEP 95. VCF contains 547 samples. Looking a the results all variant are classified as "Uncertain"
cat out.charger.txt | awk -F '\t' '{print $20}' | sort | uniq -c
224571 Benign
933207 Uncertain Significance
Looking in the log file I can see that some warnings pop out :
No gene list file uploaded. CharGer will not make PVS1 calls.
No PP2 gene list file uploaded. CharGer will not make PP2 calls.
No BP1 gene list file uploaded. CharGer will not make BP1 calls.
No expression file uploaded. CharGer will allow all passed truncations without expression data in PVS1.
Is is expected to have either Benign
or Uncertain Significance
variants ? Do you have maybe an (unannotated) test VCF (in hg38) with variants to should be annotated as pathogenic in order for me to test my config.
Thanks
The log file :
charger -f input.sort.vcf.gz -o out.charger.txt --vep-cache /home/vep/.vep/ --vep-version 95 --grch 38 --reference-fasta /home/genomes/hg38/Homo_sapiens_assembly38.fasta
Using default module scores and category thresholds:
BA1 = -8
BMC1 = -2
BP1 = -1
BP2 = -1
BP3 = -1
BP4 = -1
BP5 = -1
BP6 = -1
BP7 = -1
BS1 = -4
BS2 = -4
BS3 = -4
BS4 = -4
BSC1 = -6
PM1 = 2
PM2 = 2
PM3 = 2
PM4 = 2
PM5 = 2
PM6 = 2
PMC1 = 2
PP1 = 1
PP2 = 1
PP3 = 1
PP4 = 1
PP5 = 1
PPC1 = 1
PPC2 = 1
PS1 = 7
PS2 = 4
PS3 = 4
PS4 = 4
PSC1 = 4
PVS1 = 8
maxBenignScore = -8
maxLikelyBenignScore = -4
minLikelyPathogenicScore = 5
minPathogenicScore = 9
Will capture vcf details for output: False
This .vcf has AF!
Skipping: 0 for filters and 0 for AF and 0 for mutation types out of 1157778
No gene list file uploaded. CharGer will not make PVS1 calls.
No PP2 gene list file uploaded. CharGer will not make PP2 calls.
No BP1 gene list file uploaded. CharGer will not make BP1 calls.
No expression file uploaded. CharGer will allow all passed truncations without expression data in PVS1.
charger::getVEP Warning: skipping VEP
Running VEP took 7.10487365723e-05seconds
charger::getClinVar
Running ClinVar took 1.69277191162e-05seconds
Running exac took 1.59740447998e-05seconds
CharGer module PVS1
- truncations in genes where LOF is a known mechanism of the disease
- require the mode of inheritance to be dominant (assuming heterzygosity) and co-occurence with reduced gene expression
- run concurrently with PSC1, PMC1, PM4, PPC1, and PPC2 -
CharGer::runIndelModules Error: Cannot evaluate PVS1 or PM4: No gene list supplied.
CharGer module PS1
- same peptide change as a previously established pathogenic variant
PS1 found 0 pathogenic variants
CharGer module PS2
- de novo with maternity and paternity confirmation and no family history
CharGer module PS3: Well-established in vitro or in vivo functional studies supportive of a damaging effect on the gene or gene product
CharGer module PS4: not yet implemented
CharGer module PM1: Located in a mutational hot spot and/or critical and well-established functional domain (e.g., active site of an enzyme) without benign variation
CharGer::PM1 Warning: clustersFile is not supplied. PM1 was not executed.
CharGer module PM2
- absent or extremely low frequency in controls
CharGer module PM3: not yet implemented
CharGer module PM4
- protein length changes due to inframe indels or nonstop variant of selected genes -
CharGer module PM5
- different peptide change of a pathogenic variant at the same reference peptide
PM5 found 0 pathogenic variants
CharGer module PM6
- assumed de novo without maternity and paternity confirmation
CharGer module PP1
- cosegregation with disease in family members in a known disease gene
CharGer module PP2: Missense variant in a gene that has low rate of benign missense and in which missense are common mechanism of disease
CharGer::PP2 Error: Cannot evaluate PP2: No PP2 gene list supplied.
CharGer module PP3
- multiple lines of in silico evidence of deliterous effect
Found 0 variants with >= 2 of in silico evidence
CharGer module PP4: not yet implemented
CharGer module PP5: not yet implemented
CharGer module BA1
- allele frequency >5%
CharGer module BS1: not yet implemented
CharGer module BS2: not yet implemented
CharGer module BS3: not yet implemented
CharGer module BS4: not yet implemented
CharGer module BP1: Missense variant in a gene for which primarily truncations cause disease
CharGer::BP1 Error: Cannot evaluate BP1: No BP1 gene list supplied.
CharGer module BP2: not yet implemented
CharGer module BP3: not yet implemented
CharGer module BP4
- in silico evidence of no damage
Found 0 variants with >= 2 with in silico evidence
CharGer module BP5: not yet implemented
CharGer module BP6: not yet implemented
CharGer module BP7: not yet implemented
CharGer module PSC1
Recessive truncations of susceptible genes
CharGer module PMC1
Truncations of genes when no gene list provided
CharGer module PPC1
- protein length changes due to inframe indels or nonstop variant of other, not-specificied genes -
CharGer module PPC2
- protein length changes due to inframe indels or nonstop variant when no susceptibility genes given -
CharGer module BSC1
- same peptide change as a previously established benign variant
BSC1 found 0 benign variants
CharGer module BMC1
- different peptide change of a benign variant at the same reference peptide
BMC1 found 0 benign variants
0.0005 < 0.05
write 1157778 charged user variants to out.vep.charger.txt
charger::writeSummary Warning: skipping pubmed link tests
CharGer run Times:
input parse time (s): 0.000698089599609
get input data time (s): 37452.09834
get external data time (s): 105.454962015
modules run time (s): 268.449123859
classification time (s): 314.865689993
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.