Comments (1)
Hi @ekofman
Thank you for your question and for using our tool.
The reason why all your variants are being classified as uncertain significance is that you're not setting any additional parameters to CharGer. The simpler way to put this is: the more information about your variants that you give CharGer, the better your variant classification will be. For more information on ACMG guidelines implemented into CharGer, please read: https://www.nature.com/articles/gim201530.pdf
Is your input vcf file annotated with VEP? If not, you can VEP annotate your file within CharGer (please refer to README). This should improve your results a bit.
Adding some of the different parameters described in our README file should also make your analysis more precise.
For example, you can use have CharGer access the ClinVar database by using the -l
flag accompanied by the --mac-clinvar-tsv
file that you can download from the MacArthur lab github page (https://github.com/macarthur-lab/clinvar/tree/5b04ade4fb4d2f13ffd39e4a8d9ade9af28fdaf9). This will allow CharGer to gather information for you variants from the ClinVar database and improve variant classification. CharGer will soon allow input files downloaded directly from ClinVar, but you can use the MacArthur lab file for now.
You can also input some of the cross-reference data files described, or a allele frequency threshold for rarity (please refer to README). For an example of the CharGer tool being applied to one of our studies, please refer to our PanCan Atlas germline paper: https://www.sciencedirect.com/science/article/pii/S0092867418303635?via%3Dihub#sec4
The cross-reference data-files used in this study (pathogenic variants .vcf file, inheritanceGeneList (which includes a list of 152 known cancer predisposition genes), and a HotSpot3D clusters file) are present here: https://github.com/ding-lab/CharGer/tree/master/PanCanAtlasData
These files should give you a good example of their expected formats.
For a more in-depth description of some of the cross-reference data files you can use as input, please read below:
-z pathogenic variants, .vcf : this is a .vcf file with known pathogenic variants that you may compile yourself. This list is taken into account by CharGer when implementing the PS1 and PM5 ACMG evidence levels.
Depending on your study, you may compile a list of known pathogenic variants (confirmed in the literature and/or ClinVar) that are specific and/or relevant to your disease.
-e expression matrix file, .tsv : this is a .tsv file, which a column for each sample, and a row for each gene. If you have expression data for the genes youβre targeting or genes of your interest, you can generate a matrix like this using RSEM, for example.
If you do not input an expression matrix, CharGer will allow eligible truncations in your data set without expression data in the PVS1 evidence level.
If you provide expression data, a threshold of 0.2 is used. If expression is lower than the threshold, truncation is allowed in the PVS1 evidence level.
Note that the PVS1 evidence level requires the mode of inheritance to be dominant (assuming heterzygosity) and co-occurence with reduced gene expression if expression data is provided.
--inheritanceGeneList: is a tab-delimited file that should contain three columns: gene, disease, and mode of inheritance (autosomal dominant, autosomal recessive). Make sure to use approved HUGO symbols.
This file should be use when you have a list of known predisposition genes you would like to input to CharGer. This list is taken into account by several evidence levels (PVS1, PSC1, PM4, PP2, and PPC1).
--PP2 Gene list: this is just a file with a gene per line (be sure to use approved HUGO symbols. Following the ACMG guidelines description, this list should include susceptibility genes that have a low rate of benign missense variation and in which missense variants are a common mechanism of disease. Missense variants in any of these genes will fall into the PP2 evidence level.
--BP1 Gene list: same format as the PP2 Gene list. Following the ACMG guidelines description, this list should include genes for which primarily truncating variants are known to cause disease. Missense variants falling in any of these genes will fall into the BP1 evidence level.
-n de novo file: this is a standard maf (mutation annotation format) file; this file should contain de novo variants with maternity and paternity confirmation and no family history. This file, if provided, is taken into account in the PS2 ACMG evidence level. If you have this information from your dataset; please provide it using this argument.
-a assumed de novo file: this is a standard maf file as above; this file should contain assumed the novo variants from your dataset; i.e. variants for which you have evidence are de novo, but do not have maternity or paternity confirmation. This file, if provided, is taken into account in the PM6 ACMG evidence level.
-c co-segregation file: this is also a standard maf file; this file should include variants cosegregating with disease in multiple affected family members in a gene definitively known to cause the disease (according to ACMG guidelines).
-H HotSpot3D clusters file: this a file can be generated by our HotSpot3d tool (https://github.com/ding-lab/hotspot3d), which identifies mutation hotspots from linear protein sequence and correlate the hotspots with known or potentially interacting domains, mutations, or drugs. If provided, this file is taken into account in the PM1 evidence level. If a germline variant is located in a mutational hot spot and/or critical and well-established functional domain (e.g. active site of an enzyme) without benign variation, the the variant is flagged with a pathogenic characterization of PM1.
An example of this file, which was used in our PanCan study, is present here: https://github.com/ding-lab/CharGer/tree/master/PanCanAtlasData
Applying some of these parameters and files should improve your results.
Hope this helps. Please let us know if you have any additional questions.
- Fernanda
from charger.
Related Issues (20)
- something wrong HOT 7
- run demo.sh, report error HOT 2
- can you give me an example command with run VEP? HOT 4
- I can not get any output file using v0.6.0b1 version HOT 1
- Problems to run CharGer v0.5.4 HOT 5
- IndexError: list index out of range while running with Mac-Clinvar HOT 4
- Provide instructions to generate the HotSpot3D cluster file
- Null Variant classified as Benign HOT 3
- installed Charger 0.5.4 but version in help is 0.5.3 HOT 1
- error: Hint: is the input amino acid change column correct , charger version 0.5.4 HOT 1
- Reference for inheritanceGeneList 20160301_Rahman_KJ_KH_gene_table_CharGer.txt.gz HOT 2
- What Cross-reference data files to use and where to get for lung cancer
- CharGer::runIndelModules Error:
- Install error
- NameError: global name 'entrezaip' is not defined when doing clinvar search
- is this software alive?
- Need most updated ClinVar files
- PM5 found 0 pathogenic variants
- Default inheritance table and gene list for BP2 and PP1.
- Unable to install charger via pip or conda
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from charger.