Comments (6)
It looks like the "AF" (allele frequency, e.g. 0.5 or 1) INFO field is not emitted by VarScan, so I'll have to modify this part of CNVkit to handle it.
Could you post a couple of lines from your VarScan VCF file here? I don't need the whole thing, just a few lines and ideally the header too, if you can.
from cnvkit.
Thanks, here it is:
##fileformat=VCFv4.1
##source=VarScan2
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total depth of quality bases">
##INFO=<ID=SOMATIC,Number=0,Type=Flag,Description="Indicates if record is a somatic mutation">
##INFO=<ID=SS,Number=1,Type=String,Description="Somatic status of variant (0=Reference,1=Germline,2=Somatic,3=LOH, or 5=Unknown)">
##INFO=<ID=SSC,Number=1,Type=String,Description="Somatic score in Phred scale (0-255) derived from somatic p-value">
##INFO=<ID=GPV,Number=1,Type=Float,Description="Fisher's Exact Test P-value of tumor+normal versus no variant for Germline calls">
##INFO=<ID=SPV,Number=1,Type=Float,Description="Fisher's Exact Test P-value of tumor versus normal for Somatic/LOH calls">
##FILTER=<ID=str10,Description="Less than 10% or more than 90% of variant supporting reads on one strand">
##FILTER=<ID=indelError,Description="Likely artifact due to indel reads at this position">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth">
##FORMAT=<ID=RD,Number=1,Type=Integer,Description="Depth of reference-supporting bases (reads1)">
##FORMAT=<ID=AD,Number=1,Type=Integer,Description="Depth of variant-supporting bases (reads2)">
##FORMAT=<ID=FREQ,Number=1,Type=String,Description="Variant allele frequency">
##FORMAT=<ID=DP4,Number=1,Type=String,Description="Strand read counts: ref/fwd, ref/rev, var/fwd, var/rev">
CHROM,FROM,REF,ALT,-,-,-,FILTER,-
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NORMAL TUMOR
1 69511 . A G . PASS DP=50;SS=1;SSC=0;GPV=9.9117E-30;SPV=1E0 GT:GQ:DP:RD:AD:FREQ:DP4 1/1:.:20:0:20:100%:0,0,15,5 1/1:.:30:0:30:100%:0,0,23,7
1 139213 . A G . PASS DP=19;SS=1;SSC=5;GPV=5.9414E-9;SPV=2.6316E-1 GT:GQ:DP:RD:AD:FREQ:DP4 1/1:.:9:0:9:100%:0,0,3,6 1/1:.:10:2:8:80%:1,1,1,7
1 762273 . G A . PASS DP=109;SS=1;SSC=0;GPV=4.3979E-65;SPV=1E0 GT:GQ:DP:RD:AD:FREQ:DP4 1/1:.:54:0:54:100%:0,0,8,46 1/1:.:55:0:55:100%:0,0,9,46
from cnvkit.
I've just merged a pull request (#11) that should fix this. Can you try it now?
from cnvkit.
Thanks, it is working now. However, I am getting different chromosomes with significant LOH shift when I use "scatter" or "loh" with the same VCF file. Shouldn't the result be the same? From the plot I think "scatter" is wrong as I cannot see why it marks the chrosomome significant (all VAFs <0.8).
from cnvkit.
I agree with your assessment of it. I'll take a closer look.
from cnvkit.
The loh
command had an option --min-depth
which filtered out SNVs with depth < 20, while scatter
was missing this option and did no filtering. Fixed in: bbe9af2
The deeper problem was that the test statistic I used was garbage, so I disabled it here: b339ca1
I will at some point implement a better test for LOH and then re-enable reporting and colorization in both plots.
from cnvkit.
Related Issues (20)
- AttributeError: module 'pomegranate' has no attribute 'NormalDistribution' in hmm-germline HOT 2
- Error while indexing BAM files
- Cnvkit call and cnvkit genemetrics_differences + scatter plot
- Convert SEG format and do CNVkit to analysis ?
- Gender Inference in Reference Step
- Update included references for hg38 HOT 1
- What are the specific steps to run for cnvkit.py batch -m wgs?
- How to normalise different sequencing coverage samples in CNVKit?
- For paired-WGS, is it necessary step of markduplicates by PICARD? HOT 1
- `import-rna` not compatible with pandas 2 HOT 1
- Can the results of CNVKIT be applied to clinical clinical?
- UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte` HOT 1
- negative number in cn column of *call.cns
- Installation from VCS via pip fails during pomegranate installation
- the false negative of segment HOT 2
- Question: Clarification on building a reference HOT 3
- "Genemetrics" combination algorithm seems to include an additional antitarget bin
- Calculating non-integer copy number variations
- A question about purity assumption in the Docs
- Docker image of cnvkit does not contain additional scripts HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cnvkit.