Comments (5)
Hi, @quito418,
Thanks for your interest and the great questions!
-
The training design of ClaiS might be different from VarNet or Neusomatic. ClairS relied on reliable synthetic data for model training. To our knowledge, VarNet was trained on hundreds of real tumor panels to achieve robust performance. Neusomatic was trained in synthetic data with the spike-in mutations as well as real panel data for training, as described here. Training all three callers on the same dataset would be challenging due to the design. And please kindly let us know if you have any findings on it.
-
Our comparison with Neusomatic and VarNet lies on short-read data, we are more eager to know how the data synthesis workflow performed in short-read data, as currently there is no reliable long-read somatic variant caller for evaluation. We hope that ClairS can serve as a complement to other short-read callers, especially in specific regions where they may fall short.
from clairs.
Yes, we used the latest version of NeuSomatic(v0.2.1).
I checked and seems their training material is not publicly available now. Those models are trained with SEQC source, which includes the HCC1395/BL dataset that we used for benchmarking. Hence, we did not include those models for benchmarking for a fair comparison.
from clairs.
Thank you for providing the specification!
from clairs.
Awesome, thank you for the kind response!
from clairs.
Hi,
I've noticed that the latest publicly available version of NeuSomatic appears to be v0.2.1 from 2019. Can you confirm this was used for the comparison with ClairS?
Additionally, the linked paper that discusses the updated results of NeuSomatic trained on various datasets – is it currently not publicly accessible? or have you used model from those?
Thank you for your assistance.
from clairs.
Related Issues (20)
- Option to call SNPs only HOT 1
- Haplotype filtering step keep stuck HOT 4
- Training for PacBio HiFi indel calling HOT 11
- question: model for 5khz data HOT 4
- Nondeterminism in ClairS output HOT 1
- Germline variants present in output.vcf HOT 1
- Question: comparison with DRAGEN Somatic HOT 1
- Docker latest version HOT 1
- [Inquiry for train dataset generation procedure] HOT 2
- Questions Regarding Heterozygous Variants, Somatic Mutations, and Phasing in ClairS Usage HOT 4
- add v4.3.0 model for clair3 params HOT 6
- sh files for data preprocessing HOT 1
- Question in training data label generation code - get_candidates.py HOT 2
- Enhancing somatic variant calling and execution speed HOT 5
- ClairS crashing with spaces in input file name HOT 2
- tmp folders not being deleted after calling HOT 2
- ClairS quits with exit code 0 but no output, no error logged HOT 5
- Adding Normal Sample GT to the VCF file HOT 3
- samtools index: failed to open error HOT 9
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from clairs.