Comments (4)
An interesting use case. In theory, there should be an algorithm that can generate phased assembly with just one parent (to lower accuracy in comparison to trio binning). However, hifiasm doesn't support this use case yet and implementing it will take time. We will keep this in mind. Thanks.
from hifiasm.
Yes, adding to Heng's comment: I think WHdenovo algorithm (https://github.com/shilpagarg/WHdenovo) in principle can generate phased assemblies for any variants of pedigrees. We have designed this tool for Illumina and long-read data types. From our experiments, we observe that this method works fine for genome sizes up to 500 Mb and repeats 10-20%. I am happy to work with you to help you solve your use case. Let me know.
from hifiasm.
The genome we are assembling has 2% heterozygocity. The kmer count histogram show a nice bimodal distribution. To produce the yak paternal file for hifiasm we would need to perform three operations on the HiFi and maternal yak files. One would be to chunk the HiFi kmer file to separate the homozygous and heterozygous kmer using minimum and maximum coverage thresholds (kind of yak split), the second, to find heterozygous kmers which a not in the maternal file (kind of yak diff) and the last to join homozygous and partenal heterozygous kmer (kind of yak merge).
Yak code (https://github.com/lh3/yak) does not seem to have these functions.
Another solution would be to perform this with another tool, for example kmc, and then to format the resulting file in yak format.
Is there a script to format a kmer count file in yak format?
from hifiasm.
I guess the second solution might be better. If you want to do it manually, I recommend you to partition corrected reads by yourself, and then feed the partitioned lists to hifiasm by the option '-3' and '-4'.
from hifiasm.
Related Issues (20)
- Possible missing one haplotype in human assemblies HOT 2
- No haploid.gfa files output in trio-binning mode HOT 3
- Hifi + Hi-c + ONT assembly fails
- In Trio-binning, always more on hap1 despite (almost) same sequences for paternal and maternal
- discontinuous assembly with shorter pacbio hifi reads but high coverage HOT 2
- Is x20 of Hifi data enough to construct draft assembly of 6.5Gb genome? HOT 1
- line 8: 110334 Aborted(core dumped) HOT 1
- Ultra Long intergration failed: no output for UL kmer counting HOT 3
- missing 8Mb sequences in the assembly HOT 5
- Empty haplotype 2 gfa files by ONT integration HOT 1
- Basic Question About HiFi Input HOT 3
- Spend too long times to run hifiasm HOT 1
- Switch error on X and Y chromosome HOT 2
- *.ovlp.bin file HOT 1
- Resolving switching error (?)
- Interchromosomal misjoin HOT 1
- Read error correction does not reduce the number of kmers present once, twice or three times
- Recreate p_ctg from p_utg
- In the diploid assembly, hfiiasm identified a value that did not exist in the k-mer plot as the "homozygous read coverage threshold".
- fungi diploid assemble phasing errors
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hifiasm.