Giter Site home page Giter Site logo

Comments (7)

zhenzhenyang-psu avatar zhenzhenyang-psu commented on May 30, 2024

I compared the N50 of the fast from the following three files:
comb.hifiasm.p_ctg.fa
haplotype_binning_-3-4out.hap1.p_ctg.fa
haplotype_binning_-3-4out.hap2.p_ctg.fa

The N50 is not much different:
combined primary contain N50:
"totalContigLength": "1419992",
"numberOfContigs": "3",
"contigN50": "492778",
"longestContig": "657757",
"totalScaffoldLength": "1419992",
"numberOfScaffolds": "3",
"scaffoldN50": "492778",
"longestScaffold": "657757",

haplotype1 primary contain N50:
"totalContigLength": "1419063",
"numberOfContigs": "3",
"contigN50": "491528",
"longestContig": "658078",
"totalScaffoldLength": "1419063",
"numberOfScaffolds": "3",
"scaffoldN50": "491528",
"longestScaffold": "658078",

haplotype2 primary contig N50:
"totalContigLength": "1416418",
"numberOfContigs": "3",
"contigN50": "492613",
"longestContig": "655973",
"totalScaffoldLength": "1416418",
"numberOfScaffolds": "3",
"scaffoldN50": "492613",
"longestScaffold": "655973",

They are not much different. I guess this means that the haplotype binning information I provided is not better than what hifiasm can do inherently.

so presumably if the HIFI reads that I provide span a much longer haplotype block than what HIFI reads can phase, it should be better.

Anyways, sorry for bothering.

I was wondering the difference of primary unitigs (p_utg.gfa)/the alternate unitigs (r_utigs.gfa) from primary contigs.
is there a way to visualize the .gfa for these different files for illustration.

Finally, there is a question of how to use the overlap information
(comb.hifiasm.ovlp.source.bin/comb.hifiasm.ovlp.reverse.bin) to identify any HiFi reads that overlap with HiFI reads of interest (among the input HiFi reads)?
thanks a lot,
zhenzhen

from hifiasm.

chhylp123 avatar chhylp123 commented on May 30, 2024

The assemblies with/without -3 and -4 are totally different:

  1. If you run hifiasm in default, it will output a primary assembly including haplotype switch between two haplotypes. So the primary assembly is not a fully phased assembly, it is a mixture of two haplotypes.
  2. If you run hifiasm with -3 and -4, it will try to output two fully phased assemblies, one for each haplotypes (see Figure 1 and Figure 2 in https://arxiv.org/pdf/2008.01237.pdf).

If the N50s of hap1, hap2 and the primary assembly are similar, I guess the reason might be that your sample is too simple. You can have a try with large and complex samples.

As for the overlaps, hifiasm can output all overlaps in PAF format with the option --write-paf.

from hifiasm.

zhenzhenyang-psu avatar zhenzhenyang-psu commented on May 30, 2024

thanks, Haoyu!
When you say large and complex samples, do you mean input file containing HiFi reads across a bigger region from the genome?

from hifiasm.

lh3 avatar lh3 commented on May 30, 2024

Option -1/-2 or -3/-4 is intended for whole-genome phasing.

from hifiasm.

zhenzhenyang-psu avatar zhenzhenyang-psu commented on May 30, 2024

I see. Thanks!
I wonder when using -3/-4, if the ability to designate reads from two separate haplotypes is limited - i.e. only a portion of the reads are indicated to be haplotype-specific and provided via -3/-4, how does hifiasm handle the remaining reads? Does it consider all the remaining reads as homozygous and add to each of the two haplotypes?

from hifiasm.

chhylp123 avatar chhylp123 commented on May 30, 2024

No, in this case, it can still check which read is heterozygous, and then assign heterozygous reads to one haplotype randomly.

from hifiasm.

zhenzhenyang-psu avatar zhenzhenyang-psu commented on May 30, 2024

No, in this case, it can still check which read is heterozygous, and then assign heterozygous reads to one haplotype randomly.

that's wonderful! thank you!

from hifiasm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.