Giter Site home page Giter Site logo

Comments (17)

bistace avatar bistace commented on June 25, 2024

Hello,
the polished assembly is in hapog_results/hapog.fasta.

from hapo-g.

bistace avatar bistace commented on June 25, 2024

Do some logs in logs/hapog_{chunk}.e contain errors?

Also, did you polish the assembly with long reads? If so, it is still very experimental and I don't recommend doing it with Hapo-G at the moment.

from hapo-g.

zmz1988 avatar zmz1988 commented on June 25, 2024

Hi, sorry, I realised that I got one sample finished. and the one can't be finished has this screen output

Checking dependencies...
        Found BWA.
        Found Samtools.

Indexing the BAM file...
Done in 13 seconds

Fragmenting the genome into 30 chunks of 4,240,170 bases (depending of scaffold sizes)
Done in 1 seconds

Extracting bam for each chunk
Done in 30 seconds

Launching Hapo-G on each chunk
ERROR: Hapo-G didn't finish successfully, exit code: -11
Faulty command: /miniconda3/envs/hapog/bin/build/hapog -b chunks_bam/chunks_3.bam -f chunks/chunks_3.fasta -o hapog_chunks/chunks_3.fasta -c hapog_chunks/chunks_3.changes

I had checked logs/hapog_chunks_3.e, and it's empty. What do you think that could cause this? I had this error -11 several times already.

from hapo-g.

bistace avatar bistace commented on June 25, 2024

Could you launch the faulty command independently to precisely see the error? I mean this one:

/miniconda3/envs/hapog/bin/build/hapog -b chunks_bam/chunks_3.bam -f chunks/chunks_3.fasta -o hapog_chunks/chunks_3.fasta -c hapog_chunks/chunks_3.changes

from hapo-g.

zmz1988 avatar zmz1988 commented on June 25, 2024

hmm, it seems strange... I run the faulty command mentioned above, and then I think it finished successfully... So I guess it's somehow random? Will it be a problem if I run several hapog command at the same time in our cluster? There should be no memory problem I think because we have a lot of nodes and a lot of memory per node

(hapog) [ucbtzz1@login15 polish2]$ /miniconda3/envs/hapog/bin/build/hapog -b chunks_bam/chunks_3.bam -f chunks/chunks_3.fasta -o hapog_chunks/chunks_3.fasta -c hapog_chunks/chunks_3.changes
Reference sequence : contig_110_polished
|**************************************************| 100%
Reference sequence : contig_111_polished
|**************************************************| 100%
Reference sequence : contig_112_polished
|**************************************************| 100%
Reference sequence : contig_114_polished
|**************************************************| 100%
Reference sequence : contig_117_polished
|**************************************************| 100%
Reference sequence : contig_119_polished
Segmentation fault

from hapo-g.

bistace avatar bistace commented on June 25, 2024

It did not finish successfully, you can see the Segmentation fault at the end of the log. Would it be possible for you to send us the fasta file and the bam file that are used in the faulty command to investigate the issue?

from hapo-g.

zmz1988 avatar zmz1988 commented on June 25, 2024

Hello, sorry for the late reply! Sure. How can I send it to you?

from hapo-g.

bistace avatar bistace commented on June 25, 2024

I'm sorry, I was sure I replied to your message. Could you please shoot an email to [email protected]? I will send you a link to securely upload your data with Renater Filesender.

from hapo-g.

wjq1981 avatar wjq1981 commented on June 25, 2024

I'm sorry, I was sure I replied to your message. Could you please shoot an email to [email protected]? I will send you a link to securely upload your data with Renater Filesender.

Hi, I have the same problem, how can I solve it? Thank you very much!

Checking dependencies...
Found BWA.
Found Samtools.

Generating bwa index...
Done in 897 seconds

Launching mapping on genome...
Done in 20070 seconds

Indexing the BAM file...
Done in 691 seconds

Fragmenting the genome into 23 chunks of 50,191,572 bases (depending of scaffold sizes)
Done in 8 seconds

Extracting bam for each chunk
Done in 388 seconds

Launching Hapo-G on each chunk
ERROR: Hapo-G didn't finish successfully, exit code: -9
Faulty command: /HAPO-G/build/hapog -b chunks_bam/chunks_5.bam -f chunks/chunks_5.fasta -o hapog_chunks/chunks_5.fasta -c hapog_chunks/chunks_5.changes

from hapo-g.

bistace avatar bistace commented on June 25, 2024

HI @wjq1981,
could you please open up a new issue with all log files?

from hapo-g.

bistace avatar bistace commented on June 25, 2024

Hello,
we've pushed a few changes on Conda and on Github that specifically solve some problems encountered when polishing with long reads. Could you please check it out?

from hapo-g.

zmz1988 avatar zmz1988 commented on June 25, 2024

Hello, sorry for the very late reply! I was occupied by other projects for a while. Now I tried the new version with combined HiFi and Illumina bam file, and it got stuck on "launching Hapo-G on each chunk" for more than 12 hours already... the bam file is 9G...

Here are my codes

minimap2 -t 20 -ax map-hifi $assembly.fasta ${HiFi.reads} > ${name}.hifi.sam
samtools view -Sb ${name}.hifi.sam | samtools sort -@ 20 > ${name}.hifi.bam
bwa index $assembly.fasta
bwa mem -t 20 -o ${name}.illumina.sam ${assembly.fasta} ${Illumina_seq1} ${Illumina_seq2}
samtools view -Sb ${name}.illumina.sam | samtools sort -@ 20 > ${name}.illumina.bam
samtools merge -@ 20 ${name}.merged.bam ${name}.hifi.bam ${name}.illumina.bam
samtools sort -@ 20 ${name}.merged.bam > ${name}.merged.sorted.bam
samtools index ${name}.merged.sorted.bam
hapog.py --genome ${assembly.fasta} -b ${name}.merged.sorted.bam -o ${name}_hapog -t 30 -u

and the output for hapog.py

$ hapog.py --genome assembly.fasta -b name.merged.sorted.bam -o name_hapog -t 30 -u

Checking dependencies...
	Found BWA.
	Found Samtools.

Indexing the BAM file...
Done in 157 seconds

Fragmenting the genome into 30 chunks of 5,073,883 bases (depending of scaffold sizes)
Done in 2 seconds

Extracting bam for each chunk
Done in 253 seconds

Launching Hapo-G on each chunk

Is it normal for long-reads data to take this long time?

from hapo-g.

bistace avatar bistace commented on June 25, 2024

Hello,

it should be pretty fast (i.e less than a day) on a small bam file. I would recommend that you wait for it to finish but just to be sure, do you see hapog processes on the machine when you execute top? Also, I would recommend running minimap2 with the --secondary=no options as some lines in the bam won't be formatted in the same way than with BWA and could cause some crashes.

from hapo-g.

bistace avatar bistace commented on June 25, 2024

Just to add on that, we never tested to give hapog short reads and long reads as a single input so there may be some side effects that we have not encountered yet.

from hapo-g.

zmz1988 avatar zmz1988 commented on June 25, 2024

Thanks!! Yes, it's running. Before I was not sure whether to add --secondary=no in minimap2, and now it's clear. Thanks for mentioning that! I did Illumina reads successfully before with the older version of hapog, and it was very fast. that's why I asked this time when it took more than 12 hours. I will update later whether it's finished or not. Thanks!

from hapo-g.

zmz1988 avatar zmz1988 commented on June 25, 2024

Hello! I'm happy to confirm that hapog works well with the merged bam file from hifi long-read and illumina short-read! Thanks a lot for the easy-to-use tool! btw, when we have both hifi and illumina, could you recommend how to use both in polishing? combined or separately?

from hapo-g.

bistace avatar bistace commented on June 25, 2024

I am glad it worked well!

I would say that separately would be better as there is a coverage cutoff set to 50x to limit the number of calculations we have to do. That means that for each position of the assembly, we only look at the first 50 alignments over that position. This could cause problems if the long-reads are after the short reads in the bam file because that could lead to long-reads not being used for correcting the assembly.

As for the order in which to use the reads, we did not look into that but intuitively, I would say that it may be better to polish first with short reads and then with long reads as repeats may not all be well corrected with only short reads.

from hapo-g.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.