Giter Site home page Giter Site logo

no_map reads seem mappable about readfish HOT 6 CLOSED

looselab avatar looselab commented on August 22, 2024
no_map reads seem mappable

from readfish.

Comments (6)

mattloose avatar mattloose commented on August 22, 2024

Hi,

Could you please provide the toml file you used to configure the experiment with. In addition can you grep the specific read id from all the log files and share that. Finally are you able to share the fast5 of the read itself?

Is this on playback or a real run?

If you can't share the fast5 signal of the read could you please inspect it to see if the start of the read looks unusual in anyway,

Thanks,

Matt

from readfish.

sleepyknife avatar sleepyknife commented on August 22, 2024

Hi,

We modified toml file only targets&max_chunk column and the other parameter left as default.
Our toml file as below.
image

And we had run the ru_validate to verify our setting.
image

The grep read detail log.
grep_read_log.txt

This experiment was playback test by using your provided Fast5. We simulate the custom setting of toml file before jump into a real run.

Thanks.

from readfish.

mattloose avatar mattloose commented on August 22, 2024

Hi,

Thanks for this - this is helpful - is there any chance you can provide us with the fast5 for this specific read?

From your mapping we can tell there is clipping at one end of the read, but we can't determine if this is the start or the end of the read. We suspect it is the start of the read and if so you have likely got abnormal signal at the start (this can happen with playback but also is something that can be trimmed by basecallers normally).

We note that we can basecall the read during the read until steps and those basecalls make sense as the basecall call length is increasing as we would expect.

We have seen this from time to time and we don't worry about it. On a real run, these reads would likely be cleared from the sequencer for other reasons.

Crucially we don't recommend setting the mach chunks as high as 64 - it isn't in the documentation but I will @alexomics to consider adding it. In essence we believe there is no point rejecting a read after around 2-3kb of sequence as you run the risk of tangles on the trans side of the pore. So we would advise setting max chunks to a maximum of 16 if you are using 0.4 seconds. This would equate to about the sequence length we expect. You may end up rejecting a read that you subsequently wanted but overall this has little negative impact.

If you can share the fast5 we're happy to look more. - so I will leave this open for now.

Best

Matt

from readfish.

ythuang0522 avatar ythuang0522 commented on August 22, 2024

Hi Matt,

The playback fast5 is the same one in your GitHub doc.
Thanks for the explanation. We understood it may be unmapped due to bad signals at first 3kbp (soft clipped) of the read. But we can't understand why it's still unmapped after 7kb long. Quite a few long reads were unmapped and thus were not unblocked. Thus worry if this is due to fast basecalling instead of HAC (cuz our GPU can't keep up with HAC).

The max chunk parameter was increased for sensitivity because many reads were unmapped in four chunks and we suspected read quality of fast basecalling may be lower. Will follow your suggestions.

Many thanks,
Yao-Ting

from readfish.

mattloose avatar mattloose commented on August 22, 2024

HI Yao-Ting,
I'm after the specific READ fast5 that was generated from the bulk file - so we can have a look at the signal actually in that read.

The issue about it being unmapped after 7kb is surprising - it may be due to the normalisation of the data.

We see little difference in basecalling with fast vs HAC on the results of a read until experiment.

You will see lots of reads being unmapped in four chunks. You see far more reads via read until than you do at the end of a run as it tests everythign that might be read like. Also when running playback you get misled because rejecting a read doesn't actually change the signal source (you don't swap to a new read) so it behaves differently to a real run.

Hope this helps.

Matt

from readfish.

ythuang0522 avatar ythuang0522 commented on August 22, 2024

Thanks Matt. That's very helpful. We will move on to a real run and see if everything goes well.
Yao-Ting

from readfish.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.