Giter Site home page Giter Site logo

Comments (15)

github-actions avatar github-actions commented on August 22, 2024

Thank you for your issue. Give us a little time to review it.

PS. You might want to check the FAQ if you haven't done so already.

This is an automated reply, generated by FAQtory

from readfish.

mattloose avatar mattloose commented on August 22, 2024

When you run readfish on playback with a barcoded sample you can get very unexpected results. What you are seeing here is a consequence of how playback works. If you imagine a single channel on the sequencer, playback provides the signal that passed through the channel exactly as it happened on the original run. If you send a message to the sequencer to throw away the read, the the sequencer will break the playback read at that point and then start a new read from the signal that is coming from the playback file. That is not a true read and so it will therefore not begin with a barcode - you are starting half way through a read and it is correctly classified as unclassified (if that makes sense!).

I hope that helps.

from readfish.

lborcard avatar lborcard commented on August 22, 2024

Thank you for the answer, I was wondering how it worked. So in general a simulated run will underestimate the number of true hits? Do you usually run several simulated run to get an idea of the perfomance of AS?

from readfish.

lborcard avatar lborcard commented on August 22, 2024

Do you usually get a better idea of the performance by not testing with barcoded experiments?

from readfish.

mattloose avatar mattloose commented on August 22, 2024

If you wish to simulate barcoded runs (or non barcoded runs) you can use our icarust tool:

https://academic.oup.com/bioinformatics/article/40/4/btae141/7628125

It all depends on what you are trying to test with respect to performance as to whihc approach is better.

from readfish.

lborcard avatar lborcard commented on August 22, 2024

The idea for us is to compare A wgs run to adaptive sampling basically. So we perform a wgs run and we then run a simulated AS run for the same amount of time using the bulk file.

from readfish.

mattloose avatar mattloose commented on August 22, 2024

OK - assume you are targetting a 1mb region of a 10mb genome. When you run your normal WGS (and record a bulkfile) you obtain 100x coverage of the 10 mb genome (and therefore 100x of your 1mb region). When you run adaptive sampling on the playback of that bulkfile you will still end up with approximately 100x coverage of your 10 mb genome. The reason for this is that in playback you don't actully throw away the read - you merely break a read that you don't want into smaller bits. So to see the effect of adaptive sampling on you run on playback you would need to look at the read lengths on your 1mb target region vs your 9mb of off-target. You should see long reads "on-target" and short reads "off-target".

In contrast, Icarust will simulate reads being removed, but does so from simulated data. With icarust you will see actual enrichment - but it is more theoretical.

from readfish.

lborcard avatar lborcard commented on August 22, 2024

I was also wondering how the playback works when your amount of "on target" reads is limited (because of WGS), what happens when there no more reads corresponding to your ref (say your 1mb region)?

I saw the release of Icarust, I was under the impression that it was not yet finished but I will try it eventually, my initial thought was that playback runs were a more realistic approach to testing but apparently I was wrong.
thanks for the explanation it helps a lot!

from readfish.

mattloose avatar mattloose commented on August 22, 2024

I wouldn't say you are wrong... it's just that playback doesn't actually remove the molecule from the sequencer and so it's really easy to misinterpret the results. We use playback to look at the relative lengths of the molecules we get on and off target. They should be as short as possible off target and as long as they orginally were on target. You can also cehck mapping efficiency using playback. You just won't see any actual enrichment!

Hope all these comments are useful :-)

from readfish.

Adoni5 avatar Adoni5 commented on August 22, 2024

image

If you look at the yield columns for playback, you can see what Matt is talking about. You can't ever enrich in playback as you can't create new reads, only chunks of the original reads.

from readfish.

lborcard avatar lborcard commented on August 22, 2024

Ok gotcha! thanks for the plots very informative. How does Minknow pick the next read to come in when using playback, does the initial "timeline" hold or is it randomly picked in the bag of molecules that went through this channel during the original run?

from readfish.

mattloose avatar mattloose commented on August 22, 2024

The initial timeline holds - so reads playout exactly as they were recorded. All that happens is the signal from a read gets broken up into smaller chunks.

from readfish.

lborcard avatar lborcard commented on August 22, 2024

So what happens to reads that are broken down, the next read will come in at the same time as it was originally sequenced? the pore just waits ?

from readfish.

mattloose avatar mattloose commented on August 22, 2024

No. Imagine you have a read of 10 kb. The sequencer plays back the 1st 1000 bases, but you then "unblock" so the sequencer ends the reads at 1kb. But the signal from the original 10kb read keeps playing, so you will get a new read starting at 1kb (plus a small bit) into the old read. And so on - until the original read finishes - and then it goes on to the next read. So, your original 10kb read could be chopped up into 10 fragments of 1 kb each.

Does that make sense?

That is why you end up with reads without a barcode when you run adaptvie sampling on a playback run.

from readfish.

lborcard avatar lborcard commented on August 22, 2024

I see what you mean it is much clearer now, so while the single of the unblocked read is playing another reads can be playing in the same "pore" and thus chopping it ?

from readfish.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.