Dear Readfish team, Thank you for your work. I am using readfish to

<a target="_blank" rel="noopener noreferrer" href="https://private-user-images.githubu

High count of no-barcode that map to my targets about readfish HOT 15 CLOSED

lborcard commented on August 22, 2024

High count of no-barcode that map to my targets

from readfish.

Comments (15)

github-actions commented on August 22, 2024

Thank you for your issue. Give us a little time to review it.

PS. You might want to check the FAQ if you haven't done so already.

This is an automated reply, generated by FAQtory

from readfish.

mattloose commented on August 22, 2024

When you run readfish on playback with a barcoded sample you can get very unexpected results. What you are seeing here is a consequence of how playback works. If you imagine a single channel on the sequencer, playback provides the signal that passed through the channel exactly as it happened on the original run. If you send a message to the sequencer to throw away the read, the the sequencer will break the playback read at that point and then start a new read from the signal that is coming from the playback file. That is not a true read and so it will therefore not begin with a barcode - you are starting half way through a read and it is correctly classified as unclassified (if that makes sense!).

I hope that helps.

from readfish.

lborcard commented on August 22, 2024

Thank you for the answer, I was wondering how it worked. So in general a simulated run will underestimate the number of true hits? Do you usually run several simulated run to get an idea of the perfomance of AS?

from readfish.

lborcard commented on August 22, 2024

Do you usually get a better idea of the performance by not testing with barcoded experiments?

from readfish.

mattloose commented on August 22, 2024

If you wish to simulate barcoded runs (or non barcoded runs) you can use our icarust tool:

https://academic.oup.com/bioinformatics/article/40/4/btae141/7628125

It all depends on what you are trying to test with respect to performance as to whihc approach is better.

from readfish.

lborcard commented on August 22, 2024

The idea for us is to compare A wgs run to adaptive sampling basically. So we perform a wgs run and we then run a simulated AS run for the same amount of time using the bulk file.

from readfish.

mattloose commented on August 22, 2024

OK - assume you are targetting a 1mb region of a 10mb genome. When you run your normal WGS (and record a bulkfile) you obtain 100x coverage of the 10 mb genome (and therefore 100x of your 1mb region). When you run adaptive sampling on the playback of that bulkfile you will still end up with approximately 100x coverage of your 10 mb genome. The reason for this is that in playback you don't actully throw away the read - you merely break a read that you don't want into smaller bits. So to see the effect of adaptive sampling on you run on playback you would need to look at the read lengths on your 1mb target region vs your 9mb of off-target. You should see long reads "on-target" and short reads "off-target".

In contrast, Icarust will simulate reads being removed, but does so from simulated data. With icarust you will see actual enrichment - but it is more theoretical.

from readfish.

lborcard commented on August 22, 2024

I was also wondering how the playback works when your amount of "on target" reads is limited (because of WGS), what happens when there no more reads corresponding to your ref (say your 1mb region)?

I saw the release of Icarust, I was under the impression that it was not yet finished but I will try it eventually, my initial thought was that playback runs were a more realistic approach to testing but apparently I was wrong.
thanks for the explanation it helps a lot!

from readfish.

mattloose commented on August 22, 2024

I wouldn't say you are wrong... it's just that playback doesn't actually remove the molecule from the sequencer and so it's really easy to misinterpret the results. We use playback to look at the relative lengths of the molecules we get on and off target. They should be as short as possible off target and as long as they orginally were on target. You can also cehck mapping efficiency using playback. You just won't see any actual enrichment!

Hope all these comments are useful :-)

from readfish.

Adoni5 commented on August 22, 2024

If you look at the yield columns for playback, you can see what Matt is talking about. You can't ever enrich in playback as you can't create new reads, only chunks of the original reads.

from readfish.

lborcard commented on August 22, 2024

Ok gotcha! thanks for the plots very informative. How does Minknow pick the next read to come in when using playback, does the initial "timeline" hold or is it randomly picked in the bag of molecules that went through this channel during the original run?

from readfish.

mattloose commented on August 22, 2024

The initial timeline holds - so reads playout exactly as they were recorded. All that happens is the signal from a read gets broken up into smaller chunks.

from readfish.

lborcard commented on August 22, 2024

So what happens to reads that are broken down, the next read will come in at the same time as it was originally sequenced? the pore just waits ?

from readfish.

mattloose commented on August 22, 2024

No. Imagine you have a read of 10 kb. The sequencer plays back the 1st 1000 bases, but you then "unblock" so the sequencer ends the reads at 1kb. But the signal from the original 10kb read keeps playing, so you will get a new read starting at 1kb (plus a small bit) into the old read. And so on - until the original read finishes - and then it goes on to the next read. So, your original 10kb read could be chopped up into 10 fragments of 1 kb each.

Does that make sense?

That is why you end up with reads without a barcode when you run adaptvie sampling on a playback run.

from readfish.

lborcard commented on August 22, 2024

I see what you mean it is much clearer now, so while the single of the unblocked read is playing another reads can be playing in the same "pore" and thus chopping it ?

from readfish.

High count of no-barcode that map to my targets about readfish HOT 15 CLOSED

Comments (15)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent