Comments (9)
Note: At the moment setting the max_chunk_size to inf is effectively enrich - setting the chunk size to anything less than inf will give a deplete behaviour.
from readfish.
Can you expand on this behavior a bit and explain max_chunk_size a bit more? I have it set to inf, but would like to know if I should be toggling it in order to improve my targeting/output.
from readfish.
Hi,
This is tricky. What are you trying to do?
I would advise against using ing chunks at this time - I would suggest taking no more than 2 kb worth of data before rejecting a read. You will get better performance as this will reduce blocking.
from readfish.
Thanks Matt, just trying to optimize and figure out what parameters I should be looking at, max_chunk_size was one I was having trouble understanding. One specific challenge I'm working through is that I'm recovering a lot of long reads (>10kb) that don't map anywhere in the genome I'm using. It sounds like changing max_chunk_size might reduce this. If I set it to 2, that is equivalent to 2kb? Thanks.
from readfish.
Hi Danny,
Chunk number is dependent on chunk size - so if you have set it to 0.4s per chunk then 2kb is approximately 12 chunks.
If your chunk size is 1s then 2 kb is approximately 4 chunks.
Hope that makes sense!
from readfish.
Hi Matt, I setup a new run and can confirm that setting max_chunk_size to 12 resolved my issue where I was recovering reads >10kb that did not map anywhere. I now get reads up to 3kb that do not map anywhere, but not larger than that. I'll see if this improves coverage of my target regions then experiment with turning this down more. Any reason I shouldn't only be taking 1kb or so of data before rejecting a read (setting max_chunk_size to 6, for example)?
BTW, thanks to you and your team for the excellent work on this. I remember you discussing it at porecamp in 2017 thinking about how cool it would be if it worked.
from readfish.
Hi @danrdanny what is the best way to check the length of the reads that didn't map anywhere?
I don't see a max_chunk_size option, but there is a max_chunks. Is that the same?
from readfish.
Whoops, you're correct @rdwrt, the option is max_chunks. The easiest way to find reads that don't map is to align to a genome then identify reads that don't map. Alternatively, you can blast randomly selected reads, but that's not very efficient.
from readfish.
Closed as inactive.
from readfish.
Related Issues (20)
- Error when validating if the name of the reference genome contains dots HOT 3
- Example barcode balancing HOT 7
- readfish stats on-target median read length and N50 always 0B HOT 4
- Readfish and boss-runs in 2024 HOT 3
- readfish with just depletion? HOT 4
- Not being able to finish the tutorial/test HOT 6
- Provide additional TOML files which provide examples of host depletion and other use cases for readfish.
- location of readfish output HOT 4
- toml_live handling issue
- Depletion setting and unexpected behavior HOT 2
- Question about Readfish basecalling performance for messy samples HOT 6
- Playback issue HOT 14
- Problem communicating with basecaller after MinKnow update HOT 10
- Detect MinKNOW version and warn. HOT 1
- Circular Binary Segmentation HOT 3
- Having only one region in toml caused issue HOT 13
- chore: update github workflow actions HOT 1
- Handshake failed with fatal error SSL_ERROR_SSL: error:1000007d:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED HOT 4
- comparison between MinKNOW and Readfish HOT 4
- Error with flongle flow cell splitting HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from readfish.