Comments (7)
Good catch! The fix to make sure large writes were handled ended up skipping the EOF marker. As you noted it's not technically needed but the HTS suit of tools will complain if it isn't present.
v0.9.5 has the fixes. Thanks for following up!
from gzp.
@mrvollger Sorry the delay in getting to this! I'll take a look later tonight or tomorrow. I don't see anything immediately wrong with your example code.
from gzp.
Thanks so much for taking a look! I am having a hard time confirming this but I think it might have to do with the length of the sequences. I have run basically this code on short reads and never had a problem, but then I tried some contigs from assemblies and started to get this error.
Thanks again!
from gzp.
Yep, I have a fix incoming to gzp. It looks like when the write
calls have more bytes than the BGZF max buffer size the final call to flush will try to send all the remaining bytes at once, which will then fail in BGZF since the number of bytes exceeds the max buffer size allowed by the BGZF spec.
Thank you for making this issue! I would not have run into this normally I don't think and missed it in all my test cases as well.
from gzp.
Try the v0.9.3
release
from gzp.
And thank you again for the issue!
from gzp.
Just tried it and it got through the file and wrote all the records! Thanks!
However, I do get this error/warning from samtools when I try to index the resulting file.
$ cat .test/large.test.fa.gz | cargo run --release --bin rb -- -vvv fasta-split 1.fa.gz
$ samtools faidx 1.fa.gz
[W::bgzf_read_block] EOF marker is absent. The input is probably truncated
The index does appear to be created after this, and it is correct in this case, but it seems a little worrying.
from gzp.
Related Issues (20)
- gzp spawns one less threads than CPUs, which hurts performance HOT 27
- [Idea] Make gzp usable as C library and/or python module HOT 1
- error when target is i686-pc-windows-msvc HOT 4
- Implement single threaded decompression for block formats
- Add Intel ISA-L HOT 4
- Remove dependency on core_affinity crate HOT 7
- Find minimum versions of deps HOT 1
- Help with the bgzf reader? HOT 5
- [feature] Implement BufRead
- Tokio Compat? HOT 2
- Avoid rustc version lock-in HOT 7
- Some package updates HOT 4
- Move off of zlib-ng-compat HOT 1
- Support `zune-inflate`, the 100% safe Rust port of `libdeflate` HOT 1
- Please support generating a single gzip-compatible or deflate-compatible stream HOT 3
- multi-threaded snap decoding?
- Seamless writing of uncompressed output HOT 3
- Please support runtimes other than tokio HOT 11
- GPU-Compression HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gzp.