Comments (3)
Thanks for bringing this up! Could you provide a testfile along with a test-program that shows the current behaviour?
My assumption is that it's not actually possible to decode the last member of such a file as it emits an error on the last read, but it would be great to validate. Maybe it will also decode the last member successfully, but fail on the next read as it can't decode garbage, which would mean all zipped data could be read with the current API, and it's really a question on how to indicate a 'garbage tail' occurred.
With such example, it would be possible to setup a test-case and eventually find a solution.
from flate2-rs.
Sure. Here is a test file I created with the following commands:
echo "Hello, world" | gzip > hello.gz
echo x >> hello.gz
Running gzcat on the file:
% gzcat hello.gz
Hello, world
gzcat: hello.gz: trailing garbage ignored
A short program that demonstrates the issue:
use flate2::read::MultiGzDecoder;
use std::fs::File;
use std::io::{BufReader, Read, Write};
fn main() {
let file = File::open("hello.gz").unwrap();
let mut buf_reader = BufReader::new(MultiGzDecoder::new(file));
let mut buf = [0_u8; 100];
loop {
match buf_reader.read(&mut buf) {
Ok(0) => break,
Ok(n) => {
println!("read {} bytes:", n);
std::io::stdout().write_all(&buf[0..n]).unwrap();
std::io::stdout().flush().unwrap();
}
Err(e) => {
println!("{:?}", e);
break;
}
}
}
}
And the output:
read 13 bytes:
Hello, world
Kind(UnexpectedEof)
from flate2-rs.
Thanks a lot for your help!
Thoughts
It looks like the question here is of UnexpectedEOF
could also be another kind of error which would indicate more clearly that indeed, there was nothing to read, or that the stream it tried to read (x
here) isn't actually a deflated stream.
Another guess is that if the trailing bytes are longer, the error message will be a different, maybe more specific one. After all, the magic signature seems to be 2 bytes long, so we'd run out of data before reading this even.
Next Steps
I think it would be worth setting up a couple of test cases with varying length of trailing garbage to get an idea of how it can be detected on application level right now.
From there, we might be able to figure out how to optionally adjust or enhance the API in a backward compatible manner to make detecting this situation easier.
from flate2-rs.
Related Issues (20)
- `GzDecoder` eager reading in the constructor blocks IO HOT 9
- Error on compiling flate2 on rust 1.57.0 HOT 2
- flate2::bufread::GzDecoder doesn't impl BufRead? HOT 3
- unsafe review: Potential (not actual) dangling pointers after inflate/deflate HOT 2
- total_in(&self) / total_out(&self) implementation for GzDecoder / GzEncoder / MultiGzDecoder HOT 2
- Issues with newly created file in read-write mode HOT 7
- Implement BufRead/Write for en/decoders alongside Read/Write
- rapidgzip
- Zlib succes while miniz_oxide fails HOT 5
- Testing validity of the data without the actual decompression
- Tree borrows violation occurs when using zlib backend HOT 5
- Some compressed files can only read a portion of the lines using GzDecoder. HOT 3
- question: Slowdown after upgrading from 1.0.26 to 1.0.28 HOT 8
- Decoding a zip file returns the Error "corrupt deflate stream" HOT 2
- why GzDecoder can't read stream correct HOT 1
- Continue reading a stream after ZlibDecoder streams finishes HOT 8
- docs.rs failed to build flate2-1.0.29 HOT 1
- Add ability to set window_bits when using rust backend?
- unknown return code: -4 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from flate2-rs.