Giter Site home page Giter Site logo

flate2-rs's Introduction

flate2

Crates.io Documentation

A streaming compression/decompression library DEFLATE-based streams in Rust.

This crate by default uses the miniz_oxide crate, a port of miniz.c to pure Rust. This crate also supports other backends, such as the widely available zlib library or the high-performance zlib-ng library.

Supported formats:

  • deflate
  • zlib
  • gzip
# Cargo.toml
[dependencies]
flate2 = "1.0"

MSRV (Minimum Supported Rust Version) Policy

This crate supports the current stable and the last stable for the latest version. For example, if the current stable is 1.64, this crate supports 1.64 and 1.63. Older stables may work, but we don't guarantee these will continue to work.

Compression

use std::io::prelude::*;
use flate2::Compression;
use flate2::write::ZlibEncoder;

fn main() {
    let mut e = ZlibEncoder::new(Vec::new(), Compression::default());
    e.write_all(b"foo");
    e.write_all(b"bar");
    let compressed_bytes = e.finish();
}

Decompression

use std::io::prelude::*;
use flate2::read::GzDecoder;

fn main() {
    let mut d = GzDecoder::new("...".as_bytes());
    let mut s = String::new();
    d.read_to_string(&mut s).unwrap();
    println!("{}", s);
}

Backends

The default miniz_oxide backend has the advantage of being pure Rust. If you want maximum performance, you can use the zlib-ng C library:

[dependencies]
flate2 = { version = "1.0.17", features = ["zlib-ng"], default-features = false }

Note that the "zlib-ng" feature works even if some other part of your crate graph depends on zlib.

However, if you're already using another C or Rust library that depends on zlib, and you want to avoid including both zlib and zlib-ng, you can use that for Rust code as well:

[dependencies]
flate2 = { version = "1.0.17", features = ["zlib"], default-features = false }

Or, if you have C or Rust code that depends on zlib and you want to use zlib-ng via libz-sys in zlib-compat mode, use:

[dependencies]
flate2 = { version = "1.0.17", features = ["zlib-ng-compat"], default-features = false }

Note that when using the "zlib-ng-compat" feature, if any crate in your dependency graph explicitly requests stock zlib, or uses libz-sys directly without default-features = false, you'll get stock zlib rather than zlib-ng. See the libz-sys README for details. To avoid that, use the "zlib-ng" feature instead.

For compatibility with previous versions of flate2, the Cloudflare optimized version of zlib is available, via the cloudflare_zlib feature. It's not as fast as zlib-ng, but it's faster than stock zlib. It requires an x86-64 CPU with SSE 4.2 or ARM64 with NEON & CRC. It does not support 32-bit CPUs at all and is incompatible with mingw. For more information check the crate documentation. Note that cloudflare_zlib will cause breakage if any other crate in your crate graph uses another version of zlib/libz.

License

This project is licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in this project by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

flate2-rs's People

Contributors

alexcrichton avatar andygauge avatar anforowicz avatar byron avatar chrisvittal avatar coolreader18 avatar davidkorczynski avatar dependabot-support avatar dodomorandi avatar est31 avatar fafhrd91 avatar guillaumegomez avatar johntitor avatar jongiddy avatar joshtriplett avatar kper avatar lukazoid avatar mdsteele avatar nivkner avatar nyurik avatar opilar avatar oyvindln avatar pierrev23 avatar quininer avatar rreverser avatar sbstp avatar sunshowers avatar tshepang avatar twittner avatar wcampbell0x2a avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

flate2-rs's Issues

miniz-sys makes x86_64-w64-mingw32-* freak out

Compiling a project that transitively depends on flate2. Got a university project to deliver by monday.

Doing it like this:

cargo rustc --release --target=x86_64-pc-windows-gnu -- -C linker=x86_64-w64-mingw32-gcc

Always works flawless. Not this time. Got linker error instead

  = note: collect2: error: ld returned 1 exit status

A scary one. Not even a bleept about an undefined reference or some such.

Instead turns out some elfs got into a miniz rlib:

$ x86_64-w64-mingw32-objdump  -t "path/target/x86_64-pc-windows-gnu/release/deps/libminiz_sys-f01b4045f82726c9.rlib" 
In archive path/target/x86_64-pc-windows-gnu/release/deps/libminiz_sys-f01b4045f82726c9.rlib:

miniz_sys-f01b4045f82726c9.0.o:     file format pe-x86-64
<snip>
miniz.o:     file format elf64-x86-64

Reading self-terminating streams

First off, awesome library, thanks!

My use-case: reading git pack files, which contain multiple concatenated, yet undelimited, zlib/deflate streams of data. In other words, the pack files don't contain any information about the length of the streams - so to detect where the start of the next object in the pack is, I have to know exactly how many bytes the underlying deflate implementation consumed, so I can know where in the underlying file to start reading the next segment of data from.

This poses two separate problems:

  • Constructing a ZlibDecoder takes ownership of the underlying file stream, which means I have to re-open the pack file to read the next object from it. This is an annoyance, but it works fine for my scenario.
  • I need to get the number of bytes the ZlibDecoder consumed (which is likely less that the number of bytes it read off the underlying stream, due to buffering).

In an ideal world, I think ZlibDecoder would only take a &mut reference to the underlying Read, and by some magic, when it's destroyed, it leaves that underlying stream positioned at the exact end of zlib data. This will likely involve requiring the underling stream to also implement Seek, which in turn either requires code duplication in the API (so far as I am aware), or imposes unwanted restrictions on everyone else. I don't think that's realistic, so let's move on to option 2:

Make ZlibDecoder take a &mut reference to the underlying Read. It does nothing special when it's destroyed, but it exposes an extra zlibDecoder.consumed_bytes() method (or field, or whatever), calculated from the total bytes it's read, less what's remaining in miniz's input buffer.

The last option for me is to scrap the higher-level API and directly use your miniz-sys bindings, which is ugly for me, but less ugly for everyone else.

Thoughts?

Performance regression?

I updated rust / cargo / flate2 to the most recent versions and un-gzipping a small (165B) file takes multiple seconds. Is it me or did something change?

versions:

% rustc --version
rustc 1.0.0-nightly (199bdcfef 2015-03-26) (built 2015-03-26)
% cargo --version
cargo 0.0.1-pre-nightly (9f265c8 2015-03-25) (built 2015-03-26)

   Compiling miniz-sys v0.1.4
   Compiling flate2 v0.2.1

Library comment contains extra r's

It says:

//! This crate consists mainly of two modules, `reader` and `writer`. Each
//! module contains a number of types used to encode and decode various streams
//! of data. All types in the `writer` module work on instances of `Writer`,
//! whereas all types in the `reader` module work on instances of `Reader`.

All this stuff has been renamed: reader should be read etc.

undefined reference to `__assert_func'

For some reason I cannot compile this project: https://github.com/viperscape/font-atlas-example/tree/master/atlas-gen, specifically that sub-project atlas-gen. Below is the output during linking. I am using mingw gcc from cygwin (x86_64-w64-mingw32-gcc) with rust-nightly 32bit. It's very possible that this is an issue on my end, as I was able to compile this yesterday; I'm not sure what changed :-( I'm hesitant to install mingw directly because it's housed on sourceforge, which I try to avoid now a days. Any clues to what is going on here? Thanks!

note: C:\Users\chris\font-atlas-example\atlas-gen\target\debug\deps\libminiz_sys-d19b88f9ef21a81d.rlib(miniz.o): In function `tinfl_decompress':
/cygdrive/c/Users/Chris/.cargo/registry/src/github.com-121aea75f9ef2ce2/miniz-sys-0.1.6/miniz.c:1707: undefined reference to `__assert_func'
C:\Users\chris\font-atlas-example\atlas-gen\target\debug\deps\libminiz_sys-d19b88f9ef21a81d.rlib(miniz.o): In function `tdefl_start_dynamic_block':
/cygdrive/c/Users/Chris/.cargo/registry/src/github.com-121aea75f9ef2ce2/miniz-sys-0.1.6/miniz.c:2024: undefined reference to `__assert_func'
/cygdrive/c/Users/Chris/.cargo/registry/src/github.com-121aea75f9ef2ce2/miniz-sys-0.1.6/miniz.c:2026: undefined reference to `__assert_func'
/cygdrive/c/Users/Chris/.cargo/registry/src/github.com-121aea75f9ef2ce2/miniz-sys-0.1.6/miniz.c:2027: undefined reference to `__assert_func'
/cygdrive/c/Users/Chris/.cargo/registry/src/github.com-121aea75f9ef2ce2/miniz-sys-0.1.6/miniz.c:2030: undefined reference to `__assert_func'
C:\Users\chris\font-atlas-example\atlas-gen\target\debug\deps\libminiz_sys-d19b88f9ef21a81d.rlib(miniz.o):/cygdrive/c/Users/Chris/.cargo/registry/src/github.com-121aea75f9ef2ce2/miniz-sys-0.1.6/miniz.c:2031: more undefined references to `__assert_func' follow
$ gcc --version
gcc (GCC) 4.9.2
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ rustc --version
rustc 1.5.0-nightly (168a23ebe 2015-10-01)

$ cargo --version
cargo 0.6.0-nightly (7f21e73 2015-10-01)

Add a get_eof() method to encoders

When an encoder tries to read from a channel, it needs a minimum of two Ok(0)'s to end the encoding. One to trigger the footer being generated which yields Ok(footerSize), the second to get an Ok(0) response back from the encoder. If I could get the EOF status of the encoder, I could use something like " if myencoder.get_eof() && encodedbytes == 8 { break; } " to prevent the channel from hanging.

About Zlib-sys

I see you are working on zlib-sys too...

Do you have some kind of long-term goal to replace miniz by zlib ? Or support two backends through features ? Miniz limitations have really been a pain recently.

miniz: panic when trying to decompress data after stream ended

I'm getting a panic 'unknown return code: -2' from src/mem.rs:309, when I run the following snippet. I don't like my program crashing when decompressing untrusted input, so I think this should just return an error.

#[test]
fn test_extract_panic() {
    let data = vec![
        0x1f, 0x8b, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x03, 0xb3, 0xc9,
        0x28, 0xc9, 0xcd, 0xb1, 0xe3, 0xe5, 0xb2, 0xc9, 0x48, 0x4d, 0x4c, 0xb1,
        0xb3, 0x29, 0xc9, 0x2c, 0xc9, 0x49, 0xb5, 0x33, 0x31, 0x30, 0x51, 0xf0,
        0xcb, 0x2f, 0x51, 0x70, 0xcb, 0x2f, 0xcd, 0x4b, 0xb1, 0xd1, 0x87, 0x08,
        0xda, 0xe8, 0x83, 0x95, 0x00, 0x95, 0x26, 0xe5, 0xa7, 0x54, 0x2a, 0x24,
        0xa5, 0x27, 0xe7, 0xe7, 0xe4, 0x17, 0xd9, 0x2a, 0x95, 0x67, 0x64, 0x96,
        0xa4, 0x2a, 0x81, 0x8c, 0x48, 0x4e, 0xcd, 0x2b, 0x49, 0x2d, 0xb2, 0xb3,
        0xc9, 0x30, 0x44, 0x37, 0x01, 0x28, 0x62, 0xa3, 0x0f, 0x95, 0x06, 0xd9,
        0x05, 0x54, 0x04, 0xe5, 0xe5, 0xa5, 0x67, 0xe6, 0x55, 0xe8, 0x1b, 0xea,
        0x99, 0xe9, 0x19, 0x21, 0xab, 0xd0, 0x07, 0xd9, 0x01, 0x32, 0x53, 0x1f,
        0xea, 0x3e, 0x00, 0x94, 0x85, 0xeb, 0xe4, 0xa8, 0x00, 0x00, 0x00
    ];

    let mut decoded = Vec::with_capacity(data.len()*2);

    let mut d = Decompress::new(false);
    // decompressed whole deflate stream
    assert!(d.decompress_vec(&data[10..], &mut decoded, Flush::Finish).is_ok());

    // decompress data that has nothing to do with the deflate stream (this panics)
    assert!(d.decompress_vec(&[0], &mut decoded, Flush::None).is_err());
}

Some problems

1, the io in README.md use std::io::MemWriter; should now be changed to old_io.

2, you did not put CompressionLevel::Default into pub use CompressionLevel::{BestCompression,BestSpeed,NoCompression};, so the example in README.md might be

let mut e = ZlibEncoder::new(MemWriter::new(), flate2::CompressionLevel::Default);

3, By the way, i'm going to write a distributed nosql using rust. Is flate2 suitable for data storage and should i use BestSpeed or Default?
// Forgive me, my english is not very good :)

Could not compile with MinGW; explicit panic

I am getting this error when I try to compile flate2-rs as an implicit dependency:

C:\Users\Limeth\workspace\rust\euclider>cargo run --release
   Compiling phf_codegen v0.7.16
   Compiling scoped_threadpool v0.1.7
   Compiling shell32-sys v0.1.1
   Compiling miniz-sys v0.1.7
Build failed, waiting for other jobs to finish...
error: failed to run custom build command for `miniz-sys v0.1.7`
process didn't exit successfully: `C:\Users\Limeth\workspace\rust\euclider\target\release\build\miniz-sys-60c8d67696f63a43\build-script-build` (exit code: 101)
--- stdout
TARGET = Some("x86_64-pc-windows-gnu")
OPT_LEVEL = Some("3")
PROFILE = Some("release")
TARGET = Some("x86_64-pc-windows-gnu")
debug=false opt-level=3
HOST = Some("x86_64-pc-windows-gnu")
TARGET = Some("x86_64-pc-windows-gnu")
TARGET = Some("x86_64-pc-windows-gnu")
HOST = Some("x86_64-pc-windows-gnu")
CC_x86_64-pc-windows-gnu = None
CC_x86_64_pc_windows_gnu = None
HOST_CC = None
CC = None
TARGET = Some("x86_64-pc-windows-gnu")
HOST = Some("x86_64-pc-windows-gnu")
CFLAGS_x86_64-pc-windows-gnu = None
CFLAGS_x86_64_pc_windows_gnu = None
HOST_CFLAGS = None
CFLAGS = None
running: "gcc.exe" "-O3" "-ffunction-sections" "-fdata-sections" "-m64" "-o" "C:\\Users\\Limeth\\workspace\\rust\\euclider\\target\\release\\build\\miniz-sys-60c8d67696f63a43\\out\\miniz.o" "-c" "miniz.c"
cargo:warning=miniz.c:1:0: sorry, unimplemented: 64-bit mode not compiled in
cargo:warning= /* miniz.c v1.16 beta r1 - public domain deflate/inflate, zlib-subset, ZIP reading/writing/appending, PNG writing
cargo:warning= ^
ExitStatus(ExitStatus(1))


command did not execute successfully, got: exit code: 1



--- stderr
thread 'main' panicked at 'explicit panic', C:\Users\Limeth\.cargo\registry\src\github.com-1ecc6299db9ec823\gcc-0.3.35\src\lib.rs:897
note: Run with `RUST_BACKTRACE=1` for a backtrace.

Eagerly implement common traits

  • GzHeader should implement PartialEq and maybe Default.
  • Compression should implement PartialEq, Eq, Default.
  • Flush should implement Copy, Clone, PartialEq, Eq.
  • Status should implement Copy, Clone, PartialEq, Eq.

failed to run custom build command for `miniz-sys v0.1.7`

Some PistonDeveloper projects are implicit referencing miniz-sys. On my Desktop i run into problems compiling them (named: conrod examples and imgproc examples).

OS: Windows 8.1 (64Bit)
rustc: 1.5.0 (Stable)

failed to run custom build command for `miniz-sys v0.1.7`
Process didn't exit successfully: `C:\Users\username\Documents\GitHub\conrod\target\
release\build\miniz-sys-f1c39e8e406fa25f\build-script-build` (exit code: 101)
--- stdout
TARGET = Some("x86_64-pc-windows-gnu")
OPT_LEVEL = Some("3")
PROFILE = Some("release")
TARGET = Some("x86_64-pc-windows-gnu")
debug=false opt-level=3
HOST = Some("x86_64-pc-windows-gnu")
TARGET = Some("x86_64-pc-windows-gnu")
TARGET = Some("x86_64-pc-windows-gnu")
HOST = Some("x86_64-pc-windows-gnu")
CC_x86_64-pc-windows-gnu = None
CC_x86_64_pc_windows_gnu = None
HOST_CC = None
CC = None
TARGET = Some("x86_64-pc-windows-gnu")
HOST = Some("x86_64-pc-windows-gnu")
CFLAGS_x86_64-pc-windows-gnu = None
CFLAGS_x86_64_pc_windows_gnu = None
HOST_CFLAGS = None
CFLAGS = None
running: "gcc.exe" "-O3" "-ffunction-sections" "-fdata-sections" "-m64" "-o" "C:
\\Users\\username\\Documents\\GitHub\\conrod\\target\\release\\build\\miniz-sys-f1c3
9e8e406fa25f\\out\\miniz.o" "-c" "miniz.c"
ExitStatus(ExitStatus(1))

command did not execute successfully, got: exit code: 1

--- stderr
gcc.exe: error: CreateProcess: No such file or directory
thread '<main>' panicked at 'explicit panic', C:\Users\username\.cargo\registry\src\
github.com-0a35038f75765ae4\gcc-0.3.21\src\lib.rs:772

My %PATH% (relevant parts) looks like this:

C:\Program Files\Rust stable 1.5\bin;C:\Program Files\Rust stable 1.5\bin\rustlib\x86_64-pc-windows-gnu\lib\;C:\Program Files\Rust stable 1.5\bin\rustlib\x86_64-pc-windows-gnu\bin\;C:\Program Files\Rust stable1.5\bin\;

manual executing build-scrip-build.exe:

C:\Users\username\Documents\GitHub\conrod\target\debug\build\miniz-sys-f1c39e8e406fa
25f>build-script-build.exe
thread '<main>' panicked at 'called `Option::unwrap()` on a `None` value', ../sr
c/libcore\option.rs:366

gcc -v output:

Using built-in specs.
COLLECT_GCC=gcc
Target: x86_64-w64-mingw32
Configured with: ../../../src/gcc-4.9.1/configure --host=x86_64-w64-mingw32 --bu
ild=x86_64-w64-mingw32 --target=x86_64-w64-mingw32 --prefix=/mingw64 --with-sysr
oot=/c/mingw491/x86_64-491-win32-seh-rt_v3-rev1/mingw64 --with-gxx-include-dir=/
mingw64/x86_64-w64-mingw32/include/c++ --enable-shared --enable-static --disable
-multilib --enable-languages=ada,c,c++,fortran,objc,obj-c++,lto --enable-libstdc
xx-time=yes --enable-threads=win32 --enable-libgomp --enable-libatomic --enable-
lto --enable-graphite --enable-checking=release --enable-fully-dynamic-string --
enable-version-specific-runtime-libs --disable-isl-version-check --disable-cloog
-version-check --disable-libstdcxx-pch --disable-libstdcxx-debug --enable-bootst
rap --disable-rpath --disable-win32-registry --disable-nls --disable-werror --di
sable-symvers --with-gnu-as --with-gnu-ld --with-arch=nocona --with-tune=core2 -
-with-libiconv --with-system-zlib --with-gmp=/c/mingw491/prerequisites/x86_64-w6
4-mingw32-static --with-mpfr=/c/mingw491/prerequisites/x86_64-w64-mingw32-static
 --with-mpc=/c/mingw491/prerequisites/x86_64-w64-mingw32-static --with-isl=/c/mi
ngw491/prerequisites/x86_64-w64-mingw32-static --with-cloog=/c/mingw491/prerequi
sites/x86_64-w64-mingw32-static --enable-cloog-backend=isl --with-pkgversion='x8
6_64-win32-seh-rev1, Built by MinGW-W64 project' --with-bugurl=http://sourceforg
e.net/projects/mingw-w64 CFLAGS='-O2 -pipe -I/c/mingw491/x86_64-491-win32-seh-rt
_v3-rev1/mingw64/opt/include -I/c/mingw491/prerequisites/x86_64-zlib-static/incl
ude -I/c/mingw491/prerequisites/x86_64-w64-mingw32-static/include' CXXFLAGS='-O2
 -pipe -I/c/mingw491/x86_64-491-win32-seh-rt_v3-rev1/mingw64/opt/include -I/c/mi
ngw491/prerequisites/x86_64-zlib-static/include -I/c/mingw491/prerequisites/x86_
64-w64-mingw32-static/include' CPPFLAGS= LDFLAGS='-pipe -L/c/mingw491/x86_64-491
-win32-seh-rt_v3-rev1/mingw64/opt/lib -L/c/mingw491/prerequisites/x86_64-zlib-st
atic/lib -L/c/mingw491/prerequisites/x86_64-w64-mingw32-static/lib '
Thread model: win32

Add a pure-Rust backend

flate2 gets its actual compression from either libz or miniz, both C libraries. Ultimately, a pure-Rust stack is better for Rust, mostly because of build simplicity.

Write a miniz replacement in Rust, publish it to crates.io and add another compile-time feature to select it.

Then make the performance better than miniz and make it the default.

Allow re-use of buffers provided to `new_with_buf`

I'm working on using flate2 to add support for the deflate-based serialization formats for HdrHistogram to this Rust implementation. However, some parts of the flate2::read::DeflateDecoder and flate2::write::DeflateEncoder API are awkward for my use case.

The Deserializer struct in that library exists to let the user amortize buffer allocation costs across all deserializations (no heap allocation once steady state has been achieved), with similar goals for the serialization equivalent. The struct itself is not generic, but it has a generic deserialize method:

pub fn deserialize<T: Counter, R: Read>(&mut self, reader: &mut R)
                                            -> Result<Histogram<T>, DeserializeError> 

My understanding of the DeflateDecoder API leads me towards a few options, none of which is particularly attractive:

  1. Make Deserializer follow the same "consume the writer, then return it later" structure. This means Deserializer must itself be generified on R: Read, must always wrap a Read (which means it can't be created during startup before there might be anything to read from), etc.
  2. Make a new DeflateDecoder for each deserialization. This spoils my hope to avoid heap allocation, but allows me to keep the Deserializer API the way it is.
  3. Require BufRead instead of Read. That would let me create a flate2::bufread::DeflateDecoder for each deserialization, but it pushes buffer management to the user: now they have to figure out how to re-use buffers across Reads, assuming they put forth the effort to avoid the heap allocation. It doesn't look like std::io::BufReader will let you reclaim its buffer space, for instance. On the other hand, perhaps it's likely that in practice people will have a BufRead already?
  4. Copy-and-paste flate2::bufreader::BufReader and modify it to take (and release) a Box<[u8]> (or maybe just a &mut [u8]?) and use that with flate2::bufread::DeflateDecoder.

DeflateDecoder::new_with_buf almost makes option 2 above feasible, but I can't get the buffer back out again to use it again later. So, if there was a way to either not require that the Vec be consumed (preferable, as I wouldn't have to shuffle the Vec in and out of the Deserializer struct), or at least offer a way to consume the DeflateDecoder to get the Vec back out (like into_inner() but for the buffer), that would let me keep the Deserializer API I prefer.

Surely I can't be the only one who wants this form of re-use... What do you think? Am I missing something?

Link fails with msys2/multirust

I can't get an executable to link using either back end.

With miniz, __assert_func can't be found:

error: linking with `gcc` failed: exit code: 1
note: "gcc" "-Wl,--enable-long-section-names" "-fno-use-linker-plugin" "-Wl,--nxcompat" "-nostdlib" "-m64" "D:\\MSYS\\home\\Daggerbot\\.multirust\\toolchains\\stable\\lib\\rustlib\\x86_64-pc-windows-gnu\\lib\\crt2.o" "D:\\MSYS\\home\\Daggerbot\\.multirust\\toolchains\\stable\\lib\\rustlib\\x86_64-pc-windows-gnu\\lib\\rsbegin.o" "-L" "D:\\MSYS\\home\\Daggerbot\\.multirust\\toolchains\\stable\\lib\\rustlib\\x86_64-pc-windows-gnu\\lib" "D:\\Projects\\aurum\\draw\\target\\debug\\examples\\image.0.o" "-o" "D:\\Projects\\aurum\\draw\\target\\debug\\examples\\image.exe" "-Wl,--gc-sections" "-nodefaultlibs" "-L" "D:\\Projects\\aurum\\draw\\target\\debug" "-L" "D:\\Projects\\aurum\\draw\\target\\debug\\deps" "-L" "D:\\Projects\\aurum\\draw\\target\\debug\\build\\miniz-sys-d03126dbc9ee0074\\out" "-L" "D:\\MSYS\\home\\Daggerbot\\.multirust\\toolchains\\stable\\lib\\rustlib\\x86_64-pc-windows-gnu\\lib" "-Wl,-Bstatic" "-Wl,-Bdynamic" "D:\\Projects\\aurum\\draw\\target\\debug\\libaurum_draw.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\libaurum_display-c2219c7625040322.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\libuser32-5b257a594df68a77.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\libgdi32-69e315c17a3908b1.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\libbitflags-69448112f0ca8232.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\libaurum_image-635c95e01fa05f6d.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\libaurum_unicode-abd69641aec3f6e4.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\libaurum_color-45538d7570a142be.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\libaurum_linear-e1b0e372e378fd33.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\libaurum_numeric-02eca3f26ddef8a6.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\liblog-30a8a27ec161f1be.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\libflate2-c4ffd8a47aabab9a.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\libcrc-84833374161e9fc4.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\liblazy_static-e69e55dcc7527931.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\libaurum_winutil-bd9d94ddcf37bec3.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\libkernel32-72bc68efd3a36233.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\libwinapi-6fc61f3c438a06cc.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\libminiz_sys-d03126dbc9ee0074.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\liblibc-7f0d63b960234050.rlib" "D:\\MSYS\\home\\Daggerbot\\.multirust\\toolchains\\stable\\lib\\rustlib\\x86_64-pc-windows-gnu\\lib\\libstd-4fda350b.rlib" "D:\\MSYS\\home\\Daggerbot\\.multirust\\toolchains\\stable\\lib\\rustlib\\x86_64-pc-windows-gnu\\lib\\libcollections-4fda350b.rlib" "D:\\MSYS\\home\\Daggerbot\\.multirust\\toolchains\\stable\\lib\\rustlib\\x86_64-pc-windows-gnu\\lib\\librustc_unicode-4fda350b.rlib" "D:\\MSYS\\home\\Daggerbot\\.multirust\\toolchains\\stable\\lib\\rustlib\\x86_64-pc-windows-gnu\\lib\\librand-4fda350b.rlib" "D:\\MSYS\\home\\Daggerbot\\.multirust\\toolchains\\stable\\lib\\rustlib\\x86_64-pc-windows-gnu\\lib\\liballoc-4fda350b.rlib" "D:\\MSYS\\home\\Daggerbot\\.multirust\\toolchains\\stable\\lib\\rustlib\\x86_64-pc-windows-gnu\\lib\\liballoc_system-4fda350b.rlib" "D:\\MSYS\\home\\Daggerbot\\.multirust\\toolchains\\stable\\lib\\rustlib\\x86_64-pc-windows-gnu\\lib\\liblibc-4fda350b.rlib" "D:\\MSYS\\home\\Daggerbot\\.multirust\\toolchains\\stable\\lib\\rustlib\\x86_64-pc-windows-gnu\\lib\\libcore-4fda350b.rlib" "-l" "user32" "-l" "gdi32" "-l" "kernel32" "-l" "gcc_eh" "-l" "ws2_32" "-l" "userenv" "-l" "shell32" "-l" "advapi32" "-l" "compiler-rt" "-lmingwex" "-lmingw32" "-lgcc" "-lmsvcrt" "-luser32" "-lkernel32" "D:\\MSYS\\home\\Daggerbot\\.multirust\\toolchains\\stable\\lib\\rustlib\\x86_64-pc-windows-gnu\\lib\\rsend.o"
note: D:\Projects\aurum\draw\target\debug\deps\libminiz_sys-d03126dbc9ee0074.rlib(miniz.o): In function `tinfl_decompress':
/home/Daggerbot/.multirust/toolchains/stable/cargo/registry/src/github.com-88ac128001ac3a9a/miniz-sys-0.1.7/miniz.c:1707: undefined reference to `__assert_func'
D:\Projects\aurum\draw\target\debug\deps\libminiz_sys-d03126dbc9ee0074.rlib(miniz.o): In function `tdefl_start_dynamic_block':
/home/Daggerbot/.multirust/toolchains/stable/cargo/registry/src/github.com-88ac128001ac3a9a/miniz-sys-0.1.7/miniz.c:2024: undefined reference to `__assert_func'
/home/Daggerbot/.multirust/toolchains/stable/cargo/registry/src/github.com-88ac128001ac3a9a/miniz-sys-0.1.7/miniz.c:2026: undefined reference to `__assert_func'
/home/Daggerbot/.multirust/toolchains/stable/cargo/registry/src/github.com-88ac128001ac3a9a/miniz-sys-0.1.7/miniz.c:2027: undefined reference to `__assert_func'
/home/Daggerbot/.multirust/toolchains/stable/cargo/registry/src/github.com-88ac128001ac3a9a/miniz-sys-0.1.7/miniz.c:2030: undefined reference to `__assert_func'
D:\Projects\aurum\draw\target\debug\deps\libminiz_sys-d03126dbc9ee0074.rlib(miniz.o):/home/Daggerbot/.multirust/toolchains/stable/cargo/registry/src/github.com-88ac128001ac3a9a/miniz-sys-0.1.7/miniz.c:2031: more undefined references to `__assert_func' follow

And with zlib, the linker can't find -lz even though '/usr/lib/libz.a' exists:

error: linking with `gcc` failed: exit code: 1
note: "gcc" "-Wl,--enable-long-section-names" "-fno-use-linker-plugin" "-Wl,--nxcompat" "-nostdlib" "-m64" "D:\\MSYS\\home\\Daggerbot\\.multirust\\toolchains\\stable\\lib\\rustlib\\x86_64-pc-windows-gnu\\lib\\crt2.o" "D:\\MSYS\\home\\Daggerbot\\.multirust\\toolchains\\stable\\lib\\rustlib\\x86_64-pc-windows-gnu\\lib\\rsbegin.o" "-L" "D:\\MSYS\\home\\Daggerbot\\.multirust\\toolchains\\stable\\lib\\rustlib\\x86_64-pc-windows-gnu\\lib" "D:\\Projects\\aurum\\draw\\target\\debug\\examples\\image.0.o" "-o" "D:\\Projects\\aurum\\draw\\target\\debug\\examples\\image.exe" "-Wl,--gc-sections" "-nodefaultlibs" "-L" "D:\\Projects\\aurum\\draw\\target\\debug" "-L" "D:\\Projects\\aurum\\draw\\target\\debug\\deps" "-L" "/usr/lib" "-L" "D:\\MSYS\\home\\Daggerbot\\.multirust\\toolchains\\stable\\lib\\rustlib\\x86_64-pc-windows-gnu\\lib" "-Wl,-Bstatic" "-Wl,-Bdynamic" "D:\\Projects\\aurum\\draw\\target\\debug\\libaurum_draw.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\libaurum_display-c2219c7625040322.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\libuser32-5b257a594df68a77.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\libgdi32-69e315c17a3908b1.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\libbitflags-69448112f0ca8232.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\libaurum_image-635c95e01fa05f6d.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\libaurum_unicode-abd69641aec3f6e4.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\libaurum_color-45538d7570a142be.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\libaurum_linear-e1b0e372e378fd33.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\libaurum_numeric-02eca3f26ddef8a6.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\liblog-30a8a27ec161f1be.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\libflate2-c4ffd8a47aabab9a.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\libcrc-84833374161e9fc4.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\liblazy_static-e69e55dcc7527931.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\libaurum_winutil-bd9d94ddcf37bec3.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\libkernel32-72bc68efd3a36233.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\libwinapi-6fc61f3c438a06cc.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\liblibz_sys-e9c5285337e33905.rlib" "D:\\Projects\\aurum\\draw\\target\\debug\\deps\\liblibc-7f0d63b960234050.rlib" "D:\\MSYS\\home\\Daggerbot\\.multirust\\toolchains\\stable\\lib\\rustlib\\x86_64-pc-windows-gnu\\lib\\libstd-4fda350b.rlib" "D:\\MSYS\\home\\Daggerbot\\.multirust\\toolchains\\stable\\lib\\rustlib\\x86_64-pc-windows-gnu\\lib\\libcollections-4fda350b.rlib" "D:\\MSYS\\home\\Daggerbot\\.multirust\\toolchains\\stable\\lib\\rustlib\\x86_64-pc-windows-gnu\\lib\\librustc_unicode-4fda350b.rlib" "D:\\MSYS\\home\\Daggerbot\\.multirust\\toolchains\\stable\\lib\\rustlib\\x86_64-pc-windows-gnu\\lib\\librand-4fda350b.rlib" "D:\\MSYS\\home\\Daggerbot\\.multirust\\toolchains\\stable\\lib\\rustlib\\x86_64-pc-windows-gnu\\lib\\liballoc-4fda350b.rlib" "D:\\MSYS\\home\\Daggerbot\\.multirust\\toolchains\\stable\\lib\\rustlib\\x86_64-pc-windows-gnu\\lib\\liballoc_system-4fda350b.rlib" "D:\\MSYS\\home\\Daggerbot\\.multirust\\toolchains\\stable\\lib\\rustlib\\x86_64-pc-windows-gnu\\lib\\liblibc-4fda350b.rlib" "D:\\MSYS\\home\\Daggerbot\\.multirust\\toolchains\\stable\\lib\\rustlib\\x86_64-pc-windows-gnu\\lib\\libcore-4fda350b.rlib" "-l" "user32" "-l" "gdi32" "-l" "kernel32" "-l" "z" "-l" "gcc_eh" "-l" "ws2_32" "-l" "userenv" "-l" "shell32" "-l" "advapi32" "-l" "compiler-rt" "-lmingwex" "-lmingw32" "-lgcc" "-lmsvcrt" "-luser32" "-lkernel32" "D:\\MSYS\\home\\Daggerbot\\.multirust\\toolchains\\stable\\lib\\rustlib\\x86_64-pc-windows-gnu\\lib\\rsend.o"
note: ld: cannot find -lz

I'm not sure whether these problems are related to flate2's build script or msys2, but I'll try using an msvc toolchain in the meantime.

Can encoders work with &mut references that implement Writer?

fn new(w: W, level: CompressionLevel) -> EncoderWriter<W> consumes w. Is that necessary? I have a situation with an &mut BufferedStream which doesn't implement Writer because it's a borrowed reference, not the thing itself. I have to hack around with a MemWriter, and then writing the `memory.get_ref()into my&mut Buffered Stream``. I'd rather be able to call``DeflateEncoder::new( w , CompressionLevel::Default)`` (or similar) with the &mut BufferedStream.

"unknown return code: -2"

I have a write::ZlibEncoder which always panicks with unknown return code: -2 on mem.rs:226. I'm using the zlib backend (zlib v1.2.11, libz-sys 1.0.13) with flate2 v0.2.17.

In zlib's header, error code -2 is Z_STREAM_ERROR which means that the stream state was inconsistent on a call to deflate(). The relevant backtrace is:

7: 0x55ecdd79bdc8 - flate2::mem::Compress::compress::h944fc62bd5cc0d94
at /.cargo/registry/src/github.com-1ecc6299db9ec823/flate2-0.2.17/src/mem.rs:226
8: 0x55ecdd79bf23 - flate2::mem::Compress::compress_vec::h8e04b6895a662a0e
at /.cargo/registry/src/github.com-1ecc6299db9ec823/flate2-0.2.17/src/mem.rs:251
9: 0x55ecdd79b705 - <flate2::mem::Compress as flate2::zio::Ops>::run_vec::h0547d0514c764c38
at /.cargo/registry/src/github.com-1ecc6299db9ec823/flate2-0.2.17/src/zio.rs:31
10: 0x55ecdd516ac0 - <flate2::zio::Writer<W, D> as std::io::Write>::write::h52ff979bdb398a3e
at /.cargo/registry/src/github.com-1ecc6299db9ec823/flate2-0.2.17/src/zio.rs:143
11: 0x55ecdd5162ab - <flate2::zlib::EncoderWriter as std::io::Write>::write::h8e5590d20a34db48
at /.cargo/registry/src/github.com-1ecc6299db9ec823/flate2-0.2.17/src/zlib.rs:102
12: 0x55ecdd47f2ef - std::io::Write::write_all::h95d0008b89f6db59
at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/io/mod.rs:944
13: 0x55ecdd48033e - std::io::impls::<impl std::io::Write for &'a mut W>::write_all::hf6742d17386c5ea7
at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/io/impls.rs:51

The code using ZlibEncoder is:

let mut zlib = ZlibEncoder::new(Vec::new(), Compression::Best);
serde_json::to_writer(&mut zlib, data).unwrap();
zlib.finish().unwrap()

The ZlibEncoder works flawlessly with the minizip backend

flate2 happily "decompresses" a corrupt gzip file

I have a corrupt gzipped file which "gunzip" fails to decompress with (crc error and length error), but which flate2 happily decompresses (the result is of course corrupt).

My application requires detecting this kind of thing. I can provide a sample file on request but apparently not upload it here.

Documenting BufRead

I'm trying to read a stream containing several gzip files concatenated.
After some tests, I believe the functions in bufread::GzDecoder to read exactly the amount of bytes they need, and read::GzDecoder might read more. Apparently, I can call flate2::bufread::GzDecoder::new in a loop to get all my files.

However, either I missed something (and I'm sorry), or this is not too explicit in the current documentation. For example, the header of modules read and bufread appear to be the same, but since BufRead inherits Read, the two modules seem redundant.

panic: arithmetic operation overflowed with data > 4GB

i get this traceback when unzipping data >4GB

thread '<main>' panicked at 'arithmetic operation overflowed', ~/.cargo/registry/src/github.com-0a35038f75765ae4/flate2-0.2.9/src/crc.rs:28
stack backtrace:
   1:     0x7fa79e39b4a9 - sys::backtrace::tracing::imp::write::he18882fa84e6b00ePnt
   2:     0x7fa79e39a688 - panicking::on_panic::h495226a97f084523enx
   3:     0x7fa79e38f69e - sys_common::unwind::begin_unwind_inner::h7a4ee06c0d57e26affs
   4:     0x7fa79e38fb88 - sys_common::unwind::begin_unwind_fmt::hb590a1df1c16a800les
   5:     0x7fa79e399f01 - rust_begin_unwind
   6:     0x7fa79e3ca70f - panicking::panic_fmt::hc3363d565c1048da4HJ
   7:     0x7fa79e3c68f8 - panicking::panic::h14af70be4f3d4feaBGJ
   8:     0x7fa79e308ed5 - crc::Crc::update::h44b516d2a74907c79aa
                        at ~/.cargo/registry/src/github.com-0a35038f75765ae4/flate2-0.2.9/src/crc.rs:28
   9:     0x7fa79e096bfd - crc::CrcReader<R>.Read::read::h13621940182568327352
                        at ~/.cargo/registry/src/github.com-0a35038f75765ae4/flate2-0.2.9/src/crc.rs:47
  10:     0x7fa79e0966df - gz::DecoderReader<R>.Read::read::h5470388900186957621
                        at ~/.cargo/registry/src/github.com-0a35038f75765ae4/flate2-0.2.9/src/gz.rs:446
  11:     0x7fa79e0964c9 - io::buffered::BufReader<R>.BufRead::fill_buf::h13060170593286078394
                        at ../src/libstd/io/buffered.rs:188
  12:     0x7fa79e09606f - io::read_until::h9599120926166658730
                        at ../src/libstd/io/mod.rs:1170
  13:     0x7fa79e095ea1 - io::BufRead::read_line::closure.14338
                        at ../src/libstd/io/mod.rs:1405
  14:     0x7fa79e095c59 - io::append_to_string::h16196573818836934124
                        at ../src/libstd/io/mod.rs:309
  15:     0x7fa79e095beb - io::BufRead::read_line::h3863305996463976303
                        at ../src/libstd/io/mod.rs:1405
...

the line it refers to is this one

Optional compression

Sometimes software (minisat for example) uses optional compression: it checks if provided data file is correct gzip archive. If yes then it unpacks and reads it, if not --- just reads file as is. Zlib even has gzopen/gzdopen functions that supports this behaviour.

GzDecoder::new consumes stream in case of gzip header parsing error, so we are forced to reopen in case of file or to have big toubles in case of non-file. It would be nice to support optional compression in flate2 or at least get stream and readed bytes back in case of header parse error.

Cargo.toml for miniz-sys should document Unlicense

The Cargo.toml file for miniz-sys lists the license as MIT/Apache-2.0; however, miniz.c itself uses the Unlicense.

Since the Unlicense puts the code in the public domain, if you intended to relicense it, you can do so; however, if you didn't intend to relicense it, you should document the original license.

multi-chunk gzip files are truncated at end of first block

'block' gzip files (see page 10 here: http://samtools.github.io/hts-specs/SAMv1.pdf), are commonly used in bioinformatics. They are valid gzip files, but use the somewhat rare scheme of storing the data in many consecutive gzip blocks. System gunzip or zcat will yield all the data in all blocks, however flate2:GzDecoder will only yield data from the first block. I believe that correct behavior is to continue reading the steam to find another block until EOF.

Implement get_ref/get_mut/total_in/total_out/into_inner for gzip DecoderReader

The DEFLATE DecoderReader has a bunch of useful methods. It'd be nice to have the same methods on the gzip DecoderReader as well.

Relatedly, it would also be really nice to have traits that all the decoders and encoders here (and in your bzip2-rs repo, but not sure how easy that would be) implement, so that users could (statically or dynamically) switch out compression engines :)

Premature EOF on WARC files

When parsing part of the CommonCrawl corpus (which consists of ~1G WARC files where each record is individually compressed), flate2 will return EOF after the first chunk has been decompressed rather than continuing to read the rest of the file. Sample data:

s3://aws-publicdatasets/common-crawl/crawl-data/CC-MAIN-2016-07/segments/1454701166570.91/warc/CC-MAIN-20160205193926-00310-ip-10-236-182-209.ec2.internal.warc.gz

(Downloadable with 'aws s3 cp s3://aws-publicdatasets/common-crawl/crawl-data/CC-MAIN-2016-07/segments/1454701166570.91/warc/CC-MAIN-20160205193926-00310-ip-10-236-182-209.ec2.internal.warc.gz' if you have the AWS CLI installed. 837M, no fees.)

Sample code:

fn main() {
  let filename = env::args().nth(1).unwrap_or(
      "../CC-MAIN-20160205193926-00310-ip-10-236-182-209.ec2.internal.warc.gz"
      .to_string());
  let input_file = File::open(&filename).unwrap();
  let mut gz_decoder = GzDecoder::new(input_file).unwrap();
  loop {
    let mut buffer = Vec::with_capacity(1000000);
    let capacity = buffer.capacity();
    let read_range = buffer.len() .. capacity;
    unsafe { buffer.set_len(capacity) };
    match gz_decoder.read(&mut buffer[read_range]) {
        Ok(0) => {
          println!("EOF");
          break;
        },
        Ok(_) => {
          println!("Read {}", String::from_utf8_lossy(&buffer))
        },
        Err(err) => {
          println!("Error filling buffer: {}", err)
        }
    }
  }
}

Arguably the files shouldn't do this, but the WARC spec recommends record-at-a-time compression, and it's pretty common practice in the Hadoop world to operate on big files that are the concatenation of individually-gzipped records so that Hadoop can split the input without reading it. gunzip/gzcat can read it, and re-compressing it with gzip allows flate2 to as well. Given that these files exist, maybe flate2 could avoid returning EOF until the underlying stream does, instead returning the stream of decompressed bytes from the next record?

When deserializing, flate2 incorrectly declares a file corrupt

Here is the code that I wanted to run:

let mut buf : io::BufReader<File> = io::BufReader::new(try!(File::open(path)));
let mut gzbuf = try!(GzDecoder::new(buf));
let decoded = try!(serde_json::de::from_reader(&mut gzbuf));

This fails, with the error corrupt gzip stream does not have a matching checksum.

This seems to only occur when deserializing with serde_json, so maybe serde_json is also at fault, but that error is definitely wrong: there's nothing wrong with the gzip stream.

To show this, I made an entire repository around this to try and figure it out, and I still haven't quite gotten to the bottom. If you run cargo test on that repository, the test test_gz_basic2b() fails, while test_gz_basic2b() succeeds.

If this bug is more appropriate to give to serde, please let me know! And also what I should tell them, because this only occurs when using both serde_json and flate2.

Stream::read does not read as much as needed

Hello,

Cargo was unable to extract the content of one of my packages. It turned out that tar was expecting to read 512 bytes but flate2 gave it only 135. However, that was not due to the EOF, but rather due to the fact that miniz did not get enough input and, therefore, was not able to give as much output as it was needed. It seems that flate2 does not try to fill in the entire buffer even if it can.

The problem can be reproduced here.

Thank you.

Regards,
Ivan

GZDecoder fails to decode

GzDecoder fails silently (and weirdly) on a file that gzcat decodes correctly.

The file is huge (way over 4GB) so I hope this is not a limitation of the underlying implementation.

Anyway, the file can be obtain here:
http://dumps.wikimedia.org/other/wikidata/20150608.json.gz

and I just have a simple src/main.rs like this:

extern crate flate2;

fn main() {
    let mut input = ::flate2::read::GzDecoder::new(
        ::std::fs::File::open("wikidata-20150608.json.gz").unwrap()
    ).unwrap();
    ::std::io::copy(&mut input, &mut ::std::io::stdout()).unwrap();
}

with

[dependencies]
flate2 = "0.2.7"

The output shows a single line containing an opening square bracket (which is expected) but stop there instead of going on.

The file should be a huge json array, with a valid json object per line followed by a coma, a header line with an opening bracket and a matching closing line.

Extract gzip header into a crate? Maybe other metadata?

In my zopfli implementation, it would be useful to have your gzip into_header and do_finish footer) code, but without the actual compression logic. The C implementation of zopfli was actually kind of lazy and doesn't set the mtime or OS codes in the gzip header, so that adds weight in my mind that a crate would be useful for people who don't want to implement it themselves!

Mostly opening this to get your opinion-- do you think it'd be worthwhile to extract this into a separate crate, and would you accept a PR that used that crate in this library?

Crash: "thread panicked whiled panicking" on drop of ZlibDecoder

I am able to reproduce on OSX rust stable rust (rustc 1.10.0 (cfcb716cf 2016-07-03)):

thread panicked while panicking. aborting.
Process 83172 stopped
* thread #3: tid = 0xd54277, 0x000000010018df8b updater-772e65f3bf6d041d`std::panicking::rust_panic_with_hook::h5dd7da6bb3d06020 + 459, name = 'client::updater::tests::test_decryption_engine', stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
    frame #0: 0x000000010018df8b updater-772e65f3bf6d041d`std::panicking::rust_panic_with_hook::h5dd7da6bb3d06020 + 459
updater-772e65f3bf6d041d`std::panicking::rust_panic_with_hook::h5dd7da6bb3d06020:
->  0x10018df8b <+459>: ud2
    0x10018df8d <+461>: movq   %rax, %rbx
    0x10018df90 <+464>: movabsq $0x1d1d1d1d1d1d1d1d, %rax ; imm = 0x1D1D1D1D1D1D1D1D
    0x10018df9a <+474>: cmpq   %rax, %r15
(lldb) bt
* thread #3: tid = 0xd54277, 0x000000010018df8b updater-772e65f3bf6d041d`std::panicking::rust_panic_with_hook::h5dd7da6bb3d06020 + 459, name = 'client::updater::tests::test_decryption_engine', stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
  * frame #0: 0x000000010018df8b updater-772e65f3bf6d041d`std::panicking::rust_panic_with_hook::h5dd7da6bb3d06020 + 459
    frame #1: 0x000000010019ec77 updater-772e65f3bf6d041d`std::panicking::begin_panic::h9bf160aee246b9f6 + 103
    frame #2: 0x000000010018ecd9 updater-772e65f3bf6d041d`std::panicking::begin_panic_fmt::haf08a9a70a097ee1 + 169
    frame #3: 0x000000010019e8d0 updater-772e65f3bf6d041d`rust_begin_unwind + 32
    frame #4: 0x00000001001c49b1 updater-772e65f3bf6d041d`core::panicking::panic_fmt::h93df64e7370b5253 + 129
    frame #5: 0x000000010001743d updater-772e65f3bf6d041d`core::result::unwrap_failed::h25b2577edc9d7a6c + 285 at macros.rs:29
    frame #6: 0x00000001000172ed updater-772e65f3bf6d041d`_$LT$std..result..Result$LT$T$C$$u20$E$GT$$GT$::unwrap::h0e00937697d3e190 + 77 at result.rs:723
    frame #7: 0x000000010001685e updater-772e65f3bf6d041d`_$LT$flate2..zio..Writer$LT$W$C$$u20$D$GT$$GT$::finish::hdf2987956f07f7b1 + 446 at zio.rs:98
    frame #8: 0x000000010001663d updater-772e65f3bf6d041d`_$LT$flate2..zio..Writer$LT$W$C$$u20$D$GT$$u20$as$u20$std..ops..Drop$GT$::drop::ha4ab2ab7643772a3 + 45 at zio.rs:183
    frame #9: 0x00000001000165d0 updater-772e65f3bf6d041d`flate2..zio..Writer$LT$$RF$$u27$static$u20$mut$u20$client..updater..UpdateIntegrityChecker$LT$$u27$static$GT$$C$$u20$flate2..Decompress$GT$::drop.25688::h5418856e2095ee2b + 64
    frame #10: 0x0000000100016586 updater-772e65f3bf6d041d`flate2..write..DecoderWriter$LT$$RF$$u27$static$u20$mut$u20$client..updater..UpdateIntegrityChecker$LT$$u27$static$GT$$GT$::drop.25685::he37ae034cd202ba5 + 6
    frame #11: 0x0000000100023399 updater-772e65f3bf6d041d`updater::client::updater::tests::test_decryption_engine::h3d45de0eb0a029e7 + 1129 at updater.rs:421
    frame #12: 0x000000010009220c updater-772e65f3bf6d041d`_$LT$F$u20$as$u20$std..boxed..FnBox$LT$A$GT$$GT$::call_box::h6f41cb4b13d7cb46 + 28
    frame #13: 0x0000000100094697 updater-772e65f3bf6d041d`std::panicking::try::call::h3736ed37d7dc244b + 439
    frame #14: 0x00000001001a126c updater-772e65f3bf6d041d`__rust_try + 12
    frame #15: 0x00000001001a1206 updater-772e65f3bf6d041d`__rust_maybe_catch_panic + 38
    frame #16: 0x0000000100094b12 updater-772e65f3bf6d041d`_$LT$F$u20$as$u20$std..boxed..FnBox$LT$A$GT$$GT$::call_box::h70b0cca3005960f6 + 594
    frame #17: 0x000000010019d8a9 updater-772e65f3bf6d041d`std::sys::thread::Thread::new::thread_start::h9e5bde00f3b3e2e2 + 57
    frame #18: 0x00007fff97bef99d libsystem_pthread.dylib`_pthread_body + 131
    frame #19: 0x00007fff97bef91a libsystem_pthread.dylib`_pthread_start + 168
    frame #20: 0x00007fff97bed351 libsystem_pthread.dylib`thread_start + 13
(lldb) up 11
frame #11: 0x0000000100023399 updater-772e65f3bf6d041d`updater::client::updater::tests::test_decryption_engine::h3d45de0eb0a029e7 + 1129 at updater.rs:421
   418                  let mut decryption_engine = UpdateDecryptionEngine::new(SMIME_PRIVKEY_PATH, &mut deflate_decoder).unwrap();
   419                  let mut cursor = Cursor::new(&update[..]);
   420                  io::copy(&mut cursor, &mut decryption_engine).unwrap();
-> 421              }
   422              integrity_checker.finish().unwrap();
   423          }
   424
(lldb) down
frame #10: 0x0000000100016586 updater-772e65f3bf6d041d`flate2..write..DecoderWriter$LT$$RF$$u27$static$u20$mut$u20$client..updater..UpdateIntegrityChecker$LT$$u27$static$GT$$GT$::drop.25685::he37ae034cd202ba5 + 6
updater-772e65f3bf6d041d`flate2..write..DecoderWriter$LT$$RF$$u27$static$u20$mut$u20$client..updater..UpdateIntegrityChecker$LT$$u27$static$GT$$GT$::drop.25685::he37ae034cd202ba5:
    0x100016586 <+6>: popq   %rax
    0x100016587 <+7>: retq
    0x100016588 <+8>: nopl   (%rax,%rax)

updater-772e65f3bf6d041d`flate2..zio..Writer$LT$$RF$$u27$static$u20$mut$u20$client..updater..UpdateIntegrityChecker$LT$$u27$static$GT$$C$$u20$flate2..Decompress$GT$::drop.25688::h5418856e2095ee2b:
    0x100016590 <+0>: subq   $0x28, %rsp
(lldb) down
frame #9: 0x00000001000165d0 updater-772e65f3bf6d041d`flate2..zio..Writer$LT$$RF$$u27$static$u20$mut$u20$client..updater..UpdateIntegrityChecker$LT$$u27$static$GT$$C$$u20$flate2..Decompress$GT$::drop.25688::h5418856e2095ee2b + 64
updater-772e65f3bf6d041d`flate2..zio..Writer$LT$$RF$$u27$static$u20$mut$u20$client..updater..UpdateIntegrityChecker$LT$$u27$static$GT$$C$$u20$flate2..Decompress$GT$::drop.25688::h5418856e2095ee2b:
    0x1000165d0 <+64>: jmp    0x1000165d2               ; <+66>
    0x1000165d2 <+66>: movq   0x10(%rsp), %rdi
    0x1000165d7 <+71>: callq  0x100017460               ; flate2..zio..Writer$LT$$RF$$u27$static$u20$mut$u20$client..updater..UpdateIntegrityChecker$LT$$u27$static$GT$$C$$u20$flate2..Decompress$GT$::drop_contents.25702::h5418856e2095ee2b
    0x1000165dc <+76>: jmp    0x1000165c1               ; <+49>
(lldb) down
frame #8: 0x000000010001663d updater-772e65f3bf6d041d`_$LT$flate2..zio..Writer$LT$W$C$$u20$D$GT$$u20$as$u20$std..ops..Drop$GT$::drop::ha4ab2ab7643772a3 + 45 at zio.rs:183
   180  impl<W: Write, D: Ops> Drop for Writer<W, D> {
   181      fn drop(&mut self) {
   182          if self.obj.is_some() {
-> 183              let _ = self.finish();
   184          }
   185      }
   186  }
(lldb) down
frame #7: 0x000000010001685e updater-772e65f3bf6d041d`_$LT$flate2..zio..Writer$LT$W$C$$u20$D$GT$$GT$::finish::hdf2987956f07f7b1 + 446 at zio.rs:98
   95               try!(self.dump());
   96
   97               let before = self.data.total_out();
-> 98               self.data.run_vec(&[], &mut self.buf, Flush::Finish).unwrap();
   99               if before == self.data.total_out() {
   100                  return Ok(())
   101              }
(lldb) down
frame #6: 0x00000001000172ed updater-772e65f3bf6d041d`_$LT$std..result..Result$LT$T$C$$u20$E$GT$$GT$::unwrap::h0e00937697d3e190 + 77 at result.rs:723
(lldb) down
frame #5: 0x000000010001743d updater-772e65f3bf6d041d`core::result::unwrap_failed::h25b2577edc9d7a6c + 285 at macros.rs:29
(lldb) down
frame #4: 0x00000001001c49b1 updater-772e65f3bf6d041d`core::panicking::panic_fmt::h93df64e7370b5253 + 129
updater-772e65f3bf6d041d`core::panicking::panic_fmt::h93df64e7370b5253:
    0x1001c49b1 <+129>: nopw   %cs:(%rax,%rax)

updater-772e65f3bf6d041d`core::fmt::num::_$LT$impl$u20$fmt..Display$u20$for$u20$u32$GT$::fmt::h4dfd3ff09c843da4:
    0x1001c49c0 <+0>:   pushq  %rbp
    0x1001c49c1 <+1>:   movq   %rsp, %rbp
    0x1001c49c4 <+4>:   subq   $0x20, %rsp
(lldb) down
frame #3: 0x000000010019e8d0 updater-772e65f3bf6d041d`rust_begin_unwind + 32
updater-772e65f3bf6d041d`_$LT$core..fmt..Write..write_fmt..Adapter$LT$$u27$a$C$$u20$T$GT$$u20$as$u20$core..fmt..Write$GT$::write_str::h1c935df4f4dabbd3:
    0x10019e8d0 <+0>: pushq  %rbp
    0x10019e8d1 <+1>: movq   %rsp, %rbp
    0x10019e8d4 <+4>: pushq  %r15
    0x10019e8d6 <+6>: pushq  %r14
(lldb) down
frame #2: 0x000000010018ecd9 updater-772e65f3bf6d041d`std::panicking::begin_panic_fmt::haf08a9a70a097ee1 + 169
updater-772e65f3bf6d041d`std::panicking::begin_panic_fmt::haf08a9a70a097ee1:
    0x10018ecd9 <+169>: movq   %rax, %rbx
    0x10018ecdc <+172>: movq   -0x50(%rbp), %rsi
    0x10018ece0 <+176>: testq  %rsi, %rsi
    0x10018ece3 <+179>: je     0x10018ed02               ; <+210>
(lldb) down
frame #1: 0x000000010019ec77 updater-772e65f3bf6d041d`std::panicking::begin_panic::h9bf160aee246b9f6 + 103
updater-772e65f3bf6d041d`std::panicking::begin_panic::h9bf160aee246b9f6:
    0x10019ec77 <+103>: movq   %rax, %r12
    0x10019ec7a <+106>: testq  %r14, %r14
    0x10019ec7d <+109>: je     0x10019ec9f               ; <+143>
    0x10019ec7f <+111>: cmpq   %rbx, %r14

I cannot share the full code around the failure at present, but here is the test that is failing:

    #[test]
    fn test_decryption_engine() {
        let update_data: Vec<u8> = (0..255).cycle().take(4 * 1024 * 1024).collect();
        let (update, signature, hash, pubkey) = generate_signed_update(&update_data[..]);
        let sha224_hash = Sha224::new_from_slice(&hash[..]);
        let mut out: Vec<u8> = Vec::new();
        {
            let mut integrity_checker = UpdateIntegrityChecker::new(&sha224_hash, &signature, &pubkey, &mut out).unwrap();
            {
                let mut deflate_decoder = ZlibDecoder::new(&mut integrity_checker);
                let mut decryption_engine = UpdateDecryptionEngine::new(SMIME_PRIVKEY_PATH, &mut deflate_decoder).unwrap();
                let mut cursor = Cursor::new(&update[..]);
                io::copy(&mut cursor, &mut decryption_engine).unwrap();
            }
            integrity_checker.finish().unwrap();
        }
        assert_eq!(out, update_data);
    }
}

Use distinct Flush types for Compress::compress vs Decompress::decompress

Currently there is a single flate2::Flush type that is used by both flate2::Compress::compress and flate2::Decompress::decompress.

The methods are different in that compress supports Flush::None, Sync, Partial, Full, Finish while decompress supports Flush::None, Sync, Finish. Neither method supports Flush::Block.

It would be nicer to enforce these restrictions statically by having two different Flush types with only the variants that are allowed in each case.

Rename internal types to match the public types

The write/read/bufread modules all repeat the same type names:

pub use gz::EncoderReader as flate2::read::GzEncoder;
pub use deflate::EncoderReader as flate2::read::DeflateEncoder;

Unfortunately the rustdoc for flate2::read::GzEncoder shows the private names:

impl<R: Read> EncoderReader<R>
fn new(r: R, level: Compression) -> EncoderReader<R>

This is a rustdoc bug but even aside from that, this pattern is confusing to people browsing the code because what they see clicking through a [src] link does not match how they will be using the library. Let's try to make the real type names match the public names they are exported as.

Relevant API guideline: rust-lang/api-guidelines#5
Rustdoc bug: rust-lang/rust#41072

Add a way to reuse Readers/Writers

For my case the zlib ones.

Each reader (and I assume Writer) allocates a 32kb buffer internally, it would be nice to have a way to reset an old reader so that we don't have to allocate that buffer every time.
By the looks of it mz_deflateReset exists for this case.

Write usage examples

The crate-level documentation at https://docs.rs/flate2 as well the documentation for the modules / types / traits / methods should contain examples. Currently the only two pieces of example code are in the readme.

export crc32 and crc32_combine

crc32 and crc32_combine are required for reading and writing ZIP files. It would be nice if they are exported by flate2. The alternative for implementations is to use libz-sys directly. `

miniz does not seem to have a function for combining crc32 values. crc32_combine is used in parallel implementations of ZIP where files are divided into pieces and each piece is deflated separately and the deflated partial streams and partial crc32 values are combined in the end.

GzEncoder hangs on `flush`.

Consider the following example

extern crate flate2;

use std::fs::File;
use std::io::Write;

fn main() {
    let mut f = flate2::write::GzEncoder::new(File::create("test.txt.gz").unwrap(), flate2::Compression::Default);
    write!(f, "Hello world").unwrap();
    f.flush().unwrap();
}

This test program hangs at the last line f.flush().unwrap(). It seems as if flush never returns.

My system: rust 1.14, miniz-sys 0.1.7, flate2-rs 0.2.14

invalid gzip header

Hi @alexcrichton and thank you for your work.

I have tried your library with the standards _2to3-2.6.1_.gz format but I have getted this incident:

$ rustc -V
rustc 1.0.0-beta (9854143cb 2015-04-02) (built 2015-04-02)
$ cargo run
thread '<main>' panicked at 'called `Result::unwrap()` on an `Err` value: Error { repr: Custom(Custom { kind: InvalidInput, error: StringError("invalid gzip header") }) }', /home/rustbuild/src/rust-buildbot/slave/beta-dist-rustc-linux/build/src/libcore/result.rs:774
An unknown error occurred

By consequent, I report for you my main.rs' test:

extern crate flate2;

use std::io::prelude::*;
use flate2::read::GzDecoder;

fn main() {
    let mut d = GzDecoder::new("/usr/share/man/man1/2to3-2.6.1.gz".as_bytes()).unwrap();
    let mut s = String::new();

    d.read_to_string(&mut s).unwrap();
    println!("{}", s);
}

Do you want more information?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.