Giter Site home page Giter Site logo

gildor2 / fast_zlib Goto Github PK

View Code? Open in Web Editor NEW
132.0 21.0 24.0 201 KB

Heavily optimized zlib compression algorithm

Home Page: http://www.gildor.org/en/projects/zlib

License: BSD 3-Clause "New" or "Revised" License

Assembly 14.43% C++ 8.10% Shell 5.19% C 72.28%
zlib compression algorithm optimization deflate c x86-assembly cross-plattform

fast_zlib's Introduction

Optimized version of longest_match for zlib

Summary

Fast zlib longest_match function. Produces slightly smaller compressed files for significantly faster time. For more details and updates please visit this page. Brief algorithm description is available on the home page of the project.

Source code repository is located on Github. Wiki page contains some additional notes and test results.

You may get precompiled binaries for Windows platform at this location

Build instructions

This library consists of 2 versions: C and assembly.

Building C version

You should replace longest_match() function in deflate.c with one located in Sources/match.h. One of possible ways is to making a file with contents looking like

#define ASMV
#include "deflate.c"

#undef local
#define local

#include "match.h"

void match_init()
{
}

and linking zlib with this file instead of deflate.c.

Zlib 1.2.13 or newer

Note: since zlib 1.2.13 (October 2022), ASMV option has been removed from zlib source, therefore there's no possibility to replace longest_match function without patching the original source code. So, you'll need to apply patch to the zlib source code in order to make things buildable. You may find the patch in Sources/zlib_1.2.13.patch.

Building a 32-bit assembly version

This version is compatible only with x86 platform. To build a matcher, please start with obtaining the copy of the Netwide Assembler

For use with the Microsoft Visual Studio compile it with the following command line:

nasm match32.asm -f win32 -o match.obj

For the Watcom C use this:

nasm match32.asm -f obj -o match.obj -D OMF_FORMAT

For Linux use this command line:

nasm match32.asm -f elf32 -o match.o

Running tests

The library goes with built-in test application. Some information about its features could be found on wiki page.

On Unix platforms (those which has bash and perl out-of-the-box) you may simply run build.sh script located at the root directory of the project. This will build several versions of the test application and put them to obj/bin. There's a script which will run tests in automated mode, test.sh. Run test.sh --help to see available options. By default it will run several predefined tests for data locations from my own PC. If you'll need to change locations, just modify several bottom lined in the script (see DoTests <directory>).

For Windows platform, you'll need bash and perl to be installed and available via PATH environment variable. This could be achieved by installing Gygwin or MSYS projects to your computer. You may also get a set of required binaries here (you'll need BuildTools.zip).

License

This library is licensed under the BSD license.

fast_zlib's People

Contributors

gildor2 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fast_zlib's Issues

Recommended compiling flags

Hey, @gildor2 thanks for the awesome code. I'm using zlib(boost iostreams) in a small game server to compress(deflate) the game packets and i want to know if this flags("CFLAGS="-march=native -O2" ./configure") are good enough for optimize zlib together with your C code.

[question] Description of the algorithm

@gildor2

Hi Konstantin,

I have tested your fast-longest-match with the data set I need compress, and fast-longest-match does outperforms the original zlib in terms of speed. I want to understand how it works and so I read your brief description of the algorithm and checked the source code. Now I still have a few questions about your algorithm, could you help me better understand it?

When algorithm finds a match (intermediate one), it computes hashes for all matching bytes, and selects a different hash chain - that one which points farther from current position. ...

Q1: Do you mean when finding a match, the algorithm computes hashes for every 3 bytes in the matching string? Then I think when these hash values are inserted into the hash table head[] (or the prev[] if head[x] already exists, they will fall into different positions. Is my understanding correct?

Q2: Regarding to "select a different hash chain - that one which points farther from current position. If we have 2 hash chains, one points at distance 10, and another one at distance 1000, it is not worth checking distance 10 because other bytes will not match ..."

  • What do you mean when you refer to "hash chain"? Do you mean the elements in prev[], eg. checking prev[1000] betters checking prev[100]?
  • When you say "it not worth checking distance 10 because other bytes will not match", my understanding is like the following case:
current position 1100, string: abcdabcd......
position 1000, string: abcdabcd......
position 100, string: abcdabcd.......

Here we can check the string starting on position 100, because the possible match length is 1000+. If checking position 1000, the possible match length is 100+.
Is this what you mean?

Q3: To this piece of code:

cur_match -= offset;
offset = 0;
next_pos = cur_match;
for (i = 0; i <= len - MIN_MATCH; i++) {
    pos = prev[(cur_match + i) & wmask];
    if (pos < next_pos) {
        /* this hash chain is more distant, use it */
        next_pos = pos;
        offset = i;
    } 
}
/* Switch cur_match to next_pos chain */
cur_match = next_pos;

I think this piece of code implements the magic "jumping to the hash chain which points to farther distance". I don't quite understand what it tries to do though. Looks to me it does something like this:

when a matching string "abcdefg" at position 1000 is found, try to find a longest match, such as "abcdefg123456"
- if prev[1000] exists, record prev[1000] as p1 (a farther position with the same match, using hash value of "abc")
- if prev[1001] ("bcd") exists, record prev[1001] as p2
- if p2 < p1, record next_pos = p2
- do similar thing for prev[1002] ("cde"), prev[1003] ("def"), prev[1004] ("efg"), find the *farthest* next_pos 

Is this process the one you mentioned as "jumping to the hash chain which points to farther distance"? What's the fundamental difference between this one and the zlib behavior, which is keep checking p = prev[1000], p1 = prev[p], p2 = prev[p1]?

My questions might be too long, but I really want to understand this algorithm better, and look forward to getting some feedback from you.

Thank you.

Euccas

Неожиданный Z_STREAM_ERROR

Привет, не знаю куда копать дальше...
Написал managed-обертку над zlib (очень нужна скорость в одном месте). С версией 1.2.3 c сайта WinImage работает. С твоими dll (что с cdecl что c WinApi) работает но "неустойчиво".

Я написал цикл 100 раз чтобы скорость померить, вот где-то на 12 шаге все ломается...
Т.е. изначально несколько раз полный цикл
deflateInit2_ -> deflate(Z_NO_FLUSH)->deflate(Z_FINISH)->deflateEnd() отрабатывает...
а иногда (итерации на 12 например) выбрасывает Z_STREAM_ERROR (-2)
чаще всего на deflate(Z_FINISH) но бывает и на deflateEnd и на deflate(Z_NO_FLUSH)

Может быть сборочку сделаешь чтобы диагностику в консольку писало что этой скотине не нравится? или могу свою тестовую прогу дать...

64 bit ASM

Hi Konstantin,

your work is a great improvement to compression.
Can you port the match32.asm to 64 bit?

Yours Sincerly
Ed

How to use fast_zlib in Mac OS

it's very usefull this lib, and your blog articles.

i want use this lib in mac os,but i don't konw how.

did you have the solutaion? very thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.