Comments (8)
It's optimized by default. Please give details. What CPU? How did you measure the speed, exactly? What input sizes did you test? Is there any way that I could reproduce your results? Are you sure it was a controlled test?
from libdeflate.
I first built it with msvc, but it seems that the optimized function is not included in lib and it is not optimized by this build method. On the other hand, when building with gcc, I confirmed that it contains optimized functions without any special handling.
As for the measurement method of processing, I use lodepng which is one of the OSS implementation of the PNG library, and confirmed it by replacing the expansion part of deflate with libdeflate. Expanding images is completed in half the time from lodepng origin, but it is almost the same as libpng.
from libdeflate.
Yes, due to limited resources to develop and test with MSVC, libdeflate is only properly optimized when built with gcc or clang.
Anyway, if I understand you correctly, your results were:
- lobepng with its own DEFLATE implementation is slow.
- lobepng with libdeflate is fast.
- libpng with zlib is fast.
I don't see where you actually compared libdeflate to zlib directly. It could be that libpng is faster for other reasons. Can you please clarify whether you actually did a controlled test that compared libdeflate to zlib?
from libdeflate.
If you are compiling with GCC for Windows x86 (and not x64), you might also want to add -msse2
, unless you specifically need to support old machines. Compiling for x86 with SSE2 enabled has been the default in MSVC since 2012.
It is unlikely to affect the decompression speed, since that is handled by inspecting CPU features, but it does appear to speed up compression (as a quick test, the included benchmark tool reports 63 MB/s with the default x86 options, and 76 MB/s with -msse2
for compression level 1 on silesia.tar).
from libdeflate.
Correct, I haven't made the matchfinder optimizations detect CPU features at runtime yet. So adding
-msse2
(for x86) or -mavx2
(for x86 and x64) will help a bit, if you know the code will only be run on a CPU with those features. However, that only affects compression, whereas the original question here was about decompression.
from libdeflate.
Hi,
As a result of profiling lodepng from last time, I noticed that adler32 and unfilter processing is late by lodepng 's default (PNG image has filters). Replacing the zlib module with libdeflate instead of inflate and replacing it with unfilter in libpng confirmed that it can be expanded at 1.35 times faster than normal libpng.
It seems necessary to modify the processing of lodepng itself in order to further increase the speed (some unnecessary processing is found).
I added -O3, -Ofast, -mtune etc when adding libdeflate with gcc of msys2 but there was no difference in processing performance.
from libdeflate.
Okay, so you're saying that libdeflate is faster after all, in a proper comparison?
Is there any remaining issue here?
from libdeflate.
No.
Thank you very much for solving my first question :)
from libdeflate.
Related Issues (20)
- v1.14 - v1.18 deflate compatibility HOT 5
- Decompressing Microsoft DOCX Files in Unreal Engine 5 with Libdeflate HOT 4
- Windows Explorer cannot unpack some files generated with libdeflate (reward offered) HOT 9
- Tests fail on Windows in Release mode with VS platform toolset v141 HOT 1
- Please also generate a static source code archive for release HOT 2
- Libdeflate compressed stream in kunzip fail to decompress HOT 12
- Slow level 0 "compression" HOT 2
- I added stream & multi-thread support for libdeflate HOT 16
- ESA matchfinder HOT 4
- crc32 is twice as fast in zlib-ng HOT 19
- how to change sliding window size at runtime? HOT 1
- arm_acle.h compiler errors when building for iOS with Xcode 15.3 HOT 4
- analyzer warning regression: The left operand of '+' is a garbage value due to array index out of bounds HOT 1
- Unaligned access on PowerPC: forgotten macros in `common_defs.h` HOT 11
- 32-bit build error when using cctools 949.0.1 with assembler that doesn't support AVX instructions HOT 7
- VNNI support seems to require something later than GCC 11.1 HOT 3
- error: inlining failed in call to 'always_inline' in some builds HOT 1
- BUILD: target specific option mismatch HOT 3
- 1.20 does not compile HOT 2
- compression & decompression usage from browser to libdeflate
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from libdeflate.