Comments (16)
The fix is in the v1.1 release now.
from libdeflate.
Are you using the cdecl calling convention? See this section of the README file:
Windows developers: note that the calling convention of libdeflate.dll is cdecl.
This differs from the Windows API, which uses stdcall. If you call into
libdeflate.dll using a non-C/C++ language, or dynamically using LoadLibrary(),
make sure to use the cdecl calling convention. Using the wrong calling
convention may crash your application.
from libdeflate.
Yes, I am.
The signatures are identical for deflate and gzip. Since deflate (and 64-bit gzip) does work I don't think that's the issue.
See the x64 output of the validation / benchmark program in x64.txt
The program compresses, decompresses and validates, then it compresses with one encoder and decompresses with another and validates.
It does so with different compression levels (that loosely maps to the .NET levels).
x86.txt contains the output for deflate only, since it works.
interop.txt is the C# interop class (as you can see the signatures are identical between deflate and gzip besides the name):
I've tried to increase output buffers, but that didn't work.
Something I should try is to increase the size of the input buffer, to check if the code reads past the end.
I'll get back with results for these tests.
I'm really just interested in the X64 case anyway (with a fallback to the "slow" version on X86), but this COULD be a buffer overrun issues which could be used for exploits etc, hence I wanted to report / get to the bottom with this :)
from libdeflate.
After testing some more, I'm pretty sure that it's reading past (or before) the input buffer.
If I allocate a bigger data segment and compresses from that it works fine.,.
If I had the C++ version up and running I would have allocated 3 continuous memory pages (VirtualAlloc as readonly). Marked page 0 and 2 as fully protected (VirtualProtect), then compress page 1 and see what happens.
But I don't have any more time to spend on this issue and again since X86 isn't my concern I'll leave it at this... I hope you find the cause :)
from libdeflate.
It works for me. I tested gzip compression and decompression with the 32-bit Windows DLL using guard pages like you suggested, and all compression levels and CRC32 implementations. Previously, libdeflate has also been tested using valgrind, UBSAN, and afl-fuzz.
There's probably a bug in your C# code, such as providing the wrong buffer size. Don't assume there is no bug just because the 64-bit version seems to work.
from libdeflate.
@dosh1974 any update? If not I will close this, since as far as I can tell this isn't a libdeflate problem...
from libdeflate.
I didn't dig much deeper, but I'm still not 100% convinced that there isn't something fishy going on (like reading past input buffer in some cases, like odd input size or so).
But, I'm not using 32 bit so I don't really care so please close if you wish.
from libdeflate.
Well now you had me write a repro case..
I've attached a Visual Studio 2017 project with a case showing the issue.
LibDeflateTest.zip
It seems like it's data dependent, so my guess is it depends on what part of the compressor reads the final bytes.
Edit: Btw I get the same issue in debug / release and running with / without debugger attached.
Next step would be to include the source files instead of using the DLL's.
from libdeflate.
Including and using the latest source (instead of using the pre-built DLL for windows), seems to work, leaving a couple of options.
- Something has been fixed (changed) since the v1.0 release.
- The compiler used for building the v1.0 release binaries is doing some bad codegen (optimizations?).
I don't have any compiler but VS installed atm and I guess you're using some very specific, fine tuned, compiler setup to maximize performance, so I let it rest for now :)
from libdeflate.
Unfortunately I cannot reproduce it even with your test program, although I'm compiling it with a MinGW toolchain rather than Visual Studio. Can you send me your compiled LibDeflateTest.exe
file which demonstrates the crash? Also, what type of CPU are you running this on?
Also, I don't think this is related to the crash, but I noticed the VS project file for your test program actually uses the static library version of libdeflate, not the DLL version. Coming mostly from a non-Windows background, I seem to have neglected to include an import library in the Windows binary releases: in v1.0 libdeflate.lib
is actually a real static library, not an import library. I'll fix this in the next release by including the import library as libdeflate.lib
and the static library as libdeflatestatic.lib
.
from libdeflate.
Here's the error in VS.
The address is tries to read is -1, in __libdeflate_aligned_free, this suggests that my initial assumption of reading before / past the input buffer is wrong.
See screenshot:
Further I noticed that sometimes it does work!
That suggests to me that there might be an uninitialized memory/variable problem since I don't expect a compressor to use any randomness in it's process and I haven't seen any indication of it being multi-threaded (should be 100% deterministic IMO).
See:
It makes sense that the issue persist in both the statically linked (C++) and the DLL (my initial issue, from C#) if it's the same compiler.
Odd that I can't reproduce when I build everything using VS and you can't reproduce when building everything using MinGW.
This suggests that there are some interfacing issues, but since the deflate works 100% using identical signatures etc it doesn't seem very likely.
I've also attached (debug / release) binaries.
LibDeflateTestBinaries.zip
Edit: The function name: aligned free, does it make any assumptions on memory alignment given by the OS?
This could explain why VS / MinGW give different results (suppose the .exe stub generated by MinGW gives out 8 byte aligned memory, vs .NET and VS C++ stub gives out a different alignment?).
from libdeflate.
As for my CPU, it's a modest I5:
from libdeflate.
Figured out why building from source on VS works, it not using any of the SSE / AVX optimized versions... made a half arsed attempt of getting it working in VS but gave up... but at least that's probably why it worked.
from libdeflate.
Digging around a bit, I think this may be caused by an unaligned store to a temporary in crc32_pclmul_avx()
(makes sense, since computing crc32 is pretty much the only difference between gzip and deflate). Visual Studio misinterprets the debug info and claims the error is in libdeflate_aligned_free()
, but the example never calls the free function.
01320D40 lea ebx,[edx+40h]
01320D43 vmovdqa xmm2,xmmword ptr [edx+10h]
01320D48 cmp edi,ebx
01320D4A vmovdqa xmm4,xmmword ptr [edx+20h]
01320D4F vmovdqa xmm5,xmmword ptr [edx+30h]
01320D54 je __libdeflate_aligned_free+32Dh (01320DCDh)
01320D56 mov eax,ebx
01320D58 vmovdqa xmm6,xmmword ptr ds:[13BAC20h]
01320D60 vpclmulqdq xmm1,xmm4,xmm6,0
01320D66 add eax,40h
01320D69 vpclmulqdq xmm4,xmm4,xmm6,11h
01320D6F vpclmulqdq xmm7,xmm0,xmm6,0
01320D75 vmovaps xmmword ptr [esp],xmm1 ; <-- esp is 00EFF8DC
The problem might be that GCC assumes the stack is 16-byte aligned on entry, but x86 MSVC does not. So when you call the function in libdeflate.lib (which is compiled with GCC 7.3.0) from x86 MSVC this can happen.
from libdeflate.
A little further investigation suggests the problem is that GCC does not add stack alignment code when you use the target attribute alone. If you compile with the proper flags, or add the force_align_arg_pointer
attribute, it does.
Small example:
#include <wmmintrin.h>
__attribute__((target("sse2")))
int foo(__m128i a)
{
__m128i n = _mm_srli_si128(a, 8);
return *(unsigned char *) &n;
}
compiles with gcc -O2 -m32 -S -c foo.c
to
_foo:
movdqa %xmm0, %xmm1
subl $44, %esp
psrldq $8, %xmm1
movaps %xmm1, (%esp)
movzbl (%esp), %eax
addl $44, %esp
ret
which will crash if the stack is not aligned on entry. Whereas if you add the force_align_arg_pointer
attribute (or compile with -msse2
or -mincoming-stack-boundary=2
), you get
_foo:
pushl %ebp
movdqa %xmm0, %xmm1
psrldq $8, %xmm1
movl %esp, %ebp
andl $-16, %esp
subl $32, %esp
movaps %xmm1, (%esp)
movzbl (%esp), %eax
leave
ret
from libdeflate.
Thanks for the extra information @dosh1974, and thanks for the detailed analysis @jibsen! It seems that's exactly what the problem was. gcc assumes the stack is 16-byte aligned and can take advantage of that to generate aligned SSE or AVX load/stores to the stack. But actually MSVC only guarantees 4-byte alignment. I've fixed this by adding __attribute__((force_align_arg_pointer))
to libdeflate's API entry points when it's compiled for 32-bit Windows with gcc.
I published a prerelease build of the fixed Windows binaries to https://github.com/ebiggers/libdeflate/releases/tag/v1.0-8-gd6d50c6. Please verify that they work for you now -- thanks!
from libdeflate.
Related Issues (20)
- ESA matchfinder HOT 4
- crc32 is twice as fast in zlib-ng HOT 19
- how to change sliding window size at runtime? HOT 1
- arm_acle.h compiler errors when building for iOS with Xcode 15.3 HOT 4
- analyzer warning regression: The left operand of '+' is a garbage value due to array index out of bounds HOT 1
- Unaligned access on PowerPC: forgotten macros in `common_defs.h` HOT 11
- 32-bit build error when using cctools 949.0.1 with assembler that doesn't support AVX instructions HOT 7
- VNNI support seems to require something later than GCC 11.1 HOT 3
- error: inlining failed in call to 'always_inline' in some builds HOT 1
- BUILD: target specific option mismatch HOT 3
- 1.20 does not compile HOT 2
- compression & decompression usage from browser to libdeflate
- Static builds and exported symbols HOT 2
- Compilation on aarch64-linux requires GCC 7.1
- libdeflate_crc32 performance when buffer is 15 bytes or fewer HOT 8
- benchmark against another zlib fork HOT 1
- ARM feature issues on GCC 13.3.1? HOT 7
- Compilation Error Despite Disabling AVX Instructions HOT 3
- build with GCC13 on NetBSD failed due to Error: unsupported instruction `vpdpbusd' HOT 6
- Building error in ARM v8 system HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from libdeflate.