animetosho / rapidyenc Goto Github PK
View Code? Open in Web Editor NEWSIMD accelerated yEnc en/decode C library
SIMD accelerated yEnc en/decode C library
I saw some code changes here and was wondering if you think it's worth updating the code in sabctools with these updates?
I'm having a couple of issues on macOS Sonoma 14.0 on a M2 that hoping you can help with.
Line 113 in 78d71c4
In file included from /Users/mnightingale/personal/workspace/rapidyenc/rapidyenc/src/platform.cc:11:
In file included from /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.0.sdk/usr/include/sys/sysctl.h:83:
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.0.sdk/usr/include/sys/ucred.h:101:2: error: unknown type name 'u_int'
u_int cr_version; /* structure layout version */
...
(snip same for a few other files)
Removing the line seems to resolve it, at least it builds but I don't know if there are any implications.
Since updating to Xcode 15 (Apple clang version 15.0.0 (clang-1500.0.40.1)) I crash when trying to call rapidyenc_decode_incremental from Go.
SIGILL: illegal instruction
PC=0x103cc5000 m=0 sigcode=2
signal arrived during cgo execution
instruction bytes: 0x53 0xd9 0x3 0x4f 0x34 0xe7 0x1 0x4f 0x95 0xe4 0x0 0x4f 0x16 0xe4 0x2 0x4f
I'm not sure but I don't think those instructions are arm code?
Taking a look at the dylib in ghidra:
Error | Bad Instruction | Unable to resolve constructor at 00005000 (flow from 00004ffc) | 00005000 | | ?? 53h S
There is a chunk from 0x5000 - 0x56e0 which is hasn't decompiled.
Looking at the disassembled code preceeding it which is function do_decode_simd<>(uchar **param_1,uchar **param_2,ulong param_3,YencDecoderState *param_4)
If I comment out the neon64 decode it works correctly, no bad instructions.
#if(IS_ARM64)
# set(DECODER_NEON_FILE decoder_neon64.cc)
#else()
set(DECODER_NEON_FILE decoder_neon.cc)
#endif()
At this point I'm kind of lost, I thought maybe a missing compiler flag? or maybe something has changed with Apples new compiler.
Thanks,
Mike
I've figured out that to use a static build of the library with Go on Windows I must use mingw-w64 to build it, builds from msvc are not supported.
So I've setup MSYS2 and building with gcc, however, I get a load of similar errors such as:
...
[ 3%] Building CXX object CMakeFiles/rapidyenc.dir/src/platform.cc.obj
In file included from C:\[snip]\rapidyenc\src\common.h:123,
from C:\[snip]\rapidyenc\src\platform.cc:1:
C:/msys64/mingw64/lib/gcc/x86_64-w64-mingw32/13.2.0/include/tmmintrin.h: In function '__m128i _mm_hadd_epi16(__m128i, __m128i)':
C:/msys64/mingw64/lib/gcc/x86_64-w64-mingw32/13.2.0/include/tmmintrin.h:42:20: error: '__builtin_ia32_phaddw128' was not declared in this scope; did you mean '__builtin_ia32_paddw128'?
42 | return (__m128i) __builtin_ia32_phaddw128 ((__v8hi)__X, (__v8hi)__Y);
...
From what I can gather GCC isn't letting it use intrinsics without the relevant compiler flags -march=native or -msse4 etc.
If I pass -DCMAKE_CXX_FLAGS=-msse4
I can build it, but I don't know if that's any better than -march=native
.
If I understand correctly then the problem is that common.h is getting included in files which don't actually need intrinsics?
I've no idea why I've successfully made Linux x64 builds with gcc but on Windows it complains.
I'm attempting to build with:
msys2-x86_64-20231026.exe and add C:\msys64\mingw64\bin to PATH
pacman -S mingw-w64-x86_64-gcc mingw-w64-x86_64-make
cmake -S rapidyenc -B rapidyenc/build -G "MinGW Makefiles"
cmake --build rapidyenc/build --config Release --target rapidyenc_static
Thanks for the looking, this should be the last platform giving me problems.
I think there might be a bug with decoding on linux/arm64.
I'm testing on a raspberry pi 4.
In my Go module I have a test which just throws random data at encode followed by decode and I'm seeing failures fairly regularly.
I've tried compiling with GCC 13.2.0 and GCC 11.4.0.
I don't see the same failure on macOS arm, Windows x64 or Linux x64.
Below files are available from rapidyenc-arm-issue.zip
These are reproducible steps which are just using the rapidyenc_cli tool to rule out any of my own Go issues.
$ echo -e -n '\xB5\x3F\xE1\x20\xD1\x54\xA3\x5F\xC3\x2A\xFF\xF3\x25\x79\x45\x41\x3C\x55\xB8\x65\x8E\x31\x93\x06\xF5\x3C\x00\xC5\xF1\xB8\xD4\xC3\xC0\x7E\x64\xC4\x91\x84\x48\xF6\x38\xE2\x58\x94\x3E\xF4\x5F\xE9\xC4\x85\x03\x5F\xA8\x24\x92\xF5\x4F\x7E\x80\x62\xEB\x9D\xEC\xEA\x49\x49\xE5\x9A\x04\xC0\xDA\x7F\x7C\x9E\xF1\x3E\x62\x14\xA6\x40\xC6\x5C\xBE\xA9\xFE\x47\xF5\xAB\x3F\x5A\x0A\x9C\x1D\xCA\x61\x99\x2B\x8A\xAC\xA1\x20\x73\xDD\x4D\x36\x2F\x9F\xBE\x4E\x1A\xCC\x32\xB3\x9B\x86\xB8\x62\x39\xE6\x45\x38\x1C\x43\x04\xFE\x91\x56\x5C' > raw.dat
$ ./rapidyenc/build/rapidyenc_cli e < raw.dat > raw.yenc
Computed CRC32: b1c30d10
$ ./rapidyenc/build/rapidyenc_cli d < raw.yenc > decoded.dat
End of input reached
Computed CRC32: 90d9ecb8
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.