Comments (2)
My default position is to be supportive of adding any intrinsic found on the Intel Intrinsics Guide https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html so adding those seems reasonable to me.
from nimsimd.
Thats fine with me. I had tried to add some tags like BM1/BM2/F16C just to get a feeling how the nim-package works. I don't now if these are still important.
I've forked nimsimd and added a branch 'ASC' where i added the changes i mentioned :
- f16c.nim (just four functions for Float16 or 'half-precision' )
- BM1 and BM2 are WIP, some bit-operations might be interesting..
- most importantly are
mm_malloc
/mm_free
and maybemm_fence
which i have added to/nimsimd/sse2.nim
. Those work for me - not the fence yet, but the alloc/free. I use the aligned malloc which according to The Intel Intrinsics Guide is around since SSE1 - stoneage. The Guide is massive and there are 'hidden' sections like 'Other' - where the three tags BMI1/BMI2/F16C were sitting :))
I've tried to use CPUID to read the Cache-line-size and after some peeking and pokeing i see a value that is correct for my machine - but maybe just a coincidence. Besides Felix Cloutiers reference - i found another src, Rust-version of CPUID with different legs and numbers - so this is all a bit shaky. I've added the CPUID for
BMI1/BM12/F16C and CMPXCH16 according to the docs from Felix Cloutiers.
Beeing able to (reliably) receive the cacheline size is a plus. The Compare-Exchange-16-Byte never gained support from the compiler-side. Its technically avail. since Haswell (2015).
So i hope i've not messed up too much :) I'm a bit in a mess, thats why i pushed and pull these changes via the github webfrontend - not excatly knowing what i'm doing.. :)) Anyways, pick what you think makes sense. The aligned alloc viamm_malloc
make sense to me.
One more thing - i started smth. as already mentioned - its a RLU-Cache with vector-operation. I made me a common_avx2.nim
for dev and to learn the intrinsics. I'll put it into the 'ASC-branch' - it's rather the workbench-idea - i doubt the nim-generics will make the code faster ;) but readabillity is much better now..
Ahh and the mm_prefetch
-thing is smth. for people who really know what they are doing and interesting but a last resort. The SIMD-champ from algorithmica has a example where he usesa prefetch-instruction during tree-walking ...
greet & beats, Andreas
from nimsimd.
Related Issues (9)
- Can't compile due to missing compiler flag HOT 2
- bug: mm256_castsi256_ps returns M128 instead of M256 HOT 1
- Fails to compile with vcc backend in proc cpuid
- Future of Nim-simd HOT 4
- M1 Mac runtimecheck Errors HOT 6
- little typo in intrinsics import signature HOT 1
- mm256_permute4x64_pd missing from avx2.nim HOT 1
- Two more glitches in avx.nim
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nimsimd.