Giter Site home page Giter Site logo

vectorchief / unisimd-assembler Goto Github PK

View Code? Open in Web Editor NEW
85.0 13.0 7.0 9.33 MB

SIMD macro assembler unified for ARM, MIPS, PPC and x86

License: MIT License

Makefile 0.44% C++ 1.93% Batchfile 0.01% Shell 0.16% C 97.46%
simd x86 sse sse2 x86-64 avx avx2 avx512 armv7 neon

unisimd-assembler's Introduction

UniSIMD assembler is a high-level C/C++ macro assembler framework unified across
ARM, MIPS, POWER and x86 architectures. It establishes a subset of both BASE and
SIMD instruction sets with clearly defined common API, so that application logic
can be written and maintained in one place without code replication.
The assembler itself isn't a separate tool, but rather a collection of C/C++
header files, which applications need to include directly in order to use.

Initial documentation for the assembler is provided in core/config/rtdocs.h.

At present, Intel SSE/SSE2/SSE4 and AVX/AVX2/AVX-512 (32/64-bit x86 ISAs),
ARMv7 NEON/NEONv2, ARMv8 AArch32 and AArch64 NEON, SVE (32/64-bit ARM ISAs),
MIPS 32/64-bit r5/r6 MSA and POWER 32/64-bit VMX/VSX (little/big-endian ISAs)
are mostly implemented (w/ horizontal reductions and byte/half SIMD+BASE ops)
although scalar improvements, wider SIMD vectors with zeroing/merging predicates
in 3/4-operand instructions, cross-precision fp-converters on modern CPU targets
are planned as extensions to current 2/3-operand SPMD-driven vertical SIMD ISA.

The project has a test framework for Linux/GCC/Clang and Windows/VC++/TDM64-GCC.
Support for macOS is provided via Command Line Tools with GCC and Clang options.
Instructions for resolving dependencies and building the binaries
for supported platforms can be found in the accompanying INSTALL file.

UniSIMD core features:
 - Unified, Universal, Portable, Compatible code
 - Explicit register allocation, predictable performance
 - Three register sets for code: 8, 16, 32 (free: 8, 15, 30)
 - High-level SIMD registers/ops as singles, pairs and quads
 - SIMD-aligned backend structures with offsets/factors
 - Vector-length agnostic vertical SIMD ISA, configurable
 - Simultaneous scalar + 128/256-bit + configurable SIMD ops
 - ISA implementation for fp16/fp128 (half/quad) SIMD ops
 - C/C++, Compute, SPMD on 4 major archs
 - Intel SSE/SSE2/SSE4 and AVX/AVX2/AVX-512
 - ARMv7 NEON/NEONv2, ARMv8 AArch32/AArch64 NEON, SVE
 - MIPS r5/r6 MSA (Warrior P5600, I6400/P6600)
 - POWER VMX/VSX (PowerPC G4/G5, POWER6/7/8/9)
 - CISC, RISC, CISC on RISC, little/big-endian ISA
 - Support for reg-reg, load/store, load-op instructions
 - Plain, indexed and scaled-indexed addressing modes
 - FMA3 support (native or higher-precision emulation)
 - 32/64-bit hybrid mode for native 64-bit ABI
 - 32/64-bit addressing for BASE and SIMD ops
 - 32/64-bit configurable SIMD elements (fp+int)
 - Simultaneous 32/64-bit BASE (bridges, rules) and SIMD ops
 - ISA implementation for int8/int16 (byte/half) BASE ops
 - Full control over code, compiler steps out of the way
 - Potential for bit-exact fp-compute across modern targets
 - Used in QuadRay engine

unisimd-assembler's People

Contributors

vectorchief avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

unisimd-assembler's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.