Comments (3)
Yes: previously we tune something like https://github.com/pytorch/FBGEMM/pull/82/files (this is for avx2). You can adjust for your customized HW.
from fbgemm.
This is based on tuning results on x86 CPUs so you can change to whichever better for the riscv processor you're optimizing for.
from fbgemm.
Basically, this Diff switches the register layout in C accumulation buffer inside micro-kernel from MR * 1 to MR * 2. Check the reasons in T40816746.
Could you provide the reasons in T40816746
?
from fbgemm.
Related Issues (20)
- Vcpkg port update HOT 2
- FP8 Triton matmul code silently requires contiguous tensors HOT 4
- Undefined symbol: _ZNK5torch8autograd4Node4nameEv HOT 4
- [QST] Int4 Decoding: `ThreadID`, `ElementID` HOT 1
- Regression: Persistent kernels make Triton FP8 matmul much slower HOT 3
- Undefined symbol: `cublasLtMatmulDescCreate` in fbgemm_gpu_experimental_gen_ai_py.so HOT 3
- Can you add the Torch FBGEMM version to the requirements. txt file? Torch and fbgem are often incompatible HOT 1
- [Question FBGEMM_GPU] Adam optmizer not optimized HOT 8
- Why fp8 quantize need to add a min scaling factor? HOT 2
- [Question] Is there FP8 embedding support for training? HOT 3
- FBGEMM build steps are not correct and failing HOT 1
- Build failure with msvc v142 toolchain HOT 1
- fbgemm_gpu_py.so: undefined symbol: _ZNK3c105Error4whatEv HOT 4
- macos install fbgemm error. HOT 4
- fail to compile version v0.8.0 on cuda 12.4 HOT 3
- DLRM run failed .
- cmake is desynchronized from internal TARGETS files, many files not being built HOT 1
- Why is there no implementation of adamw optimizer. Is there a plan for development? HOT 1
- fbgemm_gpu test fail HOT 7
- float conversion emulation routines HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fbgemm.