Comments (2)
This was now merged, ty @andylolu2 for pointing out too.
from llm.c.
The cuBLASLt API started with CUDA 10.1, which was released Aug 2019. I've been trying to use code that can afaik work with fairly old versions of CUDA/cuBLAS. I think this is probably worth doing if it's faster though, which it should be. Will take a look at using this in /dev/cuda/matmul.cu
as an additional kernel that uses this API (or would gladly welcome a PR, too).
from llm.c.
Related Issues (20)
- __stcs in layernorm_forward_kernel3 function in train_gpt2.cu HOT 3
- layernorm_backward.cu: atomicAdd
- [.gitignore] gitignore does not ignore all binary files HOT 3
- Is there a final bin file that can input some question and output some answer? HOT 1
- Input token length question HOT 2
- Suddenly "Out of memory" on train/python and train, test/CUDA on 4090 HOT 3
- bug: something goes wrong at larger batch sizes HOT 6
- CI tests are dependent on connectivity/uptime of huggingface HOT 3
- test_gpt2.cu correctness bounds tune per-parameter
- Windows Github actions / workflow is successfully building including Cuda 12.4 builds HOT 10
- Error while running "make train_gpt2fp32cu" on Ubuntu HOT 13
- CI Mac issue with resources for Python? HOT 5
- init from scratch HOT 3
- Splitting cuda dev files to use smaller sizes for cpu validation compared to profiling
- Refactoring all of the shared cuda helper methods to the shared common file
- WikiText 103 evaluation
- cuda code that approaches cublas performance
- Hardcoded block_size in kernels HOT 14
- inf loss at big batch
- delete use of cooperative groups in kernels HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llm.c.