Comments (1)
Hey @bsulyok you'll be happy to hear that we've added prototype support for semi-structured sparse here :)
This uses a little bit of different user API, we've created a SemiStructuredSparseLinear drop in replacement for nn.Linear, instead of a pure tensor subclass. This is because we sometimes want to do activation sparsity and also to handle autograd support with torch.Function
.
There are some meaningful differences between this code and to_sparse_semi_structured
. Namely, we apply sparsity to a 4x4 tile so that we can accelerate both the forwards and backwards pass, since we have Wx for the forward pass and W' dL/dx ' for the backwards pass. So we need to be 2:4 sparse in both directions.
Additionally, we've written fast sparsification kernels that do runtime sparsity for training. These kernels to 2:4 pruning + compression very quickly at runtime, this makes distributed support much simpler. Additionally, you'll need cuSPARSELt support to see e2e speedups, CUTLASS is not sufficient.
I am writing a blog post about this that should be publicly available shortly, will share when it's available.
Eventually upstreaming this into pytorch core is something we're thinking about now.
from pytorch.
Related Issues (20)
- "torch.geqrf" performs differently on cpu and gpu HOT 7
- "torch.matrix_exp" performs differently on cpu and gpu HOT 6
- [cudagraph] simplify usage of how cudagraph dumps debug file HOT 2
- inplace parameter in dropouts should function as expected regardless of the value of training(or train) paramter
- "torch.unique" performs differently on cpu and gpu HOT 3
- The unexpected behavior of `min()` HOT 1
- The unexpected behavior of `max()` HOT 1
- "torch.orgqr" performs differently on cpu and gpu HOT 2
- Feature request to add the SReLU activation function
- When testing the scalar version, test_torchinductor.py will fail
- OOM Message is truncated HOT 1
- Third party LICENSE collection error when build wheel package HOT 8
- _convert_input_to_fake doesn't work when there are no inputs HOT 1
- Give`wait_tensor()` a schema that reflects its side effect HOT 4
- torch.onnx.dynamo_export nn.Bilinear's aten._trilinear HOT 2
- The significant difference between the outputs of torch.reciprocal running on cpu and gpu
- One complex number or boolean value of a 1D or more D tensor with `argmin()` and `dim=` works HOT 1
- One complex number or boolean value of a 1D or more D tensor with `argmax()` and `dim=` works HOT 1
- error: call to non-โconstexprโ function
- using pykan i met this error. HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pytorch.