Comments (6)
Hi @IdoZach, thank you for the question. Quantization support is currently planned to go into our beta version of Sparsify slated for roughly 2 months out right now. We are looking for active user design testing for Sparsify beta, so let us know if you would be interested in running through the current designs and giving any feedback!
In the interim, the current workarounds would be one of the following:
- Take a current recipe for one of our pretrained models in SparseZoo and use that on your own data
- Modify the Sparsify recipe to include a quantization modifier after pruning completes
- Create a new recipe in YAML using
We are quickly adding to the functionality in SparseML, though, to make this more streamlined over the next few weeks including APIs for creating recipes. In addition, we will be releasing tutorials on how to do all three of these steps and use the APIs in the next two weeks. I'll update here once those are launched and please let us know if you have any other feedback on them!
Thanks,
Mark
from sparsify.
Correct, you'll also want to set the submodules that should be quantized in your model. Additionally, setting disable_quantization_observer_epoch and freeze_bn_stats_epoch can help recovery as well. disable_quantization_observer_epoch freezes the observer params and generally should be done after your last or second to last LR step so you can continue to train without variable quantization statistics. freeze_bn_stats_epoch is along the same vein in that it will freeze the batch norm stats to allow for better recovery when quantizing and should be set to just after what disable_quantization_observer_epoch is set to. An example implementation from our ResNet-50 recipe looks like this:
- !QuantizationModifier
start_epoch: 100
submodules:
- input
- sections
disable_quantization_observer_epoch: 115
freeze_bn_stats_epoch: 116
Where the total epochs are 135 and pruning stopped at epoch 60.
Note that we let it recover for roughly 40 epochs in the pruning stage before beginning quantization. We do have a few experiments to show that the pruning fine-tuning phase is unnecessary and can be replaced purely with quantization; however, not enough to say definitively. So, if there are issues with recovery during quantization, you may want to train for a bit after pruning before starting quantization.
For your example, I would recommend:
- !QuantizationModifier
start_epoch: 50
submodules:
- fill in based on your model
disable_quantization_observer_epoch: 90
freeze_bn_stats_epoch: 91
You will also need to add the following to the pruning modifier if you are planning to run in the DeepSparse engine on Intel VNNI CPUs:
mask_type: [1, 4]
This will set it so that it prunes in blocks of 4 which is a requirement for speedup with sparse quantization on Intel CPUs.
Let me know if you have any further issues and happy to help more!
from sparsify.
Thanks. If I have a custom recipe, prepared through Sparsify, with a final pruning epoch of 50 and overall 100 final epochs, then I would add the following:
- !QuantizationModifier
start_epoch: 50.0
Correct? then, after 50 pruning epochs, it would start to quantize the model for the rest? or are there more options needed?
from sparsify.
Hi, after training finishes, how can I verify that my model is indeed quantized?
Similarly, when converting to onnx via ModuleExporter
, should I set convert_qat
?
Thanks.
from sparsify.
Hi @IdoZach, sorry about the delay, did not see this notification come through. You can verify that the model is quantized by looking at the wrapped modules. They should have quantization wrappers around them instead of the original convs. To export, setting convert_qat is correct. A full example can be seen in our yolov5 integration: https://github.com/neuralmagic/yolov5/blob/master/models/export.py#L225
Thanks,
Mark
from sparsify.
Hello @IdoZach
As it's been sometime with no response, I am going to go ahead and close this comment. Please re-open if you have a follow-up. Also, I invite you to "star" our Sparsify repo if you like! We enjoy seeing the community support.
https://github.com/neuralmagic/sparsify/
Thank you!
Jeannie / Neural Magic
from sparsify.
Related Issues (20)
- Can't import EpochRangeModifier HOT 4
- Transformers Text Classification Profiling fails and crases sparsify server (data indices for a Gather operation) HOT 1
- Sparse-fication recipe for yolov7 HOT 2
- how Can I solve this problem HOT 3
- Browser freezes when performing optimization (Mac M1 Max) HOT 3
- How should I correctly train the model?
- errored out Invalid input shape, cannot create a random input shape from: (3, None, None) HOT 2
- 🚨 Next Gen Sparsify Early Access Waitlist🚨
- :mega: Try Sparsify Alpha now! :mega:
- Issue when logging in HOT 1
- Pretraining style training aware sparsification HOT 1
- PyTorch to ONNX weight names may change on export, causing name mismatch in subsequent training HOT 1
- Error while installing sparsify via pip HOT 4
- Error While exporting the model HOT 4
- How to Implement config file in training pipeline HOT 3
- Improve documentation when exporting a recipe HOT 2
- Having a sparsify-cli component to interact with the deployed sparsify server HOT 2
- Trying to apply sparsify on 1-layer transformer model HOT 7
- cannot install in virtual environment python=3.6 HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sparsify.