Giter Site home page Giter Site logo

Quantization in UI about sparsify HOT 6 CLOSED

neuralmagic avatar neuralmagic commented on June 14, 2024
Quantization in UI

from sparsify.

Comments (6)

markurtz avatar markurtz commented on June 14, 2024 1

Hi @IdoZach, thank you for the question. Quantization support is currently planned to go into our beta version of Sparsify slated for roughly 2 months out right now. We are looking for active user design testing for Sparsify beta, so let us know if you would be interested in running through the current designs and giving any feedback!

In the interim, the current workarounds would be one of the following:

  • Take a current recipe for one of our pretrained models in SparseZoo and use that on your own data
  • Modify the Sparsify recipe to include a quantization modifier after pruning completes
  • Create a new recipe in YAML using

We are quickly adding to the functionality in SparseML, though, to make this more streamlined over the next few weeks including APIs for creating recipes. In addition, we will be releasing tutorials on how to do all three of these steps and use the APIs in the next two weeks. I'll update here once those are launched and please let us know if you have any other feedback on them!

Thanks,
Mark

from sparsify.

markurtz avatar markurtz commented on June 14, 2024 1

Correct, you'll also want to set the submodules that should be quantized in your model. Additionally, setting disable_quantization_observer_epoch and freeze_bn_stats_epoch can help recovery as well. disable_quantization_observer_epoch freezes the observer params and generally should be done after your last or second to last LR step so you can continue to train without variable quantization statistics. freeze_bn_stats_epoch is along the same vein in that it will freeze the batch norm stats to allow for better recovery when quantizing and should be set to just after what disable_quantization_observer_epoch is set to. An example implementation from our ResNet-50 recipe looks like this:

- !QuantizationModifier
    start_epoch: 100
    submodules:
      - input
      - sections
    disable_quantization_observer_epoch: 115
    freeze_bn_stats_epoch: 116

Where the total epochs are 135 and pruning stopped at epoch 60.

Note that we let it recover for roughly 40 epochs in the pruning stage before beginning quantization. We do have a few experiments to show that the pruning fine-tuning phase is unnecessary and can be replaced purely with quantization; however, not enough to say definitively. So, if there are issues with recovery during quantization, you may want to train for a bit after pruning before starting quantization.

For your example, I would recommend:

- !QuantizationModifier
    start_epoch: 50
    submodules:
      - fill in based on your model
    disable_quantization_observer_epoch: 90
    freeze_bn_stats_epoch: 91

You will also need to add the following to the pruning modifier if you are planning to run in the DeepSparse engine on Intel VNNI CPUs:
mask_type: [1, 4]
This will set it so that it prunes in blocks of 4 which is a requirement for speedup with sparse quantization on Intel CPUs.

Let me know if you have any further issues and happy to help more!

from sparsify.

IdoZach avatar IdoZach commented on June 14, 2024

Thanks. If I have a custom recipe, prepared through Sparsify, with a final pruning epoch of 50 and overall 100 final epochs, then I would add the following:

 - !QuantizationModifier
        start_epoch: 50.0

Correct? then, after 50 pruning epochs, it would start to quantize the model for the rest? or are there more options needed?

from sparsify.

IdoZach avatar IdoZach commented on June 14, 2024

Hi, after training finishes, how can I verify that my model is indeed quantized?
Similarly, when converting to onnx via ModuleExporter, should I set convert_qat?
Thanks.

from sparsify.

markurtz avatar markurtz commented on June 14, 2024

Hi @IdoZach, sorry about the delay, did not see this notification come through. You can verify that the model is quantized by looking at the wrapped modules. They should have quantization wrappers around them instead of the original convs. To export, setting convert_qat is correct. A full example can be seen in our yolov5 integration: https://github.com/neuralmagic/yolov5/blob/master/models/export.py#L225

Thanks,
Mark

from sparsify.

jeanniefinks avatar jeanniefinks commented on June 14, 2024

Hello @IdoZach
As it's been sometime with no response, I am going to go ahead and close this comment. Please re-open if you have a follow-up. Also, I invite you to "star" our Sparsify repo if you like! We enjoy seeing the community support.
https://github.com/neuralmagic/sparsify/

Thank you!
Jeannie / Neural Magic

from sparsify.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.