Giter Site home page Giter Site logo

fp8-quantization's Introduction

FP8 Quantization: The Power of the Exponent

This repository contains the implementation and experiments for the paper presented in

Andrey Kuzmin*1, Mart van Baalen*1, Yuwei Ren1, Markus Nagel1, Jorn Peters1, Tijmen Blankevoort1 "FP8 Quantization: The Power of the Exponent", NeurIPS 2022. [ArXiv]

*Equal contribution 1 Qualcomm AI Research (Qualcomm AI Research is an initiative of Qualcomm Technologies, Inc.)

You can use this code to recreate the results in the paper.

Method and Results

In this repository we share the code to reproduce analytical and experimental results on performance of FP8 format with different mantissa/exponent division versus INT8. The first part of the repository allows the user to reproduce analytical computations of SQNR for uniform, Gaussian, and Student's-t distibutions. Varying the mantissa/exponent bit-width division changes the trade-off between accurate representation of the data around mean of the distribution, and the ability to capture its tails. The more outliers are present in the data, the more exponent bits is useful to allocate for the best results. In the second part we provide the code to reproduce the post-training quantization (PTQ) results for MobileNetV2, and Resnet-18 pre-trained on ImageNet.

How to install

Make sure to have Python โ‰ฅ3.8 (tested with Python 3.8.10) and ensure the latest version of pip (tested with 21.3.1):

python3 -m venv env
source env/bin/activate
pip install --upgrade --no-deps pip

Next, install PyTorch 1.11.0 with the appropriate CUDA version (tested with CUDA 10.0):

pip install torch==1.11.0 torchvision==0.12.0

Finally, install the remaining dependencies using pip:

pip install -r requirements.txt

Running experiments

Analytical expected SQNR computations

The main run file to compute the expected SQNR for different distributions using different formats is compute_quant_error.py. The script takes no input arguments and computes the SQNR for different distributions and formats:

python compute_quant_error.py

ImageNet experiments

The main run file to reproduce the ImageNet experiments is image_net.py. It contains commands for validating models quantized with post-training quantization. You can see the full list of options for each command using python image_net.py [COMMAND] --help.

Usage: image_net.py [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  validate-quantized

To reproduce the experiments run:

python image_net.py validate-quantized --images-dir </PATH/TO/IMAGENET> 
--architecture <ARCHITECTURE_NAME> --batch-size 64 --seed 10
--model-dir </PATH/TO/PRETRAINED/MODEL> # only needed for MobileNet-V2
--n-bits 8  --cuda --load-type fp32 --quant-setup all --qmethod fp_quantizer --per-channel 
--fp8-mantissa-bits=5 --fp8-set-maxval --no-fp8-mse-include-mantissa-bits
--weight-quant-method=current_minmax --act-quant-method=allminmax --num-est-batches=1 

where <ARCHITECTURE_NAME> can be mobilenet_v2_quantized or resnet18_quantized. Please note that only MobileNet-V2 requires pre-trained weights that can be downloaded here (the tar file is used as it is without a need to untar):

Reference

If you find our work useful, please cite

@article{kuzmin2022fp8,
  title={FP8 Quantization: The Power of the Exponent},
  author={Kuzmin, Andrey and Van Baalen, Mart and Ren, Yuwei and Nagel, Markus and Peters, Jorn and Blankevoort, Tijmen},
  journal={arXiv preprint arXiv:2208.09225},
  year={2022}
}

fp8-quantization's People

Contributors

mhofmann-qc avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.