mxbonn / inq-pytorch Goto Github PK

View Code? Open in Web Editor NEW

164.0 164.0 26.0 30 KB

A PyTorch implementation of "Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights"

Python 100.00%

pytorch quantization

inq-pytorch's People

Contributors

Stargazers

Watchers

inq-pytorch's Issues

can I get inference acceleration on my own model using this tool?

Hi,
I have a trained model in pytorch,

can I quantize my own model using this tool?
can I get acceleration when I inference the quantized model?
Thanks.

Seems don't have acceleration on the inference step

Hi,
I implemented your INQ framework on the UNet but seems like the inference time didn't decreased. The weights are still in the type of float32 although the values are the power of 2. I wonder if the bit shift operation can only be done on some specific devices like FPGA and on CPU/GPU the model after quantization still do float multiplication operations? Thank you!

Running the code results in almost 30% accuracy reduction

After the last iterative step, I get a good accuracy (say 69.6%), however, following that, there is a final quantization step where 100% of the weights are quantized. Accuracy drops to 40%. You can replicate this, by just running the code without any changes.

can probability be set the same in random strategy?

Hi there, wonderful work. I have an question about random strategy, can I set the probability same in this model? like [0.2, 0.2, 0.6, 0.6, 0.8, 1]? I have read code in the quantization_scheduler.py, and in line 116:
else: probability = (self.iterative_steps[self.idx] - self.iterative_steps[self.idx - 1]) / (1 - self.iterative_steps[self.idx - 1])
I think that means NO, but I have set the probability above, and the results not bad. I wonder what really happen here. can you help me plz?

The code flow is slightly different from the paper？

In paper，The weight after each quantization will not be changed in the subsequent retraining and quantization process of expanding the proportion
In code, When the proportion of the last two quantization is 50%, the quantized weight will be put into the second group again in the next proportional partition, which will be propagated back in the retraining.
My understanding is correct?

Why does the output model size become bigger?

After the quantization, the weight parameters are converted into INT8 version and thus the parameter size would be smaller. So, why does my output weight become bigger (75.2 MB to 96.2 MB)? Thanks in advance.

The dataset was replaced with CIFAR100, and the accuracy after quantization was less than 10%

I retrained resnet18 on cifar100 and quantized its parameters using INQ, but the accuracy decreased seriously after quantization, usually less than 10%. Does anyone have similar problems?

use `required_grad = False`

Hello,

I noticed to implement INQ, you plugin

d_p.mul_(group['Ts'][idx])

in the end of SGD's step function where Ts is the binary mask from quantization. I was wondering if same thing can be achieved by just setting the quantized weights with required_grad = False (so would be in the quantization_scheduler class), then we don't need to hook into the optimizer code.

I'm wondering if you have tried this and whether that will work?

About the pretrained model

Hello,I am confused of the training epoch, in your code,you set the epochs as 4, if I want to quantize resnet18, do I need to change it? and do you have the quantized model of resnet18 of other bit width except 5? Thank you！

about accuracy

hello ,after the first quantization epoch ,my acc@1 is 24.65, is it normal?
I did nothing on the code that you provided.
Can you provide the final quantization model of resnet18? thanks

4-bits on ResNet18 results in 6% reduction in error

I have modified the weight_bits to 4 and the iterative_steps to [0.3, 0.5, 0.8, 0.9, 0.95, 1] but I got an accuracy of 83.12%, while the original paper got an accuracy of 89.01%

The logic is wrong in 'example' code?

Sorry to bother but I think the logic of example is not right.

The example does quantization every epoch while every quantization operation changes the iterative_step.
So it means that every epoch, the iterative_step goes to next level, which I think is wrong?

I think it is ok to do quantization every epoch(also must with the same Ts) but it should not change the iterative_step.
The iterative_step should only be changed when all epochs of current iterative_step are done. Then iterative_step can be changed and a new set of epochs begin.

mxbonn / inq-pytorch Goto Github PK

inq-pytorch's People

Contributors

Stargazers

Watchers

Forkers

inq-pytorch's Issues

Recommend Projects

Recommend Topics

Recommend Org