I'm really interested in your work. I wonder could u please release the pretrained dense model and the pruned model? I'm not sure if your training setting is the same as the default one in the code. Thanks!
After run the command:
python ManiDP/main.py --ngpu=1 --arch=dyresnet56 --dataset=cifar10 --target_remain_rate=0.6 --lambda_lasso 5e-3 --lambda_graph 1.0 --pretrain_path='ManiDP/pretrain_path/' --data_path='...'
the loss is nan.
I think this error is caused by the lasso loss. Can you give some advice. Thanks
Hello. I am using the ManiDP code, and I want to train with Cifar10. However, I don't know where to find the pre-train model. Could you provide the pre-train model or the code for original training?
Hi,
Could you please provide more information on how you calculated the FID score? For example, which code did you use? How many images did you use to find FID score?
Thanks in advance.
Hello,I was running the search.py,and when I run it on Win10,the value pred_fake = netD_B(fake_B.detach()) was positive and it was negative on ubuntu.
Could pred_fake = netD_B(fake_B.detach()) be negative,is it normal?
Hello. Thanks for your work. I have some confusions about how the Manidp achieve acceleration.
I found that the convolution is forwarded without masks.
In this situation, the FLOPs will always be the addition of original model and gate module, which is slower than a single original model.
So, how the FLOPs reduction is calculated? Can you provide the codes for FLOPs evaluation?
I have found that some parts of the mask will extremely increase and become Nan when one sample is input into the MaskBlock, although the clamp_max is set 1000.
Excuse me, could you release the pre-trained-prune models with different super parameter γ(0.1,1,10), such as netG_A2B_prune_200.pth and netG_B2A_prune_200.pth with γ(0.1,1,10)? I want to learn about it, but my computer need train for a long time .Thank you!
Hello,
Thank you for your work. It looks amazing and I'd like to try it on some custom dataset.
Could you please share the script with training generator for ImageNet or for another dataset?
Hi, I'm CNN light-weighting researcher, and I'm building a filter pruning leaderboard for better comparison.
I think GPU hour for search and training is important to evaluate the pruning algorithms, but you didn't report it.
Would you let me know GPU hours for each ImageNet configuration?