Giter Site home page Giter Site logo

jiangtaoxie / fast-mpn-cov Goto Github PK

View Code? Open in Web Editor NEW
269.0 269.0 56.0 2.91 MB

@CVPR2018: Efficient unrolling iterative matrix square-root normalized ConvNets, implemented by PyTorch (and code of B-CNN,Compact bilinear pooling etc.) for training from scratch & finetuning.

Home Page: http://peihuali.org/iSQRT-COV/index.html

License: MIT License

Python 85.92% Shell 14.08%

fast-mpn-cov's People

Contributors

akindofyoga avatar jiangtaoxie avatar lvyilin avatar warbean avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fast-mpn-cov's Issues

Should we use SVMs for FGVC?

Hi, I am reading about your iSQRT paper and i think it is quite interesting. However, I am confused about the usage of SVMs.
You wrote "After finetuning, the outputs of iSQRT-COV layer are ℓ2−normalized before inputted to train k one-vs-all linear SVMs with hyperparameter C = 1" in your paper but i didn't find it in your released code.

Loss and Accuracy dont change

Hi, thank you for a great paper and code.

I have tried to run on Cars196 dataset the code where I used MPNCOV and mpnconvresnet50 in Jypyter Notebook, firsly I got error in #loss.backward()#. After adjusting the code with detach(), I could run the code.

However, the loss and accuracy dont change from epoch to epoch, just oscillate around the same value. The best top 1 accuracy is 0.006 and the loss is just 5.28 for all epochs.

Could you help me to make the code run properly?

Thank you in advance.

Why is your manual implementation via autograd.Function even faster than PyTorch 's autograd engine?

In order to make my code clean and easy to read, I tried to reimplement covpool, sqrtm and triuvec with native PyTorch operators as a simple plain python function, as shown in #7.

After ensuring the forward and backward results are equivalent between my auto backward version (with autograd engine) and your manual backward version (with autograd.Function), I tested their speed and surprisingly found my auto backward version slower.

Have you compared these two different approaches before? Do you have any idea on why the manual backward implementation is even faster than PyTorch 's autograd engine?

the loss tend to be nan

when i used mpncovresnet50 and MPNCOV, the train converged.
But if i change the backbone to resnet* or VGG*, keeping the MPNCOV unchanged, the train loss is nan.
Beside, my dataset is for a mini-FGVC task. It contains 9 classes with extra-unbalanced. When i fine-tuning
within two stage, The test acc is about 78, which is lower than plain vgg. Could you give me some advice?
Thank you for your amazing work.

gradcheck for Sqrtm in MPNCOV.py

Hello,
I use autograd.gradcheck for Sqrtm in MPNCOV.py and the function returns false, but if I delete the ligne 'der_NSiter = der_NSiter.transpose(1, 2)' , it turns out to be true.

Is there anybody who could explain it?

Thanks so much

Model parameters

I have notice a problem in finetune.sh
if i don't change setting the model used is this:
(features): WITH ALL LAYER
(classifier): Linear(in_features=32896, out_features=70, bias=True)
(representation): MPNCOV()

I don't understand why representation level is after classifier.

RuntimeError: Error(s) in loading state_dict for DataParallel:

When I fine-tuned the mpncovresnet50 in second stage, the error occured.

loading checkpoint 'Finetune-c9-mpncovresnet50-MPNCOV-reproduce-lr0.001-bs40/checkpoint.pth.tar'
Traceback (most recent call last):
File "main.py", line 436, in
main()
File "main.py", line 179, in main
model.load_state_dict(checkpoint['state_dict'])
File "/home/wen/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 719, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for DataParallel:
Missing key(s) in state_dict: "module.representation.conv_dr_block.0.weight", "module.representation.conv_dr_block.1.weight", "module.representation.conv_dr_block.1.bias", "module.representation.conv_dr_block.1.running_mean", "module.representation.conv_dr_block.1.running_var".
Unexpected key(s) in state_dict: "module.features.8.weight", "module.features.9.weight", "module.features.9.bias", "module.features.9.running_mean", "module.features.9.running_var", "module.features.9.num_batches_tracked".

two_stage_finetune.txt
this is my config.

what's the meaning of Implementations?

Hello, I don't know what the meaning of three points in Implementations.

  1. Is the experiment implemented in three ways, in pytorch or tensorflow or MatConvNet?
  2. Or use pytorch first? then tensorflow? and finally MatConvNet?
    3.Or use MatConvNet in the pytorch environment (or use MatConvNet in the tensorflow environment)?
    My question may be naive .Thanks very much if can hear from you.
    image

Downloading problem?

When I ran the code finetune.sh, it began to download from jtxie.com while it's too slow. So I decided to use BaiduYun. However, I got the following problem:

Start finetuning! Namespace(arch='mpncovresnet101', batch_size=10, benchmark='CUB', classifier_factor=5, data='/path/to/the/data/CUB', dist_backend='gloo', dist_url='tcp://224.66.41.62:23456', epochs=100, evaluate=False, freezed_layer=0, gpu=1, lr=0.0012, lr_method='step', lr_params=[[100.0]], modeldir='Results/Finetune-CUB-mpncovresnet101-MPNCOV-reproduce-lr1.2e-3-bs10', momentum=0.9, num_classes=200, pretrained=True, print_freq=100, representation='MPNCOV', resume='Results/Finetune-CUB-mpncovresnet101-MPNCOV-reproduce-lr1.2e-3-bs10/mpncovresnet101-ade9737a.pth.tar', seed=None, start_epoch=0, store_model_everyepoch=False, weight_decay=0.0001, workers=8, world_size=1) main.py:127: UserWarning: You have chosen a specific GPU. This will completely disable data parallelism. warnings.warn('You have chosen a specific GPU. This will completely ' => loading checkpoint 'Results/Finetune-CUB-mpncovresnet101-MPNCOV-reproduce-lr1.2e-3-bs10/mpncovresnet101-ade9737a.pth.tar' Traceback (most recent call last): File "main.py", line 503, in <module> main() File "main.py", line 206, in main checkpoint = torch.load(args.resume) File "/home/yzzc/.local/lib/python3.5/site-packages/torch/serialization.py", line 358, in load return _load(f, map_location, pickle_module) File "/home/yzzc/.local/lib/python3.5/site-packages/torch/serialization.py", line 527, in _load return legacy_load(f) File "/home/yzzc/.local/lib/python3.5/site-packages/torch/serialization.py", line 441, in legacy_load tar.extract('storages', path=tmpdir) File "/usr/lib/python3.5/tarfile.py", line 2027, in extract tarinfo = self.getmember(member) File "/usr/lib/python3.5/tarfile.py", line 1738, in getmember raise KeyError("filename %r not found" % name) KeyError: "filename 'storages' not found"
How can I solve this problem?

About the post-compensation implementation using trace

Hi, I try to use trace to implement the Pre-normalization and Post-compensation, but I find that the trace of covariance matrix might be negative which make sqrt operation wrong.

so, Could I ask if we can guarantee that the trace of covariance matrix is positive, or use some operation to make sure it is positive.

the result of the combination of mpncov & efficientnet did not meet expectations

Thanks for the great work.

I tried on pretrained mpncovresnet101 & mpncovresnet50, the performance is impressive. But the performance is poor when I combined mpncov and efficient-net. I fixed the backbone parameters and only update params of reduce-layer and fc. I replace the layer_reduce_relu with Swish, and train it on imagenet2012 for 55 epochs, the top1 acc is only about 0.76.

I wonder why mpncov shows poor performance on a better backbone, any advice would be thankful!

About the implementation of MPNCOV meta layer in pytorch 0.3.1

I try to add mpncov layer in my network in pytorch 0.3.1. but I get error which is

Traceback (most recent call last):
  File "main.py", line 421, in <module>
    main()
  File "main.py", line 211, in main
    loss_temp, train_prec1_temp, train_prec5_temp = train(train_loader, model, criterion, optimizer, epoch)
  File "main.py", line 269, in train
    output = model(input)
  File "/home/zhangli/anaconda3/envs/pytorch-0.3.1/lib/python3.5/site-packages/torch/nn/modules/module.py", line 357, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/zhangli/anaconda3/envs/pytorch-0.3.1/lib/python3.5/site-packages/torch/nn/parallel/data_parallel.py", line 71, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/home/zhangli/anaconda3/envs/pytorch-0.3.1/lib/python3.5/site-packages/torch/nn/modules/module.py", line 357, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/zhangli/wubanggu/darts/test_res/resnet.py", line 156, in forward
    x = self.representation(x)
  File "/home/zhangli/anaconda3/envs/pytorch-0.3.1/lib/python3.5/site-packages/torch/nn/modules/module.py", line 357, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/zhangli/wubanggu/darts/test_res/MPNCOV.py", line 69, in forward
    x = self._cov_pool(x)
  File "/home/zhangli/wubanggu/darts/test_res/MPNCOV.py", line 60, in _cov_pool
    return Covpool.apply(x)
RuntimeError: save_for_backward can only save input or output tensors, but argument 1 doesn't satisfy this condition

How can I fix it , Thank you very much.

where should I put the pre-training model?

Hello,
I tried to train the model by myself. It shows amazing performance.
Hence, I gonna finetune the model.
After I downloaded the model from the Google Drive, I think I should put it in a special path.
Where should I put the pre-trained model? Or where should I set the path of pre-trained model?
Coz I didn't find a clear hint of the question.
Look forward to your reply.

can't download the pretrained pth

Thank you so much for the excellent work!When I run the finetune.sh, it tell me that the http://jtxie.com/models/mpncovresnet50-15991845.pth can't download.And when I type the link directly on my brower, it also can't turn to that wedsite.Could you please check the link? I will be appreciate if it is convenient for you to provide the pth. That is really an excel work! Thank you so much!

About the comparision results?

Could u pls tell me where did you get the results of CBP-Resnet50 model? Did u make the experiments yourself? If so, could u pls release your code or thraining files?

Fatal IO error: client killed

When I run the code, the following errors occur: Fatal IO error: client killed, and terminate the program. Is it the problem of my system environment? Do you know how to solve it? Thanks very much!

This error will only occur when I run the second time, and the first time it runs successfully.

fine tune issue

When I ran finetune.sh in ./finetune/, I recognized that when finetuning, the training process didn't apply the forward function in mpnconvresnet.py instead applied the forward function in base.py? So it ignores the following operations in mpnconvresnet.py if I am right:
x = MPNCOV.CovpoolLayer(x) x = MPNCOV.SqrtmLayer(x, 5) x = MPNCOV.TriuvecLayer(x)
why?

CUB-200-20122数据集实验

你好,我看了您的论文,发现您在CUB数据集上做了实验,您可以公布下关于这个数据集的代码吗,谢谢!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.