Giter Site home page Giter Site logo

zeiss-microscopy / bsconv Goto Github PK

View Code? Open in Web Editor NEW
133.0 6.0 22.0 1.77 MB

Reference implementation for Blueprint Separable Convolutions (CVPR 2020)

License: BSD 3-Clause Clear License

Python 99.75% Shell 0.25%
cvpr2020 pytorch depthwise-separable-convolutions resnet mobilenet image-classification deep-learning efficient-neural-networks zeiss cifar10

bsconv's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

bsconv's Issues

How is BSConv being utilized in MobileNet V2 and V3?

Great paper!
Just one small question,
It seems that you have not alter the structure of mobilenet v2 and v3, because it sort of already got BSConv built in?
Dose this imply that the accuracy gain (especially in CIFAR) is purely from the proposed orthonormal regularization loss?

About the PCA in section 3.1 of the paper.

Hi, thank you for releasing the code. I have a question that looking forward to your answers:
This PCA code, in my opinion, reduces the dimensionality of the features (K*K) and proves the redundancy of the features within each kernel, how is the intra-kernel correlations derived from this?

step 1: split 3D kernel F into 2D kernels (assuming F is of size CxHxW)

xs = [F[nChannel, :, :].flatten() for nChannel in range(F.shape[0])]
X = np.array(xs)

step 2: perform PCA

import sklearn.decomposition
pca = sklearn.decomposition.PCA(n_components=None)
pca.fit(X)

step3: this is the variance of F which is explained by the first principal component (PC1)

v = pca.explained_variance_ratio_[0]

5.3. Fine-grained Recognition

batch_size:128, momentum:0.9 ,weight decay 10−4.epochs:100,learning rate:0.1, linearly decayed at every epochs.
The results of the paper cannot be reproduced. Is there a problem with my hyperparameter setting

scheduling the learning rate for sub_imagenet datasets.

In paper 5.3:
For Stanford Dogs, Stanford Cars, and Oxford 102 Flowers, the learning rate is adjust as below:
"The initial learning rate is set to 0.1
and linearly decayed at every epoch such that it approaches
zero after a total of 100 epochs."

How does the learning rate adjusted? Does it wrote in the script (bsconv_pytorch_train.py)?
Is the learning rate adjusted as below?
epoch 0, lr_rate = 0.1
epoch 1, lr_rate = 0.1-(0.1/100)*1
..
epoch n, lr_rate = 0.1-(0.1/100)*n
..
epoch 99, lr_rate = 0.1-(0.1/100)*99

About activation layer and inference:

Hello, thanks for releasing the code. I have a couple of questions:

  • I want to confirm about the BSConv-S. In my understand, the module has:
    bsconvS =[ (Conv1x1 +BN) -->(Conv1x1+BN)-->(dw-Conv3x3)]
    and the BN and ReLu are only applied after bsconvS. There is no Activation in the middle of BSConv-S.
    Is this also hold for the BS Residual-Inverted Bottleneck of Mobile-V2? So that, the transformed block only has 1 activation only at the end. (while original one has 2 ReLU). Specifically:
    Inverted-Residual Block:
    x--> [conv1x1-BN-Act --> conv3x3-BN-Act --> conv1x1-BN] + x
    while BS-Inverted Bottleneck:
    x--> [conv1x1-BN --> conv1x1-BN--> conv3x3-BN-Act] + x

  • If there is no ReLu in middle of BSCon-S, then during inference, can we merge the first 2 Conv1x1 into a single conv1x1 (which reduce to BSConv-U), to save computation.

  • Did you compare inference speed between BSConv-S and regular Conv.

Thank you.

about Figure 2 in paper

How to get the Histogram of the variance along the depth axis of filter kernels as shown in Figure 2 in the paper?Can you share your code?Thanks!!

Ask about adjusting learning rate

Hello!

I read your paper very well and got amazed by your work!

While I am googling about Wide-resnet, I found this repo explaining about training best wide-resnet. So how do you think about apply this learning details in your bin/bsconv_pytorch_train.py if and only if it improve results.

If you think this is good idea, let me know and I will try this.

Thank you.

aboult BSConv-S

hello,when i see this paper, i have a problem aboult BSConv-S , in BSConv-S,here is a choice for using a BN and activate layer , So when to use BN and activate layer in BSConv-S ?

MobileNetv3-large baseline accuracy

Hello Manuel! I have read your CVPR2020 paper, and your method is effective on ConvNets.

While following your work, I have problems to reproduce MobileNetV3-large cifar100 baseline, which has 75% accuracy. However, with the following setting,
epoch=200; SGD with momentum 0.9; weight decay of 10−4; lr=0.1 and decayed by a factor of 0.1 at epochs 100, 150, and 180.
I can only get accuracy around 70%.
I also change the first two stride=2 to stride=1 for MobileNetV3.

Can you share your parameter setting? Or is there anything wrong? Thanks for helping me.

models 'mobilenetv2_w1_bsconvs' and ''mobilenetv2_w1' are identical

I print out the two models, and don't see any difference. Reproduce:

import bsconv.pytorch
model1= bsconv.pytorch.get_model('mobilenetv2_w1_bsconvs',num_classes=100)
print(model1)

and

import bsconv.pytorch
model2= bsconv.pytorch.get_model('mobilenetv2_w1',num_classes=100)
print(model2)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.