zeiss-microscopy / bsconv Goto Github PK
View Code? Open in Web Editor NEWReference implementation for Blueprint Separable Convolutions (CVPR 2020)
License: BSD 3-Clause Clear License
Reference implementation for Blueprint Separable Convolutions (CVPR 2020)
License: BSD 3-Clause Clear License
Great paper!
Just one small question,
It seems that you have not alter the structure of mobilenet v2 and v3, because it sort of already got BSConv built in?
Dose this imply that the accuracy gain (especially in CIFAR) is purely from the proposed orthonormal regularization loss?
Hi, thank you for releasing the code. I have a question that looking forward to your answers:
This PCA code, in my opinion, reduces the dimensionality of the features (K*K) and proves the redundancy of the features within each kernel, how is the intra-kernel correlations derived from this?
step 1: split 3D kernel F into 2D kernels (assuming F is of size CxHxW)
xs = [F[nChannel, :, :].flatten() for nChannel in range(F.shape[0])]
X = np.array(xs)step 2: perform PCA
import sklearn.decomposition
pca = sklearn.decomposition.PCA(n_components=None)
pca.fit(X)step3: this is the variance of F which is explained by the first principal component (PC1)
v = pca.explained_variance_ratio_[0]
batch_size:128, momentum:0.9 ,weight decay 10−4.epochs:100,learning rate:0.1, linearly decayed at every epochs.
The results of the paper cannot be reproduced. Is there a problem with my hyperparameter setting
In paper 5.3:
For Stanford Dogs, Stanford Cars, and Oxford 102 Flowers, the learning rate is adjust as below:
"The initial learning rate is set to 0.1
and linearly decayed at every epoch such that it approaches
zero after a total of 100 epochs."
How does the learning rate adjusted? Does it wrote in the script (bsconv_pytorch_train.py)?
Is the learning rate adjusted as below?
epoch 0, lr_rate = 0.1
epoch 1, lr_rate = 0.1-(0.1/100)*1
..
epoch n, lr_rate = 0.1-(0.1/100)*n
..
epoch 99, lr_rate = 0.1-(0.1/100)*99
Hello, thanks for releasing the code. I have a couple of questions:
I want to confirm about the BSConv-S. In my understand, the module has:
bsconvS =[ (Conv1x1 +BN) -->(Conv1x1+BN)-->(dw-Conv3x3)]
and the BN and ReLu are only applied after bsconvS. There is no Activation in the middle of BSConv-S.
Is this also hold for the BS Residual-Inverted Bottleneck of Mobile-V2? So that, the transformed block only has 1 activation only at the end. (while original one has 2 ReLU). Specifically:
Inverted-Residual Block:
x--> [conv1x1-BN-Act --> conv3x3-BN-Act --> conv1x1-BN] + x
while BS-Inverted Bottleneck:
x--> [conv1x1-BN --> conv1x1-BN--> conv3x3-BN-Act] + x
If there is no ReLu in middle of BSCon-S, then during inference, can we merge the first 2 Conv1x1 into a single conv1x1 (which reduce to BSConv-U), to save computation.
Did you compare inference speed between BSConv-S and regular Conv.
Thank you.
How to get the Histogram of the variance along the depth axis of filter kernels as shown in Figure 2 in the paper?Can you share your code?Thanks!!
Hello!
I read your paper very well and got amazed by your work!
While I am googling about Wide-resnet, I found this repo explaining about training best wide-resnet. So how do you think about apply this learning details in your bin/bsconv_pytorch_train.py if and only if it improve results.
If you think this is good idea, let me know and I will try this.
Thank you.
hello,when i see this paper, i have a problem aboult BSConv-S , in BSConv-S,here is a choice for using a BN and activate layer , So when to use BN and activate layer in BSConv-S ?
Hello Manuel! I have read your CVPR2020 paper, and your method is effective on ConvNets.
While following your work, I have problems to reproduce MobileNetV3-large cifar100 baseline, which has 75% accuracy. However, with the following setting,
epoch=200; SGD with momentum 0.9; weight decay of 10−4; lr=0.1 and decayed by a factor of 0.1 at epochs 100, 150, and 180.
I can only get accuracy around 70%.
I also change the first two stride=2 to stride=1 for MobileNetV3.
Can you share your parameter setting? Or is there anything wrong? Thanks for helping me.
I print out the two models, and don't see any difference. Reproduce:
import bsconv.pytorch
model1= bsconv.pytorch.get_model('mobilenetv2_w1_bsconvs',num_classes=100)
print(model1)
and
import bsconv.pytorch
model2= bsconv.pytorch.get_model('mobilenetv2_w1',num_classes=100)
print(model2)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.