kaiyuyue / cgnl-network.pytorch Goto Github PK
View Code? Open in Web Editor NEWCompact Generalized Non-local Network (NeurIPS 2018)
Home Page: https://arxiv.org/abs/1810.13125
License: MIT License
Compact Generalized Non-local Network (NeurIPS 2018)
Home Page: https://arxiv.org/abs/1810.13125
License: MIT License
How do we use CGNL in image segmentation tasks?
Hi! Thanks for your code and paper. I have several question about this work.
(a). In your paper, the results are a little lower than this repo. why? (about 1%)
(b). In your paper, you also insert 5 NL blocks in resnet, what are the specific positions of these blocks?
(c). Have you inserted 5 NL/GCNL blocks when training on ImageNet?
Many thanks !
Hi,
I find your work very interesting. However I have two questions regarding normalization:
In the original non-local neural networks work, the product of phi and theta is normalized BEFORE it multiplies g to produce the output (in their work it is done using a softmax layer). I do not see any such normalization in your work - why?
Your Taylor expansion is based on the assumption that both theta and phi are of unit L2 norm. I do not see this enforced in your code - what have I missed?
Thanks,
Thank you for your work.
There is a qusetion about SpatialCGNL dot production kernel.
In your code,the calculation process of the dot production kernel:p = p.view(b, 1, c * h * w), g = g.view(b, c * h * w, 1), att = torch.bmm(p, g), the shape of att is (b * 1 * 1) ,what is the meaning of the shape of att?
I implemented I3D and Non-local (NL) based on the code release by Wang et al. (2018); however, CGNL got slightly lower accuracy than NL for mini-Kinetics using 8-frame and 32-frame 5-block ResNet-50 models.
Because NL is reproduced, I wonder if there is any detail for CGNL to get the scores in the paper. Is there any different hyperparameter for CGNL?
Hi! am I doing something wrong? I tried to instantiate resnet.SpatialCGNL()
module feeding just required parameters inplanes
and planes
however it gives me an error.
Working Directory: cgnl-network.pytorch/model
Please find the minimal reproducing code:
from resnet import SpatialCGNL
SpatialCGNL(20, 10)
Error Traceback:
File "/cgnl-network.pytorch/model/resnet.py", line 138, in __init__
self.z = nn.Conv2d(planes, inplanes, kernel_size=1, stride=1,
File "python3.8/site-packages/torch/nn/modules/conv.py", line 340, in __init__
super(Conv2d, self).__init__(
File "python3.8/site-packages/torch/nn/modules/conv.py", line 24, in __init__
if in_channels % groups != 0:
TypeError: unsupported operand type(s) for %: 'int' and 'NoneType'
I believe this is because the parameter groups
in resnet.SpatialCGNL
is set to None
.
Thanks for the codes. I want to ask if the codes can be used in video classification?
你好,请问SpatialCGNL代码中
t = t.view(b, 1, c * h * w)
p = p.view(b, 1, c * h * w)
g = g.view(b, c * h * w, 1)
att = torch.bmm(p, g)
x = torch.bmm(att, t)
x = x.view(b, c, h, w)
这里求出的att是一个单值,再与t相乘,这样的意义是什么?
How to generate heat map visualization for video frames as presented in the paper Figure 6 ?
Thank you for your support.
Hi, thanks for interesting paper :)
I have some questions.
There is no comment in paper whether the results are mean or median or best.
If it is the mean, how many experiments did you execute for the mean?
And are the results in github are included in the mean?
Thanks for presenting a such interesting work.
I wonder if you use β = exp(−γ(∥θ∥^2 +∥φ∥^2)) to approximate β = exp(−γ(∥θi∥^2 +∥φj∥^2)) in Eq. (10)?
In which block did you add the nln-block? There are some secret numbers in function _make_layer,
`` for i in range(1, blocks):
if (i == 5 and blocks == 6) or
(i == 22 and blocks == 23) or
(i == 35 and blocks == 36):...
And did you only add the nln-block to layer3?
Thanks for the great job. I follow the training strategy in the paper to train a I3DResNet50 on ucf101, and the ImageNet pretrained model is used. I sample 64 consecutive frames and drop evenly as the training input and sample 30x32 frames as the testing input. I3DResNet is converted from C2D mentioned in Non-local network.
However, I can only get about 70% accuracy. So, can you provide the script about the task of video classification or give some suggestions? Thank you.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.