kkahatapitiya / x3d-multigrid Goto Github PK
View Code? Open in Web Editor NEWPyTorch implementation of X3D models with Multigrid training.
License: MIT License
PyTorch implementation of X3D models with Multigrid training.
License: MIT License
Hi,appreciate your beautiful work! @kkahatapitiya Could you tell me how you implement performance validation 71.48% Top-1 accuracy (3-view) on Kinetics-400?Have you opensource your video-level accuracy test code? I test the pretrained performance lower than the performance you offered. (video-level acc which is average all clips of a test video of my approach)
Hi,
When I run train_x3d_charades.py, I get the following error:
raise ValueError("num_samples should be a positive integer "
ValueError: num_samples should be a positive integer value, but got num_samples=0
I'm using the same dataset in the code (Charades_v1_rgb). Do you have any suggestions?
Thank you.
Hi, thanks a lot for sharing your implementation! I want to use your pretrained model to do validation, and if I only have one GPU, how should I modify the super-parameters, especially base_bn_splits used in generate_model. And I want to know whether the model named "x3d_multigrid_kinetics_fb_pretrained.pt" is modified from the provided model by facebook? Looking forward to your reply.
How to generate the following files?
KINETICS_TRAIN_ANNO
KINETICS_VAL_ANNO
KINETICS_CLASS_LABELS
I am planning to use your implementation in x3d.py and use it in my own training environment to train X3D with a constant batch size. I don't want to use any multigrid features. I will be using my own dataloaders and datasets and so on.
In the below model instantiation snippet, I am unsure about one parameter:
x3d = resnet_x3d.generate_model(x3d_version=X3D_VERSION, n_classes=400, n_input_channels=3,
dropout=0.5, base_bn_splits=BASE_BS_PER_GPU//CONST_BN_SIZE)
What is base_bn_splits
? If I use a single GPU and a constant batch size, what value do I need to give this parameter? Thanks a lot! @kkahatapitiya
Good day!
I have troubles on finding where to specify input clip length parameter when defining X3D model. Currently I'm aiming to change input frames (temporal duration parameter) to 20 for X3D-M training and so that input clip (gamma_tau) is sampled at 10FPS.
Please provide some insight on how that can be achieved.
Thanks for your clean implementation! @kkahatapitiya
I have two problem to consult you:
Add these codes to the file
if __name__=='__main__':
net = generate_model('S',).cuda()
#print(net)
from torchsummary import summary
inputs = torch.rand(8, 3, 10, 112, 112).cuda()
output = net(inputs)
print(output.shape)
summary(net,input_size=(3,10,112,112),batch_size=8,device='cuda')
The code can run success, but except the summary,
The error report was
File "x3d.py", line 382, in <module>
summary(net,input_size=(3,10,112,112),batch_size=8,device='cuda')
File "D:\software\program\Anaconda3\envs\pytorch1\lib\site-packages\torchsummary\torchsummary.py", line 72, in summary
model(*x)
File "D:\software\program\Anaconda3\envs\pytorch1\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "x3d.py", line 324, in forward
x = self.bn1(x)
File "D:\software\program\Anaconda3\envs\pytorch1\lib\site-packages\torch\nn\modules\module.py", line 1128, in _call_impl
result = forward_call(*input, **kwargs)
File "x3d.py", line 52, in forward
x = x.view(n // self.num_splits, c * self.num_splits, t, h, w)
RuntimeError: shape '[0, 192, 10, 56, 56]' is invalid for input of size 1505280
I found that the shape of x was (2,3,10,112,112) in the forwad other than (8,3,10,112,112), and I don`t konw why.
Do you konw that?
Hi,@kkahatapitiya, Thanks for your clear reproduction.
I have two question when I test your code:
What is the specific performance on kinetics-400? Because you said that it achieves 62.62% Top-1 accuracy (3-view) on Kinetics-400 when trained for ~200k iterations from scratch, I don not know which version of x3d got this result. How many epoch you trained to get this results?
As for the figure below in the original paper, x3d-M got 4.73G FLOPs but I test this x3d-M of this code and got 3.76G FLOPs. Could you please explain about it?
Hello, which configurations of X3d have you trained and included in the repo ? (X3D-M , X3D- L , X3D-XL ....)
Thank you a lot for sharing your implementation. It is really helpful for implementing X3D network on custom deep learning problem.
The original repo only provides Caffe2 pretrained model. How did you convert them to pytorch format ? (I also want to try other version of X3D)
Is it possible to train it from scratch? Eventually which is the dataset format I have to provide?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.