Comments (9)
@michaelku1 I really appreciate your help. I also just learned the knowledge of gradient accumulation.
from adaptive_teacher.
Follow section 4.2 in the paper and set the same exact parameters (16 batch size) then you can get the results reported in the paper. Actually setting unsupervised weight as 0.5 or 0.25 can get even better results than we were reporting in the paper, which means that the performance of our model can be improved by sweeping more parameters.
from adaptive_teacher.
16 batch size is not working for a single GPU so is that mean I can't get ap50 results using a single GPU?
from adaptive_teacher.
16 batch size is not working for a single GPU so is that mean I can't get ap50 results using a single GPU?
You may try gradient accumulation. I trained the model with gradient accumulation and got close results.
from adaptive_teacher.
how can I do that can you elaborate it, please? shall I update the config file
from adaptive_teacher.
from adaptive_teacher.
Thank you I really appreciate your detailed explanation. I take a look at the PyTorch examples but since I am not familiar with it before it isn't easy for me to make the update. so i appreciate if you can provide the updated trainer file or the code withe specific line where it can be replaced.
from adaptive_teacher.
Thank you I really appreciate your detailed explanation. I take a look at the PyTorch examples but since I am not familiar with it before it isn't easy for me to make the update. so i appreciate if you can provide the updated trainer file or the code withe specific line where it can be replaced.
Though not 100% sure, but this implementation allowed me to train well on a single gpu with effective bs = 16
from adaptive_teacher.
@michaelku1 I really appreciate your help. I also just learned the knowledge of gradient accumulation.
However, even though gradients are accumulated, batchnorm statistics are not and this may lead to a slight discrepancy in performance between model trained using gradient accumulation and one that is not.
from adaptive_teacher.
Related Issues (20)
- Memory Leak HOT 1
- Why 2 seperate annotation files? HOT 4
- Very low performance during evaluation
- how to change backbone to resnet HOT 1
- resnet backbone dont converge HOT 1
- Error while loading pretrained model weights from `detectron2://ImageNetPretrained/MSRA/R-101.pkl` for training with custom dataset
- question about selecting best model (validation or post last step?) HOT 1
- what is crop ratio? HOT 1
- How to open the output file HOT 1
- How to train with the coco format datasets? HOT 1
- TypeError: register_buffer() takes 3 positional arguments but 4 were given HOT 3
- The Beta parameter of Foggy Cityscapes HOT 1
- FloatingPointError: Predicted boxes or scores contain Inf/NaN HOT 9
- backbone modification HOT 1
- How to get the AP of each label? HOT 2
- How
- How to visualize the predicted results on pictures using the pretrained model?
- How to train a SourceOnly model or Oracle model? HOT 2
- How to conduct ablation experiments, set the corresponding Loss to 0, or remove the corresponding module from the model? HOT 2
- Save teacher weight HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from adaptive_teacher.