hyz-xmaster / swa_object_detection Goto Github PK
View Code? Open in Web Editor NEWSWA Object Detection
License: Apache License 2.0
SWA Object Detection
License: Apache License 2.0
Hi, thanks for your nice work.
I am wondering the performance of SWA with other optimizers, such as Adam, Adamw;
Can it achieve consistent performance gain?
If the original network is trained with Adam or Adamw, can SWA (with SGD) improve its performance?
Thanks very much.
Hi, @hyz-xmaster . Thanks for releasing the code.
When using your utils to get the avg model get_swa_model.py, I've got a problem. For example, if I want to avg chkpts of 13-24 epochs, I would intuitively pass starting=13 and ending=24 in args. However, the code actually gave me the avg of 13-23, because of this line:
swa_object_detection/swa/get_swa_model.py
Line 28 in 2feb867
I think it would be better to use ending_id + 1
instead of ending_id
for easier understanding.
nice work to make swa work in object detection!
i have one question about same epoch level comparison.
the result looks like faster rcnn r50 1x + 1x swa extra training get same result as faster rcnn r50 2x?
i think maybe some problem.
I can't find the circycle loss
使用VOC数据集训练验证阶段会报错
怎么把这种方法运用到自己的训练中?仅仅多训练12个epochs吗?
I just wonder woud it be better to have some iterations during which the learning rate goes up.
Describe the feature
Such as mAP.5,mAP.75 and etc.
Traceback (most recent call last):
File "tools/train.py", line 188, in
main()
File "tools/train.py", line 184, in main
meta=meta)
File "/home/hxz/anaconda3/envs/mmdnew/lib/python3.7/site-packages/mmdet-2.12.0-py3.7.egg/mmdet/apis/train.py", line 175, in train_detector
runner.run(data_loaders, cfg.workflow)
File "/home/hxz/anaconda3/envs/mmdnew/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 125, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/hxz/anaconda3/envs/mmdnew/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 54, in train
self.call_hook('after_train_epoch')
File "/home/hxz/anaconda3/envs/mmdnew/lib/python3.7/site-packages/mmcv/runner/base_runner.py", line 307, in call_hook
getattr(hook, fn_name)(self)
File "/home/hxz/anaconda3/envs/mmdnew/lib/python3.7/site-packages/mmdet-2.12.0-py3.7.egg/mmdet/core/evaluation/eval_hooks.py", line 149, in after_train_epoch
self.save_best_checkpoint(runner, key_score)
File "/home/hxz/anaconda3/envs/mmdnew/lib/python3.7/site-packages/mmdet-2.12.0-py3.7.egg/mmdet/core/evaluation/eval_hooks.py", line 166, in save_best_checkpoint
last_ckpt = runner.meta['hook_msgs']['last_ckpt']
KeyError: 'last_ckpt'
Hi,
really glad that the save_the_best_model is updated. But would like to ask, how can i save the best_segm_mAP.pth, but not best_bbox_mAP.pth?
thx:)
I think I should train the original model at first, then start to train the extra checkpoints.
But the scripts as follows in README.md don't include the script of training the original model.
I think the first script as follows only trains the extra checkpoints of 12/24 epochs. And the second script as follows averages these 12/24 checkpoints for final detection model.
(1)./tools/dist_train.sh configs/swa/swa_mask_rcnn_r101_fpn_2x_coco.py
(2)./swa/get_swa_model.py work_dirs/swa_mask_rcnn_r101_fpn_2x_coco 1 12 --save_dir work_dirs/swa_mask_rcnn
Maybe I can use this script as follow to train the original model.
./tools/dist_train.sh configs/mask_rcnn/mask_rcnn_r101_fpn_2x_coco.py work_dir=mymodel
Then I can use this script as follow to train the extra checkpoints. And I must set work_dir to the path where saves the original model.
./tools/dist_train.sh configs/swa/swa_mask_rcnn_r101_fpn_2x_coco.py work_dir=mymodel
Am I right?
Thanks very much!
I have download the master branch of swa_object_detection. Meanwhile i have mmdetection on my pc.
I have try to train mask rcnn r50fpn with the normal mmdetection code. But this error comes always. Appreciate for the help!
Hello sir! I appreciate your wonderful work which helps a lot. But there's a question I can't figure out.
When I run Two-pahse mode, after I got 12 traditional checkpoints, I can get 2 checkpoints after each epoch. I wonder the difference between them, and which one should I use?
epoch_1.pth to epoch_12.pth is traditional checkpoints
swa_epoch_xx.pth and swa_model_xx.pth are the checkpoints after swa training
还要用get_swa_model()对swa_model_12.pth----swa_model_1.pth求平均吗
你好,我查看了代码。看到配置文件中,比如yolov3中,total_epoch和cyclic_times都是24,意思是每个epoch都按照iter进行一次余弦退火,然后把这24个epoch求平均吗
@hyz-xmaster hi, thank you for your great job. I have read your paper, and it said that batch normalization layers in backbones are frozen. My question is that you just frozen BN in backbone or you fronzen all the layer in backbone.
Environment
Python: 3.7.10 | packaged by conda-forge | (default, Feb 19 2021, 16:07:37) [GCC 9.3.0]
CUDA available: True
GPU 0,1: Tesla V100-SXM2-32GB
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 10.1, V10.1.168
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.6.0+cu101
PyTorch compiling details: PyTorch built with:
TorchVision: 0.7.0+cu101
OpenCV: 4.5.3
MMCV: 1.3.8
MMCV Compiler: GCC 7.3
MMCV CUDA Compiler: 10.1
MMDetection: 2.14.0+7b5a58b
Error traceback
SWAHook’object has no attribute 'save_checkpoint'
Hello, sir. This is a nice work! I want to use this repo to improve my detection performance, but there are some question about using this repo.
I have trained my detection model in mmdetection, and get the epoch_12.pth normally, and follow the Only-SWA mode make the config,
after training, there are 24 models in my work_dir:
I use swa/get_swa_model.py which use swa_epoch_1.pth to swa_epoch_12.pth to produce swa_1-12.pth, and I use this model to test, the final score decrease a little, is there some problem when I use swa? Very confused, and very appreciate for your reply!!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.