Giter Site home page Giter Site logo

hkzhang95 / dynamicrcnn Goto Github PK

View Code? Open in Web Editor NEW
173.0 173.0 23.0 150 KB

Dynamic R-CNN: Towards High Quality Object Detection via Dynamic Training, ECCV 2020

Home Page: https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123600256.pdf

License: MIT License

Python 76.14% C++ 4.64% Cuda 19.22%

dynamicrcnn's Introduction

Hi there!

"Stay Hungry, Stay Foolish." -- Stewart Brand

dynamicrcnn's People

Contributors

hkzhang95 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

dynamicrcnn's Issues

Synchronization between multiple gpu

在代码中,计算rcnn_iou_new和rcnn_error_new只考虑到了单个GPU的样本,感觉这会造成一定统计的不稳定性,请问作者有考虑在多GPU之间进行一个同步吗?也就是rcnn_iou_new和rcnn_error_new的更新会考虑到整个batch的样本,而不是统计IMS_PER_GPU的样本

Some doubts about the details

Thanks for the work DynamicRCNN, I have read the paper "Dynamic R-CNN: Towards High Quality Object Detection via Dynamic Training", for dynamic Dynamic SmoothL1 Loss, when calculating the regression errors , the source code is

           raw_regression_targets = cat(
                [proposal.get_field("regression_targets") for proposal in
                 raw_proposals], dim=0
            ).abs()[:, :2].mean(dim=1)

            rcnn_error_new = torch.kthvalue(raw_regression_targets.cpu(), min(
                cfg.MODEL.DYNAMIC_RCNN.KE * cfg.SOLVER.IMS_PER_GPU,
                raw_regression_targets.size(0)))[0].item()

But I have some doubts:
First, why does it use the mean offsets x and y instead of using the mean offsets of x ,y, w ,h to calculate the regression errors? What's more, in the paper DynamicRCNN, it mentioned use regression error to update beta, but when I read the source code, it just uses the ground truth of offsets, it seems unreasonable.

Second, the hyper-parameter MODEL.DYNAMIC_RCNN.KE choose the number 8 10 15, Are there some reasons? Does it relate to the anchor number or some other parameters?
@hkzhang95

mmdetection2.1 Error

base = '../faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
model = dict(
roi_head=dict(
type='DynamicRoIHead',
bbox_head=dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=1,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.1, 0.1, 0.2, 0.2]),
reg_class_agnostic=False,
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0))))
train_cfg = dict(
rpn_proposal=dict(nms_thr=0.85),
rcnn=dict(
dynamic_rcnn=dict(
iou_topk=75,
beta_topk=10,
update_iter_interval=100,
initial_iou=0.4,
initial_beta=1.0)))
test_cfg = dict(rpn=dict(nms_thr=0.85))

Traceback (most recent call last):
File "train.py", line 153, in
main()
File "train.py", line 149, in main
meta=meta)
File "/home/esec/hewei/kaggle/mmdetection/mmdet/apis/train.py", line 128, in train_detector
runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
File "/home/esec/.local/lib/python3.6/site-packages/mmcv/runner/epoch_based_runner.py", line 122, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/esec/.local/lib/python3.6/site-packages/mmcv/runner/epoch_based_runner.py", line 32, in train
**kwargs)
File "/home/esec/.local/lib/python3.6/site-packages/mmcv/parallel/data_parallel.py", line 31, in train_step
return self.module.train_step(*inputs[0], **kwargs[0])
File "/home/esec/hewei/kaggle/mmdetection/mmdet/models/detectors/base.py", line 236, in train_step
losses = self(**data)
File "/home/esec/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/esec/hewei/kaggle/mmdetection/mmdet/core/fp16/decorators.py", line 51, in new_func
return old_func(*args, **kwargs)
File "/home/esec/hewei/kaggle/mmdetection/mmdet/models/detectors/base.py", line 171, in forward
return self.forward_train(img, img_metas, **kwargs)
File "/home/esec/hewei/kaggle/mmdetection/mmdet/models/detectors/two_stage.py", line 164, in forward_train
**kwargs)
File "/home/esec/hewei/kaggle/mmdetection/mmdet/models/roi_heads/dynamic_roi_head.py", line 89, in forward_train
img_metas)
File "/home/esec/hewei/kaggle/mmdetection/mmdet/models/roi_heads/dynamic_roi_head.py", line 128, in _bbox_forward_train
*bbox_targets)
File "/home/esec/hewei/kaggle/mmdetection/mmdet/core/fp16/decorators.py", line 131, in new_func
return old_func(*args, **kwargs)
File "/home/esec/hewei/kaggle/mmdetection/mmdet/models/roi_heads/bbox_heads/bbox_head.py", line 182, in loss
reduction_override=reduction_override)
File "/home/esec/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/esec/hewei/kaggle/mmdetection/mmdet/models/losses/smooth_l1_loss.py", line 93, in forward
**kwargs)
File "/home/esec/hewei/kaggle/mmdetection/mmdet/models/losses/utils.py", line 94, in wrapper
loss = loss_func(pred, target, **kwargs)
File "/home/esec/hewei/kaggle/mmdetection/mmdet/models/losses/smooth_l1_loss.py", line 21, in smooth_l1_loss
assert beta > 0
AssertionError

Confusion with the training setting

I notice that when you are using multi-scale training you would extend the training time by 1.5x. So why not directly name these configs as 3x config? Also as you extend your training time, I think it's better to clarify this in the paper for fair comparision...

loss is nan

我训练自己的数据,刚训练两分钟损失值就变成nan,请问这是什么原因导致的?

Questions about Fig. 5

你好,我基于mmdetection实现了Dynamic-RCNN的训练策略,在训练过程中,我观察了IOU阈值和beta值的更新情况,具有以下特点:
1、iou阈值在前几个迭代过程中,就可以提升到0.6左右,后续的迭代,iou阈值一致在0.58范围内做上下波动。
2、beta值经过几次迭代之后,迅速从1下降到一个小于0.1的值,且在以后的迭代过程中,在这个值做小幅度的波动。
PS:上述的实验是同时采用了DLA和DSL,超参和论文保持一致。

从我的实验中,iou和beta值的变化和论文中的Fig. 5 情况差异很大,所以想请教一下论文中iou和beta值具体是怎么统计出来的,以及相应的实现细节?

Some question about iou and beta

Hello! I am interested in your paper. Dynamic RCNN obtains iou and beta according to the current detection results. Have you directly set iou and beta to the values calculated by dynamic rcnn, such as 0.6 and 0.2, for training?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.