libuyu / ghm_detection Goto Github PK

View Code? Open in Web Editor NEW

617.0 25.0 111.0 334 KB

The implementation of “Gradient Harmonized Single-stage Detector” published on AAAI 2019.

License: MIT License

Python 97.91% Shell 2.09%

ghm_detection's Introduction

GHM_Detection

The implementation of Gradient Harmonized Single-stage Detector published on AAAI 2019 (Oral).

Updates

(May 24, 2019)

Make mmdetection a submodule to keep it up-to-date.

Installation

This project is based on mmdetection.

Requirements

Python 3.5+
PyTorch 1.0+ (Based on the current version of mmdetection)
CUDA 9.0+

Setup the Environment and Packages

i. Create a new environment We recommend Anaconda as the package & environment manager. And here is an example:

conda create -n ghm
conda activate ghm

ii. Install PyTorch Follow the official instructions to install Pytorch. Here is an example using conda:

conda install pytorch torchvision -c pytorch

iii. Install Cython

conda install cython 
# or "pip install cython"

Install GHM

i. Clone the repository

git clone --recursive https://github.com/libuyu/GHM_Detection.git

ii. Compile extensions

cd GHM_Detection/mmdetection

./compile.sh

iii. Setup mmdetection

pip install -e . 
# editable mode is convinient when debugging
# if your code in mmdetection is fixed, use "pip install ." directly

Prepare Data

It is recommended to symlink the datasets root to mmdetection/data.

ln -s $YOUR_DATA_ROOT data

The directories should be arranged like this:

GHM_detection
├──	mmdetection
|	├── mmdet
|	├── tools
|	├── configs
|	├── data
|	│   ├── coco
|	│   │   ├── annotations
|	│   │   ├── train2017
|	│   │   ├── val2017
|	│   │   ├── test2017
|	│   ├── VOCdevkit
|	│   │   ├── VOC2007
|	│   │   ├── VOC2012

Running

Script

We provide training and testing scripts and configuration files for both GHM and baseline (focal loss and smooth L1 loss) in the experiments directory. You need specify the path of your own pre-trained model in the config files.

Configuration

The configuration parameters are mainly in the cfg_*.py files. The parameters you most probably change are as follows:

work_dir: the directory for current experiment
datatype: data set name (coco, voc, etc.)
data_root: Root for the data set
model.pretrained: the path to the ImageNet pretrained backbone model
resume_from: path or checkpoint file if resume
train_cfg.ghmc: params for GHM-C loss
- bins: unit region numbers
- momentum: moving average parameter \alpha
train_cfg.ghmr: params for GHM-R loss
- mu: the \mu for ASL1 loss
- bins, momentum: similar to ghmc
total_epochs, lr_config.step: set the learning rate decay strategy

Loss Functions

The GHM-C and GHM-R loss functions are available in ghm_loss.py.
The code works for pytorch 1.0.1 and later version.

Result

Training using the Res50-FPN backbone and testing on COCO minival.

Method	AP
FL + SL1	35.6%
GHM-C + SL1	35.8%
GHM-C + GHM-R	37.0%

License

This project is released under the MIT license.

Citation

@inproceedings{li2019gradient,
  title={Gradient Harmonized Single-stage Detector},
  author={Li, Buyu and Liu, Yu and Wang, Xiaogang},
  booktitle={AAAI Conference on Artificial Intelligence},
  year={2019}
}

If the code helps you in your research, please also cite:

@article{mmdetection,
  title   = {{MMDetection}: Open MMLab Detection Toolbox and Benchmark},
  author  = {Kai Chen, Jiaqi Wang, Jiangmiao Pang, Yuhang Cao, Yu Xiong, Xiaoxiao Li,
             Shuyang Sun, Wansen Feng, Ziwei Liu, Jiarui Xu, Zheng Zhang, Dazhi Cheng,
             Chenchen Zhu, Tianheng Cheng, Qijie Zhao, Buyu Li, Xin Lu, Rui Zhu, Yue Wu,
             Jifeng Dai, Jingdong Wang, Jianping Shi, Wanli Ouyang, Chen Change Loy, Dahua Lin},
  journal = {arXiv preprint arXiv:1906.07155},
  year    = {2019}
}

ghm_detection's People

Contributors

Stargazers

Watchers

Forkers

starstylesky chaoso shlpu yyht dc-y softwaregift amwons andoneday insmod-he codes-kzhan hfxunlp ml-lab ray-lee-94 zergratino pbdahzou leonardyao irentang yogsin barbecacov benjaminzhouyj tandychao alixing zhenxingsh hxl1990 solomon1588 wpf535236337 zeng-hello-world coderhaoranlee deisler134 qilicun wwfnwg csuwoshikunge zgsxwsdxg anme90 dorniwang amirunpri2018 waiting111 sidatian liwzhi wyxingyux sunycl fanhongweifd ricardozzf wuzhiyang2016 suyanzhou626 wait1988 applekiller ranglang junsenselee armstrongyang cvrosefun 953250587 wwwanghao nangeblog xiaolurd mineidea hayden-z skyneta cxf2015 ginking shiyongde chaos1992 xiaodongdreams shiyuan0806 giorking bygreencn chilicy valencebond zhaowujie beckhamchen wandoucao ignatiuszy zhangxujinsh leo-xxx ahashisyuu hughlio liaw05 zhyj3038 aimhabo chenmingthu chenyanghungry brandonzhong flamings liaoxy169 bai0925 aipakchoi cjwbdw cxjaicj gui-cheng umgaolu wjgaas githubpgq kingwangseet zhaohao0404 yuanwanglll xrosliang tommylitlle chenping5121 tianxiaguixin002 oustandingman

ghm_detection's Issues

questions about batch size in GHMC_loss

Hi, thanks for your nice work. In your paper, you mentioned the best bin size is 30, which is a balanced value, what is the batch size in your experiments when you using bin size 30?

Is there a error in function `_expand_onehot_labels`?

Some code in function _expand_onehot_labels of https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/losses/ghm_loss.py is

if inds.numel() > 0:
       bin_labels[inds, labels[inds]] = 1

but in you rep function, it is:

    if inds.numel() > 0:
        bin_labels[inds, labels[inds] - 1] = 1

Is there something wrong?

TypeError: forward_train() got an unexpected keyword argument 'gt_bboxes_ignore

Hi,

Firstly, thank you very much for this work, great job!
Secondly, when I tried to run this detection follow the introduction with VOC dataset, there was an error TypeError: forward_train() got an unexpected keyword argument 'gt_bboxes_ignore
I modified cfg_retinanet_ghm_r50_fpn_1x.py, replace the data config part with that in GHM_Detection/mmdetection/configs/pascal_voc/faster_rcnn_r50_fpn_1x_voc0712.py, what did I miss?

Thirdly, do you have any plan to release your pretrained model? Thanks again!

add ghm loss to other network

Thank you for your good paper！I encountered some problems when I added ghm loss to other one stage networks. When adding ghmc_loss alone, it does not decrease during training, the network can't learn anything from it. And when adding ghmr_loss alone, it is nan. What parameters should I adjust, or which step I ignored?

result is much worse than ohem

Hi,

I have trained a face detector based on ohem loss (https://github.com/biubug6/Pytorch_Retinaface/blob/master/layers/modules/multibox_loss.py#L103)

and now I have replaced the regression loss and classification loss with ghm-c and ghm-r.

However, it seems that ohem is much better.

Here is a example,

from ohem,

the probabilities of the foreground are

the foreground probabilities are from [0.5,0.6], it's very low. and background probabilities seems not decrease.

the training loss seems fine.

by the way, i have trained 30 epochs, however, the probabilities of foreground object is much lower when training more epochs . Even the training loss from last epoch is much lower than early epoch. It seems that this is overfitting?

here is the early epoch (epoch-9) from ghm.

the probabilities of foreground object is much higher than last epoch, it's about [0.6,0.7]. but still there are many background.

I use the confidence threshold as 0.5, and I did the initialization trick as the focal loss.

not sure how to look into this problem and tune the parameters (I use the same ghm parameters as yours)

thank you.

What does the mask parameter refer to?

GHM_Detection/mmdetection/mmdet/core/loss/ghm_loss.py

Line 24 in ff58c3e

def calc(self, input, target, mask):

how to train only one classes

hi, I want to train only one class, I edit num_classes=2, bug I get this error RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.cuda.LongTensor for argument #3 'other'

consulting some questions

Hi,
Thanks for your great work and sharing code! I have some questions:

In line 47 of ghm_loss.py, you update it with the following code:
self.acc_sum[i] = mmt * self.acc_sum[i] + (1 - mmt) * num_in_bin.
According to #14 (comment), you explained 'self.acc_sum would consider not only samples in the current batch, but also its previous value'. However, according the updating of self.acc_sum in code or the updating equation (12) in paper, I think at the iteration t，　the self.acc_sum in the i-th bin only depends on the previous self.acc_sum. Whether I miss something?
In #4 (comment), you explained 'sum[i+1] = mmt * sum[i] + (1 - mmt) * num[i]', it seams that at each iteration t, the acc_sum in the (i+1)-th bin depends on the acc_sum in i-th bin, Is it wrong?
When I only use the ghm-c loss In the pixel-level classification in the segmentation task, the loss do not decrease, could you give me some suggestions?

Thank for you again and sorry for bothering you.
Looking forward to your reply.

@DHPO You are right. In the paper, we define the M as the number of all bins. And in the latest version of our code, we choose the number of valid (non-empty) bins.

How to get the distribution of the gradient norm?

Hi,
Thanks for your great work! I have a problem that how to get the distribution of the gradient norm as shown in Fig.2. In the paper, it says the distribution from is from a converged one-stage detection model. Is the converged model trained with the gradient harmonizing mechanism? What dataset is used for the statistics, the training set of COCO or the testing set of COCO?

Sorry for bothering you!

Have you ever tried it on YOLO?

The paper YOLOV3 said Focal Loss didnt make sense in YOLOv3.SO how about your works?
Thank you！

Why use nonempty bins rather than all bins?

Why you divide weights by nonempty bins (n) rather than all bins(self.bins)?

GHM_Detection/mmdetection/mmdet/core/loss/ghm_loss.py

Line 54 in 3647287

weights = weights / n

I think M is the amount of all bins in the paper. Am I missing something?

the output of _expand_binary_labels

Hi,

to the best of my knowledge, _expand_binary_labels (https://github.com/libuyu/mmdetection/blob/master/mmdet/models/losses/ghm_loss.py#L8) will output one-hot embedding of the target

i don't understand why you minus 1 from the target (https://github.com/libuyu/mmdetection/blob/master/mmdet/models/losses/ghm_loss.py#L12)

can you elaborate?

thank you

questions about tot in GHMR_Loss

Hi,
Thanks for your excellent work!
As you mentioned in #2, mask is used for add the samples to training or not. What I don't understand is, in retina_head.py, you used ghmr_loss as:
loss_reg = self.ghmr_loss.calc(
bbox_pred,
bbox_targets,
bbox_weights)
So bbox_weights is input as mask, however, as far as I understand, in anchor_target.py, I think bbox_weights is a weight matrix set positive sample to 1 and negative samples to 0.
bbox_weights = torch.zeros_like(anchors)
if len(pos_inds) > 0:
bbox_weights[pos_inds, :] = 1.0
@libuyu Could you please explain this part with an example? Thanks!

mutil-gpu train error

no distribution multi-gpu train exists error in the code:
"inds = (g<= edges[i])&(g<edges[i+1])&vaild"

the error is
"arguments are located on different gpus at /pytorch/aten/THC/generic/THCTensorMathCompareT.cu:31"

a question about complexity analysis in this paper

Unit Region in this paper is exactly a "buket sorting" process with time complexity O(N), while in paper, this process cost O(MN). Is there anything I misread or miss?

loss is too small

the code weights[inds] = tot / num_in_bin
loss = F.binary_cross_entropy_with_logits( input, target, weights, reduction='sum') / tot
same as weights[inds] = 1 / num_in_bin ， and combination with weights = weights / n，
weighted logits may be one percent or one thousandths of origin weighted if there are many samples in one bins。
if there is something wrong with my understanding , please tell me.

What is the label_weight in GHM loss?

Thanks for the work. I am planning to reuse GHM loss in other models, and I find that the label_weight matrix is required to compute ghmc and ghmr. In the code it mentions that if the sample is ignored then the value should be zero. How to compute the label_weight? In what situation should we ignore the sample?

The loss does not work in ssd

Hi
I rewrite your code with tensorflow and use these two loss in ssd object detection framework. However, the training can not converge. The loss value is very small at the beginning and did not become small after many iteration.
I just add GHMC_LOSS and GHMR_LOSS and regard the sum as the final loss for training. Should I do something else to GHMC_LOSS and GHMR_LOSS? What step have I ignored?

May I ask whether GHM can be used in YOLO series algorithm

Hello, I have read your article on GHM recently, and I want to know whether it can be used in YOLO algorithm

GHMC

I used GHMC and GHMR on a new task. GHMR is very effective, but I found that the classification loss cannot converge if I use GHMC. May I ask what causes this in general ？

14 epochs or 12 epochs

I notice that it is 14 epochs in the paper while 12 epochs in the code.

multi-classification

Dear Sir
For your Gradient Harmonized single-stage Detector paper, for the classification branch,
If y_true.shape= (B, N, class_num), class_num=80 classes, I'm using MSCOCO dataset,
Can you apply the code of this paper?I see what your paper discusses is dichotomy problem!

A question about pred and target in ghm_loss

The method is very good! However, I am new in this area. So I see the ghm code include two parameter: pred [batch_num, class_num] and target [batch_num, class_num]. I want to ask if the pred only input the [batch,class] matrix, we don't need to input the image's C x H x W. Because I see that the input of focal_loss is pred[batch, c, h, w], I am confused.
Tks for your reply.

One question for acc_sum

Hello, for acc_sum, if we use moving average, acc_sum[i] is bigger and bigger for each i, is it right? Thank you.

ImportError: /home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/roi_align/roi_align_cuda.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN2at15UndefinedTensor10_singletonE

Hi, I`m facing the problem with training:
(py36) [madongliang@compute-0-5 experiments]$ sh train.sh
Traceback (most recent call last):
Traceback (most recent call last):
File "../mmdetection/tools/train.py", line 8, in
File "../mmdetection/tools/train.py", line 8, in
Traceback (most recent call last):
File "../mmdetection/tools/train.py", line 8, in
Traceback (most recent call last):
File "../mmdetection/tools/train.py", line 8, in
Traceback (most recent call last):
Traceback (most recent call last):
File "../mmdetection/tools/train.py", line 8, in
Traceback (most recent call last):
from mmdet.apis import (train_detector, init_dist, get_root_logger,
from mmdet.apis import (train_detector, init_dist, get_root_logger,
File "../mmdetection/tools/train.py", line 8, in
File "../mmdetection/tools/train.py", line 8, in
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/apis/init.py", line 2, in
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/apis/init.py", line 2, in
from mmdet.apis import (train_detector, init_dist, get_root_logger,
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/apis/init.py", line 2, in
from mmdet.apis import (train_detector, init_dist, get_root_logger,
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/apis/init.py", line 2, in
Traceback (most recent call last):
File "../mmdetection/tools/train.py", line 8, in
from mmdet.apis import (train_detector, init_dist, get_root_logger,
from mmdet.apis import (train_detector, init_dist, get_root_logger,
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/apis/init.py", line 2, in
from mmdet.apis import (train_detector, init_dist, get_root_logger,
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/apis/init.py", line 2, in
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/apis/init.py", line 2, in
from mmdet.apis import (train_detector, init_dist, get_root_logger,
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/apis/init.py", line 2, in
from .train import train_detector
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/apis/train.py", line 9, in
from .train import train_detector
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/apis/train.py", line 9, in
from .train import train_detector
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/apis/train.py", line 9, in
from .train import train_detector
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/apis/train.py", line 9, in
from .train import train_detector
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/apis/train.py", line 9, in
from .train import train_detector
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/apis/train.py", line 9, in
from .train import train_detector
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/apis/train.py", line 9, in
from mmdet.core import (DistOptimizerHook, DistEvalmAPHook,
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/core/init.py", line 6, in
from mmdet.core import (DistOptimizerHook, DistEvalmAPHook,
from .train import train_detector
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/core/init.py", line 6, in
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/apis/train.py", line 9, in
from mmdet.core import (DistOptimizerHook, DistEvalmAPHook,
from mmdet.core import (DistOptimizerHook, DistEvalmAPHook,
from mmdet.core import (DistOptimizerHook, DistEvalmAPHook,
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/core/init.py", line 6, in
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/core/init.py", line 6, in
from mmdet.core import (DistOptimizerHook, DistEvalmAPHook,
from mmdet.core import (DistOptimizerHook, DistEvalmAPHook,
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/core/init.py", line 6, in
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/core/init.py", line 6, in
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/core/init.py", line 6, in
from .post_processing import * # noqa: F401, F403
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/core/post_processing/init.py", line 1, in
from mmdet.core import (DistOptimizerHook, DistEvalmAPHook,
from .post_processing import * # noqa: F401, F403
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/core/post_processing/init.py", line 1, in
from .post_processing import * # noqa: F401, F403
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/core/init.py", line 6, in
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/core/post_processing/init.py", line 1, in
from .post_processing import * # noqa: F401, F403
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/core/post_processing/init.py", line 1, in
from .post_processing import * # noqa: F401, F403
from .post_processing import * # noqa: F401, F403
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/core/post_processing/init.py", line 1, in
from .post_processing import * # noqa: F401, F403
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/core/post_processing/init.py", line 1, in
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/core/post_processing/init.py", line 1, in
from .post_processing import * # noqa: F401, F403
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/core/post_processing/init.py", line 1, in
from .bbox_nms import multiclass_nms
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/core/post_processing/bbox_nms.py", line 3, in
from .bbox_nms import multiclass_nms
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/core/post_processing/bbox_nms.py", line 3, in
from .bbox_nms import multiclass_nms
from .bbox_nms import multiclass_nms
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/core/post_processing/bbox_nms.py", line 3, in
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/core/post_processing/bbox_nms.py", line 3, in
from .bbox_nms import multiclass_nms
from .bbox_nms import multiclass_nms
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/core/post_processing/bbox_nms.py", line 3, in
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/core/post_processing/bbox_nms.py", line 3, in
from .bbox_nms import multiclass_nms
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/core/post_processing/bbox_nms.py", line 3, in
from .bbox_nms import multiclass_nms
from mmdet.ops.nms import nms_wrapper
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/core/post_processing/bbox_nms.py", line 3, in
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/init.py", line 2, in
from mmdet.ops.nms import nms_wrapper
from mmdet.ops.nms import nms_wrapper
from mmdet.ops.nms import nms_wrapper
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/init.py", line 2, in
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/init.py", line 2, in
from mmdet.ops.nms import nms_wrapper
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/init.py", line 2, in
from mmdet.ops.nms import nms_wrapper
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/init.py", line 2, in
from mmdet.ops.nms import nms_wrapper
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/init.py", line 2, in
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/init.py", line 2, in
from .roi_align import RoIAlign, roi_align
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/roi_align/init.py", line 1, in
from mmdet.ops.nms import nms_wrapper
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/init.py", line 2, in
from .roi_align import RoIAlign, roi_align
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/roi_align/init.py", line 1, in
from .roi_align import RoIAlign, roi_align
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/roi_align/init.py", line 1, in
from .roi_align import RoIAlign, roi_align
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/roi_align/init.py", line 1, in
from .roi_align import RoIAlign, roi_align
from .roi_align import RoIAlign, roi_align
from .roi_align import RoIAlign, roi_align
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/roi_align/init.py", line 1, in
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/roi_align/init.py", line 1, in
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/roi_align/init.py", line 1, in
from .functions.roi_align import roi_align
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/roi_align/functions/roi_align.py", line 3, in
from .roi_align import RoIAlign, roi_align
from .functions.roi_align import roi_align
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/roi_align/init.py", line 1, in
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/roi_align/functions/roi_align.py", line 3, in
from .functions.roi_align import roi_align
from .functions.roi_align import roi_align
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/roi_align/functions/roi_align.py", line 3, in
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/roi_align/functions/roi_align.py", line 3, in
from .functions.roi_align import roi_align
from .functions.roi_align import roi_align
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/roi_align/functions/roi_align.py", line 3, in
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/roi_align/functions/roi_align.py", line 3, in
from .functions.roi_align import roi_align
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/roi_align/functions/roi_align.py", line 3, in
from .. import roi_align_cuda
ImportError: /home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/roi_align/roi_align_cuda.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN2at15UndefinedTensor10_singletonE
from .. import roi_align_cuda
from .functions.roi_align import roi_align
ImportError: /home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/roi_align/roi_align_cuda.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN2at15UndefinedTensor10_singletonE
from .. import roi_align_cuda
from .. import roi_align_cuda
from .. import roi_align_cuda
ImportError: /home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/roi_align/roi_align_cuda.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN2at15UndefinedTensor10_singletonE
File "/home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/roi_align/functions/roi_align.py", line 3, in
from .. import roi_align_cuda
ImportError: /home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/roi_align/roi_align_cuda.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN2at15UndefinedTensor10_singletonE
ImportError: /home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/roi_align/roi_align_cuda.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN2at15UndefinedTensor10_singletonE
ImportError: /home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/roi_align/roi_align_cuda.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN2at15UndefinedTensor10_singletonE
from .. import roi_align_cuda
from .. import roi_align_cuda
ImportError: /home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/roi_align/roi_align_cuda.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN2at15UndefinedTensor10_singletonE
ImportError: /home/madongliang/.conda/envs/py36/lib/python3.6/site-packages/mmdet/ops/roi_align/roi_align_cuda.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN2at15UndefinedTensor10_singletonE

I`m using CUDA 9.0, pytorch 1.0.1.post2, python 3.6
Everything compiled well during installation.

Thanks!

Hi, backward

i'm sorry, i can't find the backward code for the ghm_loss....

what is the purpose of weights[inds] = tot / num_in_bin and weights = weights / n if momentum=0？

https://github.com/libuyu/mmdetection/blob/be06992564cc6b995b1ae86a258568e9d7b7a599/mmdet/models/losses/ghm_loss.py#L70
weights[inds] = tot / num_in_bin
https://github.com/libuyu/mmdetection/blob/be06992564cc6b995b1ae86a258568e9d7b7a599/mmdet/models/losses/ghm_loss.py#L73
weights = weights / n

Experimental effect on voc dataset

Hi, your work has a good effect on the coco dataset. Do you have made experiments on the voc dataset? I use the same parameters on the voc dataset, but it has a lower ap.5,ap.75,ap.8 than origin retinanet, why is it?
Thank you!

Really nice work! I have few questions~~~

I wonder if u have any gist when choosing the two parameters bins & momentum ?
I guess the bins might be related to the number of target classes, so u choose bins=30 for COCO which has 80 classes.
Also, I am curious about the performance with different MOMENTUM value.
Thank u for sharing.

How about two stage training, should I add ghm loss in both rpn_head and two_stage_head ?

or just rpn_head ?
and should I use the same momentum ,mu and bins as the one stage detector? And if I use cascade_mask_rcnn, how about these hyperparameters?

Thanks a lot.

About "target" in ghm_loss.py for GHMC

Many thanks for your GREAT work!
I have a question about the "target" in ghm_loss.py:

def forward(self, pred, target, label_weight, *args, **kwargs):

Is the "target" a ont-hot format?
and where the "target" from?

Thanks a lot~~

Very Low AP value？

@libuyu
Thanks for your great jobs！
I want to reproduce the GHM model, And I just use two GPU and batchsize=2*4=8, other settings in your cfg files (experiments/cfg_retinanet_ghm_r50_fpn_1x.py)are not changed,I trained on COCO dataset , after 12 epochs, the loss value is maintained at about 3.6,.
However, the test result on COCO val is very low, which is 9.3 for AP, 27.3% for AP0.5 , 2.4 for AP 0.75.
I wonder why it performed so badly on COCO val set.And I want to know the value of loss after 12 epochs in your training process?
Do you have any ideas? Thanks a lot !

A question about variable self.acc_sum in GHMC_loss

Hi, Buyu,
Thanks for your great work !
In line 22 of ghm_loss.py, the variable self.accu_sum in GHMC_loss class is initialized as a list filling with zeros. In line 47, you update it with the following code:
self.acc_sum[i] = mmt * self.acc_sum[i]
+ (1 - mmt) * num_in_bin
As self.acc_sum[i] is initialized as zero, mmt * self.acc_sum[i] has no meaning. Could you explain a little bit about your idea here?

Best Regards
Qin

Question on momentum

Hi Buyu,

Thanks for your good paper! I am re-writing your code into a tensorflow version and adapt it to RetinaNet. Yet I found that the vanilla GHM-R loss (no momentum) is converging much slower than smooth L1 loss in my experiments. I'm wondering if you have experiment on momentum parameters and do they affect the results a lot?

Shangxuan

tensor([  3.82812,   3.82812, 512.00000,  47.28125,   3.82812,   3.82812,  47.28125,   3.82812,   3.82812,   3.82812,  47.28125,  47.28125,   3.82812,   3.82812,  47.28125, 165.00000,   3.82812,   3.82812,  47.28125, 165.00000,   3.82812,   3.82812,   3.82812,   3.82812,   3.82812, 928.00000,  ...,   ], device='cuda:0', dtype=torch.float16)

A question about Mask

what is the mask?

Can it be applied to multi-classification problems?

for multi-classification, when p = softmax(x), Does GHMC_loss work?