shuliu1993 / panet Goto Github PK

View Code? Open in Web Editor NEW

1.3K 27.0 279.0 8.14 MB

PANet for Instance Segmentation and Object Detection

License: MIT License

Python 84.05% MATLAB 0.26% Shell 0.36% Cuda 8.18% C 6.80% C++ 0.36%

instance-segmentation object-detection

panet's Introduction

Path Aggregation Network for Instance Segmentation

by Shu Liu, Lu Qi, Haifang Qin, Jianping Shi, Jiaya Jia.

Introduction

This repository is for the CVPR 2018 Spotlight paper, 'Path Aggregation Network for Instance Segmentation', which ranked 1st place of COCO Instance Segmentation Challenge 2017 , 2nd place of COCO Detection Challenge 2017 (Team Name: UCenter) and 1st place of 2018 Scene Understanding Challenge for Autonomous Navigation in Unstructured Environments (Team Name: TUTU).

Citation

If PANet is useful for your research, please consider citing:

@inproceedings{liu2018path,
  author = {Shu Liu and
            Lu Qi and
            Haifang Qin and
            Jianping Shi and
            Jiaya Jia},
  title = {Path Aggregation Network for Instance Segmentation},
  booktitle = {Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2018}
}

Disclaimer

The origin code was implemented based on the modified version of Caffe maintained by Sensetime Research. Due to several reasons, we could not release our origin code.
In this repository, we provide our re-implementation of PANet based on Pytorch. Note that our code is heavily based on Detectron.pytorch. Thanks Roy for his great work!
Several details, e.g., weight initialization and RPN joint training, in Detectron is fairly different from our origin implementation. In this repository, we simply follow Detectron because it achieves a better baseline than the codebase used in our paper.
In this repository, we test our code with BN layers in the backbone fixed and use GN in other part. We expect to achieve a better performance with Synchronized Batch Normalization Layer and train all parameter layers as what we have done in our paper. With those differences and a much better baseline, the improvement is not same as the one we reported. But we achieve a better performance than our origin implementation.
We trained with image batch size 16 using 8*P40. The performance should be similar with batch size 8.

Installation

For environment requirements, data preparation and compilation, please refer to Detectron.pytorch.

WARNING: pytorch 0.4.1 is broken, see pytorch/pytorch#8483. Use pytorch 0.4.0

Usage

For training and testing, we keep the same as the one in Detectron.pytorch. To train and test PANet, simply use corresponding config files. For example, to train PANet on COCO:

python tools/train_net_step.py --dataset coco2017 --cfg configs/panet/e2e_panet_R-50-FPN_2x_mask.yaml

To evaluate the model, simply use:

python tools/test_net.py --dataset coco2017 --cfg configs/panet/e2e_panet_R-50-FPN_2x_mask.yaml --load_ckpt {path/to/your/checkpoint}

Main Results

Backbone	Type	Batch Size	LR Schedules	Box AP	Mask AP	Download Links
R-50-PANet (paper)	Faster	16	1x	39.2	-	-
R-50-PANet	Faster	16	1x	39.8	-	model
R-50-PANet-2fc (paper)	Faster	16	1x	39.0	-	-
R-50-PANet-2fc	Faster	16	1x	39.6	-	model
R-50-PANet (paper)	Mask	16	2x	42.1	37.8	-
R-50-PANet	Mask	16	2x	43.1	38.3	model

Results on COCO 20017 val subset produced by this repository. In our paper, we used Synchronized Batch Normalization following all parameter layers. While in this repository, we fix BN layers in the backbone and use GN layers in other part. With the same set of hyper-parameters, e.g., multi-scales, this repository can produce better performance than that in our origin paper. We expect a better performance with Synchronized Batch Normalization Layer.

Questions

Please contact '[email protected]'

panet's People

Contributors

Stargazers

Watchers

Forkers

chenyilun95 ml-lab samson-wang zhearing hzhang57 zhou-chao zhengfangwu zumbalamambo christinaliang shlpu hyzcn fendaq happog xllau stevenlol bikong2 xtanitfy hajungong007 keyky fastlater baiyancheng20 mvpduncan jacoobr richard-coder xxradon tony32769 ailib sefira zhangxgu baby47 yuanhang8605 fanofjava abbyxxn skyz8421 ogofo qinhaifangpku liben2018 seokjulee superz678 dreadlord1984 dimagrshk cltdevelop littlebylittle2 shuizhilinxin egrass jiazewang klqulei nimatr wpfhtl shichaosuper flyingbird93 lfliu hongpanlab for-research wangsailing z0org zgsxwsdxg wutianyirosun xuhuaze707313 scutan90 singulartnt gaopeng91 ason93 ghljh tangtangchx jefferyxm 19940312 yuckfu ryany1994 hopstone zhdai arvindchandran zekunzh blanktec ddeeppnneett sj0419 nmxnql kainmueller-lab insightai liu3xing3long xbr2017 jidebingfeng 10183308 zghzdxs alixing asi-sx muxinghan leefree-git andytsing jingwang960108 turn11 rameezrehman83 gjylt yorlife my-hello-world parety gaoqiangwu junluo-bit youruncleda tqdavid

panet's Issues

Freeze a detector's pretrained weights in training with mask

Hi,

I originally did not have ground truth masks for my dataset so I have a PANet detector trained using the e2e_panet_R-50_FPN_1x_det.yaml config file only. Now that I have masks, I'd like to reuse the detector's weights and just train the masking branch using the e2e_panet_R_50_FPN_2x_mask.yaml config.

Is that possible? How do I do that? Thanks.

mynn.DataParallel module does not support multiple gpus

In infer_simple.py the mask RCNN model is created as follows:

maskRCNN = mynn.DataParallel(maskRCNN, cpu_keywords=['im_info', 'roidb'],
                                 minibatch=True, device_ids=[0])  # only support single GPU

I've noticed that the DataParallel module only supports a single GPU. I actually wanted to speed up my inference time i.e. time taken to visualise the detections of the model on an image since I have upward of 1000 images. Is there a way to visualise the detections on multiple image using multiple gpus? Also why can't the model support multiple GPUs?

No module named cython_bbox

I haven't encountered error running sh make.sh but when I run test_net.py I still got this error below:

'VIS': False,
'VIS_TH': 0.9}
loading annotations into memory...
Done (t=0.66s)
creating index...
index created!
INFO subprocess.py: 87: detection range command 0: python /home/zhongligeng/projects/PANet/tools/test_net.py --range 0 834 --cfg /home/zhongligeng/projects/PANet/test/detection_range_config.yaml --set TEST.DATASETS '("coco_2017_val",)' --output_dir /home/zhongligeng/projects/PANet/test --load_ckpt /home/zhongligeng/projects/PANet/Outputs/panet_mask_step179999.pth
INFO subprocess.py: 87: detection range command 1: python /home/zhongligeng/projects/PANet/tools/test_net.py --range 834 1668 --cfg /home/zhongligeng/projects/PANet/test/detection_range_config.yaml --set TEST.DATASETS '("coco_2017_val",)' --output_dir /home/zhongligeng/projects/PANet/test --load_ckpt /home/zhongligeng/projects/PANet/Outputs/panet_mask_step179999.pth
INFO subprocess.py: 87: detection range command 2: python /home/zhongligeng/projects/PANet/tools/test_net.py --range 1668 2501 --cfg /home/zhongligeng/projects/PANet/test/detection_range_config.yaml --set TEST.DATASETS '("coco_2017_val",)' --output_dir /home/zhongligeng/projects/PANet/test --load_ckpt /home/zhongligeng/projects/PANet/Outputs/panet_mask_step179999.pth
INFO subprocess.py: 87: detection range command 3: python /home/zhongligeng/projects/PANet/tools/test_net.py --range 2501 3334 --cfg /home/zhongligeng/projects/PANet/test/detection_range_config.yaml --set TEST.DATASETS '("coco_2017_val",)' --output_dir /home/zhongligeng/projects/PANet/test --load_ckpt /home/zhongligeng/projects/PANet/Outputs/panet_mask_step179999.pth
INFO subprocess.py: 87: detection range command 4: python /home/zhongligeng/projects/PANet/tools/test_net.py --range 3334 4167 --cfg /home/zhongligeng/projects/PANet/test/detection_range_config.yaml --set TEST.DATASETS '("coco_2017_val",)' --output_dir /home/zhongligeng/projects/PANet/test --load_ckpt /home/zhongligeng/projects/PANet/Outputs/panet_mask_step179999.pth
INFO subprocess.py: 87: detection range command 5: python /home/zhongligeng/projects/PANet/tools/test_net.py --range 4167 5000 --cfg /home/zhongligeng/projects/PANet/test/detection_range_config.yaml --set TEST.DATASETS '("coco_2017_val",)' --output_dir /home/zhongligeng/projects/PANet/test --load_ckpt /home/zhongligeng/projects/PANet/Outputs/panet_mask_step179999.pth
INFO subprocess.py: 127: # ---------------------------------------------------------------------------- #
INFO subprocess.py: 129: stdout of subprocess 0 with range [1, 834]
INFO subprocess.py: 131: # ---------------------------------------------------------------------------- #
Traceback (most recent call last):
File "/home/zhongligeng/projects/PANet/tools/test_net.py", line 14, in
from core.test_engine import run_inference
File "/home/zhongligeng/projects/PANet/lib/core/test_engine.py", line 36, in
from core.test import im_detect_all
File "/home/zhongligeng/projects/PANet/lib/core/test.py", line 43, in
import utils.boxes as box_utils
File "/home/zhongligeng/projects/PANet/lib/utils/boxes.py", line 52, in
import utils.cython_bbox as cython_bbox
ImportError: No module named cython_bbox
Traceback (most recent call last):
File "tools/test_net.py", line 112, in
check_expected_results=True)
File "/home/zhongligeng/projects/PANet/lib/core/test_engine.py", line 128, in run_inference
all_results = result_getter()
File "/home/zhongligeng/projects/PANet/lib/core/test_engine.py", line 108, in result_getter
multi_gpu=multi_gpu_testing
File "/home/zhongligeng/projects/PANet/lib/core/test_engine.py", line 154, in test_net_on_dataset
args, dataset_name, proposal_file, num_images, output_dir
File "/home/zhongligeng/projects/PANet/lib/core/test_engine.py", line 186, in multi_gpu_test_net_on_dataset
args.load_ckpt, args.load_detectron, opts
File "/home/zhongligeng/projects/PANet/lib/utils/subprocess.py", line 107, in process_in_parallel
log_subprocess_output(i, p, output_dir, tag, start, end)
File "/home/zhongligeng/projects/PANet/lib/utils/subprocess.py", line 145, in log_subprocess_output
assert ret == 0, 'Range subprocess failed (exit code: {})'.format(ret)
AssertionError: Range subprocess failed (exit code: 1)

Any idea how to solve this?

Poor test results

(py36pandet) [root....... PANet-master]# python tools/infer_simple.py --dataset coco --cfg configs/panet/e2e_panet_R-50-FPN_2x_mask.yaml --load_ckpt data/pretrained_model/panet_mask_step179999.pth --images 001.png --output_dir infer_outputs
Called with args:
Namespace(cfg_file='configs/panet/e2e_panet_R-50-FPN_2x_mask.yaml', cuda=True, dataset='coco', image_dir=None, images=['001.png'], load_ckpt='data/pretrained_model/panet_mask_step179999.pth', load_detectron=None, merge_pdfs=True, output_dir='infer_outputs', set_cfgs=[])
load cfg from file: configs/panet/e2e_panet_R-50-FPN_2x_mask.yaml
loading checkpoint data/pretrained_model/panet_mask_step179999.pth
.....（Here's how it works, showing the class name and confidence）
The generated test results, however, were poor and barely handled, and I might need some guidance
001.pdf

Undefined name 'x' in mask_rcnn_heads.py

flake8 testing of https://github.com/ShuLiu1993/PANet on Python 3.7.0

$ flake8 . --count --select=E901,E999,F821,F822,F823 --show-source --statistics

./lib/modeling/mask_rcnn_heads.py:62:16: F821 undefined name 'x'
        return x
               ^
1     F821 undefined name 'x'
1

Multi-Scale Testing Support

Thank you for your amazing work. I notice that we can only provide one test scale in the config. Is it possible to provide multi-scale testing support or steps to be followed to implement it? Also, can you please explain what you mean by bounding box augmentation during test time? Thank you very much

About the memory usage

Hi, can you provide the memory usage of one or two batch size on a single GPU card? I assume the consumption is beyond my imagination. THX!

Suggestion: Add some automated testing like Travis CI, Circle CI, Appveyor

They are all free for Open Source projects like this one.

https://github.com/marketplace/category/continuous-integration

关于网络结构与论文不同的问题

您好：
我注意到mask分割部分，代码为：

x = self.conv_fcn(x) 
batch_size = x.size(0)
 x_fcn = F.relu(self.upconv(**self.mask_conv4(x))**, inplace=True)
x_ff = self.mask_fc(self.mask_conv5_fc(self.mask_conv4_fc(x)).view(batch_size, -1))

其中：

`        module_list = []
        for i in range(2):
            module_list.extend([
                nn.Conv2d(dim_in, dim_inner, 3, 1, padding=1*dilation, dilation=dilation, bias=False),
                nn.GroupNorm(net_utils.get_group_gn(dim_inner), dim_inner, eps=cfg.GROUP_NORM.EPSILON),
                nn.ReLU(inplace=True)
            ])
            dim_in = dim_inner
        self.conv_fcn = nn.Sequential(*module_list)`

也就是说您进行fusion时候，不是像论文一下，是丢弃了原来的conv3是吗？
是这样效果好一些还是其他原因呢？
盼望回复，感谢您～
祝好。

train result error

I use dataset is myself make ,

"annotations": [{"segmentation": [[251, 198, 286, 198, 322, 198, 358, 198, 394, 198, 430, 198, 466, 198, 466, 252, 430, 252, 394, 252, 358, 252, 322, 252, 286, 252, 251, 252]], "area": 11610, "iscrowd": 0, "image_id": 1, "bbox": [251, 198, 215, 54], "category_id": 1, "id": 1}, {"segmentation": [[196, 139, 251, 119, 306, 114, 361, 107, 416, 108, 471, 121, 527, 139, 507, 201, 459, 178, 411, 165, 362, 159, 314, 163, 266, 174, 218, 198]], "area": 31114, "iscrowd": 0, "image_id": 1, "bbox": [196, 107, 331, 94], "category_id": 1, "id": 2},

detectron(caffe2) is feasible,
but test result(PANet) is not good,　1200 pictures，MAX_ITER: 1５000
Is it a dataset problem?
1434.pdf

How to train Cityscapes dataset

Hi！How to load and train cityscapes dataset? Cityscapes dataset has no bounding box, so how do you train it for detection? thanks.

Evaluation Precision x Recall

Hi everyone,

I am doing a comparison with others object detection networks and I have a question: How can I properly variate the threshold to obtain different recall values? I would like to just run infer_simple to get outputs at different thresholds, I'm not sure if I should touch the NMS or not.

Thanks a lot!
Güinther.

repetition result on cityscapes dataset

Has anybody trained PANet on Cityscapes dataset？when I try to train PANet on Cityscapes dataset with hyper-parameters listed in the paper, I only get a 30.26 mask AP on Cityscapes validation set,which is much less than 36.5 mask AP in the paper. Maybe i got something wrong in the configuration file, Would anyone like to share his configuration file on cityscapes?thanks a lot.

infer_simple.py error ?

i run the scripts use you main result models ,but get error :

/home/rjw/anaconda3/envs/tf1.8_torch0.4_py3.5/bin/python /home/rjw/desktop/maskrcnn/PANet-master/tools/infer_simple.py --dataset coco --cfg ./configs/panet/e2e_panet_R-50-FPN_2x_mask.yaml --load_detectron /home/rjw/desktop/maskrcnn/PANet-master/panet_mask_step179999.pth --image_dir ./demo/sample_images --output_dir ./demo/
Called with args:
Namespace(cfg_file='./configs/panet/e2e_panet_R-50-FPN_2x_mask.yaml', cuda=True, dataset='coco', image_dir='./demo/sample_images', images=None, load_ckpt=None, load_detectron='/home/rjw/desktop/maskrcnn/PANet-master/panet_mask_step179999.pth', merge_pdfs=True, output_dir='./demo/', set_cfgs=[])
load cfg from file: ./configs/panet/e2e_panet_R-50-FPN_2x_mask.yaml
Traceback (most recent call last):
loading detectron weights /home/rjw/desktop/maskrcnn/PANet-master/panet_mask_step179999.pth
  File "/home/rjw/desktop/maskrcnn/PANet-master/tools/infer_simple.py", line 177, in <module>
    main()
  File "/home/rjw/desktop/maskrcnn/PANet-master/tools/infer_simple.py", line 129, in main
    load_detectron_weight(maskRCNN, args.load_detectron)
  File "/home/rjw/desktop/maskrcnn/PANet-master/lib/utils/detectron_weight_helper.py", line 14, in load_detectron_weight
    if 'blobs' in src_blobs:
TypeError: argument of type 'int' is not iterable

i dowload you pretrian models . but when read this file get error ? i don't know why ? can you help me ,thanks !

your own datasets

Has anyone trained your own dataset with this code?

Nan values for certain classes during testing

I tried out the PANet framework and observed pretty good results. But when I try to calculate the mAP score via the test_net.py script, most of my classes are outputting nan. Would you happen to know why this is the case? The detections that the models produce on the test images looks pretty good hence my confusion regarding this error.

INFO json_dataset_evaluator.py: 222: ~~~~ Mean and per-category AP @ IoU=[0.50,0.95] ~~~~
INFO json_dataset_evaluator.py: 223: 12.0
INFO json_dataset_evaluator.py: 231: 44.6
/usr/local/onnx/numpy/core/fromnumeric.py:2957: RuntimeWarning: Mean of empty slice.
  out=out, **kwargs)
/usr/local/onnx/numpy/core/_methods.py:80: RuntimeWarning: invalid value encountered in double_scalars
  ret = ret.dtype.type(ret / rcount)
INFO json_dataset_evaluator.py: 231: nan
INFO json_dataset_evaluator.py: 231: 64.1
INFO json_dataset_evaluator.py: 231: nan
INFO json_dataset_evaluator.py: 231: nan
INFO json_dataset_evaluator.py: 231: 36.2
INFO json_dataset_evaluator.py: 231: 0.0
INFO json_dataset_evaluator.py: 231: nan
INFO json_dataset_evaluator.py: 231: 6.0
INFO json_dataset_evaluator.py: 231: 1.4
INFO json_dataset_evaluator.py: 231: nan
INFO json_dataset_evaluator.py: 231: nan
INFO json_dataset_evaluator.py: 231: 19.8
INFO json_dataset_evaluator.py: 231: 3.2
INFO json_dataset_evaluator.py: 231: nan
INFO json_dataset_evaluator.py: 231: 5.9
INFO json_dataset_evaluator.py: 231: 4.6
INFO json_dataset_evaluator.py: 231: nan
INFO json_dataset_evaluator.py: 231: 5.1
INFO json_dataset_evaluator.py: 231: nan
INFO json_dataset_evaluator.py: 231: 28.6
INFO json_dataset_evaluator.py: 231: 0.0
INFO json_dataset_evaluator.py: 231: nan
INFO json_dataset_evaluator.py: 231: nan
INFO json_dataset_evaluator.py: 231: nan
INFO json_dataset_evaluator.py: 231: nan
INFO json_dataset_evaluator.py: 231: nan
INFO json_dataset_evaluator.py: 231: 32.0
....................................................................

INFO json_dataset_evaluator.py: 232: ~~~~ Summary metrics ~~~~
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.120
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.180
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.129
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.068
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.232
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.207
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.059
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.135
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.166
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.082
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.318
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.246

System information

Operating system: Ubuntu 16.04
CUDA version: 10.0
python version: 3.6.5
pytorch version: 0.4.0
numpy version: 1.14.0

Missing nms_cuda.c

https://raw.githubusercontent.com/roytseng-tw/Detectron.pytorch/master/lib/model/nms/src/nms_cuda.c

4 X P40 is not enough?

I have tried to reproduce your results with 4X P100 but meet such errors.

I am wondering whether anyone could give me some hints.

loss_rpn_bbox_fpn6: 0.014951
INFO train_net_step.py: 443: Save ckpt on exception ...
INFO train_net_step.py: 135: save model: Outputs/e2e_panet_R-50-FPN_2x_mask/ckpt/model_step1506.pth
INFO train_net_step.py: 445: Save ckpt done.
Traceback (most recent call last):
  File "tools/train_net_step.py", line 425, in main
    net_outputs = maskRCNN(**input_data)
  File "/root/miniconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/teamscratch/msravcshare/yuyua/code/segmentation/PANet/lib/nn/parallel/data_parallel.py", line 111, in forward
    outputs = self.parallel_apply(replicas, inputs, kwargs)
  File "/teamscratch/msravcshare/yuyua/code/segmentation/PANet/lib/nn/parallel/data_parallel.py", line 139, in parallel_apply
    return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
  File "/teamscratch/msravcshare/yuyua/code/segmentation/PANet/lib/nn/parallel/parallel_apply.py", line 67, in parallel_apply
    raise output
  File "/teamscratch/msravcshare/yuyua/code/segmentation/PANet/lib/nn/parallel/parallel_apply.py", line 42, in _worker
    output = module(*input, **kwargs)
  File "/root/miniconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/teamscratch/msravcshare/yuyua/code/segmentation/PANet/lib/modeling/model_builder.py", line 144, in forward
    return self._forward(data, im_info, roidb, **rpn_kwargs)
  File "/teamscratch/msravcshare/yuyua/code/segmentation/PANet/lib/modeling/model_builder.py", line 215, in _forward
    loss_mask = mask_rcnn_heads.mask_rcnn_losses(mask_pred, rpn_ret['masks_int32'])
  File "/teamscratch/msravcshare/yuyua/code/segmentation/PANet/lib/modeling/mask_rcnn_heads.py", line 101, in mask_rcnn_losses
    masks_pred.view(n_rois, -1), masks_gt, weight, size_average=False)
  File "/root/miniconda3/lib/python3.6/site-packages/torch/nn/functional.py", line 1651, in binary_cross_entropy_with_logits
    loss = input - input * target + max_val + ((-max_val).exp() + (-input - max_val).exp()).log()
RuntimeError: CUDA error: out of memory

IndexError: list index out of range

After a few Iteration i.e, After first epoch, I get list index out of range error on this line

mini_kwargs = dict([(k, v[i]) for k, v in kwargs.items()])

loss_mask: NaN, how to solve it?

Hi, when I trained your code with my own dataset, I came across a problem as shown in the flowing figure:

OOM during inference

Anyone know how much GPU memory is needed to run the checkpoint panet_mask_step179999.pth? I've got 8GB but python tools/infer_simple.py --dataset coco2017 --cfg configs/panet/e2e_panet_R-50-FPN_2x_mask.yaml --load_ckpt panet_mask_step179999.pth gives OOM:

  File "/Volumes/Data/Projects/PANet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/Volumes/Data/Projects/PANet/src/PANet/lib/modeling/FPN.py", line 439, in forward
    fpn_rpn_conv = F.relu(self.FPN_RPN_conv(bl_in), inplace=True)
  File "/Volumes/Data/Projects/PANet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/Volumes/Data/Projects/PANet/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 301, in forward
    self.padding, self.dilation, self.groups)
RuntimeError: cuda runtime error (2) : out of memory at /Volumes/Data/Projects/pytorch/src/pytorch/aten/src/THC/generic/THCStorage.cu:58

Backbone ResNeXt101 CUDA OUT OF MEMORY

Hi all,
First, I managed to train the PANet with ResNet-50 with batch_size = 8 on 8 GTX 1080 GPUs.
But when I tried to change the backbone from R-50 to ResNeXt101, I met out of memory problem. It can train for several steps, but always at the edge of OOM, and will eventually get OOM.
Could I use 4 GPUs to do this? I tried to change the code to make it work, but get into other problems. The code uses all gpus by default so I comment it.

maskRCNN = mynn.DataParallel(maskRCNN, cpu_keywords=['im_info', 'roidb'],device_ids=[0, 1, 2, 3],
                                 minibatch=True)

will get into index out of range problem.
Is there anyone who succeed on this?

ModuleNotFoundError: No module named 'utils.cython_bbox'

Traceback (most recent call last):
File "infer_simple.py", line 28, in
from core.test import im_detect_all
File "/content/PANet/lib/core/test.py", line 43, in
import utils.boxes as box_utils
File "/content/PANet/lib/utils/boxes.py", line 52, in
import utils.cython_bbox as cython_bbox
ModuleNotFoundError: No module named 'utils.cython_bbox'

I am using google colab with cuda version 9.0 and pytorch version 0.4.0

How to run it for our own set of images/videos ?

Hi,
I love the work you have done. But I wanted to know how can we run this for our own images/videos?
Thanks

All rois are predicted to backgrounds ! ?

Need help!
Train panet on COCO2017 detection dataset, and have a print at function fast_rcnn_losses in faster_rcnn_heads.py.

    print("cls_preds中非背景个数：",(cls_preds!=0).sum())
    print("rois_label中非背景个数：",(rois_label!=0).sum())
    print("非背景的正确预测个数: ",((cls_preds==rois_label)*(cls_preds!=0)).sum())
    print("~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~")

Running log as followed，
at first iter, all rois are predicted to front-ground classes, (batch=2, 2x512=1024 rois)
but after several iters, all rois are predicted to background class!
And the most strange is the ground-truth label of the classification,
in one batch, only 20 rois is front-ground class, and the left almost 1000 rois are background!
Therefore after several iters, all rois are predicted to background which index is 0.

Shouldn't (the number of positive rois / the number of negative rois) equal 1/3 ??
Why rois_label are almost bg class!

在faster_rcnn_heads.py.中fast_rcnn_losses函数计算损失时加入三行输出，
看到第一次迭代，cls_score输出的每一个roi都不是背景类，但是1024个roi，真值标签中只有几个是前景类，因此经过不到百次迭代，网络输出就会倾向于将所有的roi都判断为背景类，此时cls_accuracy因为1024个roi中大部分背景类都判断正确，因此有0.97+,损失也一直徘徊在0.2左右，但网络其实根本没有学习前景类，训练躺在了起跑线。。。：(

求大神解答

PANet computing time is slow

PANet takes a lot of time in multi-scale training.
It would be great if @ShuLiu1993 could tell how I can optimize the training time.

How to visualize the structure of PANet?

Hi~
I want to learn the detail of the structure of this network, but I don't know what to do. Can you give me some advice?
thanks a lot!
@ShuLiu1993

ImportError: cannot import name 'numpy_type_map'

Error,how to do?

Can this implementation handle images without objects?

Hello!
I want to train a model with some samples that has no objects ( just background ).
Can this implementation handle that?

Check_Point panet_mask_step179999.pth has one issue

@ShuLiu1993 I have downloaded this checkpoint and now i am using the code for testing and testing only for detection. I ahve used coco person_keypoints_annotations_2017. and i have also set the imagenet pretrained as false. Now, i am getting an error that,
Runtime error: while copying the parameter named Box_Outs.cls_score.weight whose dimension in the model are torch.size(2,1024) and whose dimension in the checkpoint are torch.size(81,1024). Why the dimension is issue in testing?
why teh checkpoint for testing has 81 classes, while, i and you are using only 2 classes during training for coco_person_keypoints

> Yes

Yes

Can you tell me how to train my own dataset? I don't know how to preprocess the dataset.

Originally posted by @YongLD in #20 (comment)
Do you know how to do that?? Could you help me, thank you so so so much !

About the box head

Thanks for your sharing code. I want to ask that before fusing feature, why you use 4 separate fc layers？Whether the 4 feature share a single fc layer is OK?

how to use several models as ensemble for bounding box and mask generation?

In your paper 《Path Aggregation Network for Instance Segmentation》， you said that "we use 3ResNeXt-101(644d), 2SE-ResNeXt-101(324d), 1 ResNet-269 and 1 SENet as ensemble for bounding box and mask generation"

I do not know what to do to achieve it.
Can you give me some advice ?
Thank you so much ~~
@ShuLiu1993

how to test a picture?

Could you please add some information about test procedure????
i have run the infer_simple.py for visualize the outcome.
but here comes an error:
Traceback (most recent call last): File "tools/infer_simple.py", line 176, in <module> main() File "tools/infer_simple.py", line 163, in main kp_thresh=2 File "/home/cm/PANet/lib/utils/vis.py", line 183, in vis_one_image e.copy(), cv2.RETR_CCOMP, cv2.CHAIN_APPROX_NONE) ValueError: not enough values to unpack (expected 3, got 2)
the order i used is:
python tools/infer_simple.py --dataset coco2017 --cfg configs/panet/e2e_panet_R-50-FPN_2x_mask.yaml --load_ckpt panet_mask_step179999.pth --image_dir demo/sample_images --output_dir infer_outputs
Can you help me!! thank u so much!

AttributeError: module 'modeling.FPN' has no attribute 'fpn_ResNet50_conv5_body_bup'

load cfg from file: configs/panet/e2e_panet_R-50-FPN_2x_mask.yaml
Failed to find function: FPN.fpn_ResNet50_conv5_body_bup
Traceback (most recent call last):
File "tools/infer_simple.py", line 176, in
main()
File "tools/infer_simple.py", line 115, in main
maskRCNN = Generalized_RCNN()
File "/home/PANet-master/PANet-master/lib/modeling/model_builder.py", line 80, in init
self.Conv_Body = get_func(cfg.MODEL.CONV_BODY)()
File "/home/PANet-master/PANet-master/lib/modeling/model_builder.py", line 40, in get_func
return getattr(module, parts[-1])
AttributeError: module 'modeling.FPN' has no attribute 'fpn_ResNet50_conv5_body_bup'

My env is cuda9.0,torch0.4.1,I have finished sh .make.sh,When I run infer_simple.py,there is an error,
I might need some guidance,
My instructions are:python tools/infer_simple.py --cfg configs/panet/e2e_panet_R-50-FPN_2x_mask.yaml --load_ckpt data/pretrained_model/panet_mask_step179999.pth --image_dir demo/006.png --output_dir infer_outputs

support multi-gpu and batch image inference ?

Thank you for sharing your code! This is a wonderful work!

I want to ask, does this code infer_simply.py support batch image inference and multi-gpu?
Thank you in advance.

Multiple batch images inference

Hi Shu, thanks for sharing the code!

I want to ask how to implement multi-GPU/batch inference on your code.
Could you give me some advice to easily implement it?

sh make.sh

gcc:5.4.0
pytorch:0.4.0
torchvision:0.2.1
cuda:9.0

when I do "sh make.sh",it comes a problem:

cffi.VerificationError:LInkError:command 'gcc' failed with exit status 1

What can I do?

License

Hi Dear,
Could you tell me what is the license of this software, please?
According to GitHub's Policy, all repositories without an explicit license are considered Copyrighted materials. Do the authors intend to make this software free?
Thank you!

Inference on CPU

Is it possible to run inference on CPU? In the forward function of the roi_Xconv1fc_gn_head_panet in fast_rcnn_heads.py, it relies on the gpu version of the roi align. How can this issue be solved?

reproduce result

I trained with batch size 8 on 8*GTX1080Ti.
python tools/train_net_step.py --dataset coco2017 --cfg configs/panet/e2e_panet_R-50-FPN_2x_mask.yaml
and only change IMAGE_PER_BATCH in config.py to 1 to get batch size 8.
At last I get box AP 0.3507 and mask AP 0.3148, which is much lower than the reported ones. I also downloaded the pretrained model and got the same result to this repo, so I think it's the problem of model.
But I'm confused about how to fix it. I use a smaller batch size but the lr is still 0.02 in config.yaml, should I make it smaller?(test it this weekend) And does anyone have other advice?

training slowly

Wonderful works!
Thank you for sharing your code!

It took about 30 seconds when I trained 20 steps with four Titan X GPU. So can you give me some suggestions to speed up training process? Thank you in advance.

Training on Cityscapes

Are these the only lines (in train_net_step.py) that need to be changed to train on Cityscapes?

     if args.dataset == "coco2017":
        cfg.TRAIN.DATASETS = ('coco_2017_train',)
        cfg.MODEL.NUM_CLASSES = 81
    elif args.dataset == "keypoints_coco2017":
        cfg.TRAIN.DATASETS = ('keypoints_coco_2017_train',)
        cfg.MODEL.NUM_CLASSES = 2
    else:
        raise ValueError("Unexpected args.dataset: {}".format(args.dataset))

Question about the demo?

Hi, how can I run a demo only for a file with images and no annotations? I find that the function

run_inference()

in test_engine.py need json file for the dataset, but in my situation I only want to show the segmentation results of a few local figures.
Hope you can help! Thank you!

Questions on benchmark

I find that you choose to use 'roi_Xconv1fc_gn_head_panet' as head in your model when training R50-PANet and get a 39.8 box mAP. I tried to train a panet using '2fc_mlp_head' with PA-structure unchanged and got a 37.1 box mAP, which is only 0.4 above the fpn baseline. Is there something wrong with this implementation, or is there any probability that PA-net has to be combined with a heavy head to have a good performance?

Sync Batch Norm

Thank you for your sharing.
By quick search on github i found that you open an issue in Sync Batch Norm repo:
vacancy/Synchronized-BatchNorm-PyTorch#3
Is this the main reason that sync batch norm is not supported in this PANet repo?

If I set cfg.NUM_GPUS = 1，does the performance will decrease？

garbage characters about segmentation results

[{'image_id': 0, 'category_id': 1, 'segmentation': {'size': [2688, 2208], 'counts': 'clZU1Y1eb29H7H7J3M3M3N2M2N3M2O2M3L5L6J6I9D;F9G9H8G5N3M2N2N2N1O2O1N1O2N2N2M3N2M4L3L5M3L4M2N2N2O1N1O2O1N101N1O2N1O2N1N3N2M3M3N2N2N3M2N2O1N2N101N100O2N1O1O1N3M2N2O1O2N1O1O1O1O2O0O100O1O2O0O1O1O1N3M[JneM1QZ2XN]fMc1aY2XNkfMd1SY2XNUgMe1jX2[N[gMa1dX2_N_gM1_X2NcgM_1\\X2aNfgM^1YX2bNigM^1VX2aNlgM^1SX2bNogM^1PX2bNQhM^1mW2bNUhM]1kW2bNWhM_1fW2aN\\hM_1cW2N_hM1_W2NdhM_1ZW2aNihM^1UW2aNohM^1oV2aNViM\1iV2cNZiM]1cV2cNaiM[1]V2fNeiMZ1YV2fNiiMZ1UV2fNmiMY1RV2gNPjMX1oU2hNRjMY1lU2hNUjMW1kU2hNWjMX1gU2hNZjMX1fU2hN[jMW1eU2hN\jMX1dU2gN^jMX1bU2hN_jMW1aU2hNajMW1^U2iNdjMV1\U2hNgjMW1YU2hNjjMV1VU2hNmjMW1SU2gNQkMW1oT2hNTkMV1lT2hNWkMW1hT2iNZkMV1fT2iN\kMV1dT2iN^kMV1bT2jN_kMU1aT2jNkMV1T2iNbkMV1^T2jNbkMW1\T2iNfkMV1YT2kNgkMV1XT2iNjkMV1UT2jNmkMV1QT2kNPlMU1nS2kNTlMU1iS2kNZlMU1dS2kN^lMU1_S2lNdlMT1XS2lNllMT1QS2kNRmMU1lR2kNWmMT1gR2lN[mMT1cR2lN_mMT1_R2lNcmMT1[R2lNfmMU1YR2jNimMV1UR2kNlmMT1TR2kNmmMU1RR2lNnmMT1RR2lNnmMT1RR2lNnmMU1PR2lNQnMS1oQ2lNRnMT1mQ2mNSnMS1mQ2mNSnMS1lQ2nNTnMR1kQ2oNVnMP1iQ2POXnMP1gQ2QOYnMo0fQ2ROZnMn0eQ2SO[nMm0dQ2TO]nMk0bQ2UO_nMk0Q2VOnMj0_Q2WOanMi0^Q2XOcnMg0]Q2XOdnMi0ZQ2WOhnMh0XQ2WOjnMi0TQ2WOnnMh0RQ2WOPoMi0oP2VOSoMi0mP2VOVoMi0hP2VO[oMj0dP2UO^oMk0aP2TOaoMl0]P2TOeoMl0ZP2SOhoMn0VP2QOkoMP1TP2POmoMQ1QP2nNQPNR1mo1oNSPNR1lo1mNVPNR1jo1nNVPNS1io1lNXPNU1fo1lN[PNS1eo1lN\PNT1co1mN^PNR1bo1mN_PNT1_o1lNcPNS1\o1mNePNS1Zo1mNhPNR1Wo1mNlPNR1So1nNnPNR1Po1nNSQNQ1ln1nNWQNQ1gn1PO\QNo0bn1QOQNn0_n1ROcQNm0\\n1SOgQNk0Xn1UOjQNj0Un1VOmQNi0Rn1XOnQNh0Rn1WOPRNh0Pn1XOPRNh0Pn1XOQRNh0nm1XORRNh0mm1YOSRNg0mm1YOSRNg0mm1YOTRNf0lm1YOURNg0km1YOURNg0km1YOURNh0jm1XOVRNh0jm1XOVRNh0jm1XOWRNg0im1YOWRNg0im1XOXRNh0hm1XOXRNi0gm1WOYRNi0gm1WOYRNi0gm1WOZRNh0em1YO[RNg0em1XO\\RNi0bm1WORNi0_m1WOaRNj0]m1VOeRNi0Zm1WOhRNj0Um1UOnRNk0Pm1UORSNk0ll1TOWSNl0fl1UO\SNl0al1TObSNk0[l1VOhSNh0Wl1XOkSNh0Rl1YOQTNf0mk1[OTTNe0kk1ZOWTNe0hk1[OYTNf0ek1[O\TNd0ck1\O_TNd0k1\\OTNd0k1\\OTNd0k1\\OTNd0k1\\OTNd0k1\\OTNd0k1\\OTNd0_k1]OaTNc0_k1\OcTNc0]k1]OcTNc0]k1]OcTNc0]k1]OcTNc0]k1]OcTNc0]k1]OcTNc0]k1]OcTNc0]k1

reproduced result

I get almost same result with one in paper using e2e_panet_R-50-FPN_1x_det.yaml on val2017. But when i test it on 2017-test-dev data, it gets worse result。 it is 39.2%AP while it is 42.5 in your paper. I donot why?, i get almost same result on val2017 but worse result on test-dev-2017.

ValueError: cannot select an axis to squeeze out which has size not equal to one

I don't know why this error occurred.