Giter Site home page Giter Site logo

deformable-convnets's Introduction

Deformable Convolutional Networks


[04/15/2019] The PyTorch version of deformable convolution operators are available in the mmdetection codebase. They are very efficient!

[12/01/2018] We updated the deformable convolution operator to be the same as those utilized in the Deformale ConvNets v2 paper. A possible issue when the sampling location is outside of image boundary is solved. The issue may cause deteriated performance on ImageNet classification. Note that the current deformable conv layers in both the official MXNet and the PyTorch codebase still have the issue. So if you want to reproduce the results in Deformable ConvNets v2, please utilize the updated layer provided here. The efficiency at large image batch size is also improved. See more details in DCNv2_op/

  • The full codebase of Deformable ConvNets v2 would be available later. But it should be easy to reproduce the results with the updated operator.

[10/2017] We released the training/testing code and pre-trained models of Deformable FPN, which is the foundation of our COCO detection 2017 entry. Slides at COCO 2017 workshop.

A third-party improvement of Deformable R-FCN + Soft NMS


Deformable ConvNets is initially described in an ICCV 2017 oral paper. (Slides at ICCV 2017 Oral)

R-FCN is initially described in a NIPS 2016 paper.


This is an official implementation for Deformable Convolutional Networks (Deformable ConvNets) based on MXNet. It is worth noticing that:

  • The original implementation is based on our internal Caffe version on Windows. There are slight differences in the final accuracy and running time due to the plenty details in platform switch.
  • The code is tested on official MXNet@(commit 62ecb60) with the extra operators for Deformable ConvNets.
  • We trained our model based on the ImageNet pre-trained ResNet-v1-101 using a model converter. The converted model produces slightly lower accuracy (Top-1 Error on ImageNet val: 24.0% v.s. 23.6%).
  • This repository used code from MXNet rcnn example and mx-rfcn.


© Microsoft, 2017. Licensed under an MIT license.

Citing Deformable ConvNets

If you find Deformable ConvNets useful in your research, please consider citing:

    Author = {Jifeng Dai, Haozhi Qi, Yuwen Xiong, Yi Li, Guodong Zhang, Han Hu, Yichen Wei},
    Title = {Deformable Convolutional Networks},
    Journal = {arXiv preprint arXiv:1703.06211},
    Year = {2017}
    Author = {Jifeng Dai, Yi Li, Kaiming He, Jian Sun},
    Title = {{R-FCN}: Object Detection via Region-based Fully Convolutional Networks},
    Conference = {NIPS},
    Year = {2016}

Main Results

training data testing data [email protected] [email protected] time
R-FCN, ResNet-v1-101 VOC 07+12 trainval VOC 07 test 79.6 63.1 0.16s
Deformable R-FCN, ResNet-v1-101 VOC 07+12 trainval VOC 07 test 82.3 67.8 0.19s
training data testing data mAP [email protected] [email protected] mAP@S mAP@M mAP@L
R-FCN, ResNet-v1-101 coco trainval coco test-dev 32.1 54.3 33.8 12.8 34.9 46.1
Deformable R-FCN, ResNet-v1-101 coco trainval coco test-dev 35.7 56.8 38.3 15.2 38.8 51.5
Faster R-CNN (2fc), ResNet-v1-101 coco trainval coco test-dev 30.3 52.1 31.4 9.9 32.2 47.4
Deformable Faster R-CNN (2fc),
coco trainval coco test-dev 35.0 55.0 38.3 14.3 37.7 52.0
training data testing data mAP [email protected] [email protected] mAP@S mAP@M mAP@L
FPN+OHEM, ResNet-v1-101 coco trainval35k coco minival 37.8 60.8 41.0 22.0 41.5 49.8
Deformable FPN + OHEM, ResNet-v1-101 coco trainval35k coco minival 41.2 63.5 45.5 24.3 44.9 54.4
FPN + OHEM + Soft NMS + multi-scale testing,
coco trainval35k coco minival 40.9 62.5 46.0 27.1 44.1 52.2
Deformable FPN + OHEM + Soft NMS + multi-scale testing, ResNet-v1-101 coco trainval35k coco minival 44.4 65.5 50.2 30.8 47.3 56.4
training data testing data mIoU time
DeepLab, ResNet-v1-101 Cityscapes train Cityscapes val 70.3 0.51s
Deformable DeepLab, ResNet-v1-101 Cityscapes train Cityscapes val 75.2 0.52s
DeepLab, ResNet-v1-101 VOC 12 train (augmented) VOC 12 val 70.7 0.08s
Deformable DeepLab, ResNet-v1-101 VOC 12 train (augmented) VOC 12 val 75.9 0.08s

Running time is counted on a single Maxwell Titan X GPU (mini-batch size is 1 in inference).

Requirements: Software

  1. MXNet from the offical repository. We tested our code on MXNet@(commit 62ecb60). Due to the rapid development of MXNet, it is recommended to checkout this version if you encounter any issues. We may maintain this repository periodically if MXNet adds important feature in future release.

  2. Python 2.7. We recommend using Anaconda2 as it already includes many common packages. We do not support Python 3 yet, if you want to use Python 3 you need to modify the code to make it work.

  3. Python packages might missing: cython, opencv-python >= 3.2.0, easydict. If pip is set up on your system, those packages should be able to be fetched and installed by running

    pip install -r requirements.txt
  4. For Windows users, Visual Studio 2015 is needed to compile cython module.

Requirements: Hardware

Any NVIDIA GPUs with at least 4GB memory should be OK.


  1. Clone the Deformable ConvNets repository, and we'll call the directory that you cloned Deformable-ConvNets as ${DCN_ROOT}.
git clone
  1. For Windows users, run cmd .\init.bat. For Linux user, run sh ./ The scripts will build cython module automatically and create some folders.

  2. Install MXNet:

    Note: The MXNet's Custom Op cannot execute parallelly using multi-gpus after this PR. We strongly suggest the user rollback to version MXNet@(commit 998378a) for training (following Section 3.2 - 3.5).

    Quick start

    3.1 Install MXNet and all dependencies by

    pip install -r requirements.txt

    If there is no other error message, MXNet should be installed successfully.

    Build from source (alternative way)

    3.2 Clone MXNet and checkout to MXNet@(commit 998378a) by

    git clone --recursive
    git checkout 998378a
    git submodule update
    # if it's the first time to checkout, just use: git submodule update --init --recursive

    3.3 Compile MXNet

    cd ${MXNET_ROOT}
    make -j $(nproc) USE_OPENCV=1 USE_BLAS=openblas USE_CUDA=1 USE_CUDA_PATH=/usr/local/cuda USE_CUDNN=1

    3.4 Install the MXNet Python binding by

    Note: If you will actively switch between different versions of MXNet, please follow 3.5 instead of 3.4

    cd python
    sudo python install

    3.5 For advanced users, you may put your Python packge into ./external/mxnet/$(YOUR_MXNET_PACKAGE), and modify MXNET_VERSION in ./experiments/rfcn/cfgs/*.yaml to $(YOUR_MXNET_PACKAGE). Thus you can switch among different versions of MXNet quickly.

  3. For Deeplab, we use the argumented VOC 2012 dataset. The argumented annotations are provided by SBD dataset. For convenience, we provide the converted PNG annotations and the lists of train/val images, please download them from OneDrive.

Demo & Deformable Model

We provide trained deformable convnet models, including the deformable R-FCN & Faster R-CNN models trained on COCO trainval, and the deformable DeepLab model trained on CityScapes train.

  1. To use the demo with our pre-trained deformable models, please download manually from OneDrive or BaiduYun, and put it under folder model/.

    Make sure it looks like this:

  2. To run the R-FCN demo, run

    python ./rfcn/

    By default it will run Deformable R-FCN and gives several prediction results, to run R-FCN, use

    python ./rfcn/ --rfcn_only
  3. To run the DeepLab demo, run

    python ./deeplab/

    By default it will run Deformable Deeplab and gives several prediction results, to run DeepLab, use

    python ./deeplab/ --deeplab_only
  4. To visualize the offset of deformable convolution and deformable psroipooling, run

    python ./rfcn/
    python ./rfcn/

Preparation for Training & Testing

For R-FCN/Faster R-CNN:

  1. Please download COCO and VOC 2007+2012 datasets, and make sure it looks like this:

  2. Please download ImageNet-pretrained ResNet-v1-101 model manually from OneDrive, and put it under folder ./model. Make sure it looks like this:


For DeepLab:

  1. Please download Cityscapes and VOC 2012 datasets and make sure it looks like this:

  2. Please download argumented VOC 2012 annotations/image lists, and put the argumented annotations and the argumented train/val lists into:


    , Respectively.

  3. Please download ImageNet-pretrained ResNet-v1-101 model manually from OneDrive, and put it under folder ./model. Make sure it looks like this:



  1. All of our experiment settings (GPU #, dataset, etc.) are kept in yaml config files at folder ./experiments/rfcn/cfgs, ./experiments/faster_rcnn/cfgs and ./experiments/deeplab/cfgs/.

  2. Eight config files have been provided so far, namely, R-FCN for COCO/VOC, Deformable R-FCN for COCO/VOC, Faster R-CNN(2fc) for COCO/VOC, Deformable Faster R-CNN(2fc) for COCO/VOC, Deeplab for Cityscapes/VOC and Deformable Deeplab for Cityscapes/VOC, respectively. We use 8 and 4 GPUs to train models on COCO and on VOC for R-FCN, respectively. For deeplab, we use 4 GPUs for all experiments.

  3. To perform experiments, run the python scripts with the corresponding config file as input. For example, to train and test deformable convnets on COCO with ResNet-v1-101, use the following command

    python experiments\rfcn\ --cfg experiments\rfcn\cfgs\resnet_v1_101_coco_trainval_rfcn_dcn_end2end_ohem.yaml

    A cache folder would be created automatically to save the model and the log under output/rfcn_dcn_coco/.

  4. Please find more details in config files and in our code.


Code has been tested under:

  • Ubuntu 14.04 with a Maxwell Titan X GPU and Intel Xeon CPU E5-2620 v2 @ 2.10GHz
  • Windows Server 2012 R2 with 8 K40 GPUs and Intel Xeon CPU E5-2650 v2 @ 2.60GHz
  • Windows Server 2012 R2 with 4 Pascal Titan X GPUs and Intel Xeon CPU E5-2650 v4 @ 2.30GHz


Q: It says AttributeError: 'module' object has no attribute 'DeformableConvolution'.

A: This is because either

  • you forget to copy the operators to your MXNet folder

  • or you copy to the wrong path

  • or you forget to re-compile

  • or you install the wrong MXNet

    Please print mxnet.__path__ to make sure you use correct MXNet

Q: I encounter segment fault at the beginning.

A: A compatibility issue has been identified between MXNet and opencv-python 3.0+. We suggest that you always import cv2 first before import mxnet in the entry script.

Q: I find the training speed becomes slower when training for a long time.

A: It has been identified that MXNet on Windows has this problem. So we recommend to run this program on Linux. You could also stop it and resume the training process to regain the training speed if you encounter this problem.

Q: Can you share your caffe implementation?

A: Due to several reasons (code is based on a old, internal Caffe, port to public Caffe needs extra work, time limit, etc.). We do not plan to release our Caffe code. Since current MXNet convolution implementation is very similar to Caffe (almost the same), it is easy to port to Caffe by yourself, the core CUDA code could be kept unchanged. Anyone who wish to do it is welcome to make a pull request.

deformable-convnets's People


ancientmooner avatar bl0 avatar daijifeng001 avatar gd-zhang avatar haozhiqi avatar liyi14 avatar stupidzz avatar taokong avatar terrychenism avatar yuwenxiong avatar


 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar


 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deformable-convnets's Issues

The question of import

When I run the R-FCN demo, It reports error like this. Can someone give me some help?
360 17001022113100154

install python exits problem

When I go to this step:
cd python
sudo python install
there has a TypeErrror:
File "", line 83, in <module> **kwargs) File "/usr/lib64/python2.7/distutils/", line 152, in setup dist.run_commands() File "/usr/lib64/python2.7/distutils/", line 953, in run_commands self.run_command(cmd) File "/usr/lib64/python2.7/distutils/", line 972, in run_command File "/usr/lib/python2.7/site-packages/setuptools/command/", line 73, in run self.do_egg_install() File "/usr/lib/python2.7/site-packages/setuptools/command/", line 101, in do_egg_install File "/usr/lib/python2.7/site-packages/setuptools/command/", line 380, in run self.easy_install(spec, not self.no_deps) File "/usr/lib/python2.7/site-packages/setuptools/command/", line 604, in easy_install return self.install_item(None, spec, tmpdir, deps, True) File "/usr/lib/python2.7/site-packages/setuptools/command/", line 655, in install_item self.process_distribution(spec, dist, deps) File "/usr/lib/python2.7/site-packages/setuptools/command/", line 701, in process_distribution distreq.project_name, distreq.specs, requirement.extras TypeError: __init__() takes exactly 2 arguments (4 given)
please tell me how to solve it. Thanks!

questions about "kernel_dim_ = conv_in_channels_ / group_ * param_.kernel.Size();" in rfcn/operator_cxx/deformable_convolution-inl.h

in caffe,
//kernel_dim_ = C * H * W
kernel_dim_ = this->blobs_[0]->count(1);
weight_offset_ = conv_out_channels_ * kernel_dim_ / group_;
but here,
kernel_dim_ = conv_in_channels_ / group_ * param_.kernel.Size();
weight_offset_ = conv_out_channels_ * kernel_dim_ / group_;

why not kernel_dim_ = conv_in_channels_ * param_.kernel.Size() , is it needed to be devided by group_?

Any built-in data augmentations?

Thanks for your great work!

My training data set is pretty small, and I wonder if there are any built-in data augmentations in your code? If so, how to configure it?


Does faster rcnn implement supports class-agnostic and ohem?

In faster_rcnn/cfgs/resnet_v1_101_v712_rcnn_end2end.yaml, I see the two options are set as false, but I think it does support class-agnostic and ohem. So I set those two options as true and conducted the training process, but the detection result are very poor, that to say only a few object are detected.

The Result for Faster RCNN

Thanks for sharing your wonderful job. I noticed that you only submit the train&test scripts for R-FCN. But In your paper, you also conduct other experiments using Faster RCNN. Would you mind sharing the result for Faster RCNN detector in this MXNET framework?

Trainng Error

When I train DCN model on the pascalvoc2012 dataset,I encountered such problems.can anyone please afford me a help to explain where the error come from and how to eliminate it?
libpng error: Read Error
Exception in thread Thread-7:
Traceback (most recent call last):
File "/usr/lib/python2.7/", line 810, in __bootstrap_inner
File "/usr/lib/python2.7/", line 763, in run
self.__target(*self.__args, **self.__kwargs)
File "experiments/deeplab/../../deeplab/../lib/utils/", line 60, in prefetch_func
self.next_batch[i] = self.iters[i].next()
File "experiments/deeplab/../../deeplab/core/", line 185, in next
File "experiments/deeplab/../../deeplab/core/", line 234, in get_batch_parallel
rst = [multiprocess_result.get() for multiprocess_result in multiprocess_results]
File "/usr/lib/python2.7/multiprocessing/", line 558, in get
raise self._value
ValueError: zero-size array to reduction operation minimum which has no identity

Question about the implementation of function deformable_im2col() in deformable_im2col.h

How to understand LOG(FATAL) << "not implemented" in the following code ?

template <typename DType>
inline void deformable_im2col(mshadow::Stream<cpu>* s,
  const DType* data_im, const DType* data_offset, 
  const TShape& im_shape, const TShape& col_shape, const TShape& kernel_shape,
  const TShape& pad, const TShape& stride, const TShape& dilation, 
  const uint32_t deformable_group, DType* data_col) {
  if (2 == kernel_shape.ndim()) {
	  LOG(FATAL) << "not implemented";
  } else {
	  LOG(FATAL) << "not implemented";

The question of offset

Hi, in the deformable convolution, the offset is from the output data of the configured convolution layer. I am curious why you process it in that way rather than add some parameters likely adding weight parameter to deforableconv?

running time

With an image size of 800x1200, I get around 7.5 samples per second when training on 8 P6000 GPUs. For r-fcn, I used to get 16 samples per second using a caffe implementation for the same image size. Is this speed in line with your observations, or something is wrong with my runtime environment?

Got error when running demo. (Operator _zeros cannot be run)

Thanks for the great work.

I followed the installation steps but got error when running 'python ./rfcn/'


Ubuntu16.04, GCC 5.4
MXNet Installation validated successfully.


[15:00:14] src/c_api/ Operator _zeros cannot be run; requires at least one of FCompute<xpu>, NDArrayFunction, FCreateOperator be registered


(mxnet) xx@xx-xx:~/PycharmProjects/Deformable-ConvNets-master$ python ./rfcn/ 
 'MXNET_VERSION': 'mxnet',
 'SCALES': [(600, 1000)],
          'CXX_PROPOSAL': False,
          'HAS_RPN': True,
          'NMS': 0.3,
          'PROPOSAL_MIN_SIZE': 0,
          'PROPOSAL_NMS_THRESH': 0.7,
          'PROPOSAL_POST_NMS_TOP_N': 2000,
          'PROPOSAL_PRE_NMS_TOP_N': 20000,
          'RPN_MIN_SIZE': 0,
          'RPN_NMS_THRESH': 0.7,
          'RPN_POST_NMS_TOP_N': 300,
          'RPN_PRE_NMS_TOP_N': 6000,
          'max_per_image': 100,
          'test_epoch': 8},
                         'RPN_BATCH_IMAGES': 0,
                         'rfcn1_epoch': 0,
                         'rfcn1_lr': 0,
                         'rfcn1_lr_step': '',
                         'rfcn2_epoch': 0,
                         'rfcn2_lr': 0,
                         'rfcn2_lr_step': '',
                         'rpn1_epoch': 0,
                         'rpn1_lr': 0,
                         'rpn1_lr_step': '',
                         'rpn2_epoch': 0,
                         'rpn2_lr': 0,
                         'rpn2_lr_step': '',
                         'rpn3_epoch': 0,
                         'rpn3_lr': 0,
                         'rpn3_lr_step': ''},
           'ASPECT_GROUPING': True,
           'BATCH_IMAGES': 1,
           'BATCH_ROIS': -1,
           'BATCH_ROIS_OHEM': 128,
           'BBOX_MEANS': [0.0, 0.0, 0.0, 0.0],
           'BBOX_REGRESSION_THRESH': 0.5,
           'BBOX_STDS': [0.1, 0.1, 0.2, 0.2],
           'BBOX_WEIGHTS': array([ 1.,  1.,  1.,  1.]),
           'BG_THRESH_HI': 0.5,
           'BG_THRESH_LO': 0.0,
           'CXX_PROPOSAL': False,
           'ENABLE_OHEM': True,
           'END2END': True,
           'FG_FRACTION': 0.25,
           'FG_THRESH': 0.5,
           'FLIP': True,
           'RESUME': True,
           'RPN_BATCH_SIZE': 256,
           'RPN_BBOX_WEIGHTS': [1.0, 1.0, 1.0, 1.0],
           'RPN_CLOBBER_POSITIVES': False,
           'RPN_FG_FRACTION': 0.5,
           'RPN_MIN_SIZE': 0,
           'RPN_NEGATIVE_OVERLAP': 0.3,
           'RPN_NMS_THRESH': 0.7,
           'RPN_POSITIVE_OVERLAP': 0.7,
           'RPN_POSITIVE_WEIGHT': -1.0,
           'RPN_POST_NMS_TOP_N': 300,
           'RPN_PRE_NMS_TOP_N': 6000,
           'SHUFFLE': True,
           'begin_epoch': 5,
           'end_epoch': 8,
           'lr': 0.0005,
           'lr_factor': 0.1,
           'lr_step': '5.333',
           'model_prefix': 'e2e',
           'momentum': 0.9,
           'warmup': False,
           'warmup_lr': 5e-05,
           'warmup_step': 1000,
           'wd': 0.0005},
 'dataset': {'NUM_CLASSES': 81,
             'dataset': 'coco',
             'dataset_path': './data/coco',
             'image_set': 'train2014+val2014',
             'proposal': 'rpn',
             'root_path': './data',
             'test_image_set': 'test-dev2015'},
 'default': {'frequent': 20, 'kvstore': 'device'},
 'gpus': '0',
 'network': {'ANCHOR_RATIOS': [0.5, 1, 2],
             'ANCHOR_SCALES': [4, 8, 16, 32],
             'FIXED_PARAMS': ['conv1',
             'FIXED_PARAMS_SHARED': ['conv1',
             'IMAGE_STRIDE': 0,
             'NUM_ANCHORS': 12,
             'PIXEL_MEANS': array([ 103.06,  115.9 ,  123.15]),
             'RCNN_FEAT_STRIDE': 16,
             'RPN_FEAT_STRIDE': 16,
             'pretrained': './model/pretrained_model/resnet_v1_101',
             'pretrained_epoch': 0},
 'output_path': './output/rfcn',
 'symbol': 'resnet_v1_101_rfcn'}
[15:00:14] /home/zehao/mxnet/dmlc-core/include/dmlc/./logging.h:304: [15:00:14] src/c_api/ Operator _zeros cannot be run; requires at least one of FCompute<xpu>, NDArrayFunction, FCreateOperator be registered

Stack trace returned 10 entries:
[bt] (0) /home/zehao/anaconda2/envs/mxnet/lib/python2.7/site-packages/mxnet-0.9.5-py2.7.egg/mxnet/ [0x7fd8c20081bc]
[bt] (1) /home/zehao/anaconda2/envs/mxnet/lib/python2.7/site-packages/mxnet-0.9.5-py2.7.egg/mxnet/ [0x7fd8c2e09c39]
[bt] (2) /home/zehao/anaconda2/envs/mxnet/lib/python2.7/lib-dynload/ [0x7fd8b930c57c]
[bt] (3) /home/zehao/anaconda2/envs/mxnet/lib/python2.7/lib-dynload/ [0x7fd8b930bcd5]
[bt] (4) /home/zehao/anaconda2/envs/mxnet/lib/python2.7/lib-dynload/ [0x7fd8b9303376]
[bt] (5) /home/zehao/anaconda2/envs/mxnet/lib/python2.7/lib-dynload/ [0x7fd8b92fadb3]
[bt] (6) /home/zehao/anaconda2/envs/mxnet/bin/../lib/ [0x7fd8ca996e93]
[bt] (7) /home/zehao/anaconda2/envs/mxnet/bin/../lib/ [0x7fd8caa4980d]
[bt] (8) /home/zehao/anaconda2/envs/mxnet/bin/../lib/ [0x7fd8caa4bc3e]
[bt] (9) /home/zehao/anaconda2/envs/mxnet/bin/../lib/ [0x7fd8caa4b1f7]

Traceback (most recent call last):
  File "./rfcn/", line 129, in <module>
  File "./rfcn/", line 89, in main
    arg_params=arg_params, aux_params=aux_params)
  File "/home/zehao/PycharmProjects/Deformable-ConvNets-master/rfcn/core/", line 29, in __init__
    self._mod.bind(provide_data, provide_label, for_training=False)
  File "/home/zehao/PycharmProjects/Deformable-ConvNets-master/rfcn/core/", line 839, in bind
    for_training, inputs_need_grad, force_rebind=False, shared_module=None)
  File "/home/zehao/PycharmProjects/Deformable-ConvNets-master/rfcn/core/", line 396, in bind
  File "/home/zehao/PycharmProjects/Deformable-ConvNets-master/rfcn/core/", line 186, in __init__
    self.bind_exec(data_shapes, label_shapes, shared_group)
  File "/home/zehao/PycharmProjects/Deformable-ConvNets-master/rfcn/core/", line 272, in bind_exec
  File "/home/zehao/PycharmProjects/Deformable-ConvNets-master/rfcn/core/", line 545, in _bind_ith_exec
    context, self.logger)
  File "/home/zehao/PycharmProjects/Deformable-ConvNets-master/rfcn/core/", line 523, in _get_or_reshape
    arg_arr = nd.zeros(arg_shape, context, dtype=arg_type)
  File "/home/zehao/anaconda2/envs/mxnet/lib/python2.7/site-packages/mxnet-0.9.5-py2.7.egg/mxnet/", line 980, in zeros
    return _internal._zeros(shape=shape, ctx=ctx, dtype=dtype)
  File "<string>", line 36, in _zeros
  File "/home/zehao/anaconda2/envs/mxnet/lib/python2.7/site-packages/mxnet-0.9.5-py2.7.egg/mxnet/", line 84, in check_call
    raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [15:00:14] src/c_api/ Operator _zeros cannot be run; requires at least one of FCompute<xpu>, NDArrayFunction, FCreateOperator be registered

Stack trace returned 10 entries:
[bt] (0) /home/zehao/anaconda2/envs/mxnet/lib/python2.7/site-packages/mxnet-0.9.5-py2.7.egg/mxnet/ [0x7fd8c20081bc]
[bt] (1) /home/zehao/anaconda2/envs/mxnet/lib/python2.7/site-packages/mxnet-0.9.5-py2.7.egg/mxnet/ [0x7fd8c2e09c39]
[bt] (2) /home/zehao/anaconda2/envs/mxnet/lib/python2.7/lib-dynload/ [0x7fd8b930c57c]
[bt] (3) /home/zehao/anaconda2/envs/mxnet/lib/python2.7/lib-dynload/ [0x7fd8b930bcd5]
[bt] (4) /home/zehao/anaconda2/envs/mxnet/lib/python2.7/lib-dynload/ [0x7fd8b9303376]
[bt] (5) /home/zehao/anaconda2/envs/mxnet/lib/python2.7/lib-dynload/ [0x7fd8b92fadb3]
[bt] (6) /home/zehao/anaconda2/envs/mxnet/bin/../lib/ [0x7fd8ca996e93]
[bt] (7) /home/zehao/anaconda2/envs/mxnet/bin/../lib/ [0x7fd8caa4980d]
[bt] (8) /home/zehao/anaconda2/envs/mxnet/bin/../lib/ [0x7fd8caa4bc3e]
[bt] (9) /home/zehao/anaconda2/envs/mxnet/bin/../lib/ [0x7fd8caa4b1f7]

How to get features from ROIs?


Thank you for sharing this amazing repo. I would like to use your proposed method to get the features (maps or vectors) from ROIs obtained from RPN.

Is there any way to do that? If so, can you please kindly guild me through it? For example, pruning the network or directly get activations in the middle of the network.

Thank you!

train error

Thank you for sharing the wonderful work, it is really help.
I encounter problems when training.
AttributeError: 'module' object has no attribute 'DeformableConvolution',why?

MXNet version

How could I download this specific version MXNet@(commit 62ecb60)? It seems not an available version in MXNet?

NameError: name 'resnet_v2_101_rfcn' is not defined

I want to change the symbol 'resnet_v1_101_rfcn' to another symbol 'resnet_v2_101_rfcn', so I just copy this file and change the name to resnet_v2_101_rfcn, besides, I also change the class name to resnet_v2_101_rfcn. Then, I change the name and symbol's name of the corresponding .yaml file and run python experiments/rfcn/ --cfg experiments/rfcn/cfgs/resnet_v2_101_coco_trainval_rfcn_dcn_end2end_ohem.yaml, the error occur: NameError: name 'resnet_v2_101_rfcn' is not defined
What should I do?

Trainning Error

I got the following error when i tried to train voc data.
I use python3.6 and the newest mxnet.

Error in proposal_target.infer_shape: Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/mxnet-0.10.0-py3.6.egg/mxnet/", line 621, in infer_shape_entry
ret = op_prop.infer_shape(shapes)
File "/media/Deformable-ConvNets/rfcn/operator_py/", line 102, in infer_shape
rois = rpn_rois_shape[0] + gt_boxes_shape[0] if self._batch_rois == -1 else self._batch_rois
IndexError: list index out of range

And I found that in_shape[1] is NONE in the infer_shape function (line 100) of rfcn/operator_py/ and in_shape[0] is [300 5]

I want to train deeplab-dcn version, How do I make image_set list?

I try to train deeplab-dcn version. I'm new to mxnet.

  1. In caffe, I made the dataset list into a text file and wrote the path in prototxt.
    The dataset list was made in a "image gt_image \n" .
    In mxnet, How do I make image_set path and list?
    Is it a other format than a text file ?

  2. What was modified from original mxnet ?
    Was it just added deformable_convolution layer ?

Deeplab implementation is different

@orpine I just noticed that in your deeplab implementation after the res5c layer there is no ASPP or atrous convolution . In actual deeplab implementation the shared conv layer(res5c) is followed by 4 atrous convolution layers with varying dilation parameters (namely 4,8,16,24)....Is there any specific reason you chose to instead follow it with fc6 and score layer instead?

conv_feat = self.get_resnet_conv(data)
fc6_bias = mx.symbol.Variable('fc6_bias', lr_mult=2.0)
fc6_weight = mx.symbol.Variable('fc6_weight', lr_mult=1.0)
fc6 = mx.symbol.Convolution(data=conv_feat, kernel=(1, 1), pad=(0, 0), num_filter=1024, name="fc6", bias=fc6_bias, weight=fc6_weight,workspace=self.workspace)
relu_fc6 = mx.sym.Activation(data=fc6, act_type='relu', name='relu_fc6')
score_bias = mx.symbol.Variable('score_bias', lr_mult=2.0)
score_weight = mx.symbol.Variable('score_weight', lr_mult=1.0)
score = mx.symbol.Convolution(data=relu_fc6, kernel=(1, 1), pad=(0, 0), num_filter=num_classes, name="score", bias=score_bias,weight=score_weight, workspace=self.workspace)

MXNetError when I tried to run python ./rfcn/

zhangboshen@smart-gpu-server1:~/src/mxnet/Deformable-ConvNets$ python ./rfcn/
'MXNET_VERSION': 'mxnet',
'SCALES': [(600, 1000)],
'HAS_RPN': True,
'NMS': 0.3,
'RPN_PRE_NMS_TOP_N': 6000,
'max_per_image': 100,
'test_epoch': 8},
'rfcn1_epoch': 0,
'rfcn1_lr': 0,
'rfcn1_lr_step': '',
'rfcn2_epoch': 0,
'rfcn2_lr': 0,
'rfcn2_lr_step': '',
'rpn1_epoch': 0,
'rpn1_lr': 0,
'rpn1_lr_step': '',
'rpn2_epoch': 0,
'rpn2_lr': 0,
'rpn2_lr_step': '',
'rpn3_epoch': 0,
'rpn3_lr': 0,
'rpn3_lr_step': ''},
'BBOX_MEANS': [0.0, 0.0, 0.0, 0.0],
'BBOX_STDS': [0.1, 0.1, 0.2, 0.2],
'BBOX_WEIGHTS': array([ 1., 1., 1., 1.]),
'BG_THRESH_HI': 0.5,
'BG_THRESH_LO': 0.0,
'END2END': True,
'FG_FRACTION': 0.25,
'FG_THRESH': 0.5,
'FLIP': True,
'RESUME': True,
'RPN_BBOX_WEIGHTS': [1.0, 1.0, 1.0, 1.0],
'RPN_PRE_NMS_TOP_N': 6000,
'SHUFFLE': True,
'begin_epoch': 5,
'end_epoch': 8,
'lr': 0.0005,
'lr_factor': 0.1,
'lr_step': '5.333',
'model_prefix': 'e2e',
'momentum': 0.9,
'warmup': False,
'warmup_lr': 5e-05,
'warmup_step': 1000,
'wd': 0.0005},
'dataset': {'NUM_CLASSES': 81,
'dataset': 'coco',
'dataset_path': './data/coco',
'image_set': 'train2014+val2014',
'proposal': 'rpn',
'root_path': './data',
'test_image_set': 'test-dev2015'},
'default': {'frequent': 20, 'kvstore': 'device'},
'gpus': '0',
'network': {'ANCHOR_RATIOS': [0.5, 1, 2],
'ANCHOR_SCALES': [4, 8, 16, 32],
'FIXED_PARAMS': ['conv1',
'PIXEL_MEANS': array([ 103.06, 115.9 , 123.15]),
'pretrained': './model/pretrained_model/resnet_v1_101',
'pretrained_epoch': 0},
'output_path': './output/rfcn',
'symbol': 'resnet_v1_101_rfcn'}
[16:21:10] /home/zhangboshen/mxnet/dmlc-core/include/dmlc/logging.h:304: [16:21:10] src/c_api/ Operator _zeros cannot be run; requires at least one of FCompute, NDArrayFunction, FCreateOperator be registered

Stack trace returned 10 entries:
[bt] (0) /home/zhangboshen/anaconda2/lib/python2.7/site-packages/mxnet-0.10.1-py2.7.egg/mxnet/ [0x7f0e1ff9981c]
[bt] (1) /home/zhangboshen/anaconda2/lib/python2.7/site-packages/mxnet-0.10.1-py2.7.egg/mxnet/ [0x7f0e209c35da]
[bt] (2) /home/zhangboshen/anaconda2/lib/python2.7/site-packages/mxnet-0.10.1-py2.7.egg/mxnet/ [0x7f0e209c3d52]
[bt] (3) /home/zhangboshen/anaconda2/lib/python2.7/lib-dynload/ [0x7f0e10a9531c]
[bt] (4) /home/zhangboshen/anaconda2/lib/python2.7/lib-dynload/ [0x7f0e10a94a75]
[bt] (5) /home/zhangboshen/anaconda2/lib/python2.7/lib-dynload/ [0x7f0e10a8c126]
[bt] (6) /home/zhangboshen/anaconda2/lib/python2.7/lib-dynload/ [0x7f0e10a83ce3]
[bt] (7) /home/zhangboshen/anaconda2/bin/../lib/ [0x7f0e2e401dc3]
[bt] (8) /home/zhangboshen/anaconda2/bin/../lib/ [0x7f0e2e4b36c7]
[bt] (9) /home/zhangboshen/anaconda2/bin/../lib/ [0x7f0e2e4b61ce]

Traceback (most recent call last):
File "./rfcn/", line 130, in
File "./rfcn/", line 90, in main
arg_params=arg_params, aux_params=aux_params)
File "/home/zhangboshen/src/mxnet/Deformable-ConvNets/rfcn/core/", line 29, in init
self._mod.bind(provide_data, provide_label, for_training=False)
File "/home/zhangboshen/src/mxnet/Deformable-ConvNets/rfcn/core/", line 839, in bind
for_training, inputs_need_grad, force_rebind=False, shared_module=None)
File "/home/zhangboshen/src/mxnet/Deformable-ConvNets/rfcn/core/", line 396, in bind
File "/home/zhangboshen/src/mxnet/Deformable-ConvNets/rfcn/core/", line 186, in init
self.bind_exec(data_shapes, label_shapes, shared_group)
File "/home/zhangboshen/src/mxnet/Deformable-ConvNets/rfcn/core/", line 272, in bind_exec
File "/home/zhangboshen/src/mxnet/Deformable-ConvNets/rfcn/core/", line 545, in _bind_ith_exec
context, self.logger)
File "/home/zhangboshen/src/mxnet/Deformable-ConvNets/rfcn/core/", line 523, in _get_or_reshape
arg_arr = nd.zeros(arg_shape, context, dtype=arg_type)
File "/home/zhangboshen/anaconda2/lib/python2.7/site-packages/mxnet-0.10.1-py2.7.egg/mxnet/", line 1028, in zeros
return _internal._zeros(shape=shape, ctx=ctx, dtype=dtype, **kwargs)
File "", line 15, in _zeros
File "/home/zhangboshen/anaconda2/lib/python2.7/site-packages/mxnet-0.10.1-py2.7.egg/mxnet/_ctypes/", line 73, in _imperative_invoke
c_array(ctypes.c_char_p, [c_str(str(val)) for val in vals])))
File "/home/zhangboshen/anaconda2/lib/python2.7/site-packages/mxnet-0.10.1-py2.7.egg/mxnet/", line 85, in check_call
raise MXNetError(py_str(_LIB.MXGetLastError()))
_mxnet.base.MXNetError: [16:21:10] src/c_api/ Operator zeros cannot be run; requires at least one of FCompute, NDArrayFunction, FCreateOperator be registered

Stack trace returned 10 entries:
[bt] (0) /home/zhangboshen/anaconda2/lib/python2.7/site-packages/mxnet-0.10.1-py2.7.egg/mxnet/ [0x7f0e1ff9981c]
[bt] (1) /home/zhangboshen/anaconda2/lib/python2.7/site-packages/mxnet-0.10.1-py2.7.egg/mxnet/ [0x7f0e209c35da]
[bt] (2) /home/zhangboshen/anaconda2/lib/python2.7/site-packages/mxnet-0.10.1-py2.7.egg/mxnet/ [0x7f0e209c3d52]
[bt] (3) /home/zhangboshen/anaconda2/lib/python2.7/lib-dynload/ [0x7f0e10a9531c]
[bt] (4) /home/zhangboshen/anaconda2/lib/python2.7/lib-dynload/ [0x7f0e10a94a75]
[bt] (5) /home/zhangboshen/anaconda2/lib/python2.7/lib-dynload/ [0x7f0e10a8c126]
[bt] (6) /home/zhangboshen/anaconda2/lib/python2.7/lib-dynload/ [0x7f0e10a83ce3]
[bt] (7) /home/zhangboshen/anaconda2/bin/../lib/ [0x7f0e2e401dc3]
[bt] (8) /home/zhangboshen/anaconda2/bin/../lib/ [0x7f0e2e4b36c7]
[bt] (9) /home/zhangboshen/anaconda2/bin/../lib/ [0x7f0e2e4b61ce]

TypeError: _update_params_on_kvstore()

I got trouble while running the scripts:
python experiments/rfcn/ --cfg experiments/rfcn/cfgs/resnet_v1_101_voc0712_rfcn_dcn_end2end_ohem.yaml

At the first epoch, I got this error:
Traceback (most recent call last):
File "experiments/rfcn/", line 19, in
File "experiments/rfcn/../../rfcn/", line 164, in main
config.TRAIN.begin_epoch, config.TRAIN.end_epoch,, config.TRAIN.lr_step)
File "experiments/rfcn/../../rfcn/", line 157, in train_net
arg_params=arg_params, aux_params=aux_params, begin_epoch=begin_epoch, num_epoch=end_epoch)
File "experiments/rfcn/../../rfcn/core/", line 969, in fit
File "experiments/rfcn/../../rfcn/core/", line 1051, in update
File "experiments/rfcn/../../rfcn/core/", line 572, in update
TypeError: _update_params_on_kvstore() takes exactly 4 arguments (3 given)

I guessed that error may be caused by wrong python and mxnet version, so I removed the version existed in my computer and re-install by this way:

cd $(DCN_ROOT)/
git clone --recursive
cd mxnet/
git checkout 62ecb60
git submodule update
make -j $(nproc) USE_OPENCV=1 USE_BLAS=openblas USE_CUDA=1 USE_CUDA_PATH=/
usr/local/cuda USE_CUDNN=1

cd python
sudo python install

I also have checked location of python and mxnet as following:
which python
import mxnet
mxnet.__ path __

With these configurations, the error still present.

Would you please give me some advice to come over this issue?
I am very appreciated your concern.

Segmentation Fault during Deformable faster r-cnn training


.....................................................error log ................................................................................
'pretrained_epoch': 0},
'output_path': './output/rcnn/imagenet_vid',
'symbol': 'resnet_v1_101_rcnn_dcn'}
num_images 53639
ImageNetVID_DET_train_30classes gt roidb loaded from ./data/cache/ImageNetVID_DET_train_30classes_gt_roidb.pkl
append flipped images to roidb
num_images 57834
ImageNetVID_VID_train_15frames gt roidb loaded from ./data/cache/ImageNetVID_VID_train_15frames_gt_roidb.pkl
append flipped images to roidb
filtered 3316 roidb entries: 222946 -> 219630
Segmentation fault (core dumped)

......................................................error log..........................................................................................

When I am trying to run Deformable Fastecr R-CNN for traing. It always shows Segmentation Fault no matter when I change VOC or coco. I have try on two server with 8 GPU. Shows the same fault. Could you please give a hint what the problem may be?

The channel number of DeformableConvolution layer and corresponding offset layer?

In the provided symbols file, the channels (filter number) of deformation convolution layer is 512 and the filter number of corresponding offset layer is 72, how to set these two numbers? For example, if I want to construct a deformation network for cifar10, the filter number of deformation convolution layer should be smaller than 512, should I choose 256 or 128?

cannot run demo


I would like to test your demo, but got some error.

My environment: Ubuntu 14.04, Tesla K80, CUDA8.0

I installed the MXNet with the checkout 62ecb60, and copy your additional operators to $(YOUR_MXNET_FOLDER)/src/operator/contrib. I successfully compiled MXNet. After this, I start testing your code. However, when I ran the demo python ./rfcn/, I got the following error.

kelin@vision-kevin-gpu-exp:~/code/Deformable-ConvNets$ python ./rfcn/ 
libdc1394 error: Failed to initialize libdc1394
 'MXNET_VERSION': 'mxnet',
 'SCALES': [(600, 1000)],
          'CXX_PROPOSAL': False,
          'HAS_RPN': True,
          'NMS': 0.3,
          'PROPOSAL_MIN_SIZE': 0,
          'PROPOSAL_NMS_THRESH': 0.7,
          'PROPOSAL_POST_NMS_TOP_N': 2000,
          'PROPOSAL_PRE_NMS_TOP_N': 20000,
          'RPN_MIN_SIZE': 0,
          'RPN_NMS_THRESH': 0.7,
          'RPN_POST_NMS_TOP_N': 300,
          'RPN_PRE_NMS_TOP_N': 6000,
          'max_per_image': 100,
          'test_epoch': 8},
                         'RPN_BATCH_IMAGES': 0,
                         'rfcn1_epoch': 0,
                         'rfcn1_lr': 0,
                         'rfcn1_lr_step': '',
                         'rfcn2_epoch': 0,
                         'rfcn2_lr': 0,
                         'rfcn2_lr_step': '',
                         'rpn1_epoch': 0,
                         'rpn1_lr': 0,
                         'rpn1_lr_step': '',
                         'rpn2_epoch': 0,
                         'rpn2_lr': 0,
                         'rpn2_lr_step': '',
                         'rpn3_epoch': 0,
                         'rpn3_lr': 0,
                         'rpn3_lr_step': ''},
           'ASPECT_GROUPING': True,
           'BATCH_IMAGES': 1,
           'BATCH_ROIS': -1,
           'BATCH_ROIS_OHEM': 128,
           'BBOX_MEANS': [0.0, 0.0, 0.0, 0.0],
           'BBOX_REGRESSION_THRESH': 0.5,
           'BBOX_STDS': [0.1, 0.1, 0.2, 0.2],
           'BBOX_WEIGHTS': array([ 1.,  1.,  1.,  1.]),
           'BG_THRESH_HI': 0.5,
           'BG_THRESH_LO': 0.0,
           'CXX_PROPOSAL': False,
           'ENABLE_OHEM': True,
           'END2END': True,
           'FG_FRACTION': 0.25,
           'FG_THRESH': 0.5,
           'FLIP': True,
           'RESUME': True,
           'RPN_BATCH_SIZE': 256,
           'RPN_BBOX_WEIGHTS': [1.0, 1.0, 1.0, 1.0],
           'RPN_CLOBBER_POSITIVES': False,
           'RPN_FG_FRACTION': 0.5,
           'RPN_MIN_SIZE': 0,
           'RPN_NEGATIVE_OVERLAP': 0.3,
           'RPN_NMS_THRESH': 0.7,
           'RPN_POSITIVE_OVERLAP': 0.7,
           'RPN_POSITIVE_WEIGHT': -1.0,
           'RPN_POST_NMS_TOP_N': 300,
           'RPN_PRE_NMS_TOP_N': 6000,
           'SHUFFLE': True,
           'begin_epoch': 5,
           'end_epoch': 8,
           'lr': 0.0005,
           'lr_factor': 0.1,
           'lr_step': '5.333',
           'model_prefix': 'e2e',
           'momentum': 0.9,
           'warmup': False,
           'warmup_lr': 5e-05,
           'warmup_step': 1000,
           'wd': 0.0005},
 'dataset': {'NUM_CLASSES': 81,
             'dataset': 'coco',
             'dataset_path': './data/coco',
             'image_set': 'train2014+val2014',
             'proposal': 'rpn',
             'root_path': './data',
             'test_image_set': 'test-dev2015'},
 'default': {'frequent': 20, 'kvstore': 'device'},
 'gpus': '0',
 'network': {'ANCHOR_RATIOS': [0.5, 1, 2],
             'ANCHOR_SCALES': [4, 8, 16, 32],
             'FIXED_PARAMS': ['conv1',
             'FIXED_PARAMS_SHARED': ['conv1',
             'IMAGE_STRIDE': 0,
             'NUM_ANCHORS': 12,
             'PIXEL_MEANS': array([ 103.06,  115.9 ,  123.15]),
             'RCNN_FEAT_STRIDE': 16,
             'RPN_FEAT_STRIDE': 16,
             'pretrained': './model/pretrained_model/resnet_v1_101',
             'pretrained_epoch': 0},
 'output_path': './output/rfcn',
 'symbol': 'resnet_v1_101_rfcn'}
[05:41:04] /home/kelin/code/origin_mxnet/mxnet/dmlc-core/include/dmlc/logging.h:300: [05:41:04] /home/kelin/code/origin_mxnet/mxnet/mshadow/mshadow/./stream_gpu-inl.h:45: Check failed: e == cudaSuccess CUDA: an illegal memory access was encountered

Stack trace returned 8 entries:
[bt] (0) /home/kelin/code/origin_mxnet/mxnet/python/mxnet/../../lib/ [0x7f096eeef06c]
[bt] (1) /home/kelin/code/origin_mxnet/mxnet/python/mxnet/../../lib/ [0x7f096f760088]
[bt] (2) /home/kelin/code/origin_mxnet/mxnet/python/mxnet/../../lib/ [0x7f096f7aaea8]
[bt] (3) /home/kelin/code/origin_mxnet/mxnet/python/mxnet/../../lib/ [0x7f096f795b8c]
[bt] (4) /home/kelin/code/origin_mxnet/mxnet/python/mxnet/../../lib/ [0x7f096f798f00]
[bt] (5) /usr/lib/x86_64-linux-gnu/ [0x7f0985ef0a60]
[bt] (6) /lib/x86_64-linux-gnu/ [0x7f09887ce184]
[bt] (7) /lib/x86_64-linux-gnu/ [0x7f09884faffd]

[05:41:04] /home/kelin/code/origin_mxnet/mxnet/dmlc-core/include/dmlc/logging.h:300: [05:41:04] src/engine/./threaded_engine.h:329: [05:41:04] /home/kelin/code/origin_mxnet/mxnet/mshadow/mshadow/./stream_gpu-inl.h:45: Check failed: e == cudaSuccess CUDA: an illegal memory access was encountered

Stack trace returned 8 entries:
[bt] (0) /home/kelin/code/origin_mxnet/mxnet/python/mxnet/../../lib/ [0x7f096eeef06c]
[bt] (1) /home/kelin/code/origin_mxnet/mxnet/python/mxnet/../../lib/ [0x7f096f760088]
[bt] (2) /home/kelin/code/origin_mxnet/mxnet/python/mxnet/../../lib/ [0x7f096f7aaea8]
[bt] (3) /home/kelin/code/origin_mxnet/mxnet/python/mxnet/../../lib/ [0x7f096f795b8c]
[bt] (4) /home/kelin/code/origin_mxnet/mxnet/python/mxnet/../../lib/ [0x7f096f798f00]
[bt] (5) /usr/lib/x86_64-linux-gnu/ [0x7f0985ef0a60]
[bt] (6) /lib/x86_64-linux-gnu/ [0x7f09887ce184]
[bt] (7) /lib/x86_64-linux-gnu/ [0x7f09884faffd]

An fatal error occurred in asynchronous engine operation. If you do not know what caused this error, you can try set environment variable MXNET_ENGINE_TYPE to NaiveEngine and run with debugger (i.e. gdb). This will force all operations to be synchronous and backtrace will give you the series of calls that lead to this error. Remember to set MXNET_ENGINE_TYPE back to empty after debugging.

Stack trace returned 6 entries:
[bt] (0) /home/kelin/code/origin_mxnet/mxnet/python/mxnet/../../lib/ [0x7f096eeef06c]
[bt] (1) /home/kelin/code/origin_mxnet/mxnet/python/mxnet/../../lib/ [0x7f096f795e71]
[bt] (2) /home/kelin/code/origin_mxnet/mxnet/python/mxnet/../../lib/ [0x7f096f798f00]
[bt] (3) /usr/lib/x86_64-linux-gnu/ [0x7f0985ef0a60]
[bt] (4) /lib/x86_64-linux-gnu/ [0x7f09887ce184]
[bt] (5) /lib/x86_64-linux-gnu/ [0x7f09884faffd]

terminate called after throwing an instance of 'dmlc::Error'
  what():  [05:41:04] src/engine/./threaded_engine.h:329: [05:41:04] /home/kelin/code/origin_mxnet/mxnet/mshadow/mshadow/./stream_gpu-inl.h:45: Check failed: e == cudaSuccess CUDA: an illegal memory access was encountered

Stack trace returned 8 entries:
[bt] (0) /home/kelin/code/origin_mxnet/mxnet/python/mxnet/../../lib/ [0x7f096eeef06c]
[bt] (1) /home/kelin/code/origin_mxnet/mxnet/python/mxnet/../../lib/ [0x7f096f760088]
[bt] (2) /home/kelin/code/origin_mxnet/mxnet/python/mxnet/../../lib/ [0x7f096f7aaea8]
[bt] (3) /home/kelin/code/origin_mxnet/mxnet/python/mxnet/../../lib/ [0x7f096f795b8c]
[bt] (4) /home/kelin/code/origin_mxnet/mxnet/python/mxnet/../../lib/ [0x7f096f798f00]
[bt] (5) /usr/lib/x86_64-linux-gnu/ [0x7f0985ef0a60]
[bt] (6) /lib/x86_64-linux-gnu/ [0x7f09887ce184]
[bt] (7) /lib/x86_64-linux-gnu/ [0x7f09884faffd]

An fatal error occurred in asynchronous engine operation. If you do not know what caused this error, you can try set environment variable MXNET_ENGINE_TYPE to NaiveEngine and run with debugger (i.e. gdb). This will force all operations to be synchronous and backtrace will give you the series of calls that lead to this error. Remember to set MXNET_ENGINE_TYPE back to empty after debugging.

Stack trace returned 6 entries:
[bt] (0) /home/kelin/code/origin_mxnet/mxnet/python/mxnet/../../lib/ [0x7f096eeef06c]
[bt] (1) /home/kelin/code/origin_mxnet/mxnet/python/mxnet/../../lib/ [0x7f096f795e71]
[bt] (2) /home/kelin/code/origin_mxnet/mxnet/python/mxnet/../../lib/ [0x7f096f798f00]
[bt] (3) /usr/lib/x86_64-linux-gnu/ [0x7f0985ef0a60]
[bt] (4) /lib/x86_64-linux-gnu/ [0x7f09887ce184]
[bt] (5) /lib/x86_64-linux-gnu/ [0x7f09884faffd]

Aborted (core dumped)

If I understand the code correctly, we can simply ignore Failed to initialize libdc1394. The main problem should be

mxnet/mshadow/mshadow/./stream_gpu-inl.h:45: Check failed: e == cudaSuccess CUDA: an illegal memory access was encountered

Since I am able to run MXNet's demo (such as image classification on CIFAR10), my hardware setting should be okay now. I have no idea about this error. Can you please help me about this? Thanks!

TypeError: init_params() got an unexpected keyword argument 'allow_extra'

Hi, I got trouble while running the scripts:
python experiments/rfcn/ --cfg experiments/rfcn/cfgs/resnet_v1_101_voc0712_rfcn_dcn_end2end_ohem.yaml

After first epoch, I got:

CNNLogLoss=0.776314, RCNNL1Loss=0.329874,
Epoch[0] Batch [9900] Speed: 4.22 samples/sec Train-RPNAcc=0.942865, RPNLogLoss=0.154231, RPNL1Loss=0.071810, RCNNAcc=0.809554, RCNNLogLoss=0.773895, RCNNL1Loss=0.330007,
Epoch[0] Batch [10000] Speed: 4.22 samples/sec Train-RPNAcc=0.943149, RPNLogLoss=0.153486, RPNL1Loss=0.071483, RCNNAcc=0.809500, RCNNLogLoss=0.771593, RCNNL1Loss=0.329987,
Traceback (most recent call last):
File "experiments/rfcn/", line 19, in
File "experiments/rfcn/../../rfcn/", line 164, in main
config.TRAIN.begin_epoch, config.TRAIN.end_epoch,, config.TRAIN.lr_step)
File "experiments/rfcn/../../rfcn/", line 157, in train_net
arg_params=arg_params, aux_params=aux_params, begin_epoch=begin_epoch, num_epoch=end_epoch)
File "experiments/rfcn/../../rfcn/core/", line 990, in fit
self.set_params(arg_params, aux_params)
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.10.1-py2.7.egg/mxnet/module/", line 651, in set_params
TypeError: init_params() got an unexpected keyword argument 'allow_extra'

Would you mind give me a hint?
My computer has single GTX 1080Ti.

could you provide pretrained model in BaiduYun?

I can not download your pretrained model in onedrive.

could you provide pretrained model in BaiduYun?


What does this mean? As I know it is not correct.
In Epoch[0] RPNL1Loss = 0.403792. Then always nan

Epoch[0] Batch [300] Speed: 1.15 samples/sec Train-RPNAcc=0.812539, RPNLogLoss=0.570043, RPNL1Loss=nan, RCNNAcc=0.767390, RCNNLogLoss=3.596807, RCNNL1Loss=0.010698,
Epoch[0] Batch [400] Speed: 1.16 samples/sec Train-RPNAcc=0.821910, RPNLogLoss=0.599778, RPNL1Loss=nan, RCNNAcc=0.818267, RCNNLogLoss=3.577709, RCNNL1Loss=0.008031,
Epoch[0] Batch [500] Speed: 1.14 samples/sec Train-RPNAcc=0.828243, RPNLogLoss=0.617132, RPNL1Loss=nan, RCNNAcc=0.848381, RCNNLogLoss=3.396140, RCNNL1Loss=0.006434,
Epoch[0] Batch [600] Speed: 1.15 samples/sec Train-RPNAcc=0.832391, RPNLogLoss=0.628403, RPNL1Loss=nan, RCNNAcc=0.869345, RCNNLogLoss=2.870057, RCNNL1Loss=0.005387,
Epoch[0] Batch [700] Speed: 1.16 samples/sec Train-RPNAcc=0.834384, RPNLogLoss=0.636100, RPNL1Loss=nan, RCNNAcc=0.883437, RCNNLogLoss=2.499374, RCNNL1Loss=0.004622,
Epoch[0] Batch [800] Speed: 1.16 samples/sec Train-RPNAcc=0.836050, RPNLogLoss=0.641726, RPNL1Loss=nan, RCNNAcc=0.894673, RCNNLogLoss=2.215910, RCNNL1Loss=0.004046,
Epoch[0] Batch [900] Speed: 1.14 samples/sec Train-RPNAcc=0.837489, RPNLogLoss=0.645880, RPNL1Loss=nan, RCNNAcc=0.903519, RCNNLogLoss=1.994231, RCNNL1Loss=0.003597,
Epoch[0] Batch [1000] Speed: 1.16 samples/sec Train-RPNAcc=0.838438, RPNLogLoss=0.649027, RPNL1Loss=nan, RCNNAcc=0.910550, RCNNLogLoss=1.816863, RCNNL1Loss=0.003238,
Epoch[0] Batch [1100] Speed: 1.16 samples/sec Train-RPNAcc=0.839179, RPNLogLoss=0.650772, RPNL1Loss=nan, RCNNAcc=0.915993, RCNNLogLoss=1.673953, RCNNL1Loss=0.003987,
Epoch[0] Batch [1200] Speed: 1.15 samples/sec Train-RPNAcc=0.839684, RPNLogLoss=0.650882, RPNL1Loss=nan, RCNNAcc=0.920535, RCNNLogLoss=1.553478, RCNNL1Loss=0.003658,
Epoch[0] Batch [1300] Speed: 1.15 samples/sec Train-RPNAcc=0.840592, RPNLogLoss=0.649718, RPNL1Loss=nan, RCNNAcc=0.924115, RCNNLogLoss=1.452725, RCNNL1Loss=0.003903,
Epoch[0] Batch [1400] Speed: 1.14 samples/sec Train-RPNAcc=0.841459, RPNLogLoss=0.647637, RPNL1Loss=nan, RCNNAcc=0.927468, RCNNLogLoss=1.363495, RCNNL1Loss=0.003740,
Epoch[0] Batch [1500] Speed: 1.16 samples/sec Train-RPNAcc=0.841013, RPNLogLoss=0.645282, RPNL1Loss=nan, RCNNAcc=0.930463, RCNNLogLoss=1.284670, RCNNL1Loss=0.003492,
Epoch[0] Batch [1600] Speed: 1.16 samples/sec Train-RPNAcc=0.841707, RPNLogLoss=0.642184, RPNL1Loss=nan, RCNNAcc=0.932781, RCNNLogLoss=1.216500, RCNNL1Loss=0.003344,
Epoch[0] Batch [1700] Speed: 1.16 samples/sec Train-RPNAcc=0.841856, RPNLogLoss=0.638847, RPNL1Loss=nan, RCNNAcc=0.935341, RCNNLogLoss=1.151333, RCNNL1Loss=0.003149,
Epoch[0] Batch [1800] Speed: 1.16 samples/sec Train-RPNAcc=0.842007, RPNLogLoss=0.635248, RPNL1Loss=nan, RCNNAcc=0.937574, RCNNLogLoss=1.095160, RCNNL1Loss=0.004580,
Epoch[0] Batch [1900] Speed: 1.17 samples/sec Train-RPNAcc=0.842244, RPNLogLoss=0.631594, RPNL1Loss=nan, RCNNAcc=0.939325, RCNNLogLoss=1.043886, RCNNL1Loss=0.004343,
Epoch[0] Batch [2000] Speed: 1.17 samples/sec Train-RPNAcc=0.842616, RPNLogLoss=0.627715, RPNL1Loss=nan, RCNNAcc=0.941225, RCNNLogLoss=0.996324, RCNNL1Loss=0.004129,
Epoch[0] Batch [2100] Speed: 1.18 samples/sec Train-RPNAcc=0.843182, RPNLogLoss=0.623727, RPNL1Loss=nan, RCNNAcc=0.942862, RCNNLogLoss=0.953685, RCNNL1Loss=0.003934,
Epoch[0] Batch [2200] Speed: 1.18 samples/sec Train-RPNAcc=0.843663, RPNLogLoss=0.619788, RPNL1Loss=nan, RCNNAcc=0.944095, RCNNLogLoss=0.915521, RCNNL1Loss=0.003757,
Epoch[0] Batch [2300] Speed: 1.17 samples/sec Train-RPNAcc=0.844234, RPNLogLoss=0.615760, RPNL1Loss=nan, RCNNAcc=0.945496, RCNNLogLoss=0.879742, RCNNL1Loss=0.003595,
Epoch[0] Batch [2400] Speed: 1.18 samples/sec Train-RPNAcc=0.844505, RPNLogLoss=0.611821, RPNL1Loss=nan, RCNNAcc=0.947011, RCNNLogLoss=0.846207, RCNNL1Loss=0.003446,
Epoch[0] Batch [2500] Speed: 1.16 samples/sec Train-RPNAcc=0.844367, RPNLogLoss=0.608176, RPNL1Loss=nan, RCNNAcc=0.947858, RCNNLogLoss=0.819818, RCNNL1Loss=0.004606,
Epoch[0] Batch [2600] Speed: 1.18 samples/sec Train-RPNAcc=0.844443, RPNLogLoss=0.604457, RPNL1Loss=nan, RCNNAcc=0.948941, RCNNLogLoss=0.791787, RCNNL1Loss=0.004434,

Demo issue

Traceback (most recent call last): File "./rfcn/", line 129, in <module> main() File "./rfcn/", line 50, in main sym = sym_instance.get_symbol(config, is_train=False) File "/mnt/Deformable-ConvNets/rfcn/symbols/", line 725, in get_symbol relu1 = self.get_resnet_v1_conv5(conv_feat) File "/mnt/Deformable-ConvNets/rfcn/symbols/", line 633, in get_resnet_v1_conv5 res5a_branch2b = mx.contrib.symbol.DeformableConvolution(name='res5a_branch2b', data=res5a_branch2a_relu, offset=res5a_branch2b_offset, AttributeError: 'module' object has no attribute 'DeformableConvolution'

Pretrained model for Deformable Faster R-CNN on MSCOCO


Thank you again for sharing this amazing repository.

I noticed that, although the training scripts are provided for Deformable Faster R-CNN, the pre-trained models for Faster R-CNN and Deformable Faster R-CNN are missing. I can also see that you already have the results of Deformable Faster R-CNN on MS COCO.

Could you please kindly share the pre-trained model for Deformable Faster R-CNN (2fc), ResNet-v1-101 with us?

Thank you!

Error when Train My Own DataSets

I've read the Deformable ConvNets paper, it's amazing! Now, I have a face dataset to train, so I change the and from 21 classes to 2 classes.I run this :
python ./experiments/rfcn/ --cfg ./experiments/rfcn/cfgs/resnet_v1_101_voc0712_rfcn_dcn_end2end_ohem.yaml

but it errors:

[14:53:44] /mnt/data1/daniel/mxnet0/dmlc-core/include/dmlc/./logging.h:304[14:53:44] /mnt/data1/daniel/mxnet0/dmlc-core/include/dmlc/./logging.h:304: : [14:53:44] /mnt/data1/daniel/mxnet0/mshadow/mshadow/./tensor_gpu-inl.h:35: Check failed: e == cudaSuccess CUDA: invalid device ordinal

Stack trace returned 6 entries:
[bt] (0) /mnt/data1/daniel/Python-2.7.13/build/lib/python2.7/site-packages/mxnet-0.9.5-py2.7.egg/mxnet/ [0x7fe9694c9ac9]
[bt] (1) /mnt/data1/daniel/Python-2.7.13/build/lib/python2.7/site-packages/mxnet-0.9.5-py2.7.egg/mxnet/ [0x7fe96a166c18]
[bt] (2) /mnt/data1/daniel/Python-2.7.13/build/lib/python2.7/site-packages/mxnet-0.9.5-py2.7.egg/mxnet/ [0x7fe96a169460]
[bt] (3) /lib64/ [0x7fe98ce1d220]
[bt] (4) /lib64/ [0x7fe995a29dc5]
[bt] (5) /lib64/ [0x7fe99504f73d]

[14:53:44] /mnt/data1/daniel/mxnet0/mshadow/mshadow/./tensor_gpu-inl.h:35: Check failed: e == cudaSuccess CUDA: invalid device ordinal

Stack trace returned 6 entries:
[bt] (0) /mnt/data1/daniel/Python-2.7.13/build/lib/python2.7/site-packages/mxnet-0.9.5-py2.7.egg/mxnet/ [0x7fe9694c9ac9]
[bt] (1) /mnt/data1/daniel/Python-2.7.13/build/lib/python2.7/site-packages/mxnet-0.9.5-py2.7.egg/mxnet/ [0x7fe96a166c18]
[bt] (2) /mnt/data1/daniel/Python-2.7.13/build/lib/python2.7/site-packages/mxnet-0.9.5-py2.7.egg/mxnet/ [0x7fe96a169460]
[bt] (3) /lib64/ [0x7fe98ce1d220]
[bt] (4) /lib64/ [0x7fe995a29dc5]
[bt] (5) /lib64/ [0x7fe99504f73d]

terminate called after throwing an instance of 'dmlc::Error'
  what():  [14:53:44] /mnt/data1/daniel/mxnet0/mshadow/mshadow/./tensor_gpu-inl.h:35: Check failed: e == cudaSuccess CUDA: invalid device ordinal

Stack trace returned 6 entries:
[bt] (0) /mnt/data1/daniel/Python-2.7.13/build/lib/python2.7/site-packages/mxnet-0.9.5-py2.7.egg/mxnet/ [0x7fe9694c9ac9]
[bt] (1) /mnt/data1/daniel/Python-2.7.13/build/lib/python2.7/site-packages/mxnet-0.9.5-py2.7.egg/mxnet/ [0x7fe96a166c18]
[bt] (2) /mnt/data1/daniel/Python-2.7.13/build/lib/python2.7/site-packages/mxnet-0.9.5-py2.7.egg/mxnet/ [0x7fe96a169460]
[bt] (3) /lib64/ [0x7fe98ce1d220]
[bt] (4) /lib64/ [0x7fe995a29dc5]
[bt] (5) /lib64/ [0x7fe99504f73d]

terminate called recursively
Segmentation fault(core dumped)

I wanna why this happened, and how to solve this?

Error when testing on the Cityscapes dataset

I tried to test my trained model on the Cityscapes dataset via following command:

python experiments/deeplab/ --cfg experiments/deeplab/cfgs/deeplab_resnet_v1_101_cityscapes_segmentation_dcn.yaml

However, it gave me this error:

Traceback (most recent call last):
File "experiments/deeplab/", line 20, in
File "experiments/deeplab/../../deeplab/", line 99, in main
File "experiments/deeplab/../../deeplab/", line 95, in test_deeplab
pred_eval(predictor, test_data, imdb, vis=args.vis, ignore_cache=args.ignore_cache, logger=logger)
File "experiments/deeplab/../../deeplab/core/", line 102, in pred_eval
evaluation_results = imdb.evaluate_segmentations(all_segmentation_result)
File "experiments/deeplab/../../deeplab/../lib/dataset/", line 182, in evaluate_segmentations
info = self._py_evaluate_segmentation()
File "experiments/deeplab/../../deeplab/../lib/dataset/", line 241, in _py_evaluate_segmentation
seg_pred = np.array('float32')
File "/home/haowang/software/miniconda2/lib/python2.7/site-packages/PIL/", line 2410, in open
fp =, "rb")
IOError: [Errno 2] No such file or directory: './output/cityscape/deeplab_resnet_v1_101_cityscapes_segmentation_dcn/leftImg8bit_val/results/frankfurt/frankfurt_000001_059642.png'

I guess there is a bug in ./lib/dataset/, line 179, where

if not pred_segmentations:

should be

if pred_segmentations:

ImportError: cannot import name bbox_overlaps_cython

I feel like this is a stupid question, but when I finished the installation and run python ./rfcn/

Traceback (most recent call last):
  File "./rfcn/", line 17, in <module>
    from utils.image import resize, transform
  File "/net/mlfs01/export/users/cyma/codes/Deformable-ConvNets/rfcn/../lib/utils/", line 6, in <module>
    from bbox.bbox_transform import clip_boxes
  File "/net/mlfs01/export/users/cyma/codes/Deformable-ConvNets/rfcn/../lib/bbox/", line 6, in <module>
    from bbox import bbox_overlaps_cython
ImportError: cannot import name bbox_overlaps_cython

It's obviously that the python can't import from bbox.pyx file.
Adding the following before from bbox import bbox_overlaps_cython in will force it to import from pyx file.

import pyximport

from bbox import bbox_overlaps_cython

But I feel like there is something wrong with my setting or installation (no error reported during installation for MXNet).

Has anyone faced the same issue before?

pip list:

Cython (0.25.2)
Django (1.11.1)
easydict (1.6)
image (1.5.5)
mxnet (0.9.5)
numpy (1.13.0rc2)
olefile (0.44)
opencv-python (
Pillow (4.1.1)
pip (9.0.1)
pytz (2017.2)
PyYAML (3.12)
setuptools (27.2.0)
wheel (0.29.0)

Questions about the

In, I have two questions about the details:
1.Why the offset of bounding box(red box) comes from output of rfcn_cls_offset layer rather than the output of rfcn_bbox_offset layer? I think the latter is more related to sub bbox location, or because the rfcn_cls_offset is related to foreground object?
2.Why set the value of trans_std to 0.1 in function show_dpsroi_offset? Thank you!!

mxnet compile failed

System configuration:Ubuntu14.04,cuda8,cudnn5
Error info when compiling mxnet:
/usr/include/c++/4.8/bits/stl_vector.h:919:7: note: no known conversion for argument 1 from ‘nnvm::dim_t* {aka long int*}’ to ‘unsigned int*&&’
make: *** [build/src/operator/custom/custom.o] Error 1
Does anyone know how to fix it?

How to get the sampling points

I note that you have plotted the sampling red locations for different activation units, which are very helpful for understanding deformable convents, but I wonder how to get the sampling points? How can I find the location corresponds to deformable filter?

How 'num_deformable_group' works?

In, num_deformable_group is set to 1, but in, num_deformable_group is set to 4, i know how deformable convolution works when num_deformable_group=1, but how it works when num_deformable_group=4? Is it the same as group convolution in regular convolution layer? Thanks!

RFCN-DCN with Soft-NMS by bharatsingh430

Hi all,
you mentioned the third party implementation of bharatsingh430 in your README.
I cannot ask my question there (disabled?), therefore I try it here.
I think the model of bharatsingh430 is not compatible with your demo. When I replace the model with 'rfcn_dcn_coco-0008.params' downloaded from here, I get the following error:

Check failed: from.shape() == to->shape() operands shape mismatchfrom.shape = (60,) to.shape=(48,)

Any ideas?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.