Giter Site home page Giter Site logo

ron's Introduction

RON: Reverse Connection with Objectness Prior Networks for Object Detection

RON is a state-of-the-art visual object detection system for efficient object detection framework. The code is modified from py-faster-rcnn. You can use the code to train/evaluate a network for object detection task. For more details, please refer to our CVPR paper.

***There is also a tensorflow re-implementation of RON at RON_Tensorflow, thanks @HiKapok!

Citing RON

If you find RON useful in your research, please consider citing:

@inproceedings{KongtCVPR2017,
    Author = {Tao Kong, Fuchun Sun, Anbang Yao, Huaping Liu, Ming Lu, Yurong Chen},
    Title = {RON: Reverse Connection with Objectness Prior Networks for Object Detection},
    Booktitle = {IEEE Conference on Computer Vision and Pattern Recognition},
    Year = {2017}
}

PASCAL VOC detection results

Method VOC 2007 mAP VOC 2012 mAP Input resolution
Fast R-CNN 70.0% 68.4% 1000*600
Faster R-CNN 73.2% 70.4% 1000*600
SSD300 72.1% 70.3% 300*300
SSD500 75.1% 73.1% 500*500
RON320 74.2% 71.7% 320*320
RON384 75.4% 73.0% 384*384

MS COCO detection results

Method Training data AP(0.50-0.95) Input resolution
Faster R-CNN trainval 21.9% 1000*600
SSD500 trainval35k 24.4% 500*500
RON320 trainval 23.6% 320*320
RON384 trainval 25.4% 384*384

Note: SSD300 and SSD500 are the original SSD model from SSD.

RON Installation

  1. Clone the RON repository

    git clone https://github.com/taokong/RON.git
    
    
  2. Build Caffe and pycaffe

    cd $RON_ROOT/
    git clone https://github.com/taokong/caffe-ron.git
    cd caffe-ron
    make -j8 && make pycaffe
    *this version use CUDNN for efficiency, so make sure that "USE_CUDNN := 1" in the Makefile.config file.
    
  3. Build the Cython modules

    cd $RON_ROOT/lib
    make
    
  4. installation for training and testing models on PASCAL VOC dataset

    3.0 The PASCAL VOC dataset has the basic structure:

     $VOCdevkit/                           # development kit
     $VOCdevkit/VOCcode/                   # VOC utility code
     $VOCdevkit/VOC2007                    # image sets, annotations, etc.
    

    3.1 Create symlinks for the PASCAL VOC dataset

     cd $RON_ROOT/data
     ln -s $VOCdevkit VOCdevkit2007
     ln -s $VOCdevkit VOCdevkit2012
    
  5. Test with PASCAL VOC dataset

    Now we provide two models for testing the pascal voc 2007 test dataset. To use demo you need to download the pretrained RON model, please download the model manually from BaiduYun(Google Drive), and put it under $data/RON_models.

    4.0 The original model as introduced in the RON paper:

     ./test_voc07.sh
     # The final result of the model should be 74.2% mAP.
    

    4.1 A lite model we make some optimization after the original one:

     ./test_voc07_reduced.sh
     # The final result of the model should be 74.1% mAP.
    
  6. Train with PASCAL VOC dataset

   Please download ImageNet-pre-trained VGG models manually from BaiduYun(Google Drive), and put them into $data/ImageNet_models. Then everything is done, you could train your own model.

5.0 The original model as introduced in the RON paper: 

    ./train_voc.sh
    
5.1 A lite model we make some optimization after the original one:

    ./train_voc_reduced.sh

ron's People

Contributors

taokong avatar taokongcn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ron's Issues

Some questions about paper

Hey, much thanks for your great work. About the paper, I have some questions if you don't mind.

  1. For each scale feature maps, there is a seperated classifier and regressor to get class-specific score and bounding box regression. So for four scales, there are four classifiers and regressors. This might bring repeated computation. I wonder if these operations on different scales can merge in some way.
  2. I find that objectness prior is much like rpn(region proposal network). The only difference is that objectness prior only produces a score without bbreg, which is included in rpn. I wonder if I am wrong. Please give me some tips about the differences.
  3. For the last classifier and regressor, one uses two convs while the other uses two inceptions. I wonder the reason why you choose them.
    Thanks again. If disturbed, please forgive.

CUDNN error

Thanks for your helpful work! When I am trying to run the code following the guide of yours, I got the following issue:
F1119 15:00:44.842713 16912 cudnn.hpp:96] Check failed: status == CUDNN_STATUS_SUCCESS (3 vs. 0) CUDNN_STATUS_BAD_PARAM.

Could you please tell me how to deal with this? Thank you in advance.

train crashed.

when I train the model, it crashed.

F0718 15:20:34.923629 16833 syncedmem.cpp:56] Check failed: error == cudaSuccess (2 vs. 0) out of memory
*** Check failure stack trace: ***
./train_voc_reduced.sh: line 7: 16833 Aborted (core dumped) python tools/train_net.py --gpu 0 --solver models/pascalvoc/VGG16-REDUCED/solver.prototxt --imdb voc_2007_trainval --weights data/ImageNet_models/VGG_ILSVRC_16_layers_fc_reduced.caffemodel --batchsize 64 --iters 4000

root$ nvidia-smi
Tue Jul 18 15:29:09 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.66 Driver Version: 375.66 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K80 Off | 0000:06:00.0 Off | 0 |
| N/A 69C P0 57W / 149W | 0MiB / 11439MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla K80 Off | 0000:07:00.0 Off | 0 |
| N/A 51C P0 71W / 149W | 0MiB / 11439MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 Tesla K80 Off | 0000:83:00.0 Off | 0 |
| N/A 62C P0 58W / 149W | 0MiB / 11439MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 Tesla K80 Off | 0000:84:00.0 Off | 0 |
| N/A 48C P0 72W / 149W | 0MiB / 11439MiB | 99% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+

train data is too slower

Hello,
when I train my data ,the process is too slower.It cost too much time.

Could you give me some advice?

thank you very much

set thebatchsize

due to the limitation of the hardware,I need to set up a little bit of bathsize, so how does RON set the parameters. I cannot find the parameters anywhere.thanks

What is the normal train loss

hi, @taokong ,great work!
I want to re-implement your experiment, and I have three small issues?

  1. What is the normal train loss when the model converge well?
    mark: When I train ,the loss always 4.0+ before 8w iters, and I test at 8w Iters only got mAp=0.54(train data is 2007trainval + 2013trainval)
  2. I also notice that I can do test in a pascal Titanx , but got CUDNN_STATUS_SUCCESS (8 vs. 0) CUDNN_STATUS_EXECUTION_FAILED error in a TitanX.
  3. Training needs much time, can you give me some advices?
    Thanks in advance!

demo

测试了一下demo,发现同一个物体回归了好多框。。。

Changing alpha and beta

Hi,
thanks you so much for the code, can you tell me how can I change alpha and beta from equation(2)? This is to change the loss function
Thanks

Two questions about the code

@taokong
@taokongcn

  1. 在test.py文件和paper中指出:

  scores = np.tile(scores[:, 0], (imdb.num_classes, 1)).transpose() * scores

  相当于给"分类(21分类)得分"乘以了"其属于物体的概率".为什么采用这种得分形式?为什么要进行这样的处理?

  1. 在anchor_target_layer.py和det_target_layer.py中,
  if len(fg_inds) > 0:
            num_bg = len(fg_inds) *  (1.0 - cfg.TRAIN.FG_FRACTION) / (cfg.TRAIN.FG_FRACTION)
        else:
            num_bg = self._batch
    
        bg_inds = np.where(all_labels == 0)[0]
        if len(bg_inds) > num_bg:
            disable_inds = npr.choice(bg_inds, size=int(len(bg_inds) - num_bg), replace=False)
            all_labels[disable_inds] = -1

  感觉只要存在正样本,这个batch_size参数就没用了.这是只是保证正负样本比例是1:3,并不考虑正负样本的总数(超过256/512也可以).这样的理解对吗?

  麻烦了!!

The demo result is not good

@taokong

I have tried the model (RON320_VOC0712_VOC07.caffemodel) you provided.

But the location of the bounding is not good.

For example, Here are the faster rcnn demo images.

test

test

Any idea?

Train RON on KITTI Blocked in pythonlayer::forward

@taokong Hi, professor kong, recently I changed your RON project for training detector on KITTI, when I started to train, the output of pycharm run would stop after 100 iters, with the GPU-Util nearly 0% and maintain to 0%.
I gdb the PID of train_net.py and got info like below
#0 0x000000000052b3bf in ?? ()
#1 0x00000000004c8061 in PyEval_EvalFrameEx ()
#2 0x00000000004cfedc in PyEval_EvalCodeEx ()
#3 0x00000000004c8314 in PyEval_EvalFrameEx ()
#4 0x00000000004c8762 in PyEval_EvalFrameEx ()
#5 0x00000000004c8762 in PyEval_EvalFrameEx ()
#6 0x00000000004c8762 in PyEval_EvalFrameEx ()
#7 0x00000000004c8762 in PyEval_EvalFrameEx ()
#8 0x00000000004704ea in ?? ()
#9 0x00000000004d8194 in ?? ()
#10 0x00000000004d40fb in PyEval_CallObjectWithKeywords ()
#11 0x0000000000467c68 in PyEval_CallFunction ()
#12 0x00007fd32612a485 in caffe::PythonLayer::Forward_cpu(std::vector<caffe::Blob, std::allocator<caffe::Blob> > const&, std::vector<caffe::Blob, std::allocator<caffe::Blob> > const&) ()
what should I do now?

What's the meaning of the number in rpn-data param "param_str"

Hi,
recently,l'm reading your RON source code. I have a question about the file "traincudnn.prototxt". what's the meaning of param_str: "'stride_scale_border_batchsize': 64,7,32,256" in rpn-data_7 and this param_str in the other but the same position layer. I want to konw each meaning of the num in the param.

Thanks.If disturbed, please forgive.

train.prototxt

@taokong HI

关于train.prototxt,有如下一个问题

layer {
name: "rpn_lrn7"
type: "BatchNorm"
bottom: "rpn_conv7"
top: "rpn_lrn7"
}

对于BatchNoram层,看 网上/其他paper中 很多都是如下设置的:
layer {
bottom: "res5c_branch2b"
top: "res5c_branch2b"
name: "bn5c_branch2b"
type: "BatchNorm"
batch_norm_param {
use_global_stats: true
}
param {
lr_mult: 0.0
decay_mult: 0.0
}
param {
lr_mult: 0.0
decay_mult: 0.0
}
param {
lr_mult: 0.0
decay_mult: 0.0
}
}
 1) 为什么你这里没有这3个默认的param?
 2) use_global_stats在训练时不是为false么,为什么看很多都是true?

为了描述准确,这里用了中文,麻烦了!

cuda/cudnn version

Hi,
What is the cuda and cudnn version that you used? I tried compiling the code cuda 9/cudnn 8 and I am getting the following errors:

src/caffe/net.cpp:8:18: fatal error: hdf5.h: No such file or directory
compilation terminated.
Makefile:575: recipe for target '.build_release/src/caffe/net.o' failed
make: *** [.build_release/src/caffe/net.o] Error 1

I have also tried with older versions (cuda 6.5 and cudnn 5) but I still can't get this to build

I would appreciate any advise
thanks

how to save log

hi taokong:
when i use your code ,i want save the loss value, can you help me,how i can save the log .

How to use 384 or other size for testing?

When I use test320cudnn.prototxt, all is right.
But When I modify the input_shape to 384 in test320cudnn.prototxt, there is error:
File "/opt/yushan/RON/tools/../lib/ron_layer/det_layer.py", line 72, in forward
all_scores_det = bottom[1].data[:, 1:, :, :].reshape(self._ndim, self._numclasses - 1, self._num_anchors, self._height, self._width)
ValueError: cannot reshape array of size 5000 into shape (1,20,10,6,6)

I think the size of det_cls_prob_7 is 5000 when size of input is 320. But the size of det_cls_prob_7 shuold be 7500 when size of input is 384.

This is the log of det_cls_prob_7 in testing.
8690 net.cpp:434] det_cls_prob_7 <- det_cls_score_reshape_7
I1010 07:39:07.711601 8690 net.cpp:408] det_cls_prob_7 -> det_cls_prob_7
I1010 07:39:07.711802 8690 net.cpp:150] Setting up det_cls_prob_7
I1010 07:39:07.711809 8690 net.cpp:157] Top shape: 1 21 60 6 (7560)

It seems the size of det_cls_prob_7 is all right.
So I'm puzzled where the problem is.
I only modify the input_shape. Is it right?How to use 384 or other size for testing?
Could you help me? Thanks!

Train and Test for input size 384

Thanks for the model. I was able to train and test the RON model for input size 320 (different dataset). I like to train and test for input size 384. I was able to train for input size 384 by adding the 384 to C.TRAIN.SCALES
__C.TRAIN.SCALES = (384,320,256)

Now I want to test. Can you please let me know how need to change the "test320cudnn.prototxt" for 384 input?

Multiple objects detection isn't so accurate

Hi,
I tried the RON VOC VGG-16 model on a test image. Though there is a 'bird' class in the PASCAL VOC dataset, the detection isn't so accurate as it should be.
Original image
The detection result
@taokong May you also try RON on this image, thanks!

How can I use RON when the objects of my datasets are all small?

 First of all, thanks for your great work in object detection based on deep learning. I want to use RON framework to my own datasets, the objects of my datasets are all small. I only reserve rpn_lrn4 layer in order to detect small objects. I also change num_output to my own number of classes. My datasets only have two classes of objects , background and car. What's wrong with my changes? What else should I do?
 Looking foward your apply! Thanks!

training time?

Thanks for your impressive work. In my workstation with Titan X, it need about 3 second for each iteration. And the training time is about 100 hours (3*120000 second). Is it normal?

Save Model every 20 iters

Hello. When I used your method and framework to train on Pascal VOC. I found that the method save caffemodel every 20 iters, and I did not hope it save model too frequently. Then I modified your solver.prototxt, use snapshot: 10000 to change your original code, but it did not work. Could you told me how to modify your code and make my idea work.

weight_decay: 0.0005
#We disable standard caffe solver snapshotting and implement our own snapshot
#function
snapshot: 0
#We still use the snapshot prefix, though
snapshot_prefix: "RON-REDUCED"
#debug_info: true

Why the trainning is very slow in my dateset

When I use GTX1080 to train in VOC2007, It is very fast. It can iterate 100 times during 30 minites. But When I use GTX1080 to train in my dateset, It can iterate 30 times during the whole night. It seems something wrong. Do you know the reason? Thank you!

How to reduce batchsize when im testing ?

Thanks for your great contribution!@taokong
I have some issues when i train and test RON with other dataset:
(1)How to use multi gpus while im training and testing?i find that in 'train_net.py' or 'test_net.py' ,the GPUS argument can only get one int32 value.
(2)If i only use 1 gpu when im testing ,it seems the caffe always out of the memory with the ERROR:status == CUDNN_STATUS_SUCCESS (8 vs 0)CUDNN_STATUS_EXECUTION_FAILED , so where can i reduce the batchsize when im testing ?
Thank you !

BaiduYun

Hi ,
Thanks for the great code. I was wondering if you could host the VGG16_layers_fully_conv.caffemode for training (https://pan.baidu.com/s/1c2xm2U8#list/path=%2F) somewhere other than Baidu? It requires a confirmation code with chinese area code to be able to download it. If this is the generic pre-trained VGG, can I download it from the modelzoo?

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.