Giter Site home page Giter Site logo

craft's Introduction

README

The codes are with the CVPR2016 paper "CRAFT Objects from Images".

In a word, we extend the conventional two-stage object detection framework (first locating object proposals, then classifying object categories) to a four-stage pipeline, in which the proposal localization task is solved with a cascade network of Region Proposal Network (RPN) and Fast R-CNN to improve the proposal quality, while the object classification task is handled by a cascade network of two Fast R-CNN nets with different objective functions (one-hot classification and one-vs-rest classification) to eliminate false positives.

We name our approach "CRAFT" (short for "Cascade Rpn And FasT-rcnn") and show considerable improvement over Fast R-CNN and Faster R-CNN baselines on PASCAL VOC 07/12 and ILSVRC datasets. For more details please refer to our CVPR2016 paper.

The codes are built on RPN (Stage 1) and Fast R-CNN (Stage 2,3,4). It would be easier to use the codes if you are familiar with these two projects.

The codes are tested on Ubuntu 14.04, 256GB Memory, Titan X GPU, MATLAB R2015a.

Preparation

  1. Follow instructions in Faster R-CNN to make the codes in 1_RPN, using Caffe provided by Shaoqing Ren
  2. Follow instructions in Fast R-CNN to make the codes in 2_CasRPN, 3_FRCN, and 4_CasFRCN, using our slightly modified Caffe
  3. Download the VGG16 pre-trained model and PASCAL VOC 2012 dataset and make proper links pointing to them
  4. You can create a soft link of folders caffe-fast-rcnn and data for 2_CasRPN, 3_FRCN, and 4_CasFRCN for convenience.

Training and testing

The whole pipeline is stage-wise. Now we show how to train an object detector using CRAFT approach on PASCAL VOC 2012 train+val dataset and test it on PASCAL VOC 2012 test set. For simplicity, we do not use joint training between RPN and Fast R-CNN networks.

Stage 1. RPN

cd 1_RPN
matlab ./experiments/script_faster_rcnn_VOC2012_VGG16.m
matlab saveProposals.m

Stage 2. CasRPN

cd 2_CasRPN
bash train.sh
bash test.sh
matlab saveProposals.m

Stage 3. FRCN

cd 3_FRCN
bash train.sh
bash test.sh
matlab saveDetections.m

Stage 4. CasFRCN

cd 4_CasFRCN
bash train.sh
bash test.sh

Results

                      | training data                          | test data            | mAP   

------------------------- |:--------------------------------------:|:--------------------:|:-----: CRAFT, VGG-16 | VOC 2007 trainval + 2012 trainval | VOC 2007 test | 75.7% CRAFT, VGG-16 | VOC 2012 trainval | VOC 2012 test | 71.3%

Note: The real mAP results may vary a little from the above results reported in the paper. We do not adopt joint training between RPN and Fast R-CNN currently.

Reference

If you use our codes in your research, we are grateful if you cite the paper:

@inproceedings{binyang16craft,
  title={Craft Objects from Images},
  author={Yang, Bin and Yan, Junjie and Lei, Zhen and Li, Stan},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2016}
}

Acknowledgement

We give our sincere gratitude to the following people, groups and institutions:

  • Anonymous reviewers
  • Ross Girshick for the Fast R-CNN project
  • Shaoqing Ren for the Faster R-CNN project
  • Caffe team
  • VGG team
  • SenseTime Group Limited
  • NVIDIA Corporation

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.