Giter Site home page Giter Site logo

chienhsuan / gcnet Goto Github PK

View Code? Open in Web Editor NEW

This project forked from xvjiarui/gcnet

0.0 0.0 0.0 3.2 MB

GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond

License: Apache License 2.0

Shell 0.16% Python 90.43% C++ 3.17% Cuda 6.24%

gcnet's Introduction

GCNet for Object Detection

PWC PWC PWC PWC

By Yue Cao, Jiarui Xu, Stephen Lin, Fangyun Wei, Han Hu.

This repo is a official implementation of "GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond" on COCO object detection based on open-mmlab's mmdetection. The core operator GC block could be find here. Many thanks to mmdetection for their simple and clean framework.

Update on 2020/12/07

The extension of GCNet got accepted by TPAMI (PDF).

Update on 2019/10/28

GCNet won the Best Paper Award at ICCV 2019 Neural Architects Workshop!

Update on 2019/07/01

The code is refactored. More results are provided and all configs could be found in configs/gcnet.

Notes: Both PyTorch official SyncBN and Apex SyncBN have some stability issues. During training, mAP may drops to zero and back to normal during last few epochs.

Update on 2019/06/03

GCNet is supported by the official mmdetection repo here. Thanks again for open-mmlab's work on open source projects.

Introduction

GCNet is initially described in arxiv. Via absorbing advantages of Non-Local Networks (NLNet) and Squeeze-Excitation Networks (SENet), GCNet provides a simple, fast and effective approach for global context modeling, which generally outperforms both NLNet and SENet on major benchmarks for various recognition tasks.

Citing GCNet

@article{cao2019GCNet,
  title={GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond},
  author={Cao, Yue and Xu, Jiarui and Lin, Stephen and Wei, Fangyun and Hu, Han},
  journal={arXiv preprint arXiv:1904.11492},
  year={2019}
}

Main Results

Results on R50-FPN with backbone (fixBN)

Back-bone Model Back-bone Norm Heads Context Lr schd Mem (GB) Train time (s/iter) Inf time (fps) box AP mask AP Download
R50-FPN Mask fixBN 2fc (w/o BN) - 1x 3.9 0.453 10.6 37.3 34.2 model
R50-FPN Mask fixBN 2fc (w/o BN) GC(c3-c5, r16) 1x 4.5 0.533 10.1 38.5 35.1 model
R50-FPN Mask fixBN 2fc (w/o BN) GC(c3-c5, r4) 1x 4.6 0.533 9.9 38.9 35.5 model
R50-FPN Mask fixBN 2fc (w/o BN) - 2x - - - 38.2 34.9 model
R50-FPN Mask fixBN 2fc (w/o BN) GC(c3-c5, r16) 2x - - - 39.7 36.1 model
R50-FPN Mask fixBN 2fc (w/o BN) GC(c3-c5, r4) 2x - - - 40.0 36.2 model

Results on R50-FPN with backbone (syncBN)

Back-bone Model Back-bone Norm Heads Context Lr schd Mem (GB) Train time (s/iter) Inf time (fps) box AP mask AP Download
R50-FPN Mask SyncBN 2fc (w/o BN) - 1x 3.9 0.543 10.2 37.2 33.8 model
R50-FPN Mask SyncBN 2fc (w/o BN) GC(c3-c5, r16) 1x 4.5 0.547 9.9 39.4 35.7 model
R50-FPN Mask SyncBN 2fc (w/o BN) GC(c3-c5, r4) 1x 4.6 0.603 9.4 39.9 36.2 model
R50-FPN Mask SyncBN 2fc (w/o BN) - 2x 3.9 0.543 10.2 37.7 34.3 model
R50-FPN Mask SyncBN 2fc (w/o BN) GC(c3-c5, r16) 2x 4.5 0.547 9.9 39.7 36.0 model
R50-FPN Mask SyncBN 2fc (w/o BN) GC(c3-c5, r4) 2x 4.6 0.603 9.4 40.2 36.3 model
R50-FPN Mask SyncBN 4conv1fc (SyncBN) - 1x - - - 38.8 34.6 model
R50-FPN Mask SyncBN 4conv1fc (SyncBN) GC(c3-c5, r16) 1x - - - 41.0 36.5 model
R50-FPN Mask SyncBN 4conv1fc (SyncBN) GC(c3-c5, r4) 1x - - - 41.4 37.0 model

Results on stronger backbones

Back-bone Model Back-bone Norm Heads Context Lr schd Mem (GB) Train time (s/iter) Inf time (fps) box AP mask AP Download
R101-FPN Mask fixBN 2fc (w/o BN) - 1x 5.8 0.571 9.5 39.4 35.9 model
R101-FPN Mask fixBN 2fc (w/o BN) GC(c3-c5, r16) 1x 7.0 0.731 8.6 40.8 37.0 model
R101-FPN Mask fixBN 2fc (w/o BN) GC(c3-c5, r4) 1x 7.1 0.747 8.6 40.8 36.9 model
R101-FPN Mask SyncBN 2fc (w/o BN) - 1x 5.8 0.665 9.2 39.8 36.0 model
R101-FPN Mask SyncBN 2fc (w/o BN) GC(c3-c5, r16) 1x 7.0 0.778 9.0 41.1 37.4 model
R101-FPN Mask SyncBN 2fc (w/o BN) GC(c3-c5, r4) 1x 7.1 0.786 8.9 41.7 37.6 model
X101-FPN Mask SyncBN 2fc (w/o BN) - 1x 7.1 0.912 8.5 41.2 37.3 model
X101-FPN Mask SyncBN 2fc (w/o BN) GC(c3-c5, r16) 1x 8.2 1.055 7.7 42.4 38.0 model
X101-FPN Mask SyncBN 2fc (w/o BN) GC(c3-c5, r4) 1x 8.3 1.037 7.6 42.9 38.5 model
X101-FPN Cascade Mask SyncBN 2fc (w/o BN) - 1x - - - 44.7 38.3 model
X101-FPN Cascade Mask SyncBN 2fc (w/o BN) GC(c3-c5, r16) 1x - - - 45.9 39.3 model
X101-FPN Cascade Mask SyncBN 2fc (w/o BN) GC(c3-c5, r4) 1x - - - 46.5 39.7 model
X101-FPN DCN Cascade Mask SyncBN 2fc (w/o BN) - 1x - - - 47.1 40.4 model
X101-FPN DCN Cascade Mask SyncBN 2fc (w/o BN) GC(c3-c5, r16) 1x - - - 47.9 40.9 model
X101-FPN DCN Cascade Mask SyncBN 2fc (w/o BN) GC(c3-c5, r4) 1x - - - 47.9 40.8 model

Notes

  • GC denotes Global Context (GC) block is inserted after 1x1 conv of backbone.
  • DCN denotes replace 3x3 conv with 3x3 Deformable Convolution in c3-c5 stages of backbone.
  • r4 and r16 denote ratio 4 and ratio 16 in GC block respectively.
  • Some of models are trained on 4 GPUs with 4 images on each GPU.

Requirements

  • Linux(tested on Ubuntu 16.04)
  • Python 3.6+
  • PyTorch 1.1.0
  • Cython
  • apex (Sync BN)

Install

a. Install PyTorch 1.1 and torchvision following the official instructions.

b. Install latest apex with CUDA and C++ extensions following this instructions. The Sync BN implemented by apex is required.

c. Clone the GCNet repository.

 git clone https://github.com/xvjiarui/GCNet.git 

d. Compile cuda extensions.

cd GCNet
pip install cython  # or "conda install cython" if you prefer conda
./compile.sh  # or "PYTHON=python3 ./compile.sh" if you use system python3 without virtual environments

e. Install GCNet version mmdetection (other dependencies will be installed automatically).

python(3) setup.py install  # add --user if you want to install it locally
# or "pip install ."

Note: You need to run the last step each time you pull updates from github. Or you can run python(3) setup.py develop or pip install -e . to install mmdetection if you want to make modifications to it frequently.

Please refer to mmdetection install instruction for more details.

Environment

Hardware

  • 8 NVIDIA Tesla V100 GPUs
  • Intel Xeon 4114 CPU @ 2.20GHz

Software environment

  • Python 3.6.7
  • PyTorch 1.1.0
  • CUDA 9.0
  • CUDNN 7.0
  • NCCL 2.3.5

Usage

Train

As in original mmdetection, distributed training is recommended for either single machine or multiple machines.

./tools/dist_train.sh <CONFIG_FILE> <GPU_NUM> [optional arguments]

Supported arguments are:

  • --validate: perform evaluation every k (default=1) epochs during the training.
  • --work_dir <WORK_DIR>: if specified, the path in config file will be replaced.

Evaluation

To evaluate trained models, output file is required.

python tools/test.py <CONFIG_FILE> <MODEL_PATH> [optional arguments]

Supported arguments are:

  • --gpus: number of GPU used for evaluation
  • --out: output file name, usually ends wiht .pkl
  • --eval: type of evaluation need, for mask-rcnn, bbox segm would evaluate both bounding box and mask AP.

gcnet's People

Contributors

hellock avatar yhcao6 avatar oceanpang avatar thangvubk avatar myownskyw7 avatar caoyue10 avatar xvjiarui avatar lindahua avatar innerlee avatar zhihuagao avatar donnyyou avatar sovrasov avatar patrick-llgc avatar zhijl avatar zehaos avatar youkaichao avatar tjsongzw avatar sty-yyj avatar chensnathan avatar lyuwenyu avatar luxiin avatar liushuchun avatar libuyu avatar cclauss avatar stupidzz avatar ychfan avatar lzhbrian avatar slidelucask avatar zhangtemplar avatar ozps avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.