Giter Site home page Giter Site logo

tqcai / siamrpn_plus_plus_pytorch Goto Github PK

View Code? Open in Web Editor NEW

This project forked from pengboxiangshang/siamrpn_plus_plus_pytorch

0.0 1.0 0.0 766 KB

SiamRPN, SiamRPN++, unofficial implementation of "SiamRPN++" (CVPR2019), multi-GPUs, LMDB.

Python 99.42% Shell 0.58%

siamrpn_plus_plus_pytorch's Introduction

SiamRPN++_PyTorch

This is an unofficial PyTorch implementation of SiamRPN++ (CVPR2019), implemented by Peng Xu and Jin Feng. Our training can be conducted on multi-GPUs, and use LMDB data format to speed up the data loading.

This project is designed with these goals:

  • Training on ILSVRC2015_VID dataset.
  • Training on GOT-10k dataset.
  • Training on YouTube-BoundingBoxes dataset.
  • Evaluate the performance on tracking benchmarks.

Details of SiamRPN++ Network

As stated in the original paper, SiamRPN++ network has three parts, including Backbone Networks, SiamRPN Blocks, and Weighted Fusion Layers.

1. Backbone Network (modified ResNet-50)

As stated in the original paper, SiamRPN++ uses ResNet-50 as backbone by modifying the strides and adding dilated convolutions for conv4 and conv5 blocks. Here, we present the detailed comparison between original ResNet-50 and SiamRPN++ ResNet-50 backbone in following table.

bottleneck in conv4 bottleneck in conv5
conv1x1 conv3x3 conv1x1 conv1x1 conv3x3 conv1x1
original ResNet-50 stride 1 2 1 1 2 1
padding 0 1 0 0 1 0
dilation 1 1 1 1 1 1
ResNet-50 in SiamRPN++ stride 1 1 1 1 1 1
padding 0 2 0 0 4 0
dilation 1 2 1 1 4 1

2. SiamRPN Block

Based on our understanding to the original paper, we plot a architecture illustration to describe the Siamese RPN block as shown in following.

We also present the detailed configurations of each layer of RPN block in following table. Please see more details in ./network/RPN.py.

component configuration
adj_1 / adj_2 / adj_3 / adj_4 conv2d(256, 256, ksize=3, pad=1, stride=1), BN2d(256)
fusion_module_1 / fusion_module_2 conv2d(256, 256, ksize=1, pad=0, stride=1), BN2d(256), ReLU
box head conv2d(256, 4*5, ksize=1, pad=0, stride=1)
cls head conv2d(256, 2*5, ksize=1, pad=0, stride=1)

3. Weighted Fusion Layer

We implemente the weighted fusion layer via group convolution operations. Please see details in ./network/SiamRPN.py.

Requirements

Ubuntu 14.04

Python 2.7

PyTorch 0.4.0

Other main requirements can be installed by:

# 1. Install cv2 package.
conda install opencv

# 2. Install LMDB package.
conda install lmdb

# 3. Install fire package.
pip install fire -c conda-forge

Training Instructions

# 1. Clone this repository to your disk.
git clone https://github.com/PengBoXiangShang/SiamRPN_plus_plus_PyTorch.git

# 2. Change working directory.
cd SiamRPN_plus_plus_PyTorch

# 3. Download training data. In this project, we provide the downloading and preprocessing scripts for ILSVRC2015_VID dataset. Please download ILSVRC2015_VID dataset (86GB). The cripts for other tracking datasets are coming soon.
cd data
wget -c http://bvisionweb1.cs.unc.edu/ilsvrc2015/ILSVRC2015_VID.tar.gz
tar -xvf ILSVRC2015_VID.tar.gz
rm ILSVRC2015_VID.tar.gz
cd ..

# 4. Preprocess data.
chmod u+x ./preprocessing/create_dataset.sh
./preprocessing/create_dataset.sh

# 5. Pack the preprocessed data into LMDB format to accelerate data loading.
chmod u+x ./preprocessing/create_lmdb.sh
./preprocessing/create_lmdb.sh

# 6. Start the training.
chmod u+x ./train.sh
./train.sh

Acknowledgement

Many thanks to Sisi who helps us to download the huge ILSVRC2015_VID dataset.

siamrpn_plus_plus_pytorch's People

Contributors

pengboxiangshang avatar

Watchers

paper2code - bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.