Giter Site home page Giter Site logo

deformable-convnets-v1-v2-caffe's Introduction

Deformable-ConvNets(v1&v2)-caffe

Experiment Results:

Model: Faster Rcnn (ResNet-50 backbone) without OHEM and Deformable Roi Pooling
Dataset:train with voc 07+12 test on voc 07

Deformable-V1 (with dcn in stage5):

[email protected] aeroplane bicycle bird boat bottle bus car cat chair cow
0.7836 0.8004 0.8071 0.7909 0.7092 0.6297 0.8582 0.8697 0.8951 0.6366 0.8516
diningtable dog horse motorbike person pottedplant sheep sofa train tvmonitor
0.7121 0.8822 0.8837 0.8162 0.7965 0.5449 0.7787 0.7764 0.8725 0.7613

Deformable-V2 (with mdcn in stage5):

[email protected] aeroplane bicycle bird boat bottle bus car cat chair cow
0.7872 0.8025 0.8378 0.7808 0.7019 0.6241 0.8600 0.8650 0.8937 0.6351 0.8645
diningtable dog horse motorbike person pottedplant sheep sofa train tvmonitor
0.7366 0.8848 0.8853 0.8268 0.7977 0.5161 0.7823 0.7799 0.8785 0.7911

Add the code to your caffe:

move deformable_conv_layer.cpp and deformable_conv_layer.cu to yourcaffepath/src/caffe/layers/
move modulated_deformable_conv_layer.cpp and modulated_deformable_conv_layer.cu to yourcaffepath/src/caffe/layers/
move deformable_conv_layer.hpp and modulated_deformable_conv_layer.hpp to yourcaffepath/include/caffe/layers/
move deformable_im2col.hpp and modulated_deformable_im2col.hpp to yourcaffepath/include/caffe/util/
move deformable_im2col.cu and modulated_deformable_im2col.cu to yourcaffepath/src/caffe/util/

edit caffe.proto:

optional DeformableConvolutionParameter deformable_convolution_param = 999999;  
optional ModulatedDeformableConvolutionParameter modulated_deformable_convolution_param = 9999999;  


message DeformableConvolutionParameter {
  optional uint32 num_output = 1; 
  optional bool bias_term = 2 [default = true]; 
  repeated uint32 pad = 3; // The padding size; defaults to 0
  repeated uint32 kernel_size = 4; // The kernel size
  repeated uint32 stride = 6; // The stride; defaults to 1
  repeated uint32 dilation = 18; // The dilation; defaults to 1
  optional uint32 pad_h = 9 [default = 0]; // The padding height (2D only)
  optional uint32 pad_w = 10 [default = 0]; // The padding width (2D only)
  optional uint32 kernel_h = 11; // The kernel height (2D only)
  optional uint32 kernel_w = 12; // The kernel width (2D only)
  optional uint32 stride_h = 13; // The stride height (2D only)
  optional uint32 stride_w = 14; // The stride width (2D only)
  optional uint32 group = 5 [default = 1]; 
  optional uint32 deformable_group = 25 [default = 1]; 
  optional FillerParameter weight_filler = 7; // The filler for the weight
  optional FillerParameter bias_filler = 8; // The filler for the bias
  enum Engine {
    DEFAULT = 0;
    CAFFE = 1;
    CUDNN = 2;
  }
  optional Engine engine = 15 [default = DEFAULT];
  optional int32 axis = 16 [default = 1];
  optional bool force_nd_im2col = 17 [default = false];
}


message ModulatedDeformableConvolutionParameter {
  optional uint32 num_output = 1; 
  optional bool bias_term = 2 [default = true]; 
  repeated uint32 pad = 3; // The padding size; defaults to 0
  repeated uint32 kernel_size = 4; // The kernel size
  repeated uint32 stride = 6; // The stride; defaults to 1
  repeated uint32 dilation = 18; // The dilation; defaults to 1
  optional uint32 pad_h = 9 [default = 0]; // The padding height (2D only)
  optional uint32 pad_w = 10 [default = 0]; // The padding width (2D only)
  optional uint32 kernel_h = 11; // The kernel height (2D only)
  optional uint32 kernel_w = 12; // The kernel width (2D only)
  optional uint32 stride_h = 13; // The stride height (2D only)
  optional uint32 stride_w = 14; // The stride width (2D only)
  optional uint32 group = 5 [default = 1]; 
  optional uint32 deformable_group = 25 [default = 1]; 
  optional FillerParameter weight_filler = 7; // The filler for the weight
  optional FillerParameter bias_filler = 8; // The filler for the bias
  enum Engine {
    DEFAULT = 0;
    CAFFE = 1;
    CUDNN = 2;
  }
  optional Engine engine = 15 [default = DEFAULT];
  optional int32 axis = 16 [default = 1];
  optional bool force_nd_im2col = 17 [default = false];
}

Model structure:

Deformable_ConvNet_V1 in ResNet:
Deformable_ConvNet_V1

Deformable_ConvNet_V2 in Resnet:
Deformable_ConvNet_V2

Acknowlegement:

Thanks to offical mxnet code
Thanks to unsky

deformable-convnets-v1-v2-caffe's People

Contributors

zhanglonghao1992 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.