Giter Site home page Giter Site logo

modified-cbamnet.mxnet's Introduction

We plan to release one modified architecture implemented by MXNet for image classification.

CBAMnet.mxnet

A MXNet implementation of Modified CBAMnet.

In this part, we implement a modified CBAMnet (CBAM Resnet 100) architecture via MXNet. The original one is described in the paper CBAM: Convoluational Block Attention Module proposed by Sanghyun Woo, Jongchan Park, Joon-Young Lee, and In So Kweon. This paper is accepted by ECCV 2018.

Original architecture

This is an overview of a convolutional block attention module (CBAM).

Each attention sub-module is illustrated as following diagram:

The Residual building block integrated with CBAM is demonstrated as the following figure:

We implement the modified CBAMnet based on original CBAMnet 100 (ResNet 100 + CBAM).
In our implementation, we use 1x1 convolution layer to replace the fully connected layer in MLP.

What's the difference between modified version and original version ?

1. The size of input data is 112x112 not 224x224. In order to preserve higher feature map resolution, we follow the setting of input in [2]. Specifically, The first convolution layer with 7x7 kernel size and 2 stride is replaced by 3x3 kernel size and 1 stride. Moreover, we remove the following max pooling layer with 3x3 kernel size and 2 stride.

2. We adopt the improved residual unit mentioned in [2]. Specifically, the improved residual unit is constructed by BN-Conv-BN-PReLu-Conv-BN, where BN denotes batch normalization layer, PReLu is Parametric Rectified Linear Unit activation layer and Conv means convolution layer.

3. We replace all ReLu activation layers with PReLu activation layers in our whole architecture.

4. We follow the output setting mentioned in [2]. Specifically, we choose Option-E with structure of BN-Dropout-FC-BN after the last convolutional layer, where Dropout means dropout layer and FC denotes fully connected layer.

This modified Convolutional Block Attention Module based Residual Network architecture can be directly integrated into the library of insightface.

Reference

[1] Sanghyun Woo, Jongchan Park, Joon-Young Lee, and In So Kweon. CBAM: Convoluational Block Attention Module ECCV 2018.

[2] Jiankang Deng, Jia Guo, Stefanos Zafeiriou. "ArcFace: Additive Angular Margin Loss for Deep Face Recognition"

[3] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun. "Identity Mappings in Deep Residual Networks"

Pytorch implementation

[4] Jie Hu, Li Shen and Gang Sun. "Squeeze-and-Excitation Networks"

modified-cbamnet.mxnet's People

Contributors

bruinxiong avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.