Giter Site home page Giter Site logo

sknet's Introduction

SKNet: Selective Kernel Networks (paper)

By Xiang Li[1,2], Wenhai Wang[3,2], Xiaolin Hu[4] and Jian Yang[1]

[PCALab, Nanjing University of Science and Technology][1] Momenta[2] [Nanjing University][3] [Tsinghua University][4].

Approach

Figure 1: The Diagram of a Selective Kernel Convolution module.

Implementation

In this repository, all the models are implemented by Caffe.

We use the data augmentation strategies with SENet.

There are two new layers introduced for efficient training and inference, these are Axpy and CuDNNBatchNorm layers.

  • The Axpy layer is already implemented in SENet.
  • The [CuDNNBatchNorm] is mainly borrowed from GENet.

Trained Models

Table 2. Single crop validation error on ImageNet-1k (center 224x224/320x320 crop from resized image with shorter side = 256).

Model Top-1 224x Top-1 320x #P GFLOPs
ResNeXt-50 22.23 21.05 25.0M 4.24
AttentionNeXt-56 21.76 31.9M 6.32
InceptionV3 21.20 27.1M 5.73
ResNeXt-50 + BAM 21.70 20.15 25.4M 4.31
ResNeXt-50 + CBAM 21.40 20.38 27.7M 4.25
SENet-50 21.12 19.71 27.7M 4.25
SKNet-50 20.79 19.32 27.5M 4.47
ResNeXt-101 21.11 19.86 44.3M 7.99
Attention-92 19.50 51.3M 10.43
DPN-92 20.70 19.30 37.7M 6.50
DPN-98 20.20 18.90 61.6M 11.70
InceptionV4 20.00 42.0M 12.31
Inception-ResNetV2 19.90 55.0M 13.22
ResNeXt-101 + BAM 20.67 19.15 44.6M 8.05
ResNeXt-101 + CBAM 20.60 19.42 49.2M 8.00
SENet-101 20.58 18.61 49.2M 8.00
SKNet-101 20.19 18.40 48.9M 8.46

Download:

Model caffe model
SKNet-50 GoogleDrive
SKNet-101 GoogleDrive

20190323_Update: SKNet-101 model is deleted by mistake. We are retraining a model and it will come soon in 2-3 days. 20190326_Update: SKNet-101 model is ready.

Attention weights correspond to object scales in low/middle layers

We look deep into the selection distributions from the perspective of classes on SK_2_3 (low), SK_3_4 (middle), SK_5_3 (high) layers:

Figure 2: Average mean attention difference (mean attention value of kernel 5x5 minus that of kernel 3x3) on SK units of SKNet-50, for each of 1,000 categories using all validation samples on ImageNet. On low or middle level SK units (e.g., SK\_2\_3, SK\_3\_4), 5x5 kernels are clearly imposed with more emphasis if the target object becomes larger (1.0x -> 1.5x).

More details of attention distributions on specific images are as follows:

Citation

If you use Selective Kernel Convolution in your research, please cite the paper:

@inproceedings{li2019selective,
  title={Selective Kernel Networks},
  author={Li, Xiang and Wang, Wenhai and Hu, Xiaolin and Yang, Jian},
  journal={IEEE Conference on Computer Vision and Pattern Recognition},
  year={2019}
}

sknet's People

Contributors

implus avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.