Deep Residual Learning for Image Recognition

Implementation of "Deep Residual Learning for Image Recognition", Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun in PyFunt (a simple Python + Numpy DL framework).

Also inspired by this implementation in Lua + Torch.

The network operates on minibatches of data that have shape (N, C, H, W) consisting of N images, each with height H and width W and with C input channels. It has, like in the reference paper, (6*n)+2 layers, composed as below:

		                                        (image_dim: 3, 32, 32; F=16)
		                                        (input_dim: N, *image_dim)
		 INPUT
		    |
		    v
		+-------------------+
		|conv[F, *image_dim]|                    (out_shape: N, 16, 32, 32)
		+-------------------+
		    |
		    v
		+-------------------------+
		|n * res_block[F, F, 3, 3]|              (out_shape: N, 16, 32, 32)
		+-------------------------+
		    |
		    v
		+-------------------------+
		|res_block[2*F, F, 3, 3]  |              (out_shape: N, 32, 16, 16)
		+-------------------------+
		    |
		    v
		+---------------------------------+
		|(n-1) * res_block[2*F, 2*F, 3, 3]|      (out_shape: N, 32, 16, 16)
		+---------------------------------+
		    |
		    v
		+-------------------------+
		|res_block[4*F, 2*F, 3, 3]|              (out_shape: N, 64, 8, 8)
		+-------------------------+
		    |
		    v
		+---------------------------------+
		|(n-1) * res_block[4*F, 4*F, 3, 3]|      (out_shape: N, 64, 8, 8)
		+---------------------------------+
		    |
		    v
		+-------------+
		|pool[1, 8, 8]|                          (out_shape: N, 64, 1, 1)
		+-------------+
		    |
		    v
		+-------+
		|softmax|                                (out_shape: N, num_classes)
		+-------+
		    |
		    v
		 OUTPUT

Every convolution layer has a pad=1 and stride=1, except for the dimension enhancning layers which has a stride of 2 to mantain the computational complexity. Optionally, there is the possibility of setting m affine layers immediatley before the softmax layer by setting the hidden_dims parameter, which should be a list of integers representing the numbe of neurons for each affine layer.

Each residual block is composed as below:

          Input
             |
     ,-------+-----.
Downsampling      3x3 convolution+dimensionality reduction
    |               |
    v               v
Zero-padding      3x3 convolution
    |               |
    `-----( Add )---'
             |
          Output

After every layer, a batch normalization with momentum .1 is applied.

Requirements

After you get Python, you can get pip and install all requirements by running:

pip install -r requirements.txt

Usage

If you want to train the network on the CIFAR-10 dataset, simply run:

python train.py --help

Otherwise, you have to get the right train.py for MNIST or SFDDD datasets, they are respectively on the mnist and sfddd git branches:

train.py for MNIST: https://github.com/dnlcrl/PyResNet/blob/mnist/train.py
train.py for SFDDD: https://github.com/dnlcrl/PyResNet/blob/sfddd/train.py

Experiments Results

You can view all the experiments results in the ./docs directory. Main results are shown below:

CIFAR-10

best error: 9.59 % (accuracy: 0.9041) with a 20 layers residual network (n=3):

CIFAR-10 Results - iPython notebook

MNIST

best error: 0.36 % (accuracy: 0.9964) with a 32 layers residual network (n=5):

MNIST Results - iPython notebook

SFDDD

best error: 0.25 % (accuracy: 0.9975 %) on a subset (1000 samples) of the train data (~21k images) with a 44 layers residual network (n=7), resizing the images to 64x48, randomly cropping 32x32 images for training and cropping a 32x32 image from the center of the original images for testing. Unfortunately I got more than 2% error on Kaggle's results (composed of ~80k images).

SFDDD Results - iPython notebook

dnlcrl / deep-residual-networks-pyfunt Goto Github PK

deep-residual-networks-pyfunt's Introduction

Deep Residual Learning for Image Recognition

Requirements

Usage

Experiments Results

CIFAR-10

MNIST

SFDDD

deep-residual-networks-pyfunt's People

Contributors

Stargazers

Watchers

Forkers

deep-residual-networks-pyfunt's Issues

About nnet.utils.vis_utils

ImportError: cannot import name ResNet

两个问题

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent