Giter Site home page Giter Site logo

hanyoseob / pytorch-stn Goto Github PK

View Code? Open in Web Editor NEW
1.0 3.0 2.0 727 KB

Spatial Transform Networks

Home Page: https://github.com/hanyoseob/pytorch-STN

Python 100.00%
deep-learning deep-neural-networks convolutional-neural-networks stn spatial-transform-network pytorch-cnn

pytorch-stn's Introduction

STN

Title

Spatial Transform Networks

Abstract

Convolutional Neural Networks define an exceptionally powerful class of models, but are still limited by the lack of ability to be spatially invariant to the input data in a computationally and parameter efficient manner. In this work we introduce a new learnable module, the Spatial Transformer, which explicitly allows the spatial manipulation of data within the network. This differentiable module can be inserted into existing convolutional architectures, giving neural networks the ability to actively spatially transform feature maps, conditional on the feature map itself, without any extra training supervision or modification to the optimisation process. We show that the use of spatial transformers results in models which learn invariance to translation, scale, rotation and more generic warping, resulting in state-of-the-art performance on several benchmarks, and for a number of classes of transformations.

alt text alt text

Train

$ python main.py --mode train \
                 --scope [scope name] \
                 --name_data [data name] \
                 --dir_data [data directory] \
                 --dir_log [log directory] \
                 --dir_checkpoint [checkpoint directory]
                 --gpu_ids [gpu id; '-1': no gpu, '0, 1, ..., N-1': gpus]

$ python main.py --mode train \
                 --scope stn \
                 --name_data mnist \
                 --dir_data ./datasets \
                 --dir_log ./log \
                 --dir_checkpoint ./checkpoint
                 --gpu_ids 0
  • If --scope is defined as "stn", a classification network with STN is used.

$ python main.py --mode train \
                 --scope cls \
                 --name_data mnist \
                 --dir_data ./datasets \
                 --dir_log ./log \
                 --dir_checkpoint ./checkpoint
                 --gpu_ids 0
  • If --scope is defined as "cls", a classification network without STN is used.

  • Set [scope name] uniquely.
  • To understand hierarchy of directories based on their arguments, see directories structure below.
  • Hyperparameters were written to arg.txt under the [log directory].

Test

$ python main.py --mode test \
                 --scope [scope name] \
                 --name_data [data name] \
                 --dir_data [data directory] \
                 --dir_log [log directory] \
                 --dir_checkpoint [checkpoint directory] \
                 --dir_result [result directory]
                 --gpu_ids [gpu id; '-1': no gpu, '0, 1, ..., N-1': gpus]

$ python main.py --mode test \
                 --scope stn \
                 --name_data mnist \
                 --dir_data ./datasets \
                 --dir_log ./log \
                 --dir_checkpoint ./checkpoints \
                 --dir_result ./results
                 --gpu_ids 0

$ python main.py --mode test \
                 --scope cls \
                 --name_data mnist \
                 --dir_data ./datasets \
                 --dir_log ./log \
                 --dir_checkpoint ./checkpoints \
                 --dir_result ./results
                 --gpu_ids 0
  • To test using trained network, set [scope name] defined in the train phase.
  • Generated images are saved in the images subfolder along with [result directory] folder.
  • index.html is also generated to display the generated images.

Tensorboard

$ tensorboard --logdir [log directory]/[scope name]/[data name] \
              --port [(optional) 4 digit port number]

$ tensorboard --logdir ./log/dcgan/celeba \
              --port 6006

After the above comment executes, go http://localhost:6006

  • You can change [(optional) 4 digit port number].
  • Default 4 digit port number is 6006.

Results

alt text

  • The results were generated by a network trained with mnist dataset during 10 epochs.

  • After finishing the Test phase, execute display_result.py to display the figure.

  • Left figure shows MNIST Input and right figure shows STN output.

  • Below table shows quantitative metrics such as cross entropy loss and accuracy.

Metrics w/o STN w/ STN
Cross Entropy Loss 0.0457 0.0350
Accuracy (%) 98.5868 98.8654

Directories structure

pytorch-STN
+---[dir_checkpoint]
|   \---[scope]
|       \---[name_data]
|           +---model_epoch00000.pth
|           |   ...
|           \---model_epoch12345.pth
+---[dir_log]
|   \---[scope]
|       \---[name_data]
|           +---arg.txt
|           \---events.out.tfevents
\---[dir_result]
    \---[scope]
        \---[name_data]
            +---images
            |   +---00000-output.png
            |   |   ...
            |   +---12345-output.png
            \---index.html

pytorch-STN
+---checkpoints
|   \---cls
|   |   \---mnist
|   |       +---model_epoch00001.pth
|   |       |   ...
|   |       \---model_epoch0010.pth
|   \---stn
|       \---mnist
|           +---model_epoch00001.pth
|           |   ...
|           \---model_epoch0010.pth
+---log
|   \---cls
|   |   \---mnist
|   |       +---arg.txt
|   |       \---events.out.tfevents
|   \---stn
|       \---mnist
|           +---arg.txt
|           \---events.out.tfevents
\---results
    \---cls
    |   \---mnist
    |       +---images
    |       |   +---00000-output.png
    |       |   |   ...
    |       |   +---09999-output.png
    |       \---index.html
    \---stn
        \---mnist
            +---images
            |   +---00000-output.png
            |   |   ...
            |   +---09999-output.png
            \---index.html
  • Above directory is created by setting arguments when main.py is executed.

pytorch-stn's People

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.