Giter Site home page Giter Site logo

knut0815 / switchnorm_segmentation Goto Github PK

View Code? Open in Web Editor NEW

This project forked from switchablenorms/switchnorm_segmentation

0.0 0.0 0.0 1.27 MB

Switchable Normalization for semantic image segmentation and scene parsing.

Python 98.47% Shell 1.53%

switchnorm_segmentation's Introduction

Switchable Normalization for Semantic Segmentation

This repository contains the code of using Swithable Normalization (SN) in semantic image segmentation, proposed by the paper "Differentiable Learning-to-Normalize via Switchable Normalization".

This is the implementations of the experiments presented in the above paper by using open-source semantic segmentation framework Scene Parsing on MIT ADE20K.

Update

  • 2018/9/26: The code and trained models of semantic segmentation on ADE20K by using SN are released !
  • More results and models will be released soon.

Citation

You are encouraged to cite the following paper if you use SN in research or wish to refer to the baseline results.

@article{SwitchableNorm,
  title={Differentiable Learning-to-Normalize via Switchable Normalization},
  author={Ping Luo and Jiamin Ren and Zhanglin Peng},
  journal={arXiv:1806.10779},
  year={2018}
}

Getting Started

Use git to clone this repository:

git clone https://github.com/switchablenorms/SwitchNorm_Segmentation.git

Environment

The code is tested under the following configurations.

  • Hardware: 1-8 GPUs (with at least 12G GPU memories)
  • Software: CUDA 9.0, Python 3.6, PyTorch 0.4.0, tensorboardX

Installation & Data Preparation

Please check the Environment, Training and Evaluation subsection in the repo Scene Parsing on MIT ADE20K for a quick start.

Pre-trained Models

Download SN based ImageNet pretrained model and put them into the {repo_root}/pretrained_sn.

ImageNet pre-trained models

The backbone models with SN pretrained on ImageNet are available in the format used by above Segmentation Framework and this repo.

For more pretrained models with SN, please refer to the repo of switchablenorms/Switchable-Normalization. The following script converts the model trained from Switchable-Normalization into a valid format used by the semantic segmentation codebase : ./pretrained_sn/convert_sn.py

usage: python -u convert_sn.py

NOTE: The paramater keys in pretrained model checkpoint must match the keys in backbone model EXACTLY. You should load the correct pretrained model according to your segmentation architechure.

Training

  • The training strategies of baseline models and sn-based models on ADE20K are same as Scene Parsing on MIT ADE20K.
  • The training script with ResNet-50-sn backbone can be found here: ./scripts/train.sh

NOTE: The default architecture of this repo is Encoder: resnet50_dilated8 ( resnetXX_dilatedYY: customized resnetXX with dilated convolutions, output feature map is 1/YY of input size, see DeepLab for more details ) and Decoder: c1_bilinear_deepsup ( 1 conv + bilinear upsample + deep supervision, see PSPNet for more details ).

Optional arguments (see full input arguments via ./train.py):

  --arch_encoder         architecture of encode network
  --arch_decoder         architecture of decode network
  --weights_encoder      weights to finetune endoce network
  --weights_decoder      weights to finetune decode network
  --list_train           the list to load the training data 
  --root_dataset         the path of the dataset
  --batch_size_per_gpu   input batch size
  --start_epoch          epoch to start training. (continue from a checkpoint loaded via weights_encoder & weights_decoder)
  

NOTE: In this repo, --start_epoch allows the training to resume from the checkpoint loaded from --weights_encoder and --weights_decoder, which is generated in the training process automatically. If you want to train from scratch, you need to assign --start_epoch as 1 and set --weights_encoder and --weights_decoder to the blank value.

Evaluation

  • The evaluation script with ResNet-50-sn backbone can be found here : ./scripts/evaluate.sh

Optional arguments (see full input arguments via ./eval.py):

  --arch_encoder         architecture of encode network
  --arch_decoder         architecture of decode network
  --suffix               which snapshot to load
  --list_val             the list to load the validation data 
  --root_dataset         the path of the dataset
  --imgSize              list of input image sizes

--imgSize enables single-scale or multi-scale inference. When --load_dir is with the int type, the single-scale inference will be started up. When --load_dir is a int list, the multi-scale test will be applied.

Main Results

Semantic Segmentation Results on ADE20K

The experiment results are on the ADE20K validation set. MS test is short for multi-scale test. sync BN indicates the mutli-GPU synchronization batch normalization. More results and models will be released soon.

Architecture Norm MS test Mean IoU Pixel Acc. Overall Score Download
ResNet50_dilated8 + c1_bilinear_deepsup sync BN no 36.43 77.30 56.87 encoder decoder
ResNet50_dilated8 + c1_bilinear_deepsup GN no 35.66 77.24 56.45 encoder decoder
ResNet50_dilated8 + c1_bilinear_deepsup SN-(8,2) no 38.72 78.90 58.82 encoder decoder
ResNet50_dilated8 + c1_bilinear_deepsup sync BN yes 37.69 78.29 57.99 --
ResNet50_dilated8 + c1_bilinear_deepsup GN yes 36.32 77.77 57.05 --
ResNet50_dilated8 + c1_bilinear_deepsup SN-(8,2) yes 39.21 79.20 59.21 --

NOTE: For all settings in this repo, we employ ResNet as the backbone network, using the original 7ร—7 kernel size in the first convolution layer. This is different from the MIT framework , which adopts 3 convolution layers with the kernel size 3ร—3 at the bottom of the network. See ./models/resnet_v1_sn.py for the details.

switchnorm_segmentation's People

Contributors

ruixuejianfei avatar switchablenorms avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.