Giter Site home page Giter Site logo

yolo1996 / ssenet-pytorch Goto Github PK

View Code? Open in Web Editor NEW

This project forked from yudewang/ssenet-pytorch

0.0 1.0 0.0 1.14 MB

The pytorch implementation of self-supervised scale equivariant network for weakly supervised semantic segmentation.

License: MIT License

Python 100.00%

ssenet-pytorch's Introduction

SSENet-pytorch

The pytorch implementation of Self-supervised Scale Equivariant Network for Weakly Supervised Semantic Segmentaion (https://arxiv.org/abs/1909.03714).

Introduction

CAMs visualization As well-known, conventional CAM tends to be incomplete or over-activated due to weak supervision. Fortunately, we find that semantic segmentation has a characteristic of spatial transformation equivariance, which can form a few self-supervisions to help weakly supervised learning. This work mainly explores the advantages of scale equivariant constrains for CAM generation, formulated as a self supervised scale equivariant network (SSENet). Extensive experiments on PASCAL VOC 2012 datasets demonstrate that our method achieves outstanding performance comparing with other state-of-the-arts.

Thanks to jiwoon-ahn, the code of this repository borrow heavly from his AffinityNet project, and we follw the same pipeline to verify the effectiveness of our SSENet.

Dependency

  • This repo is tested on Ubuntu 16.04, with python 3.6, pytorch 0.4, torchvision 0.2.1, CUDA 9.0, 4xGPUs (NVIDIA TITAN XP 12GB)
  • Please install tensorboardX for training visualization.
  • The dataset we used is PASCAL VOC 2012, please download the VOC development kit. It is suggested to make a soft link toward downloaded dataset.
ln -s $your_dataset_path/VOCdevkit/VOC2012 $your_voc12_root
  • (Optional) The image-level labels have already been given in voc12/cls_label.npy. If you want to regenerate it (which is unnecessary), please download the annotation of VOC 2012 SegmentationClassAug training set (containing 10582 images), which can be download here and place them all as $your_voc12_root/SegmentationClassAug/xxxxxx.png. Then run the code
cd voc12
python make_cls_labels.py --voc12_root $your_voc12_root

Usage

CAM generalization step

  1. SSENet training
python train_cls_ser.py --voc12_root $your_voc12_root --weights $your_weights_file --session_name $your_session_name
  1. SSENet inference. Noting that the the crf results will be saved in $your_crf_dir+_4.0 and $your_crf_dir+_24.0, where the parameters can be modified in infer_cls_ser.py. These two folders will be further used in following AffinityNet training step.
python infer_cls_ser.py --weights $your_SSENet_checkpoint --infer_list [voc12/val.txt | voc12/train.txt | voc12/train_aug.txt] --out_cam $your_cam_dir --out_crf $your_crf_dir --out_cam_pred $your_pred_dir
  1. CAM step evaluation. We provide python mIoU evaluation script evaluation.py, or you can use official development kit.
python evaluation.py --list $your_voc12_root/ImageSets/Segmentation/[val.txt | train.txt] --predict_dir $your_pred_dir --gt_dir $your_voc12_root/SegmentationClass

Random walk step

The random walk step keep the same with AffinityNet project.

  1. Train AffinityNet.
python train_aff.py --weights $your_weights_file --voc12_root $your_voc12_root --la_crf_dir $your_crf_dir_4.0 --ha_crf_dir $your_crf_dir_24.0 --session_name $your_session_name
  1. Random walk propagation
python infer_aff.py --weights $your_weights_file --infer_list [voc12/val.txt | voc12/train.txt] --cam_dir $your_cam_dir --voc12_root $your_voc12_root --out_rw $your_rw_dir
  1. Random walk step evaluation
python evaluation.py --list $your_voc12_root/ImageSets/Segmentation/[val.txt | train.txt] --predict_dir $your_rw_dir --gt_dir $your_voc12_root/SegmentationClass

Results

The generated pseudo labels are evaluated on PASCAL VOC 2012 train set.

Model CAM step (mIoU) CAM+rw step (mIoU)
ResNet38 48.0 58.1 AffinityNet cvpr submission[1]
ResNet38 47.3 58.8 reimplemented baseline
SSENet-ResNet38 49.8 62.1 branch downsampling rate = 0.3 (weights)

Citation

Please cite our paper if the code is helpful to your research.

@article{SSENet,
    author = {Yude Wang and Jie Zhang and Meina Kan and Shiguang Shan and Xilin Chen},
    title = {Self-supervised Scale Equivariant Network for Weakly Supervised Semantic Segmentation},
    journal = {arXiv:1909.03714},
    year = {2019}
}

Reference

[1] J. Ahn and S. Kwak. Learning pixel-level semantic affinity with image-level supervision for weakly supervised semantic segmentation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.

ssenet-pytorch's People

Contributors

yudewang avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.