Giter Site home page Giter Site logo

yrt1026 / clims Goto Github PK

View Code? Open in Web Editor NEW

This project forked from cvi-szu/clims

0.0 0.0 0.0 42.51 MB

[CVPR 2022] CLIMS: Cross Language Image Matching for Weakly Supervised Semantic Segmentation

License: MIT License

Shell 0.85% Python 99.15%

clims's Introduction

CLIMS

Code repository for our paper "CLIMS: Cross Language Image Matching for Weakly Supervised Semantic Segmentation" in CVPR 2022.

๐Ÿ˜ Code for our paper "CCAM: Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation" in CVPR 2022 is also available at here.

Please to NOTE that this repository is an improved version of our camera-ready version (you can refer to the directory of previous_version/). We recommend to use our improved version of CLIMS instead of camera-ready version.

Dataset

PASCAL VOC2012

You will need to download the images (JPEG format) in PASCAL VOC2012 dataset at here and train_aug ground-truth can be found at here. Make sure your data/VOC2012 folder is structured as follows:

โ”œโ”€โ”€ VOC2012/
|   โ”œโ”€โ”€ Annotations
|   โ”œโ”€โ”€ ImageSets
|   โ”œโ”€โ”€ SegmentationClass
|   โ”œโ”€โ”€ SegmentationClassAug
|   โ””โ”€โ”€ SegmentationObject

MS-COCO 2014

You will need to download the images (JPEG format) in MSCOCO 2014 dataset at here and ground-truth mask can be found at here. Make sure your data/COCO folder is structured as follows:

โ”œโ”€โ”€ COCO/
|   โ”œโ”€โ”€ train2014
|   โ”œโ”€โ”€ val2014
|   โ”œโ”€โ”€ annotations
|   |   โ”œโ”€โ”€ instances_train2014.json
|   |   โ”œโ”€โ”€ instances_val2014.json
|   โ”œโ”€โ”€ mask
|   |   โ”œโ”€โ”€ train2014
|   |   โ”œโ”€โ”€ val2014

Training on PASCAL VOC2012

  1. Install CLIP.
$ pip install ftfy regex tqdm
$ pip install git+https://github.com/openai/CLIP.git
  1. Download pre-trained baseline CAM ('res50_cam.pth') at here and put it at the directory of cam-baseline-voc12/.
  2. Train CLIMS on PASCAL V0C2012 dataset to generate initial CAMs.
CUDA_VISIBLE_DEVICES=0 python run_sample.py --voc12_root /data1/xjheng/dataset/VOC2012/ --hyper 10,24,1,0.2 --clims_num_epoches 15 --cam_eval_thres 0.15 --work_space clims_voc12 --cam_network net.resnet50_clims --train_clims_pass True --make_clims_pass True --eval_cam_pass True
  1. Train IRNet and generate pseudo semantic masks.
CUDA_VISIBLE_DEVICES=0 python run_sample.py --voc12_root /data1/xjheng/dataset/VOC2012/ --cam_eval_thres 0.15 --work_space clims_voc12 --cam_network net.resnet50_clims --cam_to_ir_label_pass True --train_irn_pass True --make_sem_seg_pass True --eval_sem_seg_pass True
  1. Train DeepLabv2 using pseudo semantic masks.
cd segmentation/

Evaluation Results

The quality of initial CAMs and pseudo masks on PASCAL VOC2012.

Method backbone CAMs + RW + IRNet
CLIMS(camera-ready) R50 56.6 70.5 -
CLIMS(this repo) R50 58.6 ~73 74.1

Evaluation results on PASCAL VOC2012 val and test sets.

Please cite the results of camera-ready version

Method Supervision Network Pretrained val test
AdvCAM I DeepLabV2 ImageNet 68.1 68.0
EDAM I+S DeepLabV2 COCO 70.9 70.6
CLIMS(camera-ready) I DeepLabV2 ImageNet 69.3 68.7
CLIMS(camera-ready) I DeepLabV2 COCO 70.4 70.0
CLIMS(this repo) I DeepLabV2 ImageNet 70.3 70.6
CLIMS(this repo) I DeepLabV2 COCO 71.4 71.2
CLIMS(this repo) I DeepLabV1-R38 ImageNet 73.3 73.4

(Please cite the results of camera-ready version. Initial CAMs, pseudo semantic masks, and pre-trained models of camera-ready version can be found at Google Drive)

Training on MSCOCO 2014

  1. Download pre-trained baseline CAM ('res50_cam.pth') at here and put it at the directory of cam-baseline-coco/.
  2. Train CLIMS on MSCOCO 2014 dataset to generate initial CAMs.
CUDA_VISIBLE_DEVICES=6,7 python -m torch.distributed.launch --nproc_per_node=2 run_sample_coco.py --work_space clims_coco --clims_network net.resnet50_clims --train_clims_pass True --make_clims_pass True --eval_cam_pass True --clims_num_epoches 8 --cam_eval_thres 0.15 --hyper 2,14,1.25,0.2 --cam_batch_size 16 --clims_learning_rate 0.0005 --use_distributed_train True --cbs_loss_thresh 0.285

If you are using our code, please consider citing our paper.

@InProceedings{Xie_2022_CVPR,
    author    = {Xie, Jinheng and Hou, Xianxu and Ye, Kai and Shen, Linlin},
    title     = {CLIMS: Cross Language Image Matching for Weakly Supervised Semantic Segmentation},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2022},
    pages     = {4483-4492}
}
@article{xie2022cross,
  title={Cross Language Image Matching for Weakly Supervised Semantic Segmentation},
  author={Xie, Jinheng and Hou, Xianxu and Ye, Kai and Shen, Linlin},
  journal={arXiv preprint arXiv:2203.02668},
  year={2022}
}

This repository was highly based on IRNet, thanks for Jiwoon Ahn's great code.

clims's People

Contributors

sierkinhane avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.