Giter Site home page Giter Site logo

xing0047 / rewrite Goto Github PK

View Code? Open in Web Editor NEW
17.0 2.0 0.0 8.41 MB

[NeurIPS 2023] Rewrite Caption Semantics: Bridging Semantic Gaps for Language-Supervised Semantic Segmentation

License: Other

Python 97.79% Shell 2.21%
neurips-2023 pre-training segmentation vision-language

rewrite's Introduction

Rewrite Caption Semantics: Bridging Semantic Gaps for Language Supervised Semantic Segmentation

This is the official repository of the following paper:

Rewrite Caption Semantics: Bridging Semantic Gaps for Language Supervised Semantic Segmentation
NeurIPS 2023
Yun Xing, Jian Kang, Aoran Xiao, Jiahao Nie, Ling Shao, Shijian Lu

Updates

  • code released.
  • paper available.

Environmental Setup

conda create -n rewrite python=3.7 -y
conda activate rewrite
conda install pytorch==1.8.0 torchvision==0.9.0 cudatoolkit=11.1 -c pytorch -c conda-forge
pip install mmcv-full==1.3.14 -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.8.0/index.html
pip install -r requirements.txt
git clone https://github.com/ptrblck/apex.git
cd apex & pip install -v --no-cache-dir ./

Run

Curation

CONFIG='configs/train/cocu_clip-vit-b-16_8_c3_30e.yml'
DATA='c3'
MODEL='clip-vit-b-16'
Turn a set of image-caption pairs to CLIP embeddings.
bash scripts/inference.sh --data ${DATA} --model ${MODEL}
Take CLIP embeddings and make a search index out of it.
bash scripts/index.sh --data ${DATA} --model ${MODEL}
Rewrite semantics of image captions.
python rewrite/curation.py --data ${DATA} --model ${MODEL}

Pre-train

./tools/dist_launch.sh main_group_vit.py ${CONFIG} 4

Citation

Please consider citing our paper if you find our work useful.

@inproceedings{xing2023rewrite,
    title={Rewrite Caption Semantics: Bridging Semantic Gaps for Language-Supervised Semantic Segmentation}, 
    author={Yun Xing and Jian Kang and Aoran Xiao and Jiahao Nie and Shao Ling and Shijian Lu},
    booktitle={Advances in Neural Information Processing Systems},
    year={2023},
}

Acknowledgement

The repo is built on GroupViT and clip-retrieval.

rewrite's People

Contributors

xing0047 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.