Giter Site home page Giter Site logo

toe's Introduction

On the Powerfulness of Textual Outlier Exposure for Visual OoD Detection

This codebase provides a Pytorch implementation for the paper On the Powerfulness of Textual Outlier Exposure for Visual OoD Detection at NeurIPS 2023.

Overview

Screen Shot 2023-10-27 at 7 32 54 PM

Preparation

Word-level outlier

To train the text decoder for word-level outliers, you can execute python preprocess/train_decoder.py. To run this code, you need to download the MS-COCO dataset and place it under data folder, TOE/data/MS-COCO. We also provide a pre-trained model with 100 epoch in this Google Drive link. You need to place this checkpoint under preprocess/trained_model. We adopt this text decoder code from ZOC.

For word-level outlier, we generate outliers during running main.py. We also provide pre-processed .npy file. For quick start, you can run the code with making --debug option True.

Description-level outlier

We adopted the method of generating descriptions for in-dataset from this paper. Before running the code, you need to download the .json files for your targeted in-distribution data from this link and place it under preprocess folder. To create .npy file for description-level textul outlier, run

cd preprocess
python description.py

Caption-level outlier

Create .npy file for caption-level outlier by running codes below.

cd preprocess
# generate captions (create {in_dataset}_outlier_caption.npy)
python blip.py
# index for filtering generated captions (create {in_dataset}_outlier_caption_index.npy)
python caption_select.py

For a quick start, please refer to this Google Drive link, which contains all the npy files.

In-distribution Dataset

imagenet_class_clean.npy from MCM

Out-of-distribution Dataset

We use large-scale OoD datasets iNaturalist, SUN, Places and Texture curated by Huang et al. 2021. Please follow instruction from this repository to download the subsampled datasets where semantically overlapped classes with ImageNet-1K are removed.

The overall file structure is as follows:

TOE
|--data
   |--imagenet_class_clean.npy
|--preprocess
   |--descriptors_imagenet.json
   |--npys
      |--ImageNet
         |--ImageNet_outlier_word.npy
         |--ImageNet_outlier_description.npy
         |--ImageNet_outlier_caption.npy
         |--ImageNet_outlier_caption_index.npy
   |--trained_model
      |--model_epoch100.pt
   |--data
      |--ImageNet
         |--ImageNet_classwise_mean_ImageNet_250_True.pt
         |--ImageNet_precision_ImageNet_250_True.pt
|--datasets
   |--Imagenet
   |--iNaturalist
      |--images
      |--class_list_old.txt
   |--SUN
      |--images
      |--class_list_old.txt
   |--Places
      |--images
      |--class_list_old.txt
   |--dtd
      |--images
      |--class_list.txt

Quick Start

  • --outlier: word, description, caption-level textual outlier
  • --debug: load pre-generated word-level textual outlier .npy file
  • --noise: option to add noise to text embedding for reducing modality gap
  • --run: name for single run
# word-level textual outlier
python main.py --in_dataset ImageNet --num_classes 1000 --outlier word --run test 

Image vs Text

The code for this part will be released soon.

  • --mode: real or virtual (auxiliary dataset or synthesis in feature space)
  • --domain: image or text We use ImageNet10 and ImageNet20 from MCM
python run.py --in_dataset ImageNet10 --num_classes 10 --outlier dtd --domain text --mode virtual --run test 

Citation and Paper availability

You can find the arXiv version of the paper here: https://arxiv.org/abs/2310.16492

Please cite our paper with the following BibTex:

@inproceedings{NEURIPS2023_a2374637,
 author = {Park, Sangha and Mok, Jisoo and Jung, Dahuin and Lee, Saehyung and Yoon, Sungroh},
 booktitle = {Advances in Neural Information Processing Systems},
 editor = {A. Oh and T. Neumann and A. Globerson and K. Saenko and M. Hardt and S. Levine},
 pages = {51675--51687},
 publisher = {Curran Associates, Inc.},
 title = {On the Powerfulness of Textual Outlier Exposure for Visual OoD Detection},
 url = {https://proceedings.neurips.cc/paper_files/paper/2023/file/a2374637af47ac9471b43c99b68acf27-Paper-Conference.pdf},
 volume = {36},
 year = {2023}
}

toe's People

Contributors

wiarae avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.