Giter Site home page Giter Site logo

mrcrabsss / hierarchy-clip Goto Github PK

View Code? Open in Web Editor NEW

This project forked from gyhandy/hierarchy-clip

0.0 0.0 0.0 5.4 MB

[CVPR 2023] Improving Zero-shot Generalization and Robustness of Multi-modal Models

Python 43.55% Jupyter Notebook 56.45%

hierarchy-clip's Introduction

Hierarchy-CLIP

[CVPR 2023] Improving Zero-shot Generalization and Robustness of Multi-modal Models

Improving Zero-shot Generalization and Robustness of Multi-modal Models
Yunhao Ge*, Jie Ren*, Andrew Gallagher, Yuxiao Wang, Ming-Hsuan Yang, Hartwig Adam, Laurent Itti, Balaji Lakshminarayanan, Jiaping Zhao ( * =equal contribution)
IEEE/ CVF International Conference on Computer Vision and Pattern Recognition (CVPR), 2023

Editor

Figure: Our zero-shot classification pipeline consists of 2 steps: confidence estimation via self-consistency (left block) and top-down and bottom-up label augmentation using the WordNet hierarchy (right block).

Editor

Figure: Typical failure modes in the cases where top-5 prediction was correct but top-1 was wrong.

Getting Started

Installation

  • Clone this repo:
git clone https://github.com/gyhandy/Hierarchy-CLIP.git
cd Hierarchy-CLIP
  • Install required library:
git clone https://github.com/google-research/scenic.git
cd scenic
pip install .

Load dataset:

  • Most of the dataset we used in paper could be load by tensorflow_datasets, with our provided function:
  • dset = load_dataset('imagenet2012')
    Note: please make sure you have registered ImageNet account.
  • You could also first download ImageNet and then process them with tensorflow_datasets and load them with function:
    dset = load_dataset_from(data_dir='YOUR/LOCAL/PATH/imagenet2012', dataset='imagenet2012', split='validation')
    If you want to use other dataset (paper Table 2), e.g., caltech101, Food-101, Flower102, Cifar-100, please use/rewrite our function: load_dataset_info()
  • # caltech101
    caltech101_dset, caltech101_dset_info = load_dataset_info('caltech101', split='test')

Download WordNet hierarchy information to build top-down and bottom-up prompt augmentation:

Code

We provide a colab code, all details are in the following:

Hierarcy_Clip.ipynb

Contact / Cite

Got Questions? We would love to answer them! Please reach out by email! You may cite us in your research as:

@inproceedings{ge2023improving,
  title={Improving Zero-shot Generalization and Robustness of Multi-modal Models},
  author={Ge, Yunhao and Ren, Jie and Gallagher, Andrew and Wang, Yuxiao and Yang, Ming-Hsuan and Adam, Hartwig and Itti, Laurent and Lakshminarayanan, Balaji and Zhao, Jiaping},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={11093--11101},
  year={2023}
}

hierarchy-clip's People

Contributors

gyhandy avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.