Giter Site home page Giter Site logo

miccunifi / circo Goto Github PK

View Code? Open in Web Editor NEW
41.0 7.0 1.0 582 KB

[ICCV 2023] - Composed Image Retrieval on Common Objects in context (CIRCO) dataset

License: Other

Python 100.00%
coco-dataset composed-image-retrieval information-retrieval multimodal-learning pytorch circo iccv2023 iccv

circo's Introduction

CIRCO Dataset (ICCV 2023)

arXiv Generic badge Generic badge GitHub Stars

๐Ÿ”ฅ๐Ÿ”ฅ [2024/05/07] Following the extended version of our paper, from now the evaluation server also provides the results divided by semantic category

This is the official repository of the Composed Image Retrieval on Common Objects in context (CIRCO) dataset.

For more details please see our ICCV 2023 paper "Zero-Shot Composed Image Retrieval with Textual Inversion" and its extended version "iSEARLE: Improving Textual Inversion for Zero-Shot Composed Image Retrieval".

You are currently viewing the dataset repository. If you are looking for more information about our method SEARLE see the repository.

Table of Contents

Overview

CIRCO (Composed Image Retrieval on Common Objects in context) is an open-domain benchmarking dataset for Composed Image Retrieval (CIR) based on real-world images from COCO 2017 unlabeled set. It is the first CIR dataset with multiple ground truths and aims to address the problem of false negatives in existing datasets. CIRCO comprises a total of 1020 queries, randomly divided into 220 and 800 for the validation and test set, respectively, with an average of 4.53 ground truths per query. We evaluate the performance on CIRCO using mAP@K.

Download

CIRCO is structured similarly to CIRR and FashionIQ, two popular datasets for CIR.

Start by cloning the repository:

git clone https://github.com/miccunifi/CIRCO.git

Annotations

The annotations are provided in the annotations folder. For each split, a JSON file contains the list of the corresponding annotations. Each annotation comprises the following fields:

  • reference_img_id: the id of the reference image;
  • target_img_id: the id of the target image (the one we used to write the relative caption);
  • relative_caption: the relative caption of the target image;
  • shared_concept: the shared concept between the reference and target images (useful to clarify ambiguities);
  • gt_img_ids: the list of ground truth images;
  • id: the id of the query.
  • semantic_aspects: the list of semantic aspects that characterize the query.
Click to see an annotation example
{
    "reference_img_id": 85932,
    "target_img_id": 9761,
    "relative_caption": "is held by a little girl on a chair",
    "shared_concept": "a teddy bear",
    "gt_img_ids": [
        9761,
        489535,
        57541,
        375057,
        119881
    ],
    "id": 13,
    "semantic_aspects": [
        "spatial_relations_background",
        "direct_addressing"
    ]
}

Note that:

  • target_img_id, gt_img_ids and semantic_aspects are not available for the test set.
  • target_img_id always corresponds to the first element of gt_img_ids.

Images

CIRCO is based on images taken from the COCO 2017 unlabeled set. Please see the COCO website to download both the images and the corresponding annotations.

Tip: sometimes when clicking on the download link from the COCO website, the download does not start. In this case, copy the download link and paste it into a new browser tab.

Create a folder named COCO2017_unlabeled in the CIRCO folder:

cd CIRCO
mkdir COCO2017_unlabeled

After downloading the images and the annotations, unzip and move them to the COCO2017_unlabeled folder.

Data Structure

After the download, the data structure should be as follows:

CIRCO
โ””โ”€โ”€โ”€ annotations
        | test.json
        | val.json

โ””โ”€โ”€โ”€ COCO2017_unlabeled
    โ””โ”€โ”€โ”€ annotations
        | image_info_unlabeled2017.json
        
    โ””โ”€โ”€โ”€ unlabeled2017
        | 000000243611.jpg
        | 000000535009.jpg
        | 000000097553.jpg
        | ...

Test Evaluation Server

We do not release the ground truth labels for CIRCO test split. Instead, we host an evaluation server to allow researchers to evaluate their models on the test split. The server is hosted independently, so please email us if the site is unreachable.

Once you have submitted your predictions, you will receive an email with the results.

Submission Format

The evaluation server accepts a JSON file where the keys are the query ids and the values are the lists of the top 50 retrieved images.

Note that:

  • the submission file must contain all the queries in the test set;
  • to limit the size of the submission file, you must submit only the top 50 retrieved images for each query.

The submission file should be formatted as the following example:

Click to expand
{
    "0": [
        9761,
        489535,
        57541,
        375057,
        119881,
        ...
    ],
    "1": [
        9761,
        489535,
        57541,
        375057,
        119881,
        ...
    ],
    ...
    "799": [
        9761,
        489535,
        57541,
        375057,
        119881,
        ...
    ],
}

Under the submission_examples/ directory, we provide two examples of submission files: one for the validation set and one for the test set. Since we release the GT labels for the validation set, you do not need to submit a submission file to evaluate your model on the validation set. We provide the submission file for the validation set only to test the evaluation server code through the evaluation.py script. See here for an example of how to generate a submission file.

Utility Scripts

Under the src/ directory, we provide some utility scripts to help you in the usage of the dataset:

  • dataset.py: contains the CIRCODataset class, which is a PyTorch Dataset class that can be used to load the dataset;
  • evaluation.py: contains the code that is running on the server to evaluate the submitted predictions.

Authors

* Equal contribution. Author ordering was determined by coin flip.

Citation

@misc{baldrati2023zeroshot,
      title={Zero-Shot Composed Image Retrieval with Textual Inversion}, 
      author={Alberto Baldrati and Lorenzo Agnolucci and Marco Bertini and Alberto Del Bimbo},
      year={2023},
      eprint={2303.15247},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgements

This work was partially supported by the European Commission under European Horizon 2020 Programme, grant number 101004545 - ReInHerit.

LICENSE

Creative Commons License
All material is made available under Creative Commons BY-NC 4.0. You can use, redistribute, and adapt the material for non-commercial purposes, as long as you give appropriate credit by citing our paper and indicate any changes that you've made.

circo's People

Contributors

abaldrati avatar lorenzoagnolucci avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

cv-ip

circo's Issues

results of supervised CIR method on this dataset

Hi there,

Thanks for your great work! I'm recently doing research on fully supervised CIR task. I'm wondering if you have any results of fully supervised CIR methods or baselin on this dataset.

BW

Evaluation server does not working

Hello, thanks for the wonderful work!

Currently, the evaluation sever seems not working, can you please confirm on this?

Appreciated!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.