Giter Site home page Giter Site logo

henrypengzou / implicitave Goto Github PK

View Code? Open in Web Editor NEW
6.0 2.0 0.0 79.89 MB

[ACL 2024 Findings] Dataset and Code of "ImplicitAVE: An Open-Source Dataset and Multimodal LLMs Benchmark for Implicit Attribute Value Extraction"

Python 0.23% Jupyter Notebook 99.77%
attribute-value-extraction vision-language-model implicit-attribute-value-extraction multimodal-llm

implicitave's Introduction

ImplicitAVE: An Open-Source Dataset and Multimodal LLMs Benchmark for Implicit Attribute Value Extraction

Task

This repository contains the dataset and code of the paper:

ImplicitAVE: An Open-Source Dataset and Multimodal LLMs Benchmark for Implicit Attribute Value Extraction [Paper] [arXiv] [ACL Anthology] [OpenReview]
Accepted by ACL 2024 Findings
ACL ARR Feb Scores: Soundness - 4/4/4, Overall Assessment - 4/3.5/3.5, Meta - 4
Henry Peng Zou, Vinay Samuel, Yue Zhou, Weizhi Zhang, Liancheng Fang, Zihe Song, Philip S. Yu, Cornelia Caragea

Datasets

Our evaluation and training data are released in the data folder. For product images, please download them from the provided links in the corresponding folder and unzip them into the same folder.

Code

The inference code we used for GPT-4V, BLIP-2, InstructBLIP, LLaVA, Qwen-VL, and Qwen-VL-Chat are provided. When running the inference code for each MLLM, please refer to the instruction in the corresponding projects for environment setup and package installation.

Here we provide an example for setting up the environment, running the inference and evaluation code for Qwen:

Setup

# Environment setup
conda create -n Qwen python=3.9 -y
conda activate Qwen

# install pytorch
conda install pytorch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2 pytorch-cuda=11.8 -c pytorch -c nvidia

# install dependency
# cd code/Qwen-VL
pip install -r requirements.txt

Evaluation

To start the inference and evaluation, simply run Qwen_VL_7B.ipynb and Qwen_VL_Chat.ipynb notebooks.

You might need to change the paths to your own data paths and replace the model names with other variants you would like to use.

Bugs or Questions

If you have any questions related to the dataset or the paper, feel free to email Henry Peng Zou ([email protected]) and Vinay Samuel([email protected]). If you encounter any problems when using the code, or want to report a bug, you can open an issue. Please try to specify the problem with details so we can help you better and quicker!

Citation

If you find this repository helpful, please consider citing our paper ๐Ÿ’•:

@article{zou2024implicitave,
    title={ImplicitAVE: An Open-Source Dataset and Multimodal LLMs Benchmark for Implicit Attribute Value Extraction},
    author={Henry Peng Zou and Vinay Samuel and Yue Zhou and Weizhi Zhang and Liancheng Fang and Zihe Song and Philip S. Yu and Cornelia Caragea},
    journal={arXiv preprint arXiv:2404.15592},
    year={2024}
}

Acknowledgement

This repo borrows some data and codes from MAVE, LaVIN and Llama, GPT-4V, BLIP-2, InstructBLIP, LLaVA, Qwen-VL, and Qwen-VL-Chat. We appreciate their great works!

implicitave's People

Contributors

vsamuel2003 avatar henrypengzou avatar

Stargazers

 avatar  avatar  avatar Xiaocong Qiu avatar Liancheng Fang avatar  avatar

Watchers

Kostas Georgiou avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.