The mllm-dpo from findalexli

This repo contains the code and the data for the following paper:

@misc{li2024multimodal,
    title={Multi-modal preference alignment remedies regression of visual instruction tuning on language model},
    author={Shengzhi Li and Rongyu Lin and Shichao Pei},
    year={2024},
    eprint={2402.10884},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

[Arxiv paper] [GitHub] [Data] [Model] [Data]

Developers: Shengzhi Li (TIFIN.AI), Rongyu Lin (KAUST), Shichao Pei (University of Massachusetts Boston) Affiliations: TIFIN, KAUST, University of Massachusetts Boston Contact Information: [email protected], [email protected], [email protected]

Introduction
Installation
Data Preparation
Training
Evaluation

Introduction

This guide provides step-by-step instructions for fine-tuning using the alignment methods and evaluating the LLaVA model, specifically focusing on visual instruction tuning using SciGraphQA and LRV-instruct datasets.

Installation

Unzip the repository:

Set up the environment:

conda create -n llava python=3.10 -y
conda activate llava
pip install --upgrade pip
pip install -e .

Install packages for training:

pip install -e ".[train]"
pip install flash-attn --no-build-isolation

Data Preparation

Download datasets and images:
- SciGraphQA: Download Link
- LRV-Insturct: Download Link
The images for LRC-Instruct shall be downloaded by: gdown https://drive.google.com/uc?id=1k9MNV-ImEV9BYEOeLEIb4uGEUZjd3QbM

The images for SciGraphQA can be downloaded by: https://huggingface.co/datasets/alexshengzhili/SciGraphQA-295K-train/resolve/main/img.zip?download=true 2. Organize the images in ./playground/data:

```
playground/
└── data/
    ├── scigraphqa/
    │   └── images/
    └── lrv_instruct/
        └── images/
```

For DPO, please see playground/data/dpo_inference0104.with_logpllava-v1.5-13b_2024-02-03.json
For non-DPO data, we also provide each of the alignment method (SteerLM, Rejection Sampling and Standard SFT) in the data folder such as playground/data/rejection_sampling.json playground/data/standard_sft.json playground/data/steerlm.json

Training

Use scripts/v1/finetune_dpo.sh for DPO experiments
Use scripts/v1/finetune_steer.sh for non-DPO experiments,

Evaluation

Use the provided evaluation scripts under scripts/v1_5/eval/ to assess the performance of your fine-tuned model on various benchmarks. Ensure that you follow the guidelines for using greedy decoding to ensure consistency with real-time outputs.

We thank the authors of LLaVA, Vicuna for which the origional state of this repo is based on

Loading Dataset returns error from huggingface

running this:

from datasets import load_dataset
dataset = load_dataset("alexshengzhili/mllm-dpo",split='train[0:1]',trust_remote_code=True)

returns this error:

ArrowInvalid Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/datasets/packaged_modules/json/json.py in _generate_tables(self, files)
121 try:
--> 122 pa_table = paj.read_json(
123 io.BytesIO(batch), read_options=paj.ReadOptions(block_size=block_size)

17 frames
ArrowInvalid: JSON parse error: Column() changed from object to array in row 0

During handling of the above exception, another exception occurred:

ArrowTypeError Traceback (most recent call last)
ArrowTypeError: Expected bytes, got a 'int' object

The above exception was the direct cause of the following exception:

DatasetGenerationError Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/datasets/builder.py in _prepare_split_single(self, gen_kwargs, fpath, file_format, max_shard_size, job_id)
2036 if isinstance(e, DatasetGenerationError):
2037 raise
-> 2038 raise DatasetGenerationError("An error occurred while generating the dataset") from e
2039
2040 yield job_id, True, (total_num_examples, total_num_bytes, writer._features, num_shards, shard_lengths)

DatasetGenerationError: An error occurred while generating the dataset

findalexli / mllm-dpo Goto Github PK

mllm-dpo's Introduction

Contents

Introduction

Installation

Data Preparation

Training

Evaluation

mllm-dpo's People

Contributors

Stargazers

Watchers

mllm-dpo's Issues

Loading Dataset returns error from huggingface

ValueError: Some specified arguments are not used by the HfArgumentParser: ['--task', 'DPO'...]

Script for train data gen

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent