Giter Site home page Giter Site logo

Comments (9)

Liuziyu77 avatar Liuziyu77 commented on July 30, 2024

We will release our model soon. You can also train your own model with MMUD by yourself. Train code depends on which model you are using.

from mmdu.

Liuziyu77 avatar Liuziyu77 commented on July 30, 2024

MMDU can be applied to various LVLMs

from mmdu.

daixiangzi avatar daixiangzi commented on July 30, 2024

We will release our model soon. You can also train your own model with MMUD by yourself. Train code depends on which model you are using.

hh,we are preparing to do this。

from mmdu.

daixiangzi avatar daixiangzi commented on July 30, 2024

MMDU can be applied to various LVLMs

max image num is 20 in MMDU.in fact ,if I use llava3-clip-l14-336(max token is 8k),I think I need to use token compression,have you done any research in this area?

from mmdu.

Liuziyu77 avatar Liuziyu77 commented on July 30, 2024

MMDU can be applied to various LVLMs

max image num is 20 in MMDU.in fact ,if I use llava3-clip-l14-336(max token is 8k),I think I need to use token compression,have you done any research in this area?

One of the purposes of MMDU-45k is to enhance the dialogue capabilities of LVLMs in long multi-modal contexts involving text and images. The maximum token length for MMDU-45k is 17k. During the finetuning of the model, we generally use lengths of 16k or 32k to train the model, without considering the issue of token compression.

from mmdu.

Liuziyu77 avatar Liuziyu77 commented on July 30, 2024

The main data length distribution of MMDU-45k and the MMDU benchmark is around 8k. Therefore, using MMDU-45k to finetune an 8k-LVLM is also feasible.

from mmdu.

daixiangzi avatar daixiangzi commented on July 30, 2024

I tried fine-tuning clip_l14_336-llama3-8b using mmdu, and even with a batch size of 1, it still runs out of memory on an 80G A100.

from mmdu.

Liuziyu77 avatar Liuziyu77 commented on July 30, 2024

I tried fine-tuning clip_l14_336-llama3-8b using mmdu, and even with a batch size of 1, it still runs out of memory on an 80G A100.

image
MMDU has long-context use zero3.json

from mmdu.

daixiangzi avatar daixiangzi commented on July 30, 2024

I tried fine-tuning clip_l14_336-llama3-8b using mmdu, and even with a batch size of 1, it still runs out of memory on an 80G A100.

image MMDU has long-context use zero3.json

I use zero3 in fact.but still oom

from mmdu.

Related Issues (2)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.