Giter Site home page Giter Site logo

Comments (4)

LukeForeverYoung avatar LukeForeverYoung commented on June 2, 2024

You can pass a list of images and place the same number of "<|image|>" in your prompt.

from mplug-owl.

fansticOne avatar fansticOne commented on June 2, 2024

I pass a list of images, say 2 images, and modify the prompt. The image_tensor after preprocess has batch size of 2, while the input_ids has batch size of 1,then I run model.generate(), I do get a result, however the result is wrong. Do I misunderstand?

from mplug-owl.

LukeForeverYoung avatar LukeForeverYoung commented on June 2, 2024

I pass a list of images, say 2 images, and modify the prompt. The image_tensor after preprocess has batch size of 2, while the input_ids has batch size of 1,then I run model.generate(), I do get a result, however the result is wrong. Do I misunderstand?

Could you provide an example and the incorrect response generated by the owl? Btw, the owl has not been trained on SFT data that includes multiple images. Therefore, it is reasonable to expect that it might fail in some cases.

from mplug-owl.

fansticOne avatar fansticOne commented on June 2, 2024

Here are the two images I passed
1664356777209_m_11
1664356777209_m_17
the prompt is
'USER: <|image|><|image|>{}\nAnswer the question using a single word or phrase. ASSISTANT:'.format('Does the dog in the first picture have same color with the dog in the second picture?')
the response generated by the owl is 'Yes'

from mplug-owl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.