Giter Site home page Giter Site logo

Comments (8)

blaze7451 avatar blaze7451 commented on July 30, 2024 2

@alvarobartt hi, thx for your detailed explanation at another post, looks like I misuse device_map="auto" for training, I will follow your instruction to try later. big thx again.

from trl.

alvarobartt avatar alvarobartt commented on July 30, 2024 1

Hi here too @blaze7451 can you share with me a Gist with the full fine-tuning script so that I can provide you the FSDP configuration and command to run after testing that myself? But as mentioned in the other issue, if you use accelerate with FSDP and remove device_map="auto" you shouldn't encounter this issue. Thanks in advance!

from trl.

huangxinping avatar huangxinping commented on July 30, 2024 1

@alvarobartt Hi buddy, thx. I will try again later.

from trl.

huangxinping avatar huangxinping commented on July 30, 2024

@blaze7451 Hey, I'm facing the same issue, do you know how to fix it?

from trl.

Aisuko avatar Aisuko commented on July 30, 2024

hi @alvarobartt, thanks for your response. I hit this issue too, from the same article. I have some experience with device_map="auto" to fine-tune models like phi-2, Gemma with multiple GPUs. It is really useful when the single GPU does not have enough memory to load the model and training data. Do you have any suggestion here that we can use ORPO on multiple GPUs in notebook environment(like kaggle)? Thank you.

from trl.

alvarobartt avatar alvarobartt commented on July 30, 2024

But the issue for you @Aisuko is that you're not able to fine-tune Llama 3B with device_map="auto" right? Or is it that you're just unable to use trl.ORPOTrainer + device_map="auto"?

from trl.

Aisuko avatar Aisuko commented on July 30, 2024

But the issue for you @Aisuko is that you're not able to fine-tune Llama 3B with device_map="auto" right? Or is it that you're just unable to use trl.ORPOTrainer + device_map="auto"?

Hi @alvarobartt thanks for your quick replying. Let's make it clearly, can we use trl.ORPOTrainer to fine-tune model on multiple GPUs with device_map="auto" parameter?

from trl.

alvarobartt avatar alvarobartt commented on July 30, 2024

Hi @alvarobartt thanks for your quick replying. Let's make it clearly, can we use trl.ORPOTrainer to fine-tune model on multiple GPUs with device_map="auto" parameter?

Hi @Aisuko as mentioned before I don't have much experience on relying on device_map="auto" to fine-tune models, indeed I thought it was not even supported, but it seems it is with some tweaks. After reading some issues I realised that @younesbelkada answered a similar question for the KTOTrainer that can be easily solved apparently, please read #1560 (comment)

from trl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.