I was trying to use ORPOTrainer to train model with 4

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi here too <a class="user-mention notranslate" data-hovercard-type="user" data-hoverc

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

But the issue for you <a class="user-mention notranslate" data-hovercard-type="user" d

But the issue for you <a class="user-mention notranslate" data-hovercard-

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-ho

Fail to train with ORPOTrainer under multi-GPUs setting about trl HOT 8 OPEN

blaze7451 commented on July 30, 2024

Fail to train with ORPOTrainer under multi-GPUs setting

from trl.

Comments (8)

blaze7451 commented on July 30, 2024 2

@alvarobartt hi, thx for your detailed explanation at another post, looks like I misuse device_map="auto" for training, I will follow your instruction to try later. big thx again.

from trl.

alvarobartt commented on July 30, 2024 1

Hi here too @blaze7451 can you share with me a Gist with the full fine-tuning script so that I can provide you the FSDP configuration and command to run after testing that myself? But as mentioned in the other issue, if you use accelerate with FSDP and remove device_map="auto" you shouldn't encounter this issue. Thanks in advance!

from trl.

huangxinping commented on July 30, 2024 1

@alvarobartt Hi buddy, thx. I will try again later.

from trl.

huangxinping commented on July 30, 2024

@blaze7451 Hey, I'm facing the same issue, do you know how to fix it?

from trl.

Aisuko commented on July 30, 2024

hi @alvarobartt, thanks for your response. I hit this issue too, from the same article. I have some experience with device_map="auto" to fine-tune models like phi-2, Gemma with multiple GPUs. It is really useful when the single GPU does not have enough memory to load the model and training data. Do you have any suggestion here that we can use ORPO on multiple GPUs in notebook environment(like kaggle)? Thank you.

from trl.

alvarobartt commented on July 30, 2024

But the issue for you @Aisuko is that you're not able to fine-tune Llama 3B with device_map="auto" right? Or is it that you're just unable to use trl.ORPOTrainer + device_map="auto"?

from trl.

Aisuko commented on July 30, 2024

But the issue for you @Aisuko is that you're not able to fine-tune Llama 3B with device_map="auto" right? Or is it that you're just unable to use trl.ORPOTrainer + device_map="auto"?

Hi @alvarobartt thanks for your quick replying. Let's make it clearly, can we use trl.ORPOTrainer to fine-tune model on multiple GPUs with device_map="auto" parameter?

from trl.

alvarobartt commented on July 30, 2024

Hi @alvarobartt thanks for your quick replying. Let's make it clearly, can we use trl.ORPOTrainer to fine-tune model on multiple GPUs with device_map="auto" parameter?

Hi @Aisuko as mentioned before I don't have much experience on relying on device_map="auto" to fine-tune models, indeed I thought it was not even supported, but it seems it is with some tweaks. After reading some issues I realised that @younesbelkada answered a similar question for the KTOTrainer that can be easily solved apparently, please read #1560 (comment)

from trl.

Recommend Projects

Fail to train with ORPOTrainer under multi-GPUs setting about trl HOT 8 OPEN

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent