Fintuning 176B Bloom with lora about lora HOT 7 OPEN

microsoft commented on May 22, 2024

Fintuning 176B Bloom with lora

from lora.

Comments (7)

edwardjhu commented on May 22, 2024

Hi! We had a proprietary setup. Are you using Adam and have you made sure to not pass the non-trainable parameters to the optimizer?

from lora.

drxmy commented on May 22, 2024

I used Adamw with tranformers's trainer class(hugging face). It printed a trainable parameter count. The number was much smaller with Lora.

from lora.

aegisgpt commented on May 22, 2024

The paper says that it only need 350G VRAM to train 175B GPT3 with rank =4. Can you elaborate more about how this is done? Like, do you use Megraton-deepspeed?

In my experiment with bloom-3b, fintuning all parameters need 29G. After using lora with different experiment set, trainable parameters differ form 10M to 0.8M. But they all need around 20G VRAM. I find this a little bit weird.

Hello, can I check with you how to use Iora to finetune Bloom-3B? I encountered the issue of Bloom-3B having no v_proj and q_proj in the base model. Thanks a lot!

from lora.

zsc commented on May 22, 2024

@aegisgpt

having no v_proj and q_proj in the base model

By https://huggingface.co/smangrul/twitter_complaints_bigscience_bloomz-7b1_LORA_CAUSAL_LM/blob/main/adapter_config.json , need to change to query_key_value for bloom models. Let me know if that solves your problem.

from lora.

aegisgpt commented on May 22, 2024

@aegisgpt

having no v_proj and q_proj in the base model

By https://huggingface.co/smangrul/twitter_complaints_bigscience_bloomz-7b1_LORA_CAUSAL_LM/blob/main/adapter_config.json , need to change to query_key_value for bloom models. Let me know if that solves your problem.

Hey @zsc , many thanks! I tried it and it worked! Do you mind sharing where I can find more detailed documentations for LoRA online, especially with regards to configurations for various types of GPTs?

from lora.

zsc commented on May 22, 2024

This may be useful: https://github.com/huggingface/peft/blob/main/src/peft/mapping.py

from lora.

aegisgpt commented on May 22, 2024

This may be useful: https://github.com/huggingface/peft/blob/main/src/peft/mapping.py

Thank you! That helps!

from lora.

Recommend Projects

Fintuning 176B Bloom with lora about lora HOT 7 OPEN

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent