Comments (6)
LLM Studio needs quite a bit of GPU memory to fine-tune the large language models.
We employ state of the art techniques to reduce the amount of VRAM required for training.
Make sure to use LoRA, gradient checkpointing, low batch size (1) and int4 quantization.
May I ask which model you were trying to fine-tune and which GPU you have? With 10 GB VRAM and the above settings, it may be possible to fine-tune a 7B model, but not larger.
from h2o-llmstudio.
Thank you for your response. The model trained successfully. It's during execution, opening the 'Chat' tab, that it fails. I see this error in Ubuntu: torch.cuda.OutOfMemoryError. On the 'Config' tab, it shows Lora = True, Gradient Checkpointing = True, Batch Size = 2, and Backbone Dtype = int4.
Maybe I need to change the batch size to 1?
I'm running dual Nvidia GeForce RTX 3080s.
from h2o-llmstudio.
I am trying to reproduce what could be different in the chat than for fine-tune. Both use the same logic, and locally, I see the memory peaking at around 15 GB for both when loading the checkpoints.
The batch_size is not used during inference/chat. It seems that the model exceeds max VRAM when loading the checkpoint.
from h2o-llmstudio.
There is a small memory leak after training, so if the weights are very tight it could oom.
Did you try restarting the whole application and then navigating to the Chat tab?
from h2o-llmstudio.
Yes, I did restart the application and restarted the machine just to be sure.
from h2o-llmstudio.
Maybe we can init the model with empty weights before loading the checkpoint, haven't tested the workflow yet.
https://huggingface.co/docs/accelerate/v0.11.0/en/big_modeling
from h2o-llmstudio.
Related Issues (20)
- [FEATURE] Option for not saving checkpoint HOT 1
- [FEATURE] Use local LLM deployment as Judge HOT 1
- Compare Zero-Epoch Prediction with Fine-Tuned Prediction as well as Validation Score Comparison HOT 1
- [FEATURE] Option to plot train/eval plots with epoch instead of step on x-axis
- Data Format section has a broken link HOT 4
- [BUG] HuggingFace export does not preserve bfloat16 weights but converts to float16 silently when using CPU for upload HOT 9
- [FEATURE] add support for multigpu, splitting model across gpus without using deepspeed/fsdp HOT 3
- [CODE IMPROVEMENT] Custom HF model for classification
- [BUG] Code rendering in the validation prediction insights replaces characters
- Can't access meta-llama/Llama-2-7b using llmstudio cli tool HOT 1
- [FEATURE] Freezing layers
- [UX] Screen hangs when you click Download Model
- [FEATURE] Implement SimPO
- [FEATURE] Select multiple training dataframes
- [DOCS] Duplicate Questions in the FAQ's
- [FEATURE] Connection with LLM DataStudio HOT 1
- [BUG] Memory allocation left resident in GPU(s) after model upload to HuggingFace
- [BUG] Default model is added twice to the LLM Backbone list
- [BUG] Deepseek tokenizer error on main (worked before) HOT 4
- [FEATURE] Add (experimental) FP8 support
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from h2o-llmstudio.