Comments (1)
Hi @JuiceLemonLemon, thanks for opening this issue!
Without knowing the GPUs your running on, it'd be hard to say what's reasonable in terms of CPU offloading utilization. Have you inspected with tools like nvidia-smi
and top
to see the memory usage and ensuring the model is loading as expected?
As the command comes from the https://github.com/tatsu-lab/stanford_alpaca repo, I'd suggest opening an issue on this repo, and they'll have more knowledge and experience with the expected behaviour and possible gotchas
from transformers.
Related Issues (20)
- model.generate cannot handle past_key_values correctly HOT 2
- Embedding size 0 when using TrainingArguments & Deepspeed stage 3 with ```model.get_input_embedding()``` HOT 1
- Chameleon model failed after receiving two times the same inputs HOT 2
- Transformer Hangup. HOT 2
- Minor typo in ImageClassificationPipeline
- Bert cannot converge on toy dataset HOT 2
- Why MPS can never be used successfully? HOT 5
- ddp_time in TrainingArguments with deepspeed doesn't take effect HOT 1
- static cache implementation is not compatible with attn_implementation==flash_attention_2 HOT 2
- `AutoModel` class for `image-text-to-text` models HOT 3
- Using `numpy==2.0.0` HOT 1
- Bug version 4.42.4: KeyError: 'Cache only has 0 layers, attempted to access layer with index 0'
- *Nothing* HOT 1
- More robust tests required for gradient checkpointing HOT 4
- The ProgressCallback triggers a `cannot pickle '_thread.lock' object` failure HOT 5
- Checkpoint validation as an option HOT 6
- BertForSequenceClassification.from_pretrained broken when using FSDP HOT 4
- unexpected keyword argument 'torch_empty_cache_steps' in TrainingArguments HOT 5
- cannot use activation_checkpoint in torch native fsdp HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from transformers.