Comments (6)
Hi @goelayu, this is expected since with torch.device('meta')
also puts the buffers on the meta
device. However, non persistant buffers are not saved in the state_dict
. So, in the case of a llama model where we do have non persistant buffers, you get an error after loading the weights With init_empty_weights
, by default, we don't put the buffer on the meta
device. This is why it is working. Hope it is clearer !
from transformers.
cc @muellerzr for the accelerate related stuff rather than Sylvain!
from transformers.
To add to the above, if i use init_empty_weights
from accelerate
I can skip the initialization without any errors.
Wondering what is the difference between the two? Also if it is possible to achieve the same using the torch.device('meta')
context manager.
from transformers.
Mmmm could you make sure that the map_location
is correct?
This might be expected, cc @SunMarc WDYT?
from transformers.
So this issue seems to be documented in the code itself big_modeling.py, turns out you can't run model.to
when using the meta
device. I was hoping for some kind of explanation as to why is that the case?
(hence tagged @sgugger since the big_modeling.py
file seems to be often modified by them)
Also if you notice my comment from above, replacing torch.device('meta')
with init_empty_weights
from the accelerate
package seems to resolve the issue.
from transformers.
@SunMarc thanks for the response, that answers my question.
from transformers.
Related Issues (20)
- RuntimeError: expected mat1 and mat2 to have the same dtype, but got: float != c10::BFloat16 HOT 6
- Add post_process_depth_estimation to image processors HOT 1
- [BUG] Offline loading of non-safe tensors fails HOT 2
- `center_crop` outputs wrong sized array if provided with odd-numbered dimensions smaller than requested crop size HOT 1
- LLama3-70b LoRa results in OOM with torchrun but succeeds with python3 command HOT 2
- Sink Cache Attention Scores are strange. CausalMask seems not working. HOT 2
- Libraries import missing, unable to load image for inference and not able to load pipeline with the trained model HOT 4
- CLIPTokenizerFast cause memory leak HOT 1
- VisEncoderDecoderModel generate text incomplete when predict image with long text label HOT 1
- Trained tokenizer has broken encoding for cyrillic HOT 3
- Running out of memory while finetuning and inferencing VideoMAE due to which script is being killed. HOT 5
- Trainer memory leak for evaluation with `compute_metrics`
- Llama Model throwing "RuntimeError: expected scalar type BFloat16 but found Float" when using torch.compile and AMP together HOT 6
- [LLaMA3] 'add_bos_token=True, add_eos_token=True' seems not taking effect HOT 4
- google/siglip-so400m-patch14-384 inference output mismatch with pipeline output HOT 4
- Why using empty tensor to initialize? HOT 3
- Allow `ConversationalPipeline` to receive string input HOT 3
- Weird behaviour running AWQ code on RTX 4000 Ada that worked on Tesla T4 HOT 5
- AttributeError: 'BertModel' object has no attribute 'attn_implementation' HOT 7
- Training GPT2 with run_clm.py exceeds the described memory amount . HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from transformers.