System Info transformers: 4.37.2 torch: 2.1 torch_xla: 2.1

I think we can close this issue as discussed here: <a class="issue-link js-issue-link"

AttributeError: module 'torch' has no attribute 'xla' when using trainer for training a model about transformers HOT 2 CLOSED

shub-kris commented on May 21, 2024

AttributeError: module 'torch' has no attribute 'xla' when using trainer for training a model

from transformers.

Comments (2)

allenwang28 commented on May 21, 2024

I think we need to add a codepath in Trainer to enable gradient checkpointing for torch_xla using this API

from transformers.

shub-kris commented on May 21, 2024

I think we can close this issue as discussed here: pytorch/xla#6611 (comment)

The native grad ckpt won't work in any cases. We need to apply some special treatment for the XLA compiler to make grad ckpt working. Those are implemented here: https://github.com/pytorch/xla/blob/master/torch_xla/utils/checkpoint.py

However, I do think we should able to integrate our grad ckpt into HF such that users don't need to specify xla_fsdp_grad_ckpt in FSDP anymore

from transformers.

Related Issues (20)

Cannot convert llama 3 model to hf HOT 2
error when using PPO in Gemma HOT 9
Llama3 models causing `TypeError: not a string` error in LlamaTokenizer HOT 4
Some functional problems in the implementation of Speculative Decoding HOT 3
Error During Training with PatchTSMixerForTimeSeriesClassification for Time Series Classification HOT 1
Whisper assistant decoding not working with pipeline
Error During Training with PatchTSMixerForTimeSeriesClassification for Time Series Classification HOT 1
TypeError: WhisperForConditionalGeneration.forward() got an unexpected keyword argument 'model' HOT 5
FutureWarning about resume_download is raised after huggingface-hub 0.23.0 release
Remove pipelines, chatformatters, templates etc --> Replace with simple generator function / manual string interpolation ---> Just have one standardized way for building datasets and running inference HOT 2
HTML Files Keep on Loading HOT 1
Wav2Vec2ForCTC weight mismatch HOT 1
More memory consumption than litgpt
Setting compute_metrics in Trainer with Idefics2ForConditionalGeneration leads to AttributeError: 'DynamicCache' object has no attribute 'detach' HOT 5
DPT implementation contains unused parameters HOT 4
Urdu Encoding Issue in Hugging Face Tokenizer HOT 1
Add Prismatic VLMs to Transformers HOT 3
Error converting from PyTorch to HuggingFace - Mistral / Mixtral
model_max_length default parameters are missing in transformers>=4.40.0 HOT 2
(Have PR) Speed up `BeamScorer` to make GPT-2 generation 2-3x faster HOT 1

AttributeError: module 'torch' has no attribute 'xla' when using trainer for training a model about transformers HOT 2 CLOSED

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent