Comments (1)
Hi @zewenli98, thanks for opening this issue!
I don't think this is a bug in either PyTorch or Hugging Face. When one uses torch.export
, it's trying to trace the model/function/callable to produce a traced graph i.e. something which can effectively be compiled or serialized. Not all code is tracing compatible. In particular, things like variable shapes, unknown or changing input and output types and certain logic controls cannot be traced. In the error, we can see the tracing breaks at this line (implies variable tensor sizes and a logic condition).
As the modeling code is on the hub, it's the repo's authors who can update it. If you wish for the model to be exportable, I'd suggest opening a discussion on the checkpoint's community tab, requesting this feature.
from transformers.
Related Issues (20)
- [Bug] Modifying normalizer for pretrained tokenizers don't consistently work HOT 1
- Failed to import transformers HOT 6
- Generation with HybridCache fails (affecting Gemma-2) HOT 2
- Qwen2-1.5B eos_token NoneType Error prevents generation HOT 2
- Vit-hybrid is deprecated, however still shown in the official documentation (with broken links) HOT 4
- compute_metric(eval_pred) in trainer is not mini-batch HOT 1
- transformers.pipeline does not load tokenizer passed as string for custom models HOT 1
- Do we need a config to change `padding_side='left` before the evaluation? HOT 5
- Label Leakage in Gemma 2 Finetuning HOT 1
- QLORA + FSDP distributed fine-tuning failed at the end during model saving stage
- Error running inference on CogVLM2 when distributing it on multiple GPUs: Expected all tensors to be on the same device, but found at least two devices HOT 2
- Mismatch with epoch when using gradient_accumulation HOT 2
- AttributeError: 'str' object has no attribute 'shape' HOT 4
- Whisper - list index out of range with word level timestamps HOT 1
- NameError: free variable 'state_dict' referenced before assignment in enclosing scope HOT 3
- Any config for DeBERTa series as decoders for TSDAE? HOT 3
- Unable to load models with adapter weights in offline mode HOT 3
- meta-llama/Llama-2-7b-chat-hf tokenizer `model_max_length` attribute needs to be fixed.
- When I used galore, the learning rate was set to 8e-6, but the training rate was 0.001 HOT 7
- Add `bot_token` attribute to `PreTrainedTokenizer` and `PreTrainedTokenizerFast` HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from transformers.