Comments (4)
I think it may be related to #31679
from transformers.
Hi @MARD1NO, thanks for opening a PR!
So that we can best help you, could you:
- Share the full running env: run
transformers-cli env
in the terminal and copy-paste the output - Share a minimal code snippet to reproduce the error
It does look like the error is similar to the one in #31679. As the code in the description looks like it's custom, rather than from the transformers library, that code might need to be updated to handle this
from transformers.
Hi @MARD1NO, thanks for opening a PR!
So that we can best help you, could you:
- Share the full running env: run
transformers-cli env
in the terminal and copy-paste the output- Share a minimal code snippet to reproduce the error
It does look like the error is similar to the one in #31679. As the code in the description looks like it's custom, rather than from the transformers library, that code might need to be updated to handle this
Hi @amyeroberts, thanks for your quick reply :D
env is:
- `transformers` version: 4.42.1
- Platform: Linux-5.4.0-176-generic-x86_64-with-glibc2.31
- Python version: 3.11.5
- Huggingface_hub version: 0.23.4
- Safetensors version: 0.4.1
- Accelerate version: 0.25.0
- Accelerate config: not found
- PyTorch version (GPU?): 2.1.2+cu121 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using distributed or parallel set-up in script?: <fill in>
- Using GPU in script?: <fill in>
- GPU type: NVIDIA GeForce RTX 3090
The minimal code snippet is when using chatglm3 to generate like that:
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm3-6b", padding_side="left", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("THUDM/chatglm3-6b", device_map="auto", trust_remote_code=True)
model = model.eval()
prompts = ["hello, how are you?", "Who are you?"]
inputs = tokenizer(prompts, padding=True, return_tensors='pt')
inputs = inputs.to(model.device)
pred = model.generate(**inputs,
max_new_tokens=128,
do_sample=False,
repetition_penalty=1.0)
print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))
And I test success in transformers==4.40.1, I thinks there exist some bug
from transformers.
Hi @MARD1NO, thanks for sharing!
As the modeling code is defined in https://huggingface.co/THUDM/chatglm3-6b/blob/main/modeling_chatglm.py, I'd suggest opening a discussion on the THUDM/chatglm3-6b repo to report this error
from transformers.
Related Issues (20)
- Allow additional keyword args to be passed to optuna hyperparameter search
- Is there any way to update the parameters of embedding model? HOT 2
- A bug that may cause device inconsistency HOT 4
- Gemma 2 Inference with BF16 fails HOT 5
- cuda device is wrongly requested instead of xpu running pipeline(device_map="auto", max_memory": {0: 1.0e+10}) HOT 2
- Incorrect Whisper long-form decoding timestamps HOT 2
- Very different output depending on whether an attention mask is passed when using caching HOT 3
- `last_hidden_state` has a different shape than `hidden_states[-1]` in the output of `SeamlessM4Tv2SpeechEncoder` if adapter layers are present HOT 1
- [GroundingDino] - GroundingDinoProcessor kwargs is Broken HOT 2
- Flash Attention with Gemma 2 HOT 5
- FX tracer doen't work when requesting non-default input argument HOT 2
- Keep Tuple of past key values as an option HOT 9
- How to manually stop the LLM output? HOT 2
- Pipeline's "num_return_sequences" > greater than 1 causes a runtime error with Gemma-2-9B. HOT 6
- WavLM returns empty hidden states when loaded directly to GPU HOT 1
- "TypeError: Object of type device is not JSON serializable" when saving the model on TPU HOT 3
- Add Depth Anything v2 metric depth HOT 2
- `attention_mask` must be in the same device as model? HOT 1
- `Gemma2Model` not returning cache HOT 8
- the attention output from llama2 generate differs from other llama models HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from transformers.