Comments (12)
Me win 10, cuda 12.1, python 3.10, gpu 3090*2, also encountered the same problem with KeyError: '<|endoftext|>'. It seems that the problem comes from AutoTokenizer.from_pretrained(...).
there is no clear basic environment recommendations. too bad
I just successfully fixed it. The version of transformer seems to be 4.40.0, which can be found in requirements.txt in basic_demo. Once I renew the version, it works. Hopefully this can help you as well.
from glm-4.
情况一样,tokenizer加载报错
transformer 升级为4.40.0确实好了
from glm-4.
请严格按照req安装依赖哦,如果是windows系统,不能装vLLM,使用transformers后端
from glm-4.
使用huggingface model页的多模态demo代码出现同样报错
return self.mergeable_ranks[token]
KeyError: '<|endoftext|>'
from glm-4.
Me win 10, cuda 12.1, python 3.10, gpu 3090*2, also encountered the same problem with KeyError: '<|endoftext|>'. It seems that the problem comes from AutoTokenizer.from_pretrained(...).
from glm-4.
Me win 10, cuda 12.1, python 3.10, gpu 3090*2, also encountered the same problem with KeyError: '<|endoftext|>'. It seems that the problem comes from AutoTokenizer.from_pretrained(...).
there is no clear basic environment recommendations. too bad
from glm-4.
Me win 10, cuda 12.1, python 3.10, gpu 3090*2, also encountered the same problem with KeyError: '<|endoftext|>'. It seems that the problem comes from AutoTokenizer.from_pretrained(...).
there is no clear basic environment recommendations. too badI just successfully fixed it. The version of transformer seems to be 4.40.0, which can be found in requirements.txt in basic_demo. Once I renew the version, it works. Hopefully this can help you as well.
would you please shall your full enviroment details?
i had other issues....
CUDA
Pytorch
etc..?
Thanks!
from glm-4.
Me win 10, cuda 12.1, python 3.10, gpu 3090*2, also encountered the same problem with KeyError: '<|endoftext|>'. It seems that the problem comes from AutoTokenizer.from_pretrained(...).
there is no clear basic environment recommendations. too badI just successfully fixed it. The version of transformer seems to be 4.40.0, which can be found in requirements.txt in basic_demo. Once I renew the version, it works. Hopefully this can help you as well.
would you please shall your full enviroment details? i had other issues.... CUDA Pytorch etc..?
Thanks!
Sure, my envs are:
win 10,
cuda 12.1,
python 3.10,
gpu 3090*2,
transformer==4.40.0,
torch==2.1.0
That's all for inference. Btw, I noticed that the README in basic_demo suggests "GPUs above A100, V100, 20 and older GPU architectures are not supported". I hope this may help.
from glm-4.
Me win 10, cuda 12.1, python 3.10, gpu 3090*2, also encountered the same problem with KeyError: '<|endoftext|>'. It seems that the problem comes from AutoTokenizer.from_pretrained(...).
there is no clear basic environment recommendations. too badI just successfully fixed it. The version of transformer seems to be 4.40.0, which can be found in requirements.txt in basic_demo. Once I renew the version, it works. Hopefully this can help you as well.
would you please shall your full enviroment details? i had other issues.... CUDA Pytorch etc..?
Thanks!Sure, my envs are:
win 10, cuda 12.1, python 3.10, gpu 3090*2, transformer==4.40.0, torch==2.1.0
That's all for inference. Btw, I noticed that the README in basic_demo suggests "GPUs above A100, V100, 20 and older GPU architectures are not supported". I hope this may help.
Thanks!
it shall be a amper GPU, not a tuning GPUT
from glm-4.
请严格按照req安装依赖哦,如果是windows系统,不能装vLLM,使用transformers后端
非常感谢
我是从GLM3的环境继承的,全局的CUDA版本也不一样
还是希望官方出一般最低要求或者兼容性比较好的要求。一台本地机器上能跑GLM3 6b,也能跑GLM4-9b
期待越来越好,越来越完善。
等待你们的更新
from glm-4.
或许可以使用我们trans_cli_demo,环境还是要重新装的,现在的依赖默认不会装vLLM,但是如果用trans后段能推理的长度非常非常短,8K差不多就到消费卡24G显存的上限
from glm-4.
Me win 10, cuda 12.1, python 3.10, gpu 3090*2, also encountered the same problem with KeyError: '<|endoftext|>'. It seems that the problem comes from AutoTokenizer.from_pretrained(...).
there is no clear basic environment recommendations. too badI just successfully fixed it. The version of transformer seems to be 4.40.0, which can be found in requirements.txt in basic_demo. Once I renew the version, it works. Hopefully this can help you as well.
would you please shall your full enviroment details? i had other issues.... CUDA Pytorch etc..?
Thanks!Sure, my envs are:
win 10, cuda 12.1, python 3.10, gpu 3090*2, transformer==4.40.0, torch==2.1.0
That's all for inference. Btw, I noticed that the README in basic_demo suggests "GPUs above A100, V100, 20 and older GPU architectures are not supported". I hope this may help.
May I ask what is the motherboard and CPU? can two GPU run maxium performance, as the PCIE channel limts of the CPU vs. two 3090. Thanks!
from glm-4.
Related Issues (20)
- You: bug GLM-4:The attention mask is not set and cannot be inferred from input because pad token is same as eos token.As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results. HOT 7
- 返回json格式时是中带有markdown的```json HOT 4
- 大佬想问一下glm4是encode-decode还是decode-only? HOT 1
- GLM-4 and Dify Response Reception Issue
- 本地glm4-9B模型使用function call 功能的问题 HOT 1
- GLM4如何一次性回复多个候选答案,类似于gpt参数设置“n”? HOT 2
- api_server运行报错,请教一下如何解决 HOT 2
- Rust Candle Framework Support
- 上传文件识别怎么做 HOT 1
- 可以写一个支持mlx环境运行的模型代码吗 HOT 1
- 工具调用时,promp回答和工具无关时stream=true设置无效
- finetune_demo.py中第277行代码是不是有问题?input_ids.append(151336) # EOS for chat,这个token_id并不是<eos,而是<user>. HOT 2
- 微调报错 TypeError: GenerationMixin._extract_past_from_model_output() got an unexpected keyword argument 'standardize_cache_format' HOT 2
- finetune_vision.py中的process_batch_eval Split the conversation into multiple dialogue segments 作用是什么
- glm4v可以支持一下vllm推理吗
- 这个是什么问题 HOT 1
- glm-4v-9b微调eval阶段出错,attention_mask未考虑visual tokens
- 8卡lora微调训练,数据集AdvertiseGen,loss从3.2613开始到1000多个step之后一直在2.8 2.9左右震荡,请问这是正常的吗 HOT 1
- 请问api_server支持GLM4v吗?
- compitable issue on transformer 4.44
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from glm-4.