thudm / chatglm3 Goto Github PK
View Code? Open in Web Editor NEWChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型
License: Apache License 2.0
ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型
License: Apache License 2.0
如题
ChatGLM3-6B-32K相对于ChatGLM3-6B在非长文本评价指标上是否存在性能损失?能否透露相关性能损失的具体大小?
模型部署后调用了几百次没问题 但再调用就报了这个错误
ERROR: Exception in ASGI application
Traceback (most recent call last):
File "/usr/local/anaconda3/envs/chatglm/lib/python3.8/site-packages/uvicorn/protocols/http/h11_impl.py", line 428, in run_asgi
result = await app( # type: ignore[func-returns-value]
File "/usr/local/anaconda3/envs/chatglm/lib/python3.8/site-packages/uvicorn/middleware/proxy_headers.py", line 78, in call
return await self.app(scope, receive, send)
File "/usr/local/anaconda3/envs/chatglm/lib/python3.8/site-packages/fastapi/applications.py", line 276, in call
await super().call(scope, receive, send)
File "/usr/local/anaconda3/envs/chatglm/lib/python3.8/site-packages/starlette/applications.py", line 122, in call
await self.middleware_stack(scope, receive, send)
File "/usr/local/anaconda3/envs/chatglm/lib/python3.8/site-packages/starlette/middleware/errors.py", line 184, in call
raise exc
File "/usr/local/anaconda3/envs/chatglm/lib/python3.8/site-packages/starlette/middleware/errors.py", line 162, in call
await self.app(scope, receive, _send)
File "/usr/local/anaconda3/envs/chatglm/lib/python3.8/site-packages/starlette/middleware/exceptions.py", line 79, in call
raise exc
File "/usr/local/anaconda3/envs/chatglm/lib/python3.8/site-packages/starlette/middleware/exceptions.py", line 68, in call
await self.app(scope, receive, sender)
File "/usr/local/anaconda3/envs/chatglm/lib/python3.8/site-packages/fastapi/middleware/asyncexitstack.py", line 21, in call
raise e
File "/usr/local/anaconda3/envs/chatglm/lib/python3.8/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in call
await self.app(scope, receive, send)
File "/usr/local/anaconda3/envs/chatglm/lib/python3.8/site-packages/starlette/routing.py", line 718, in call
await route.handle(scope, receive, send)
File "/usr/local/anaconda3/envs/chatglm/lib/python3.8/site-packages/starlette/routing.py", line 276, in handle
await self.app(scope, receive, send)
File "/usr/local/anaconda3/envs/chatglm/lib/python3.8/site-packages/starlette/routing.py", line 66, in app
response = await func(request)
File "/usr/local/anaconda3/envs/chatglm/lib/python3.8/site-packages/fastapi/routing.py", line 237, in app
raw_response = await run_endpoint_function(
File "/usr/local/anaconda3/envs/chatglm/lib/python3.8/site-packages/fastapi/routing.py", line 163, in run_endpoint_function
return await dependant.call(**values)
File "get_api_cuda1.py", line 66, in create_item
response, history = model.chat(tokenizer,
File "/usr/local/anaconda3/envs/chatglm/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 1032, in chat
inputs = inputs.to(self.device)
File "/usr/local/anaconda3/envs/chatglm/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 758, in to
self.data = {k: v.to(device=device) for k, v in self.data.items()}
File "/usr/local/anaconda3/envs/chatglm/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 758, in
self.data = {k: v.to(device=device) for k, v in self.data.items()}
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA
to enable device-side assertions.
我想部署在如vllm的平台上,请问如何通过对话模型输入提示词的方式,将工具的设定输入
请问1.5B 和 3B 的参数开放吗?
端口是打开了的 也给了白名单 但就是浏览器无法访问
如何支持在线和离线量化???
安装依赖 pip install -r requirements.txt
然后 streamlit run main.py
然后报错下面信息:
2023-10-29 00:31:32.581 Uncaught app exception
Traceback (most recent call last):
File "/home/xxx/.local/share/virtualenvs/ChatGLM3-uTamXjui/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 534, in _run_script
exec(code, module.__dict__)
File "/home/xxx/code/github/ChatGLM3/composite_demo/main.py", line 11, in <module>
import demo_chat, demo_ci, demo_tool
File "/home/xxx/code/github/ChatGLM3/composite_demo/demo_ci.py", line 9, in <module>
import jupyter_client
ModuleNotFoundError: No module named 'jupyter_client'
安装 pip install jupyter_client
之后正常
请问visualGLM3会考虑吗
之前的openai api脚本应该不能用了吧,有没有大佬写一个新的
我没看到具体generate方法代码,就先用prepare_inputs_for_generation分析。
如上图,llama的prepare_inputs_for_generation可以支持embedding输入,但是chatglm没有。
请问chatglm的generate方法是否不支持embedding输入?
如果理解错误,还望见谅。
@xunkai55 @davidlvxin @duzx16
python38能跑吗?
更新ChatGLM3之后,感觉推理速度变慢了
怎么下载chatglm3-6b-32k呢
我有观察到Prompt文档显示如何生成图像,如下所示。请问【image】是随意写个占位符就行了是吗?
...
plt.axis('equal')
plt.axis('off')
plt.show()
<|observation|>
```result
【image】
<|assistant|>
这是一个爱心形状。我使用了参数方程来描述这个形状,并使用matplotlib进行了绘制。如果您有任何其他需求或问题,请随时告诉我。
<|user|> # End
非常感谢开源,很棒的工作
ChatGLM3 的 tokenizer 对特殊字符(如<|user|>)不允许注入,微调时应如何构造对齐模版的数据呢?具体而言,encode 时无法将 <|user|> 等 special tokens 编码到对应 id,而只是当成普通文本处理。这种情况在垂类微调时,数据应该怎么构造、处理,才能保证模板一致?
FYI:QWen 在发布初期也进行了防注入,后续社区反馈影响很大,做出了一定的调整 QWen的处理方式
同时,tokenizer 的类命名(ChatGLMTokenizer)与 ChatGLM2 的 tokenizer 类命名一致,但在细节上完全不同,这可能会使得一些下游仓库在适配时遇到问题,请问是否考虑给 ChatGLM3 的 tokenizer 起一个新的类名字?
如何设置自己个性化的system角色,让模型默认扮演着这个角色进行后续的对话,用户不需要在对话时要求模型扮演,类似chatgpt的customize,在哪里修改代码呢?
萌新~请教各位大佬,我看到文档中说需要torch2.0以上达到最佳推理性能,请问是体现在速度方面吗?会不会影响模型的推理效果呢?
谢谢大佬们!
需要样例代码,让我们更好的使用,调用ChatGLM3:
关心这个问题,谢谢
如题
I see that function calling is the format used by the Openai API, which may not be very user-friendly for some POST requests. Is there any solution to this?
ChatGLM3-6b会提供可直接下载的int4量化权重吗?
File "G:\ai\ChatGLM\ChatGLM3\web_demo2.py", line 23, in
tokenizer, model = get_model()
File "G:\ai\ChatGLM\ChatGLM3\venv\lib\site-packages\streamlit\runtime\caching\cache_utils.py", line 212, in wrapper
return cached_func(*args, **kwargs)
File "G:\ai\ChatGLM\ChatGLM3\venv\lib\site-packages\streamlit\runtime\caching\cache_utils.py", line 241, in call
return self._get_or_create_cached_value(args, kwargs)
File "G:\ai\ChatGLM\ChatGLM3\venv\lib\site-packages\streamlit\runtime\caching\cache_utils.py", line 267, in _get_or_create_cached_value
return self._handle_cache_miss(cache, value_key, func_args, func_kwargs)
File "G:\ai\ChatGLM\ChatGLM3\venv\lib\site-packages\streamlit\runtime\caching\cache_utils.py", line 321, in _handle_cache_miss
computed_value = self.info.func(*func_args, **func_kwargs)
File "G:\ai\ChatGLM\ChatGLM3\web_demo2.py", line 14, in get_model
tokenizer = AutoTokenizer.from_pretrained("./models/chatglm3-6b", trust_remote_code=True)
File "G:\ai\ChatGLM\ChatGLM3\venv\lib\site-packages\transformers\models\auto\tokenization_auto.py", line 676, in from_pretrained
tokenizer_class = get_class_from_dynamic_module(class_ref, pretrained_model_name_or_path, **kwargs)
File "G:\ai\ChatGLM\ChatGLM3\venv\lib\site-packages\transformers\dynamic_module_utils.py", line 443, in get_class_from_dynamic_module
return get_class_in_module(class_name, final_module.replace(".py", ""))
File "G:\ai\ChatGLM\ChatGLM3\venv\lib\site-packages\transformers\dynamic_module_utils.py", line 164, in get_class_in_module
module = importlib.import_module(module_path)
File "D:\Program Files\python\lib\importlib_init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1050, in _gcd_import
File "", line 1027, in _find_and_load
File "", line 992, in _find_and_load_unlocked
File "", line 241, in _call_with_frames_removed
File "", line 1050, in _gcd_import
File "", line 1027, in _find_and_load
File "", line 992, in _find_and_load_unlocked
File "", line 241, in _call_with_frames_removed
File "", line 1050, in _gcd_import
File "", line 1027, in _find_and_load
File "", line 1004, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'transformers_modules.'
但实际transformers已经安装也是4.30.2
💬 Chat
🛠️ Tool
🧑💻 Code Interpreter
Tools
查询巴黎天气
RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'
Traceback:
File "/root/miniconda3/envs/chatglm3-demo/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 534, in _run_script
exec(code, module.dict)
File "/root/ChatGLM3-main/composite_demo/main.py", line 52, in
demo_tool.main(top_p, temperature, prompt_text)
File "/root/ChatGLM3-main/composite_demo/demo_tool.py", line 111, in main
for response in client.generate_stream(
File "/root/ChatGLM3-main/composite_demo/client.py", line 119, in generate_stream
for new_text, _ in stream_chat(self.model,
File "/root/ChatGLM3-main/composite_demo/client.py", line 69, in stream_chat
for outputs in self.stream_generate(**inputs, past_key_values=past_key_values,
File "/root/miniconda3/envs/chatglm3-demo/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 35, in generator_context
response = gen.send(None)
File "/root/.cache/huggingface/modules/transformers_modules/modeling_chatglm.py", line 1156, in stream_generate
outputs = self(
File "/root/miniconda3/envs/chatglm3-demo/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/miniconda3/envs/chatglm3-demo/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/modeling_chatglm.py", line 937, in forward
transformer_outputs = self.transformer(
File "/root/miniconda3/envs/chatglm3-demo/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/miniconda3/envs/chatglm3-demo/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/modeling_chatglm.py", line 830, in forward
hidden_states, presents, all_hidden_states, all_self_attentions = self.encoder(
File "/root/miniconda3/envs/chatglm3-demo/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/miniconda3/envs/chatglm3-demo/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/modeling_chatglm.py", line 640, in forward
layer_ret = layer(
File "/root/miniconda3/envs/chatglm3-demo/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/miniconda3/envs/chatglm3-demo/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/modeling_chatglm.py", line 544, in forward
attention_output, kv_cache = self.self_attention(
File "/root/miniconda3/envs/chatglm3-demo/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/miniconda3/envs/chatglm3-demo/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/modeling_chatglm.py", line 376, in forward
mixed_x_layer = self.query_key_value(hidden_states)
File "/root/miniconda3/envs/chatglm3-demo/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/miniconda3/envs/chatglm3-demo/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/root/miniconda3/envs/chatglm3-demo/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
兼容openAI的代码什么时候开发?
提示语里面有一句话,有一年被评为A,glm3输出这句原话 变成了有一量被评为A
In both web_demo.py and web_demo2.py the paths to ChatGLM3 model are pointing to a local path "/mnt/vepfs/workspace/zxdu/chatglm3-6b"
eg.
tokenizer = AutoTokenizer.from_pretrained("/mnt/vepfs/workspace/zxdu/chatglm3-6b", trust_remote_code=True)
model = AutoModel.from_pretrained("/mnt/vepfs/workspace/zxdu/chatglm3-6b", trust_remote_code=True).cuda()
They probably should be corrected to hugging face hub format, like "THUDM/chatglm3-6b"
发现chatglm3-6b可以跑出json,chatglm3-6b-32k跑不出json
大佬,api样例代码什么时候出?
感谢 release 了一个非常强大的模型,想问一下 readme 中的 benchmark result 如何进行复现呢?我尝试使用了和 huggingface leaderboard 或提供在原本论文/repo中类似的测试方加 greedy decoding 的方法进行测试。经过一些调试,可以复现 Llama2 文章中提到的大部分分数。但是在测试 ChatGLM3 时发现,除了 AGI_Eval 的分数较为接近外,大部分分数都有 10-20 分以上的差距,尤其是 GSM8K 的 exact match 只有 47.8 分,即使是检测 contains,也只有 51 分。是否可以分享下复现 benchmark 结果的方法?谢谢
如何部署到手机端呢?
=== History:
[Conversation(role=<Role.USER: 2>, content='1', tool=None, image=None)]
2023-10-28 23:14:33.424 Uncaught app exception
Traceback (most recent call last):
File "E:\ChatGLM3\venv\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 534, in _run_script
exec(code, module.dict)
File "E:\ChatGLM3\composite_demo\main.py", line 50, in
demo_chat.main(top_p, temperature, system_prompt, prompt_text)
File "E:\ChatGLM3\composite_demo\demo_chat.py", line 50, in main
for response in client.generate_stream(
File "E:\ChatGLM3\composite_demo\client.py", line 119, in generate_stream
for new_text, _ in stream_chat(self.model,
File "E:\ChatGLM3\composite_demo\client.py", line 69, in stream_chat
for outputs in self.stream_generate(**inputs, past_key_values=past_key_values,
File "E:\ChatGLM3\venv\lib\site-packages\torch\utils_contextlib.py", line 35, in generator_context
response = gen.send(None)
File "C:\Users\zhuya/.cache\huggingface\modules\transformers_modules\THUDM\chatglm3-6b\fc3235f807ef5527af598c05f04f2ffd17f48bab\modeling_chatglm.py", line 1156, in stream_generate
outputs = self(
File "E:\ChatGLM3\venv\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "E:\ChatGLM3\venv\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\zhuya/.cache\huggingface\modules\transformers_modules\THUDM\chatglm3-6b\fc3235f807ef5527af598c05f04f2ffd17f48bab\modeling_chatglm.py", line 937, in forward
transformer_outputs = self.transformer(
File "E:\ChatGLM3\venv\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "E:\ChatGLM3\venv\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\zhuya/.cache\huggingface\modules\transformers_modules\THUDM\chatglm3-6b\fc3235f807ef5527af598c05f04f2ffd17f48bab\modeling_chatglm.py", line 830, in forward
hidden_states, presents, all_hidden_states, all_self_attentions = self.encoder(
File "E:\ChatGLM3\venv\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "E:\ChatGLM3\venv\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\zhuya/.cache\huggingface\modules\transformers_modules\THUDM\chatglm3-6b\fc3235f807ef5527af598c05f04f2ffd17f48bab\modeling_chatglm.py", line 640, in forward
layer_ret = layer(
File "E:\ChatGLM3\venv\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "E:\ChatGLM3\venv\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\zhuya/.cache\huggingface\modules\transformers_modules\THUDM\chatglm3-6b\fc3235f807ef5527af598c05f04f2ffd17f48bab\modeling_chatglm.py", line 544, in forward
attention_output, kv_cache = self.self_attention(
File "E:\ChatGLM3\venv\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "E:\ChatGLM3\venv\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\zhuya/.cache\huggingface\modules\transformers_modules\THUDM\chatglm3-6b\fc3235f807ef5527af598c05f04f2ffd17f48bab\modeling_chatglm.py", line 376, in forward
mixed_x_layer = self.query_key_value(hidden_states)
File "E:\ChatGLM3\venv\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "E:\ChatGLM3\venv\lib\site-packages\torch\nn\modules\module.py", line 1527, in call_impl
return forward_call(*args, **kwargs)
File "E:\ChatGLM3\venv\lib\site-packages\torch\nn\modules\linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: "addmm_impl_cpu" not implemented for 'Half'
在demo里面没有看到
您好,注意到此版模型prompt格式调整挺大。请问长本文下(如文档问答)prompt构建与普通问答一样吗,有没什么样例呢
请问长文档问答的prompt构造如何合适
{'role': 'assistant', 'metadata': '', 'content': '你好世界'}
ChatGPT-Next-Web请官方支持,感激不尽!!!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.