Giter Site home page Giter Site logo

x-d-lab / langchain-chatglm-webui Goto Github PK

View Code? Open in Web Editor NEW
3.1K 3.1K 468.0 19.11 MB

基于LangChain和ChatGLM-6B等系列LLM的针对本地知识库的自动问答

License: Apache License 2.0

Python 99.49% Dockerfile 0.51%
belle bilibili chatglm-6b chatglm-webui jina langchain langchain-serve llama llm minimax modelscope

langchain-chatglm-webui's People

Contributors

123456adwae2 avatar aliscacl avatar barryyin avatar d-mahony-x avatar damon-ldl avatar godlockin avatar notandor avatar online2311 avatar thomas-yanxin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

langchain-chatglm-webui's Issues

初次运行报错

显示模块找不到,但是pip装不了这么模块

from duckduckgo_search.utils import Session
ModuleNotFoundError: No module named 'duckduckgo_search.utils'

AiStudio和启智的平台都有问题

AiStudio上依赖库之间出现问题,按照Notebook内容无法正确安装

启智上使用docker镜像,降级了gradio,但是Embedding model无论怎么选择或者怎么放置都无法正确加载(尝试放在dataset中,并对应修改config.py路径,,或者dataset什么都不放让他自己下载)
image

新版本和视频不太匹配

整个流程跑完了,也正常启动,终端打印的日志中只有127.0.0.1,但是没有外网访问地址,而且使用的是CPU访问,请问是哪一个步骤不对吗?

一直提示找不到模型

image

模型文件具体在哪,我没看到,但是再开头的log日志里我看到快7个G的文件是下载完了的,我用的hugging face

在模型加载以及知识库上传遇到了各种各样的问题

代码是今天pull的,设备是一机双卡(3090),我这边的网络从HF上下载,要比启智快很多,启智下载不动,所以没有采用作者提供的非chatglm模型:
1、这是从hf上down下来的vicuna-13b-1.1加载后的报错(警告),页面显示加载不成功

  • This IS expected if you are initializing LlamaModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
  • This IS NOT expected if you are initializing LlamaModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).

2、从hf上down下来的BELLE-LLaMA-13B-2M-enc页面显示加载失败,后台无提示;
用我自己微调的vicuna-13b,加载成功,但有提示:You are probably using the old Vicuna-v0 model, which will generate unexpected results with the current fschat.
推理显存爆了torch.cuda.OutOfMemoryError: CUDA out of memory.

3、从新选择chatglm_6b,模型自动下载的那种,加载成功,推理也成功,上传doc文件报:ocx.opc.exceptions.PackageNotFoundError: Package not found at xxx.doc
于是尝试上传UTF-8的txt,结果UnicodeDecodeError: 'ascii' codec can't decode byte 0xe7 in position

4、另外,在config.py修改init_llm 和 init_embedding_model 好像只有界面生效,模型并未加载,只在手动选择后才能加载模型。

感谢作者提供这么好的项目,烦请百忙之中抽空帮忙看看,是不是我哪里配置不对,谢谢!!!

输入python3 app.py后没有反应

输入python3 app.py后没有反应,接着尝试输入python app.py提示我:

Traceback (most recent call last):
File "E:\langchain-chatGLM-webui\LangChain-ChatGLM-Webui\app.py", line 8, in
from duckduckgo_search.utils import SESSION
ModuleNotFoundError: No module named 'duckduckgo_search.utils'

然后我尝试了使用py app.py运行,结果报错
Traceback (most recent call last):
File "E:\langchain-chatGLM-webui\LangChain-ChatGLM-Webui\app.py", line 4, in
import gradio as gr
ModuleNotFoundError: No module named 'gradio'
但是我已经安装了gradio,并且打开python输入Import也成功,我不知道怎么回事了

另外说一下,在安装requirement.txt里的依赖性的时候无法按照detectron2,于是我按照这个方法:
https://zhuanlan.zhihu.com/p/425631249
安装了,然后才允许的Langchain-chatglm-webui,不知道有没有关系。

是否可以加载 safetensors 模型

我在用 GPTQ-for-LLaMa 压缩模型

CUDA_VISIBLE_DEVICES=0 python llama.py ${MODEL_DIR} c4 --wbits 4 --true-sequential --act-order --groupsize 128 --save_safetensors llama7b-4bit-128g.safetensors

项目是否可以加载safetensors 模型

没有GPU的电脑上不能运行

Setting CPU quantization kernel threads to 20
Using quantization cache
Applying quantization to glm layers
Traceback (most recent call last):
File "/home/wz/.local/lib/python3.9/site-packages/gradio/routes.py", line 401, in run_predict
output = await app.get_blocks().process_api(
File "/home/wz/.local/lib/python3.9/site-packages/gradio/blocks.py", line 1302, in process_api
result = await self.call_function(
File "/home/wz/.local/lib/python3.9/site-packages/gradio/blocks.py", line 1025, in call_function
prediction = await anyio.to_thread.run_sync(
File "/home/wz/.local/lib/python3.9/site-packages/anyio/to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/home/wz/.local/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "/home/wz/.local/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 867, in run
result = context.run(func, *args)
File "/home/wz/LangChain-ChatGLM-Webui/app.py", line 150, in predict
resp = get_knowledge_based_answer(
File "/home/wz/LangChain-ChatGLM-Webui/app.py", line 108, in get_knowledge_based_answer
chatLLM.load_model(model_name_or_path=llm_model_dict[large_language_model])
File "/home/wz/LangChain-ChatGLM-Webui/chatglm_llm.py", line 114, in load_model
self.model = (AutoModel.from_pretrained(
File "/home/wz/.local/lib/python3.9/site-packages/transformers/modeling_utils.py", line 1811, in to
return super().to(*args, **kwargs)
File "/home/wz/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1145, in to
return self._apply(convert)
File "/home/wz/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 797, in _apply
module._apply(fn)
File "/home/wz/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 797, in _apply
module._apply(fn)
File "/home/wz/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 820, in _apply
param_applied = fn(param)
File "/home/wz/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1143, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
File "/home/wz/.local/lib/python3.9/site-packages/torch/cuda/init.py", line 247, in _lazy_init
torch._C._cuda_init()
RuntimeError: No CUDA GPUs are available

是否支持多 GPU。

单卡显存容量不够,一样可以支持多 GPU 把模型分摊到多 GPU 显存。

'NoneType' object has no attribute 'write'

Traceback (most recent call last):
File "C:\GLM\LangChain-ChatGLM-Webui-master\app.py", line 15, in
from chatllm import ChatLLM
File "C:\GLM\LangChain-ChatGLM-Webui-master\chatllm.py", line 7, in
from fastchat.serve.inference import load_model as load_fastchat_model
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\site-packages\fastchat\serve\inference.py", line 9, in
from transformers import (
File "", line 1075, in _handle_fromlist
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\utils\import_utils.py", line 1137, in getattr
value = getattr(module, name)
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\utils\import_utils.py", line 1136, in getattr
module = self._get_module(self._class_to_module[name])
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\utils\import_utils.py", line 1148, in _get_module
raise RuntimeError(
RuntimeError: Failed to import transformers.models.llama.modeling_llama because of the following error (look up to see its traceback):
'NoneType' object has no attribute 'write'

安装库后,提示这个错误,不知道哪里出了问题

多个文件或文件夹的支持

建议可以支撑一个文件、或者文件夹的支撑,
希望可以有一个针对不同的用户有不同的knowledage base,一个用户可以有多个knowledage base,每个knoledage base可以是多个文件或文件夹

win10下安装detectron2一直失败

安装requirement的时候出错,以下是主要信息:
...
Building wheels for collected packages: unstructured-inference, sentence-transformers, detectron2, langchain-serve, jina, docarray, fvcore, antlr4-python3-runtime, jcloud, promise, pycocotools, future, python-docx, python-pptx, olefile
Building wheel for unstructured-inference (setup.py) ... done
Created wheel for unstructured-inference: filename=unstructured_inference-0.4.4-py3-none-any.whl size=36816 sha256=fd37dd8b1c4723d1206d7c4757dbe285f02cf4119396d0c72f43c935a0ea3e1b
Stored in directory: e:\pipcache\wheels\7f\0c\91\360ebd8b96f0acd20be6cf329c372911a4c01c05c16a8846d3
Building wheel for sentence-transformers (setup.py) ... done
Created wheel for sentence-transformers: filename=sentence_transformers-2.2.2-py3-none-any.whl size=125960 sha256=98195ec8c6a418f5085171098d4fa5bd7b2fb0c3e06e0105138b21f09a6aaeca
Stored in directory: e:\pipcache\wheels\71\67\06\162a3760c40d74dd40bc855d527008d26341c2b0ecf3e8e11f
Building wheel for detectron2 (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [1042 lines of output]
running bdist_wheel
E:\minconda_extra_env_folder\langchain-chatGLM-webui\lib\site-packages\torch\utils\cpp_extension.py:476: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
warnings.warn(msg.format('we could not find ninja.'))
running build
running build_py
creating build
creating build\lib.win-amd64-cpython-39
creating build\lib.win-amd64-cpython-39\detectron2
copying detectron2_init_.py -> build\lib.win-amd64-cpython-39\detectron2
creating build\lib.win-amd64-cpython-39\tools
copying tools\analyze_model.py -> build\lib.win-amd64-cpython-39\tools
copying tools\benchmark.py -> build\lib.win-amd64-cpython-39\tools
copying tools\convert-torchvision-to-d2.py -> build\lib.win-amd64-cpython-39\tools
copying tools\lazyconfig_train_net.py -> build\lib.win-amd64-cpython-39\tools
copying tools\lightning_train_net.py -> build\lib.win-amd64-cpython-39\tools
copying tools\plain_train_net.py -> build\lib.win-amd64-cpython-39\tools
copying tools\train_net.py -> build\lib.win-amd64-cpython-39\tools
copying tools\visualize_data.py -> build\lib.win-amd64-cpython-39\tools
copying tools\visualize_json_results.py -> build\lib.win-amd64-cpython-39\tools
copying tools_init_.py -> build\lib.win-amd64-cpython-39\tools
...
"C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\bin\HostX86\x64\cl.exe" /c logo /O2 /W3 /GL /DNDEBUG /MD -DWITH_CUDA -IC:\Users\cc\AppData\Local\Temp\pip-install-2wjkw3vm\detectron2_1187d9a69c984854be83cbda608f18ff\detectron2\layers\csrc -IE:\minconda_extra_env_folder\langchain-chatGLM-webui\lib\site-packages\torch\include -IE:\minconda_extra_env_folder\langchain-chatGLM-webui\lib\site-packages\torch\include\torch\csrc\api\include -IE:\minconda_extra_env_folder\langchain-chatGLM-webui\lib\site-packages\torch\include\TH -IE:\minconda_extra_env_folder\langchain-chatGLM-webui\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\include" -IE:\minconda_extra_env_folder\langchain-chatGLM-webui\include -IE:\minconda_extra_env_folder\langchain-chatGLM-webui\Include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\include" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\cppwinrt" /EHsc /TpC:\Users\cc\AppData\Local\Temp\pip-install-2wjkw3vm\detectron2_1187d9a69c984854be83cbda608f18ff\detectron2\layers\csrc\ROIAlignRotated\ROIAlignRotated_cpu.cpp /Fobuild\temp.win-amd64-cpython-39\Release\Users\cc\AppData\Local\Temp\pip-install-2wjkw3vm\detectron2_1187d9a69c984854be83cbda608f18ff\detectron2\layers\csrc\ROIAlignRotated\ROIAlignRotated_cpu.obj /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /EHsc -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
ROIAlignRotated_cpu.cpp
E:\minconda_extra_env_folder\langchain-chatGLM-webui\lib\site-packages\torch\include\c10/macros/Macros.h(138): warning C4067: 预处理器指令后有意外标记 - 应输入换行符
E:\minconda_extra_env_folder\langchain-chatGLM-webui\lib\site-packages\torch\include\c10/util/Optional.h(212): warning C4624: “c10::constexpr_storage_t”: 已将析构函数隐式定义为“已删除”
with
[
T=c10::SymInt
]
E:\minconda_extra_env_folder\langchain-chatGLM-webui\lib\site-packages\torch\include\c10/util/Optional.h(411): note: 查看对正在编译的 类 模板 实例化“c10::constexpr_storage_t”的引用
with
[
T=c10::SymInt
]
...
晓媮 18:09:58
"C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\bin\HostX86\x64\cl.exe" /c logo /O2 /W3 /GL /DNDEBUG /MD -DWITH_CUDA -IC:\Users\cc\AppData\Local\Temp\pip-install-2wjkw3vm\detectron2_1187d9a69c984854be83cbda608f18ff\detectron2\layers\csrc -IE:\minconda_extra_env_folder\langchain-chatGLM-webui\lib\site-packages\torch\include -IE:\minconda_extra_env_folder\langchain-chatGLM-webui\lib\site-packages\torch\include\torch\csrc\api\include -IE:\minconda_extra_env_folder\langchain-chatGLM-webui\lib\site-packages\torch\include\TH -IE:\minconda_extra_env_folder\langchain-chatGLM-webui\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\include" -IE:\minconda_extra_env_folder\langchain-chatGLM-webui\include -IE:\minconda_extra_env_folder\langchain-chatGLM-webui\Include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\include" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\cppwinrt" /EHsc /TpC:\Users\cc\AppData\Local\Temp\pip-install-2wjkw3vm\detectron2_1187d9a69c984854be83cbda608f18ff\detectron2\layers\csrc\ROIAlignRotated\ROIAlignRotated_cpu.cpp /Fobuild\temp.win-amd64-cpython-39\Release\Users\cc\AppData\Local\Temp\pip-install-2wjkw3vm\detectron2_1187d9a69c984854be83cbda608f18ff\detectron2\layers\csrc\ROIAlignRotated\ROIAlignRotated_cpu.obj /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /EHsc -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0
ROIAlignRotated_cpu.cpp
E:\minconda_extra_env_folder\langchain-chatGLM-webui\lib\site-packages\torch\include\c10/macros/Macros.h(138): warning C4067: 预处理器指令后有意外标记 - 应输入换行符
E:\minconda_extra_env_folder\langchain-chatGLM-webui\lib\site-packages\torch\include\c10/util/Optional.h(212): warning C4624: “c10::constexpr_storage_t”: 已将析构函数隐式定义为“已删除”
with
[
T=c10::SymInt
]
E:\minconda_extra_env_folder\langchain-chatGLM-webui\lib\site-packages\torch\include\c10/util/Optional.h(411): note: 查看对正在编译的 类 模板 实例化“c10::constexpr_storage_t”的引用
with
[
T=c10::SymInt
]

晓媮 18:10:34
C:\Users\cc\AppData\Local\Temp\pip-install-2wjkw3vm\detectron2_1187d9a69c984854be83cbda608f18ff\detectron2\layers\csrc\ROIAlignRotated\ROIAlignRotated_cpu.cpp : fatal error C1083: 无法打开编译器生成的文件: “”: Invalid argument
error: command 'C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\bin\HostX86\x64\cl.exe' failed with exit code 1
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure

× Encountered error while trying to install package.
╰─> detectron2

note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.

ROIAlignRotated_cpu.cpp : fatal error C1083: 无法打开编译器生成的文件

执行 pip install -r requirement.txt

报错:
\AppData\Local\Temp\pip-install-pzphnpyf\detectron2_342601ed2f124809b3a4ec0ad331962a\detectron2\layers\csrc\ROIAlignRotated\ROIAlignRotated_cpu.cpp : fatal error C1083: 无法打开编译器生成的文件: “▒\x80\x9d: Invalid argument
error: command 'C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.35.32215\bin\HostX86\x64\cl.exe' failed with exit code 1
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for detectron2
Running setup.py clean for detectron2
Failed to build detectron2
ERROR: Could not build wheels for detectron2, which is required to install pyproject.toml-based projects

模型未成功重新加载,请点击重新加载模型

embedding_model_dict = {
"ernie-base": "D:/AIL/workspace/LangChain-ChatGLM-Webui/models/ernie-3.0-base-zh",
"simbert-base-chinese": "D:/AIL/workspace/LangChain-ChatGLM-Webui/models/simbert-base-chinese",
"text2vec-base": "D:/AIL/workspace/LangChain-ChatGLM-Webui/models/text2vec-large-chinese"

}

llm_model_dict = {
"ChatGLM-6B-int4": "D:/AIL/workspace/LangChain-ChatGLM-Webui/models/chatglm-6b-int4",
"BELLE-LLaMA-7B-2M": "D:/AIL/workspace/LangChain-ChatGLM-Webui/models/BELLE-LLaMA-7B-2M"
}

配置的绝对地址路径

本地运行报错

Traceback (most recent call last):
File "/Users/terry/Downloads/test/lib/python3.8/site-packages/gradio/routes.py", line 401, in run_predict
output = await app.get_blocks().process_api(
File "/Users/terry/Downloads/test/lib/python3.8/site-packages/gradio/blocks.py", line 1302, in process_api
result = await self.call_function(
File "/Users/terry/Downloads/test/lib/python3.8/site-packages/gradio/blocks.py", line 1025, in call_function
prediction = await anyio.to_thread.run_sync(
File "/Users/terry/Downloads/test/lib/python3.8/site-packages/anyio/to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/Users/terry/Downloads/test/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "/Users/terry/Downloads/test/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 867, in run
result = context.run(func, *args)
File "app.py", line 143, in predict
print(file_obj.name)
AttributeError: 'NoneType' object has no attribute 'name'

为啥主机不联网不能正常启动服务?

以下是联网状态下启动的输出,启动webui后可以正常访问服务与加载模型,也能正常使用对话。

(chatGLM) a@b:~/chatGLM/LangChain-ChatGLM-Webui$ python app.py 
No sentence-transformers model found with name /home/a/.cache/torch/sentence_transformers/GanymedeNil_text2vec-base-chinese. Creating a new one with MEAN pooling.
No sentence-transformers model found with name /home/a/chatGLM/LangChain-ChatGLM-Webui/model_cache/GanymedeNil/text2vec-base-chinese/GanymedeNil_text2vec-base-chinese. Creating a new one with MEAN pooling.
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
No compiled kernel found.
Compiling kernels : /home/a/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int8/3218e92c957a036d2716fc2eaf86454841bcef18/quantization_kernels_parallel.c
Compiling gcc -O3 -fPIC -pthread -fopenmp -std=c99 /home/a/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int8/3218e92c957a036d2716fc2eaf86454841bcef18/quantization_kernels_parallel.c -shared -o /home/a/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int8/3218e92c957a036d2716fc2eaf86454841bcef18/quantization_kernels_parallel.so
Load kernel : /home/a/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int8/3218e92c957a036d2716fc2eaf86454841bcef18/quantization_kernels_parallel.so
Setting CPU quantization kernel threads to 4
Using quantization cache
Applying quantization to glm layers
The dtype of attention mask (torch.int64) is not bool

Thanks for being a Gradio user! If you have questions or feedback, please join our Discord server and chat with us: https://discord.gg/feTf9x3ZSB
Running on local URL:  http://0.0.0.0:6006

To create a public link, set `share=True` in `launch()`.

以下是不联网时的输出,可以正常启动webui并在浏览器访问,但是点加载模型时始终会显示模型未正常加载

(chatGLM) a@b:~/chatGLM/LangChain-ChatGLM-Webui$ python app.py 
Running on local URL:  http://0.0.0.0:6006

To create a public link, set `share=True` in `launch()`.

因为最终的部署环境是无法联网的,想问问不联网咋正常启动服务,感谢感谢

一直提示报错:模型未成功重新加载,请点击重新加载模型

按照步骤部署启动了app.py,提示“模型未成功重新加载,请点击重新加载模型”, 这是什么原因
相关的日志信息如下,

Traceback (most recent call last) ──────────────────────╮
│ /content/drive/MyDrive/LangChain-ChatGLM-Webui/app.py:215 in │
│ │
│ 212 │ return '', history, history │
│ 213 │
│ 214 │
│ ❱ 215 model_status = init_model() │
│ 216 │
│ 217 if name == "main": │
│ 218 │ block = gr.Blocks() │
│ │
│ /content/drive/MyDrive/LangChain-ChatGLM-Webui/app.py:160 in init_model │
│ │
│ 157 │
│ 158 def init_model(): │
│ 159 │ # try: │
│ ❱ 160 │ │ knowladge_based_chat_llm.init_model_config() │
│ 161 │ │ print(knowladge_based_chat_llm.llm.call("你好")) │
│ 162 │ │ return """初始模型已成功加载,可以开始对话""" │
│ 163 │ # except Exception as e: │
│ │
│ /content/drive/MyDrive/LangChain-ChatGLM-Webui/app.py:79 in │
│ init_model_config │
│ │
│ 76 │ │ │ model_name=embedding_model_dict[embedding_model], ) │
│ 77 │ │ self.embeddings.client = sentence_transformers.SentenceTransfo │
│ 78 │ │ │ self.embeddings.model_name, device=EMBEDDING_DEVICE) │
│ ❱ 79 │ │ self.llm.load_llm(llm_device=LLM_DEVICE, num_gpus=num_gpus) │
│ 80 │ │
│ 81 │ def init_knowledge_vector_store(self, filepath): │
│ 82 │
│ │
│ /content/drive/MyDrive/LangChain-ChatGLM-Webui/chatllm.py:126 in load_llm │
│ │
│ 123 │ │ │ │ device_map: Optional[Dict[str, int]] = None, │
│ 124 │ │ │ │ **kwargs): │
│ 125 │ │ if 'chatglm' in self.model_name_or_path.lower(): │
│ ❱ 126 │ │ │ self.tokenizer = AutoTokenizer.from_pretrained(self.model

│ 127 │ │ │ │ │ │ │ │ │ │ │ │ │ trust_remote_co │
│ 128 │ │ │ if torch.cuda.is_available() and llm_device.lower().starts │
│ 129 │ │ │ │ # 根据当前设备GPU数量决定是否进行多卡部署 │
│ │
│ /usr/local/lib/python3.9/dist-packages/transformers/models/auto/tokenization │
│ _auto.py:692 in from_pretrained │
│ │
│ 689 │ │ │ │ raise ValueError( │
│ 690 │ │ │ │ │ f"Tokenizer class {tokenizer_class_candidate} does │
│ 691 │ │ │ │ ) │
│ ❱ 692 │ │ │ return tokenizer_class.from_pretrained(pretrained_model_na │
│ 693 │ │ │
│ 694 │ │ # Otherwise we have to be creative. │
│ 695 │ │ # if model is an encoder decoder, the encoder tokenizer class │
│ │
│ /usr/local/lib/python3.9/dist-packages/transformers/tokenization_utils_base. │
│ py:1812 in from_pretrained │
│ │
│ 1809 │ │ │ else: │
│ 1810 │ │ │ │ logger.info(f"loading file {file_path} from cache at │
│ 1811 │ │ │
│ ❱ 1812 │ │ return cls._from_pretrained( │
│ 1813 │ │ │ resolved_vocab_files, │
│ 1814 │ │ │ pretrained_model_name_or_path, │
│ 1815 │ │ │ init_configuration, │
│ │
│ /usr/local/lib/python3.9/dist-packages/transformers/tokenization_utils_base. │
│ py:1878 in _from_pretrained │
│ │
│ 1875 │ │ │ # For backward compatibility with odl format. │
│ 1876 │ │ │ if isinstance(init_kwargs["auto_map"], (tuple, list)): │
│ 1877 │ │ │ │ init_kwargs["auto_map"] = {"AutoTokenizer": init_kwar │
│ ❱ 1878 │ │ │ init_kwargs["auto_map"] = add_model_info_to_auto_map( │
│ 1879 │ │ │ │ init_kwargs["auto_map"], pretrained_model_name_or_pat │
│ 1880 │ │ │ ) │
│ 1881 │
│ │
│ /usr/local/lib/python3.9/dist-packages/transformers/utils/generic.py:563 in │
│ add_model_info_to_auto_map │
│ │
│ 560 │ """ │
│ 561 │ for key, value in auto_map.items(): │
│ 562 │ │ if isinstance(value, (tuple, list)): │
│ ❱ 563 │ │ │ auto_map[key] = [f"{repo_id}--{v}" if "--" not in v else v │
│ 564 │ │ else: │
│ 565 │ │ │ auto_map[key] = f"{repo_id}--{value}" if "--" not in value │
│ 566 │
│ │
│ /usr/local/lib/python3.9/dist-packages/transformers/utils/generic.py:563 in │
│ │
│ │
│ 560 │ """ │
│ 561 │ for key, value in auto_map.items(): │
│ 562 │ │ if isinstance(value, (tuple, list)): │
│ ❱ 563 │ │ │ auto_map[key] = [f"{repo_id}--{v}" if "--" not in v else v │
│ 564 │ │ else: │
│ 565 │ │ │ auto_map[key] = f"{repo_id}--{value}" if "--" not in value │
│ 566 │
╰──────────────────────────────────────────────────────────────────────────────╯
TypeError: argument of type 'NoneType' is not iterable

希望增加向量相似度阈值设定

本地的知识进行识别、清洗、分词、后存入向量库,我把问题转换成向量去向量库查询最接近的,我查询的时候在向量库中没有对应的数据,但是会返回一些乱七八糟的,是不是可以增加一个相似度阈值的设置

有关detectron2

requirements里面的detectron2必需要用到吗(看了下是用来图像检测和图像分割),因为它还不支持cuda12,能不能不装

API接口服务,疑似加载多个文件后,丢失前一个文件信息。

采用模型 [BELLE-LLaMA-13B-2M]
Embedding model [text2vec-base]
引用文件 https://github.com/LawRefBook/Laws/raw/master/%E5%88%91%E6%B3%95/%E5%88%91%E6%B3%95.md

{
    "input": "在战场上故意遗弃伤病军人会受到什么样的判决?", 
    "use_web": true, 
    "top_k": 3,  
    "history_len": 1, 
    "temperature": 0.01, 
    "top_p": 0.1, 
    "history": []
  }
{
    "result": "根据已知信息,在战场上故意遗弃伤病军人的行为属于徇私枉法、徇情枉法,对明知是无罪的人而使他受追诉、对明知是有罪的人而故意包庇不使他受追诉,或者在刑事审判活动中故意违背事实和法律作枉法裁判的,处五年以下有期徒刑或者拘役;情节严重的,处五年以上十年以下有期徒刑;情节特别严重的,处十年以上有期徒刑。因此,在战场上故意遗弃伤病军人的行为将受到相应的刑事惩罚。",
    "error": "",
    "stdout": "根据已知信息,在战场上故意遗弃伤病军人的行为属于徇私枉法、徇情枉法,对明知是无罪的人而使他受追诉、对明知是有罪的人而故意包庇不使他受追诉,或者在刑事审判活动中故意违背事实和法律作枉法裁判的,处五年以下有期徒刑或者拘役;情节严重的,处五年以上十年以下有期徒刑;情节特别严重的,处十年以上有期徒刑。因此,在战场上故意遗弃伤病军人的行为将受到相应的刑事惩罚。"
}

langchain-serve 集成

Hey 我是来自 langchain-serve 的dev!

请问你们在langchain部分的集成有线上的场景吗?

如果有可以考虑了解下我们的产品,方便把langchain部署在云端:

  • Exposes APIs from function definitions locally as well as on the cloud.
  • Very few lines of code changes, ease of development remains the same as local.
  • Supports both REST & Websocket endpoints
  • Serverless/autoscaling endpoints with automatic tls certs.
  • Real-time streaming, human-in-the-loop support

谢谢

模型加载提示已成功,但发送问题会报error

Setting CPU quantization kernel threads to 6
Using quantization cache
Applying quantization to glm layers
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/gradio/routes.py", line 394, in run_predict
output = await app.get_blocks().process_api(
File "/usr/local/lib/python3.8/dist-packages/gradio/blocks.py", line 1075, in process_api
result = await self.call_function(
File "/usr/local/lib/python3.8/dist-packages/gradio/blocks.py", line 884, in call_function
prediction = await anyio.to_thread.run_sync(
File "/usr/local/lib/python3.8/dist-packages/anyio/to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/usr/local/lib/python3.8/dist-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "/usr/local/lib/python3.8/dist-packages/anyio/_backends/_asyncio.py", line 867, in run
result = context.run(func, *args)
File "app.py", line 198, in predict
print(file_obj.name)
AttributeError: 'NoneType' object has no attribute 'name'
^CKeyboard interruption in main thread... closing server.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.