kedreamix / linly-talker Goto Github PK

Digital Avatar Conversational System - Linly-Talker. 😄✨ Linly-Talker is an intelligent AI system that combines large language models (LLMs) with visual models to create a novel human-AI interaction method. 🤝🤖 It integrates various technologies like Whisper, Linly, Microsoft Speech Services, and SadTalker talking head generation system. 🌟🔬

Home Page: https://kedreamix.github.io/

License: MIT License

Python 88.12% Shell 0.04% Jupyter Notebook 4.02% C++ 0.16% Cuda 7.19% C 0.46%

linly-talker's Introduction

Hi 很高兴遇见你 👋

🧡 专注于计算机视觉 Focus on CV
👯 梦想能环游世界
🤔 希望思考，做一些有趣的事情
💬 生活不息，学习不止，fighting！！！
🛰️ 我的微信（WeChat）: pikachu2biubiu
📫 我的邮件（Email）: [email protected]
🚀 我的个人博客（Github Blog）: https://kedreamix.github.io/
📚 我的CSDN博客（CSDN Blog） https://redamancy.blog.csdn.net/
📯 我的哔哩哔哩空间（Bilibili Video）https://space.bilibili.com/241286257

linly-talker's People

Contributors

Stargazers

Watchers

Forkers

hsaigroup fingerx ljy2019 ariafyy hanwenyuan0907 strategist922 timkar164 fern001 hectorta1989 yangbod saiyi123 sujianwei1 hike2008 wxyv orangels yyheart huangweiboy2 ai-jie01 ai-framwork 731why siliconlife wangchaodeyuzhou ynag9508 tangyiyong xgymchq qinzhuguang yanniszhou ythyty anthonyyuan bi0nd0 todouer kaixindelele dafei1288 weizihua amorjnyh catspunch heefan sakuramaiii lily569 iweig zqz981 mocha-xsy xueminghui redstarxz zxh263 l1-j5n zhikanggfu riderdecade ilumiere jackstephen sqsjavaer chenmoyun jivaklong zjzkiss juno119 skic sunbin728 keyzf blackwhites wingjoezhou weblfe kuyacai cvcuiwei danvan freesteel ainisa20 yslion nksix bestpredicts colinyyj laohuguaiguai venbill weichunpeng yzhou9700 vbc11 jags111 mru4913 zhengmingshao tenzo444 nuffins hushi55 qwioer1 bytescientist newxlife aloukik21 nemodem opensorceycw mtcto ajeema 1192603654 rehberim360 scriptsnet meng-x zmy15501525166 julianyangjingjun coolbe chenliqiang1106 yanlianfu jackieglq songfang

linly-talker's Issues

当我运行python webui.py ，并点击提交视频生成时，我遇到了Connection errored out

终端如下图

请问各位大佬是什么原因呢

这个proxy url要如何设置

需要单独构建一个魔法server吗

API

大佬，什么时候能开放API出来哇？（坐等中...

启动问题执行python app.py出现如下的错误

Traceback (most recent call last):
File "D:\ai3\Linly-Talker\app.py", line 180, in
talker = SadTalker(lazy_load=True)
File "D:\ai3\Linly-Talker\TFG\SadTalker.py", line 38, in init
self.animate_from_coeff = AnimateFromCoeff(self.sadtalker_paths, self.device)
File "D:\ai3\Linly-Talker\src\facerender\animate.py", line 82, in init
self.load_cpk_mapping(sadtalker_path['mappingnet_checkpoint'], mapping=mapping)
File "D:\ai3\Linly-Talker\src\facerender\animate.py", line 157, in load_cpk_mapping
checkpoint = torch.load(checkpoint_path, map_location=torch.device(device))
File "D:\ai3\Linly-Talker\venv\lib\site-packages\torch\serialization.py", line 1028, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "D:\ai3\Linly-Talker\venv\lib\site-packages\torch\serialization.py", line 1246, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, 'v'.

gr.Error("无克隆环境或者无克隆模型权重，无法克隆声音", e)

事先在单独工作区中训练了GPTsoVITS，然后再将训练好的权重放在了GPT_weights和SoVITS_weights中，然后运行克隆声音时出现以下error:

/Linly-Talker/webui.py", line 114, in LLM_response
gr.Error("无克隆环境或者无克隆模型权重，无法克隆声音", e)
TypeError: Error.init

或许声音克隆这个模块在webui.py中还需要改代码吗？

请问可以实现和数字人实时交流？

请问可以实现用rtmp_streaming推流，让数字人实时动起来然后实时交流吗？

启动问题

启动后报错了，总是解决不了，请问这是什么原因，该怎么解决？

(linly) D:\Linly-Talker>python app.py
Traceback (most recent call last):
File "D:\Linly-Talker\app.py", line 5, in
from LLM import LLM
File "D:\Linly-Talker\LLM_init_.py", line 1, in
from .Linly import Linly
File "D:\Linly-Talker\LLM\Linly.py", line 2, in
import torch
File "C:\ProgramData\Anaconda3\envs\linly\lib\site-packages\torch_init_.py", line 130, in
raise err
OSError: [WinError 127] 找不到指定的程序。 Error loading "C:\ProgramData\Anaconda3\envs\linly\lib\site-packages\torch\lib\c10_cuda.dll" or one of its dependencies.

感谢大佬可以正常运行

代码运行成功了

run app_img.py error!

config.py unchanged.
import gradio as gr
ValueError: Unknown scheme for proxy URL URL('socks://127.0.0.1:7890/')
Looking forward to your letter to resolving this issue.

保存视频路径错误

Face Renderer:: 100%之后，提示路径错误，请问是配置的问题吗？

./results/b71f4ace-a29e-47fe-a1ad-edeb3ba99e28\temp_b71f4ace-a29e-47fe-a1ad-edeb3ba99e28\first_frame_dir\image_b71f4ace-a29e-47fe-a1ad-edeb3ba99e28\input\answer.mp4: No such file or directory
Traceback (most recent call last):
File "C:\ProgramData\anaconda3\envs\linly\lib\shutil.py", line 791, in move
os.rename(src, real_dst)
FileNotFoundError: [WinError 2] 系统找不到指定的文件。: '89cf9dcd-0120-4368-8106-ef56ecd5ed86.mp4' -> './results/b71f4ace-a29e-47fe-a1ad-edeb3ba99e28\b71f4ace-a29e-47fe-a1ad-edeb3ba99e28\first_frame_dir\image_b71f4ace-a29e-47fe-a1ad-edeb3ba99e28\input\answer.mp4'

LLM对话步骤出现错误：“对不起，你的请求出错了，请再次尝试。”

您好，我在使用webui时上传语音对话，识别完成后提交视频时发生了如下的问题。
使用的显卡为4090。

错误部分如下：
extern "C"
launch_bounds(512, 4)
global void reduction_prod_kernel(ReduceJitOp r){
r.run();
}
nvrtc: error: invalid value for --gpu-architecture (-arch)

对不起，你的请求出错了，请再次尝试。
Sorry, your request has encountered an error. Please try again.

函数 predict 运行时间： 3.0960586071014404 秒
函数 LLM_response 运行时间： 3.160871982574463 秒
audio2exp:: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 212.45it/s]Face Renderer:: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 92/92 [00:18<00:00, 5.00it/s]fps: 20 183
./results/temp_girl_answer.mp4
函数 Talker_response 运行时间： 22.409300565719604 秒

我的Qwen文件夹结构如下：

希望能帮忙解答，谢谢！

up主可以在语言克隆模块中加入RVC（简版SOVITS）项目嘛ㅠㅠ

sovits运行时一直报错了。。可能RVC会更好跑通一点..？（）

关于GPT-SoVITS和XTTS，README写的太简单了

GPT-SoVITS和XTTS的配置写的太简单了。
GPT-SoVITS还有一堆包需要下载，还有nltk需要下载配置。
XTTS也是报：没有examples/female.wav、 tts_models--multilingual--multi-dataset--xtts_v2/config.json等错误。

README能否写详细点，或者类似Sadtalker，把调用的模型和存放位置都写一下。

测试下来，发现几个问题，请大佬指点解决。

首先，觉得这个项目挺好，所以才会本地部署起来测试，这是值得肯定的！

其次，先描述本人系统状态：

Lenovo P52 笔记本，64GB 内存，P3200 6GB + 外接 P40 24GB 双显卡
Windows 11 x64，Python 3.10.13，CUDA 11.8， Torch 2.0.1 环境
采用 Linly-AI-7B 做对话模型
首先，根据大佬的 requirements_app.txt 列出的依赖项，补充了环境里没有的：
gradio==3.38.0
edge-tts>=6.1.9
openai-whisper
zhconv
google-generativeai
transformers==4.32.0
其它环境里具备的，按 pip install -r requirements_app.txt 走。

一、直接 Python app.py，成功执行，会有警告：
Exception in callback _ProactorBasePipeTransport._call_connection_lost(None)
handle: <Handle _ProactorBasePipeTransport._call_connection_lost(None)>
Traceback (most recent call last):
File "C:\Python\Python310\lib\asyncio\events.py", line 80, in _run
self._context.run(self._callback, *self._args)
File "C:\Python\Python310\lib\asyncio\proactor_events.py", line 165, in _call_connection_lost
self._sock.shutdown(socket.SHUT_RDWR)
ConnectionResetError: [WinError 10054] 远程主机强迫关闭了一个现有的连接。
查了网上资料，发现这个与网络连接有关的问题，很常见，但不影响使用，具体原因应该是asyncio库在运行时，没有判别系统平台是Windows还是Linux或别的，都直接调用了asyncio.set_event_loop_policy()类引起的，解决方法可通过加入判断：
if platform.system() == 'Windows':
asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())
之后，就不会再报错。

二、运行过程加载模型的时候，会有提示：
bin C:\Python\Python310\lib\site-packages\bitsandbytes\libbitsandbytes_cuda118_nocublaslt.dll
[2024-01-25 17:08:13,225] [INFO] [real_accelerator.py:161:get_accelerator] Setting ds_accelerator to cuda (auto detect)
NOTE: Redirects are currently not supported in Windows or MacOs.
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████| 2/2 [00:37<00:00, 18.90s/it]
using safetensor as default
但其实，本人安装的是 windows 版编译的 bitsandbytes，可能是跟某个模型加速的库调用有关，不影响使用。

三、测试 app_img.py，视频合成的最后阶段，报错如下：
{'checkpoint': 'checkpoints\SadTalker_V0.0.2_256.safetensors', 'dir_of_BFM_fitting': 'src/config', 'audio2pose_yaml_path': 'src/config\auido2pose.yaml', 'audio2exp_yaml_path': 'src/config\auido2exp.yaml', 'pirender_yaml_path': 'src/config\facerender_pirender.yaml', 'pirender_checkpoint': 'checkpoints\epoch_00190_iteration_000400000_checkpoint.pt', 'use_safetensor': True, 'mappingnet_checkpoint': 'checkpoints\mapping_00229-model.pth.tar', 'facerender_yaml': 'src/config\facerender.yaml'}
temp\1822631dac470091cee138bad413911fac97da9e\image.png
landmark Det:: 100%|███████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 5.03it/s]
3DMM Extraction In Video:: 100%|███████████████████████████████████████████████████████| 1/1 [00:00<00:00, 14.77it/s]
audio2exp:: 100%|███████████████████████████████████████████████████████████████████| 13/13 [00:00<00:00, 110.95it/s]
Face Renderer:: 100%|██████████████████████████████████████████████████████████████| 123/123 [00:34<00:00, 3.54it/s]
fps: 25 123
ffmpeg error
Traceback (most recent call last):
File "C:\Python\Python310\lib\site-packages\gradio\routes.py", line 442, in run_predict
output = await app.get_blocks().process_api(
File "C:\Python\Python310\lib\site-packages\gradio\blocks.py", line 1389, in process_api
result = await self.call_function(
File "C:\Python\Python310\lib\site-packages\gradio\blocks.py", line 1094, in call_function
prediction = await anyio.to_thread.run_sync(
File "C:\Python\Python310\lib\site-packages\anyio\to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "C:\Python\Python310\lib\site-packages\anyio_backends_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "C:\Python\Python310\lib\site-packages\anyio_backends_asyncio.py", line 867, in run
result = context.run(func, *args)
File "C:\Python\Python310\lib\site-packages\gradio\utils.py", line 703, in wrapper
response = f(*args, **kwargs)
File "D:\AITest\LinlyTalker\my_app_img.py", line 84, in text_response
video = sad_talker.test2(source_image,
File "D:\AITest\LinlyTalker\src\SadTalker.py", line 279, in test2
return_path = self.animate_from_coeff.generate(data, save_dir, pic_path, crop_info, enhancer='gfpgan' if use_enhancer else None, preprocess=preprocess, img_size=size)
File "D:\AITest\LinlyTalker\src\facerender\animate.py", line 272, in generate
os.remove(path)
FileNotFoundError: [WinError 3] 系统找不到指定的路径。: './results/85200a0a-e6c9-4143-980f-a82b4a8dd3b5\temp_85200a0a-e6c9-4143-980f-a82b4a8dd3b5\first_frame_dir\image_85200a0a-e6c9-4143-980f-a82b4a8dd3b5\input\answer.mp4'

这个可能与大佬传递的系统 path 变量有关，但没找到如何解决，请大佬帮忙分析解决。

四、在使用 app.py 和 app_multi.py 时，想修改默认的头像 example.png 为别的头像，但发现修改脚本里面的 image 路径是不管用的，最后直接删除掉 inputs 目录下的 first_frame_dir 整个目录，执行得到报错信息如下：
Traceback (most recent call last):
File "C:\Python\Python310\lib\site-packages\scipy\io\matlab_mio.py", line 39, in _open_file
return open(file_like, mode), True
FileNotFoundError: [Errno 2] No such file or directory: './inputs/first_frame_dir/example.mat'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Python\Python310\lib\site-packages\gradio\routes.py", line 442, in run_predict
output = await app.get_blocks().process_api(
File "C:\Python\Python310\lib\site-packages\gradio\blocks.py", line 1389, in process_api
result = await self.call_function(
File "C:\Python\Python310\lib\site-packages\gradio\blocks.py", line 1094, in call_function
prediction = await anyio.to_thread.run_sync(
File "C:\Python\Python310\lib\site-packages\anyio\to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "C:\Python\Python310\lib\site-packages\anyio_backends_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "C:\Python\Python310\lib\site-packages\anyio_backends_asyncio.py", line 867, in run
result = context.run(func, *args)
File "C:\Python\Python310\lib\site-packages\gradio\utils.py", line 703, in wrapper
response = f(*args, **kwargs)
File "D:\AITest\LinlyTalker\my_app_multi.py", line 148, in human_respone
video_path = sad_talker.test(source_image,
File "D:\AITest\LinlyTalker\src\SadTalker.py", line 153, in test
batch = get_data(first_coeff_path, audio_path, self.device, ref_eyeblink_coeff_path=ref_eyeblink_coeff_path, still=still_mode,
File "D:\AITest\LinlyTalker\src\generate_batch.py", line 82, in get_data
source_semantics_dict = scio.loadmat(source_semantics_path)
File "C:\Python\Python310\lib\site-packages\scipy\io\matlab_mio.py", line 225, in loadmat
with _open_file_context(file_name, appendmat) as f:
File "C:\Python\Python310\lib\contextlib.py", line 135, in enter
return next(self.gen)
File "C:\Python\Python310\lib\site-packages\scipy\io\matlab_mio.py", line 17, in _open_file_context
f, opened = _open_file(file_like, appendmat, mode)
File "C:\Python\Python310\lib\site-packages\scipy\io\matlab_mio.py", line 45, in _open_file
return open(file_like, mode), True
FileNotFoundError: [Errno 2] No such file or directory: './inputs/first_frame_dir/example.mat'

感觉这个脚本里面哪里被写死了，请大佬指点修改哪里可以实现替换不同默认头像的功能，谢谢！

启动webui.py报错：SadTalker Error: invalid load key, 'v'.

我已经下载SadTalker的相关权重，保持和如下一致：

但是我运行程序的时候还是报错了：

是模型的权重文件下载得不对吗？请问该从哪里下载正确的文件？

页面出来就是连接到网络，这什么情况

打开网页之后，会出现连接到网络需要登录，然后跳转http://edge-http.microsoft.com/captiveportal/generate_204
http://www.gstatic.com/generate_204

colab的连接找不到启动文件··

群聊二维码已过期，求拉。顺便问项目问题

我希望我来传入文本，本地数字人不调用大模型而是直接运行TTS和wav2lip，请问可以做到吗？

运行报错 python app.py

你好，请教一个问题，项目根目录运行的时候会报这个错。

关于数字人的问题

请问我想传入一段话让数字人读，怎么实现？

镜像

大神, 是不是可以做个更方便的镜像

support qwen models?

pip install -r VITS/requirements_gptsovits.txt安装报错

按照你的说明，安装这个的时候依赖出了问题
pip install -r VITS/requirements_gptsovits.txt

Installing build dependencies ... error
error: subprocess-exited-with-error

× pip subprocess to install build dependencies did not run successfully.
│ exit code: 2
╰─> [63 lines of output]
Looking in indexes: https://pypi.mirrors.ustc.edu.cn/simple/
Ignoring oldest-supported-numpy: markers 'python_version < "3.9"' don't match your environment
ERROR: Exception:
Traceback (most recent call last):

Installing build dependencies ... error
error: subprocess-exited-with-error

python app_img.py 生成视频的话，每次都会报错

python app_img.py 生成视频的话，每次都会报这种错， app.py 和 app_multi.py启动后是好着的