Giter Site home page Giter Site logo

kedreamix / linly-talker Goto Github PK

View Code? Open in Web Editor NEW
599.0 599.0 120.0 60.07 MB

Digital Avatar Conversational System - Linly-Talker. 😄✨ Linly-Talker is an intelligent AI system that combines large language models (LLMs) with visual models to create a novel human-AI interaction method. 🤝🤖 It integrates various technologies like Whisper, Linly, Microsoft Speech Services, and SadTalker talking head generation system. 🌟🔬

Home Page: https://kedreamix.github.io/

License: MIT License

Python 88.12% Shell 0.04% Jupyter Notebook 4.02% C++ 0.16% Cuda 7.19% C 0.46%

linly-talker's Introduction

Hi 很高兴遇见你 👋

Top Langs

linly-talker's People

Contributors

kaixindelele avatar kedreamix avatar yarkable avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

linly-talker's Issues

API

大佬,什么时候能开放API出来哇?(坐等中...

启动问题 执行python app.py出现如下的错误

Traceback (most recent call last):
File "D:\ai3\Linly-Talker\app.py", line 180, in
talker = SadTalker(lazy_load=True)
File "D:\ai3\Linly-Talker\TFG\SadTalker.py", line 38, in init
self.animate_from_coeff = AnimateFromCoeff(self.sadtalker_paths, self.device)
File "D:\ai3\Linly-Talker\src\facerender\animate.py", line 82, in init
self.load_cpk_mapping(sadtalker_path['mappingnet_checkpoint'], mapping=mapping)
File "D:\ai3\Linly-Talker\src\facerender\animate.py", line 157, in load_cpk_mapping
checkpoint = torch.load(checkpoint_path, map_location=torch.device(device))
File "D:\ai3\Linly-Talker\venv\lib\site-packages\torch\serialization.py", line 1028, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "D:\ai3\Linly-Talker\venv\lib\site-packages\torch\serialization.py", line 1246, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, 'v'.

gr.Error("无克隆环境或者无克隆模型权重,无法克隆声音", e)

事先在单独工作区中训练了GPTsoVITS,然后再将训练好的权重放在了GPT_weights和SoVITS_weights中,然后运行克隆声音时出现以下error:

/Linly-Talker/webui.py", line 114, in LLM_response
gr.Error("无克隆环境或者无克隆模型权重,无法克隆声音", e)
TypeError: Error.init

或许声音克隆这个模块在webui.py中还需要改代码吗?

启动问题

启动后报错了,总是解决不了,请问这是什么原因,该怎么解决?

(linly) D:\Linly-Talker>python app.py
Traceback (most recent call last):
File "D:\Linly-Talker\app.py", line 5, in
from LLM import LLM
File "D:\Linly-Talker\LLM_init_.py", line 1, in
from .Linly import Linly
File "D:\Linly-Talker\LLM\Linly.py", line 2, in
import torch
File "C:\ProgramData\Anaconda3\envs\linly\lib\site-packages\torch_init_.py", line 130, in
raise err
OSError: [WinError 127] 找不到指定的程序。 Error loading "C:\ProgramData\Anaconda3\envs\linly\lib\site-packages\torch\lib\c10_cuda.dll" or one of its dependencies.

run app_img.py error!

config.py unchanged.
import gradio as gr
ValueError: Unknown scheme for proxy URL URL('socks://127.0.0.1:7890/')
Looking forward to your letter to resolving this issue.

保存视频路径错误

Face Renderer:: 100%之后,提示路径错误,请问是配置的问题吗?

./results/b71f4ace-a29e-47fe-a1ad-edeb3ba99e28\temp_b71f4ace-a29e-47fe-a1ad-edeb3ba99e28\first_frame_dir\image_b71f4ace-a29e-47fe-a1ad-edeb3ba99e28\input\answer.mp4: No such file or directory
Traceback (most recent call last):
File "C:\ProgramData\anaconda3\envs\linly\lib\shutil.py", line 791, in move
os.rename(src, real_dst)
FileNotFoundError: [WinError 2] 系统找不到指定的文件。: '89cf9dcd-0120-4368-8106-ef56ecd5ed86.mp4' -> './results/b71f4ace-a29e-47fe-a1ad-edeb3ba99e28\b71f4ace-a29e-47fe-a1ad-edeb3ba99e28\first_frame_dir\image_b71f4ace-a29e-47fe-a1ad-edeb3ba99e28\input\answer.mp4'

LLM对话步骤出现错误:“对不起,你的请求出错了,请再次尝试。”

您好,我在使用webui时上传语音对话,识别完成后提交视频时发生了如下的问题。
使用的显卡为4090。

错误部分如下:
extern "C"
launch_bounds(512, 4)
global void reduction_prod_kernel(ReduceJitOp r){
r.run();
}
nvrtc: error: invalid value for --gpu-architecture (-arch)

对不起,你的请求出错了,请再次尝试。
Sorry, your request has encountered an error. Please try again.

函数 predict 运行时间: 3.0960586071014404 秒
函数 LLM_response 运行时间: 3.160871982574463 秒
audio2exp:: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 212.45it/s]Face Renderer:: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 92/92 [00:18<00:00, 5.00it/s]fps: 20 183
./results/temp_girl_answer.mp4
函数 Talker_response 运行时间: 22.409300565719604 秒

我的Qwen文件夹结构如下:
image

希望能帮忙解答,谢谢!

关于GPT-SoVITS和XTTS,README写的太简单了

GPT-SoVITS和XTTS的配置写的太简单了。
GPT-SoVITS还有一堆包需要下载,还有nltk需要下载配置。
XTTS也是报:没有examples/female.wav、 tts_models--multilingual--multi-dataset--xtts_v2/config.json等错误。

README能否写详细点,或者类似Sadtalker,把调用的模型和存放位置都写一下。

测试下来,发现几个问题,请大佬指点解决。

首先,觉得这个项目挺好,所以才会本地部署起来测试,这是值得肯定的!

其次,先描述本人系统状态:

  1. Lenovo P52 笔记本,64GB 内存,P3200 6GB + 外接 P40 24GB 双显卡
  2. Windows 11 x64,Python 3.10.13,CUDA 11.8, Torch 2.0.1 环境
  3. 采用 Linly-AI-7B 做对话模型
    首先,根据大佬的 requirements_app.txt 列出的依赖项,补充了环境里没有的:
    gradio==3.38.0
    edge-tts>=6.1.9
    openai-whisper
    zhconv
    google-generativeai
    transformers==4.32.0
    其它环境里具备的,按 pip install -r requirements_app.txt 走。

一、直接 Python app.py,成功执行,会有警告:
Exception in callback _ProactorBasePipeTransport._call_connection_lost(None)
handle: <Handle _ProactorBasePipeTransport._call_connection_lost(None)>
Traceback (most recent call last):
File "C:\Python\Python310\lib\asyncio\events.py", line 80, in _run
self._context.run(self._callback, *self._args)
File "C:\Python\Python310\lib\asyncio\proactor_events.py", line 165, in _call_connection_lost
self._sock.shutdown(socket.SHUT_RDWR)
ConnectionResetError: [WinError 10054] 远程主机强迫关闭了一个现有的连接。
查了网上资料,发现这个与网络连接有关的问题,很常见,但不影响使用,具体原因应该是asyncio库在运行时,没有判别系统平台是Windows还是Linux或别的,都直接调用了asyncio.set_event_loop_policy()类引起的,解决方法可通过加入判断:
if platform.system() == 'Windows':
asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())
之后,就不会再报错。

二、运行过程加载模型的时候,会有提示:
bin C:\Python\Python310\lib\site-packages\bitsandbytes\libbitsandbytes_cuda118_nocublaslt.dll
[2024-01-25 17:08:13,225] [INFO] [real_accelerator.py:161:get_accelerator] Setting ds_accelerator to cuda (auto detect)
NOTE: Redirects are currently not supported in Windows or MacOs.
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████| 2/2 [00:37<00:00, 18.90s/it]
using safetensor as default
但其实,本人安装的是 windows 版编译的 bitsandbytes,可能是跟某个模型加速的库调用有关,不影响使用。

三、测试 app_img.py,视频合成的最后阶段,报错如下:
{'checkpoint': 'checkpoints\SadTalker_V0.0.2_256.safetensors', 'dir_of_BFM_fitting': 'src/config', 'audio2pose_yaml_path': 'src/config\auido2pose.yaml', 'audio2exp_yaml_path': 'src/config\auido2exp.yaml', 'pirender_yaml_path': 'src/config\facerender_pirender.yaml', 'pirender_checkpoint': 'checkpoints\epoch_00190_iteration_000400000_checkpoint.pt', 'use_safetensor': True, 'mappingnet_checkpoint': 'checkpoints\mapping_00229-model.pth.tar', 'facerender_yaml': 'src/config\facerender.yaml'}
temp\1822631dac470091cee138bad413911fac97da9e\image.png
landmark Det:: 100%|███████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 5.03it/s]
3DMM Extraction In Video:: 100%|███████████████████████████████████████████████████████| 1/1 [00:00<00:00, 14.77it/s]
audio2exp:: 100%|███████████████████████████████████████████████████████████████████| 13/13 [00:00<00:00, 110.95it/s]
Face Renderer:: 100%|██████████████████████████████████████████████████████████████| 123/123 [00:34<00:00, 3.54it/s]
fps: 25 123
ffmpeg error
Traceback (most recent call last):
File "C:\Python\Python310\lib\site-packages\gradio\routes.py", line 442, in run_predict
output = await app.get_blocks().process_api(
File "C:\Python\Python310\lib\site-packages\gradio\blocks.py", line 1389, in process_api
result = await self.call_function(
File "C:\Python\Python310\lib\site-packages\gradio\blocks.py", line 1094, in call_function
prediction = await anyio.to_thread.run_sync(
File "C:\Python\Python310\lib\site-packages\anyio\to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "C:\Python\Python310\lib\site-packages\anyio_backends_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "C:\Python\Python310\lib\site-packages\anyio_backends_asyncio.py", line 867, in run
result = context.run(func, *args)
File "C:\Python\Python310\lib\site-packages\gradio\utils.py", line 703, in wrapper
response = f(*args, **kwargs)
File "D:\AITest\LinlyTalker\my_app_img.py", line 84, in text_response
video = sad_talker.test2(source_image,
File "D:\AITest\LinlyTalker\src\SadTalker.py", line 279, in test2
return_path = self.animate_from_coeff.generate(data, save_dir, pic_path, crop_info, enhancer='gfpgan' if use_enhancer else None, preprocess=preprocess, img_size=size)
File "D:\AITest\LinlyTalker\src\facerender\animate.py", line 272, in generate
os.remove(path)
FileNotFoundError: [WinError 3] 系统找不到指定的路径。: './results/85200a0a-e6c9-4143-980f-a82b4a8dd3b5\temp_85200a0a-e6c9-4143-980f-a82b4a8dd3b5\first_frame_dir\image_85200a0a-e6c9-4143-980f-a82b4a8dd3b5\input\answer.mp4'

1706240978862

这个可能与大佬传递的系统 path 变量有关,但没找到如何解决,请大佬帮忙分析解决。

四、在使用 app.py 和 app_multi.py 时,想修改默认的头像 example.png 为别的头像,但发现修改脚本里面的 image 路径是不管用的,最后直接删除掉 inputs 目录下的 first_frame_dir 整个目录,执行得到报错信息如下:
Traceback (most recent call last):
File "C:\Python\Python310\lib\site-packages\scipy\io\matlab_mio.py", line 39, in _open_file
return open(file_like, mode), True
FileNotFoundError: [Errno 2] No such file or directory: './inputs/first_frame_dir/example.mat'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Python\Python310\lib\site-packages\gradio\routes.py", line 442, in run_predict
output = await app.get_blocks().process_api(
File "C:\Python\Python310\lib\site-packages\gradio\blocks.py", line 1389, in process_api
result = await self.call_function(
File "C:\Python\Python310\lib\site-packages\gradio\blocks.py", line 1094, in call_function
prediction = await anyio.to_thread.run_sync(
File "C:\Python\Python310\lib\site-packages\anyio\to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "C:\Python\Python310\lib\site-packages\anyio_backends_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "C:\Python\Python310\lib\site-packages\anyio_backends_asyncio.py", line 867, in run
result = context.run(func, *args)
File "C:\Python\Python310\lib\site-packages\gradio\utils.py", line 703, in wrapper
response = f(*args, **kwargs)
File "D:\AITest\LinlyTalker\my_app_multi.py", line 148, in human_respone
video_path = sad_talker.test(source_image,
File "D:\AITest\LinlyTalker\src\SadTalker.py", line 153, in test
batch = get_data(first_coeff_path, audio_path, self.device, ref_eyeblink_coeff_path=ref_eyeblink_coeff_path, still=still_mode,
File "D:\AITest\LinlyTalker\src\generate_batch.py", line 82, in get_data
source_semantics_dict = scio.loadmat(source_semantics_path)
File "C:\Python\Python310\lib\site-packages\scipy\io\matlab_mio.py", line 225, in loadmat
with _open_file_context(file_name, appendmat) as f:
File "C:\Python\Python310\lib\contextlib.py", line 135, in enter
return next(self.gen)
File "C:\Python\Python310\lib\site-packages\scipy\io\matlab_mio.py", line 17, in _open_file_context
f, opened = _open_file(file_like, appendmat, mode)
File "C:\Python\Python310\lib\site-packages\scipy\io\matlab_mio.py", line 45, in _open_file
return open(file_like, mode), True
FileNotFoundError: [Errno 2] No such file or directory: './inputs/first_frame_dir/example.mat'

感觉这个脚本里面哪里被写死了,请大佬指点修改哪里可以实现替换不同默认头像的功能,谢谢!

镜像

大神, 是不是可以做个更方便的镜像

pip install -r VITS/requirements_gptsovits.txt安装报错

按照你的说明,安装这个的时候依赖出了问题
pip install -r VITS/requirements_gptsovits.txt

Installing build dependencies ... error
error: subprocess-exited-with-error

× pip subprocess to install build dependencies did not run successfully.
│ exit code: 2
╰─> [63 lines of output]
Looking in indexes: https://pypi.mirrors.ustc.edu.cn/simple/
Ignoring oldest-supported-numpy: markers 'python_version < "3.9"' don't match your environment
ERROR: Exception:
Traceback (most recent call last):

pip install -r VITS/requirements_gptsovits.txt报错

按照你的说明,安装这个的时候依赖出了问题

Installing build dependencies ... error
error: subprocess-exited-with-error

× pip subprocess to install build dependencies did not run successfully.
│ exit code: 2
╰─> [63 lines of output]
Looking in indexes: https://pypi.mirrors.ustc.edu.cn/simple/
Ignoring oldest-supported-numpy: markers 'python_version < "3.9"' don't match your environment
ERROR: Exception:
Traceback (most recent call last):

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.