zz-ww / sadtalker-video-lip-sync Goto Github PK
View Code? Open in Web Editor NEW本项目基于SadTalkers实现视频唇形合成的Wav2lip。通过以视频文件方式进行语音驱动生成唇形,设置面部区域可配置的增强方式进行合成唇形(人脸)区域画面增强,提高生成唇形的清晰度。使用DAIN 插帧的DL算法对生成视频进行补帧,补充帧间合成唇形的动作过渡,使合成的唇形更为流畅、真实以及自然。
本项目基于SadTalkers实现视频唇形合成的Wav2lip。通过以视频文件方式进行语音驱动生成唇形,设置面部区域可配置的增强方式进行合成唇形(人脸)区域画面增强,提高生成唇形的清晰度。使用DAIN 插帧的DL算法对生成视频进行补帧,补充帧间合成唇形的动作过渡,使合成的唇形更为流畅、真实以及自然。
`(SadTalker-Wav2lip) H:\SadTalker-Video-Lip-Sync>
(SadTalker-Wav2lip) H:\SadTalker-Video-Lip-Sync>python inference.py --driven_audio 1.wav --source_video 1.mp4 --use_DAIN
./checkpoints\epoch_20.pth
./checkpoints\auido2pose_00140-model.pth
./checkpoints\auido2exp_00300-model.pth
./checkpoints\facevid2vid_00189-model.pth.tar
./checkpoints\mapping_00109-model.pth.tar
3DMM Extraction for source image
landmark Det:: 100%|███████████████████████████████████████████████████████████████████████████████| 1020/1020 [01:11<00:00, 14.18it/s]
3DMM Extraction In Video:: 100%|███████████████████████████████████████████████████████████████████| 1020/1020 [00:16<00:00, 63.24it/s]
mel:: 100%|███████████████████████████████████████████████████████████████████████████████████████| 423/423 [00:00<00:00, 40052.16it/s]
audio2exp:: 100%|█████████████████████████████████████████████████████████████████████████████████████| 43/43 [00:00<00:00, 124.36it/s]
Face Renderer:: 100%|████████████████████████████████████████████████████████████████████████████████| 423/423 [00:35<00:00, 11.85it/s]
Traceback (most recent call last):
File "inference.py", line 123, in
main(args)
File "inference.py", line 76, in main
tmp_path, new_audio_path, return_path = animate_from_coeff.generate(data, save_dir, pic_path, crop_info,
File "H:\SadTalker-Video-Lip-Sync\src\facerender\animate.py", line 154, in generate
sound = AudioSegment.from_mp3(audio_path)
File "H:\anaconda3\envs\SadTalker-Wav2lip\lib\site-packages\pydub\audio_segment.py", line 796, in from_mp3
return cls.from_file(file, 'mp3', parameters=parameters)
File "H:\anaconda3\envs\SadTalker-Wav2lip\lib\site-packages\pydub\audio_segment.py", line 773, in from_file
raise CouldntDecodeError(
pydub.exceptions.CouldntDecodeError: Decoding failed. ffmpeg returned error code: 1
Output from ffmpeg/avlib:
ffmpeg version 2022-07-06-git-03d81a044a-full_build-www.gyan.dev Copyright (c) 2000-2022 the FFmpeg developers
built with gcc 12.1.0 (Rev2, Built by MSYS2 project)
configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-bzlib --enable-lzma --enable-libsnappy --enable-zlib --enable-librist --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-libbluray --enable-libcaca --enable-sdl2 --enable-libdav1d --enable-libdavs2 --enable-libuavs3d --enable-libzvbi --enable-librav1e --enable-libsvtav1 --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxvid --enable-libaom --enable-libjxl --enable-libopenjpeg --enable-libvpx --enable-mediafoundation --enable-libass --enable-frei0r --enable-libfreetype --enable-libfribidi --enable-liblensfun --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-nvdec --enable-nvenc --enable-d3d11va --enable-dxva2 --enable-libmfx --enable-libshaderc --enable-vulkan --enable-libplacebo --enable-opencl --enable-libcdio --enable-libgme --enable-libmodplug --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libshine --enable-libtheora --enable-libtwolame --enable-libvo-amrwbenc --enable-libilbc --enable-libgsm --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-ladspa --enable-libbs2b --enable-libflite --enable-libmysofa --enable-librubberband --enable-libsoxr --enable-chromaprint
libavutil 57. 27.100 / 57. 27.100
libavcodec 59. 36.100 / 59. 36.100
libavformat 59. 26.100 / 59. 26.100
libavdevice 59. 6.100 / 59. 6.100
libavfilter 8. 41.100 / 8. 41.100
libswscale 6. 6.100 / 6. 6.100
libswresample 4. 6.100 / 4. 6.100
libpostproc 56. 5.100 / 56. 5.100
[mp3float @ 000001c4461b57c0] Header missing
Last message repeated 127 times
[mp3 @ 000001c44619f200] Could not find codec parameters for stream 0 (Audio: mp3 (mp3float), 0 channels, fltp): unspecified frame size
Consider increasing the value for the 'analyzeduration' (0) and 'probesize' (5000000) options
Input #0, mp3, from '1.wav':
Duration: N/A, start: 0.000000, bitrate: N/A
Stream #0:0: Audio: mp3, 0 channels, fltp
Stream mapping:
Stream #0:0 -> #0:0 (mp3 (mp3float) -> pcm_s32le (native))
Press [q] to stop, [?] for help
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[abuffer @ 000001c4461a1440] Value inf for parameter 'time_base' out of range [0 - 2.14748e+09]
Last message repeated 1 times
[abuffer @ 000001c4461a1440] Error setting option time_base to value 1/0.
[graph_0_in_0_0 @ 000001c446625e40] Error applying options to the filter.
Error reinitializing filters!
Error while filtering: Result too large
Finishing stream 0:0 without any data written to it.
[abuffer @ 000001c4461da2c0] Value inf for parameter 'time_base' out of range [0 - 2.14748e+09]
Last message repeated 1 times
[abuffer @ 000001c4461da2c0] Error setting option time_base to value 1/0.
[graph_0_in_0_0 @ 000001c4461da1c0] Error applying options to the filter.
Error configuring filter graph
Conversion failed!`
Not sure what else to do. I've tried different files, uninstall and re-install ffpeg but I keep this, any ideas?
Hi,
I'm trying the first run with no success. After cloning the repository and installing the requirements, I run the following in the main project's directory:
python inference.py --driven-audio examples/driven_audio/chinese_poem1.wav --source-video examples/driven_video/1.mp4
The output:
Traceback (most recent call last):
File "M:\AI\SadTalker-Video-Lip-Sync\inference.py", line 5, in <module>
from src.utils.preprocess import CropAndExtract
ModuleNotFoundError: No module named 'src.utils'
I took a look at the project's structure and it's alright (src/utils are there). Am I missing something here? Thanks in advance.
作者可以分享一下SadTalker-Video-Lip-Sync和VideoReTalking、wav2lip的区别吗 ?
W0630 09:16:06.445015 25868 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 8.9, Driver API Version: 12.2, Runtime API Version: 10.2
W0630 09:16:06.447014 25868 dynamic_loader.cc:276] Note: [Recommend] copy cudnn into CUDA installation directory.
For instance, download cudnn-10.0-windows10-x64-v7.6.5.32.zip from NVIDIA's official website,
then, unzip it and copy it into C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0
You should do this according to your CUDA installation directory and CUDNN version.
Traceback (most recent call last):
File "inference.py", line 123, in
main(args)
File "inference.py", line 83, in main
predictor_dian = dain_predictor.DAINPredictor(args.dian_output, weight_path=args.DAIN_weight,
File "D:\SadTalker-Video-Lip-Sync\src\dain_model\dain_predictor.py", line 64, in init
self.build_inference_model()
File "D:\SadTalker-Video-Lip-Sync\src\dain_model\base_predictor.py", line 25, in build_inference_model
self.program, self.feed_names, self.fetch_targets = paddle.static.load_inference_model(
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\decorator.py", line 232, in fun
return caller(func, *(extras + args), **kw)
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\paddle\fluid\wrapped_decorator.py", line 25, in impl
return wrapped_func(*args, **kwargs)
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\paddle\fluid\framework.py", line 443, in impl
return func(*args, **kwargs)
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\paddle\static\io.py", line 809, in load_inference_model
deserialize_persistables(program, params_bytes, executor)
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\decorator.py", line 232, in fun
return caller(func, *(extras + args), **kw)
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\paddle\fluid\wrapped_decorator.py", line 25, in impl
return wrapped_func(*args, **kwargs)
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\paddle\fluid\framework.py", line 443, in impl
return func(*args, **kwargs)
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\paddle\static\io.py", line 650, in deserialize_persistables
executor.run(load_program)
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\paddle\fluid\executor.py", line 1299, in run
six.reraise(*sys.exc_info())
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\six.py", line 719, in reraise
raise value
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\paddle\fluid\executor.py", line 1285, in run
res = self._run_impl(
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\paddle\fluid\executor.py", line 1464, in _run_impl
return new_exe.run(list(feed.keys()), fetch_list, return_numpy)
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\paddle\fluid\executor.py", line 547, in run
tensors = self._new_exe.run(feed_names, fetch_list)._move_to_list()
RuntimeError: (PreconditionNotMet) The third-party dynamic library (cudnn64_7.dll) that Paddle depends on is not configured correctly. (error code is 126)
Suggestions:
export LD_LIBRARY_PATH=...
Error configuring filter graph
Conversion failed!
how to solve this problem
希望实现虚拟人对话中改变情绪,比如悲伤的说话,开心的说话,兴奋的说话 , 是使用不同的带有情绪的视频会比较好呢 ? 亦或者是有其他实现?
3DMM Extraction In Video:: 0%| | 0/135 [00:00<?, ?it/s]
Traceback (most recent call last):
File "inference.py", line 123, in
main(args)
File "inference.py", line 67, in main
first_coeff_path, crop_pic_path, crop_info = preprocess_model.generate(pic_path, first_frame_dir)
File "D:\3090\SadTalker-Video-Lip-Sync-master\src\utils\preprocess.py", line 124, in generate
trans_params, im1, lm1, _ = align_img(frame, lm1, self.lm3d_std)
File "D:\3090\SadTalker-Video-Lip-Sync-master\src\face3d\util\preprocess.py", line 101, in align_img
trans_params = np.array([w0, h0, s, t[0], t[1]])
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (5,) + inhomogeneous part.
唇形效果完全对不上!还需要调整
尝试哪怕audio输入为全静音的时候输出的视频也仍然有说话的嘴型,怀疑是某些网络没有运行,全程命令行中没有报错。请问有遇到类似的问题有头绪的吗?
I suspect that some neural networks in my device are not running …… when there is no audio input the generated video still has lip movements. There are no error messages in the command line throughout the process. Does anyone have any idea about this issue?
rt
求commit号,或者大致的fork日期
please give me steps
or starting file to run from colab like sadtalker
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ D:\soundmaker\SadTalker-Video-Lip-Sync\inference.py:124 in │
│ │
│ 121 │ │ args.device = "cuda" │
│ 122 │ else: │
│ 123 │ │ args.device = "cpu" │
│ ❱ 124 │ main(args) │
│ 125 │
│ │
│ D:\soundmaker\SadTalker-Video-Lip-Sync\inference.py:87 in main │
│ │
│ 84 │ │ predictor_dian = dain_predictor.DAINPredictor(args.dian_output, weight_path=args │
│ 85 │ │ │ │ │ │ │ │ │ │ │ │ │ time_step=args.time_step, │
│ 86 │ │ │ │ │ │ │ │ │ │ │ │ │ remove_duplicates=args.remove_dupl │
│ ❱ 87 │ │ frames_path, temp_video_path = predictor_dian.run(tmp_path) │
│ 88 │ │ paddle.disable_static() │
│ 89 │ │ save_path = return_path[:-4] + 'dain.mp4' │
│ 90 │ │ command = r'ffmpeg -y -i "%s" -i "%s" -vcodec copy "%s"' % (temp_video_path, new │
│ │
│ D:\soundmaker\SadTalker-Video-Lip-Sync/src/..\src\dain_model\dain_predictor.py:166 in run │
│ │
│ 163 │ │ │ │
│ 164 │ │ │ X = np.concatenate((X0, X1), axis=0) │
│ 165 │ │ │ │
│ ❱ 166 │ │ │ o = self.base_forward(X) │
│ 167 │ │ │ │
│ 168 │ │ │ y = o[0] │
│ 169 │
│ │
│ D:\soundmaker\SadTalker-Video-Lip-Sync/src/..\src\dain_model\base_predictor.py:44 in │
│ base_forward │
│ │
│ 41 │ │ │ else: │
│ 42 │ │ │ │ feed_dict[self.feed_names[0]] = inputs │
│ 43 │ │ │ │
│ ❱ 44 │ │ │ out = self.exe.run(self.program, │
│ 45 │ │ │ │ │ │ │ fetch_list=self.fetch_targets, │
│ 46 │ │ │ │ │ │ │ feed=feed_dict) │
│ 47 │
│ │
│ D:\sd-webui-aki-v4\py310\lib\site-packages\paddle\fluid\executor.py:1463 in run │
│ │
│ 1460 │ │ │ core.update_autotune_status() │
│ 1461 │ │ │ return res │
│ 1462 │ │ except Exception as e: │
│ ❱ 1463 │ │ │ six.reraise(*sys.exc_info()) │
│ 1464 │ │
│ 1465 │ def _run_impl(self, program, feed, fetch_list, feed_var_name, │
│ 1466 │ │ │ │ fetch_var_name, scope, return_numpy, use_program_cache, │
│ │
│ D:\sd-webui-aki-v4\py310\lib\site-packages\six.py:719 in reraise │
│ │
│ 716 │ │ │ │ value = tp() │
│ 717 │ │ │ if value.traceback is not tb: │
│ 718 │ │ │ │ raise value.with_traceback(tb) │
│ ❱ 719 │ │ │ raise value │
│ 720 │ │ finally: │
│ 721 │ │ │ value = None │
│ 722 │ │ │ tb = None │
│ │
│ D:\sd-webui-aki-v4\py310\lib\site-packages\paddle\fluid\executor.py:1450 in run │
│ │
│ 1447 │ │ │ self._log_force_set_program_cache(use_program_cache) │
│ 1448 │ │ │
│ 1449 │ │ try: │
│ ❱ 1450 │ │ │ res = self._run_impl(program=program, │
│ 1451 │ │ │ │ │ │ │ │ feed=feed, │
│ 1452 │ │ │ │ │ │ │ │ fetch_list=fetch_list, │
│ 1453 │ │ │ │ │ │ │ │ feed_var_name=feed_var_name, │
│ │
│ D:\sd-webui-aki-v4\py310\lib\site-packages\paddle\fluid\executor.py:1661 in _run_impl │
│ │
│ 1658 │ │ │ │ else: │
│ 1659 │ │ │ │ │ tensor._copy_from(cpu_tensor, self.place) │
│ 1660 │ │ │ │
│ ❱ 1661 │ │ │ return new_exe.run(scope, list(feed.keys()), fetch_list, │
│ 1662 │ │ │ │ │ │ │ return_numpy) │
│ 1663 │ │ │
│ 1664 │ │ compiled = isinstance(program, compiler.CompiledProgram) │
│ │
│ D:\sd-webui-aki-v4\py310\lib\site-packages\paddle\fluid\executor.py:631 in run │
│ │
│ 628 │ │ """ │
│ 629 │ │ fetch_list = self._check_fetch(fetch_list) │
│ 630 │ │ │
│ ❱ 631 │ │ tensors = self._new_exe.run(scope, feed_names, │
│ 632 │ │ │ │ │ │ │ │ │ fetch_list)._move_to_list() │
│ 633 │ │ if return_numpy: │
│ 634 │ │ │ return as_numpy(tensors, copy=True) │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: In user code:
File "/root/miniconda3/lib/python3.7/site-packages/paddle/fluid/framework.py", line 2798, in append_op
attrs=kwargs.get("attrs", None))
File "/root/miniconda3/lib/python3.7/site-packages/paddle/fluid/layer_helper.py", line 43, in append_op
return self.main_program.current_block().append_op(*args, **kwargs)
File "/root/miniconda3/lib/python3.7/site-packages/paddle/fluid/layers/tensor.py", line 1412, in range
outputs={'Out': out})
File "/paddle/work/github/DAIN/DAIN-paddle-release/PWCNet/PWCNet.py", line 125, in warp
bb = fluid.layers.range(0, B, 1, 'float32')
File "/paddle/work/github/DAIN/DAIN-paddle-release/PWCNet/PWCNet.py", line 292, in forward
warp5 = self.warp(c25, up_flow6 * 0.625)
File "/root/miniconda3/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 600, in __call__
outputs = self.forward(*inputs, **kwargs)
File "/paddle/work/github/DAIN/DAIN-paddle-release/networks/DAIN_slowmotion.py", line 136, in forward_flownets
temp = model(input)
File "/paddle/work/github/DAIN/DAIN-paddle-release/networks/DAIN_slowmotion.py", line 72, in forward
self.forward_flownets(self.flownets, cur_offset_input, time_offsets=time_offsets),
File "/root/miniconda3/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 600, in __call__
outputs = self.forward(*inputs, **kwargs)
File "parse_weights.py", line 25, in convert_model
out = DAIN(image)
File "parse_weights.py", line 179, in <module>
convert_model(pkl_path, model_path)
InvalidArgumentError: The step should be greater than 0 while start < end.
[Hint: Expected step > 0, but received step:-0.0331662 <= 0:0.] (at
..\paddle/phi/kernels/funcs/range_function.h:33)
[operator < range > error]
当视频生成后好像开始了DAIN的处理,然后出现上述错误,step是负数,看不出来是什么原因造成的
您好,感谢您的工作!
经过测试,中文音频的输出口型似乎并不正确,请问是项目的bug吗
how to solve:RuntimeError: Unable to open C:\Python-Project\数字人lip\SadTalker-Video-Lip-Sync\checkpoints\shape_predictor_68_face_landmarks.dat
I apologize if this topic has already been addressed, and I understand that you may be receiving numerous messages. I am having difficulty installing the program on my Windows computer. I would greatly appreciate any assistance you can provide.
Thank you.
Hello,看起来sadtalker对比的合成,是静态的,头不会动,使用静态图片生成的吗?
我理解的是LipSync只改善了唇形部分,不至于头部动作也会影响吧
0%| | 0/787 [00:02<?, ?it/s]
Traceback (most recent call last):
File "inference.py", line 123, in
main(args)
File "inference.py", line 86, in main
frames_path, temp_video_path = predictor_dian.run(tmp_path)
File "D:\SadTalker-Video-Lip-Sync\src\dain_model\dain_predictor.py", line 166, in run
o = self.base_forward(X)
File "D:\SadTalker-Video-Lip-Sync\src\dain_model\base_predictor.py", line 44, in base_forward
out = self.exe.run(self.program,
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\paddle\fluid\executor.py", line 1299, in run
six.reraise(*sys.exc_info())
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\six.py", line 719, in reraise
raise value
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\paddle\fluid\executor.py", line 1285, in run
res = self._run_impl(
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\paddle\fluid\executor.py", line 1464, in _run_impl
return new_exe.run(list(feed.keys()), fetch_list, return_numpy)
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\paddle\fluid\executor.py", line 547, in run
tensors = self._new_exe.run(feed_names, fetch_list)._move_to_list()
ValueError: (InvalidArgument) The step should be greater than 0 while start < end.
[Hint: Expected step > 0, but received step:-0.000413004 <= 0:0.] (at ..\paddle/phi/kernels/funcs/range_function.h:33)
Thanks for your work on this - sorry this is in English.
The results are very intriguing using this repository but on my tests, the lips are not moving correctly.
I wonder if it is due to the audio only being checked every 10 frames.
Also, would be nice to have the frame rate of the video parsed from the input video and used instead of assuming 25fps.
Looking forward to your response (you can respond in Chinese, I will use a translation app.
发现刚开始几秒的口型效果很好,视频越往后,效果越差
Sadtalker just released a 512x512 model, is it possible you can implement this? not sure if you are even using their 256 model but worth it to ask as i havent seen an update from you in a few months
怎样针对自己的人物形象进行训练呢?和wav2lip训练方式一样吗?
嘴部非常拉跨。。
Please advise how to avoid this?
The generated video is named ./results\2023_07_19_11.11.52/video##audio_full.mp4
Error: Can not import paddle core while this file exists: C:\Users\Max\anaconda3\envs\SadTalkerLipSync\lib\site-packages\paddle\fluid\libpaddle.pyd
Traceback (most recent call last):
File "inference.py", line 123, in <module>
main(args)
File "inference.py", line 80, in main
import paddle
File "C:\Users\Max\anaconda3\envs\SadTalkerLipSync\lib\site-packages\paddle\__init__.py", line 31, in <module>
from .framework import monkey_patch_variable
File "C:\Users\Max\anaconda3\envs\SadTalkerLipSync\lib\site-packages\paddle\framework\__init__.py", line 17, in <module>
from . import random # noqa: F401
File "C:\Users\Max\anaconda3\envs\SadTalkerLipSync\lib\site-packages\paddle\framework\random.py", line 17, in <module> from paddle import fluid
File "C:\Users\Max\anaconda3\envs\SadTalkerLipSync\lib\site-packages\paddle\fluid\__init__.py", line 36, in <module>
from . import framework
File "C:\Users\Max\anaconda3\envs\SadTalkerLipSync\lib\site-packages\paddle\fluid\framework.py", line 35, in <module>
from . import core
File "C:\Users\Max\anaconda3\envs\SadTalkerLipSync\lib\site-packages\paddle\fluid\core.py", line 356, in <module>
raise e
File "C:\Users\Max\anaconda3\envs\SadTalkerLipSync\lib\site-packages\paddle\fluid\core.py", line 269, in <module>
from . import libpaddle
ImportError: generic_type: type "_gpuDeviceProperties" is already registered!
All checkpoints were downloaded correctly and Python 3.8 env installed without any obstacles.
只生成了img文件夹下的全部视频截图,video_landmarks.txt并没有对图片进行标注。txt内容为空。
Collecting scipy==1.5.3 (from -r requirements.txt (line 9))
Using cached scipy-1.5.3.tar.gz (25.2 MB)
error: subprocess-exited-with-error
× pip subprocess to install build dependencies did not run successfully.
│ exit code: 1
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
Installing build dependencies ... error
error: subprocess-exited-with-error
× pip subprocess to install build dependencies did not run successfully.
│ exit code: 1
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
Traceback (most recent call last):
File "/content/SadTalker-Video-Lip-Sync/inference.py", line 123, in
main(args)
File "/content/SadTalker-Video-Lip-Sync/inference.py", line 47, in main
preprocess_model = CropAndExtract(path_of_lm_croper, path_of_net_recon_model, dir_of_BFM_fitting, device)
File "/content/SadTalker-Video-Lip-Sync/src/utils/preprocess.py", line 44, in init
self.kp_extractor = KeypointExtractor(device)
File "/content/SadTalker-Video-Lip-Sync/src/face3d/extract_kp_videos.py", line 16, in init
self.detector = face_alignment.FaceAlignment(face_alignment.LandmarksType._2D, device=device)
File "/usr/lib/python3.10/enum.py", line 437, in getattr
raise AttributeError(name) from None
AttributeError: _2D
I write jupyter notebook for colab and i have this problem. Notebook: https://colab.research.google.com/drive/1zeqfsNl7vZxPUztQzNDT2xRQthaz3Zws
none
Here is ERROR information when running,是哪里的dll没导入吗?
(sadwav2lip) D:\pythonProject\SadTalker-Video-Lip-Sync>python inference.py --driven_audio sound.wav --source_video driving.mp4 --enhancer lip
Traceback (most recent call last):
File "inference.py", line 10, in
from third_part.GFPGAN.gfpgan import GFPGANer
File "D:\pythonProject\SadTalker-Video-Lip-Sync\third_part\GFPGAN\gfpgan_init_.py", line 5, in
from .archs import *
File "D:\pythonProject\SadTalker-Video-Lip-Sync\third_part\GFPGAN\gfpgan\archs_init_.py", line 2, in
from basicsr.utils import scandir
File "D:\Anaconda\envs\sadwav2lip\lib\site-packages\basicsr_init_.py", line 3, in
from .archs import *
File "D:\Anaconda\envs\sadwav2lip\lib\site-packages\basicsr\archs_init_.py", line 5, in
from basicsr.utils import get_root_logger, scandir
File "D:\Anaconda\envs\sadwav2lip\lib\site-packages\basicsr\utils_init_.py", line 5, in
from .img_util import crop_border, imfrombytes, img2tensor, imwrite, tensor2img
File "D:\Anaconda\envs\sadwav2lip\lib\site-packages\basicsr\utils\img_util.py", line 6, in
from torchvision.utils import make_grid
File "D:\Anaconda\envs\sadwav2lip\lib\site-packages\torchvision_init_.py", line 1, in
from torchvision import models
File "D:\Anaconda\envs\sadwav2lip\lib\site-packages\torchvision\models_init_.py", line 11, in
from . import detection
File "D:\Anaconda\envs\sadwav2lip\lib\site-packages\torchvision\models\detection_init_.py", line 1, in
from .faster_rcnn import *
File "D:\Anaconda\envs\sadwav2lip\lib\site-packages\torchvision\models\detection\faster_rcnn.py", line 7, in
from torchvision.ops import misc as misc_nn_ops
File "D:\Anaconda\envs\sadwav2lip\lib\site-packages\torchvision\ops_init_.py", line 1, in
from .boxes import nms, box_iou
File "D:\Anaconda\envs\sadwav2lip\lib\site-packages\torchvision\ops\boxes.py", line 2, in
from torchvision import _C
ImportError: DLL load failed: 找不到指定的模块。
夸克盘链接失效
The generated video is named ./results\2023_07_10_06.45.09/6##1_full.mp4
Mon Jul 10 06:49:00-WARNING: The old way to load inference model is deprecated. model path: C:\Users\Spencer\PycharmProjects\pythonProject\SadTalker-Video-Lip-Sync\checkpoints\DAIN_weight\model, params path: C:\Users\Spencer\PycharmProjects\pythonProject\SadTalker-Video-Lip-Sync\checkpoints\DAIN_weight\params
W0710 06:49:01.032488 6856 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 12.0, Runtime API Version: 11.2
W0710 06:49:01.032488 6856 gpu_resources.cc:91] device: 0, cuDNN Version: 8.3.
'rm' 不是内部或外部命令,也不是可运行的程序
或批处理文件。
Old fps (frame rate): 25.0
New fps (frame rate): 50
[image2 muxer @ 0000000002f30620] Value 0.000000 for parameter 'start_number' out of range [1 - 2.14748e+009]
[image2 muxer @ 0000000002f30620] Error setting option start_number to value 0.
Could not write header for output file #0 (incorrect codec parameters ?): Error number -34 occurred
Traceback (most recent call last):
File "C:\Users\Spencer\PycharmProjects\pythonProject\SadTalker-Video-Lip-Sync\inference.py", line 123, in
main(args)
File "C:\Users\Spencer\PycharmProjects\pythonProject\SadTalker-Video-Lip-Sync\inference.py", line 86, in main
frames_path, temp_video_path = predictor_dian.run(tmp_path)
File "C:\Users\Spencer\PycharmProjects\pythonProject\SadTalker-Video-Lip-Sync\src\dain_model\dain_predictor.py", line 97, in run
out_path = video2frames(video_path, frame_path_input)
File "C:\Users\Spencer\PycharmProjects\pythonProject\SadTalker-Video-Lip-Sync\src\dain_model\dain_predictor.py", line 36, in video2frames
raise RuntimeError('ffmpeg process video: {} error'.format(vid_name))
RuntimeError: ffmpeg process video: ead5dc4a-3a48-423d-9996-aab5540abf06 error
how to solve this problem ?
Traceback (most recent call last):
File "G:\dev\SadTalker-Video-Lip-Sync-master\inference.py", line 123, in
main(args)
File "G:\dev\SadTalker-Video-Lip-Sync-master\inference.py", line 58, in main
restorer_model = GFPGANer(model_path='checkpoints/GFPGANv1.3.pth', upscale=1, arch='clean',
File "G:\dev\SadTalker-Video-Lip-Sync-master\third_part\GFPGAN\gfpgan\utils.py", line 76, in init
self.face_helper = FaceRestoreHelper(
File "E:\anaconda\envs\sadtalker\lib\site-packages\facexlib\utils\face_restoration_helper.py", line 99, in init
self.face_det = init_detection_model(det_model, half=False, device=self.device, model_rootpath=model_rootpath)
File "E:\anaconda\envs\sadtalker\lib\site-packages\facexlib\detection_init_.py", line 22, in init_detection_model
load_net = torch.load(model_path, map_location=lambda storage, loc: storage)
File "E:\anaconda\envs\sadtalker\lib\site-packages\torch\serialization.py", line 713, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "E:\anaconda\envs\sadtalker\lib\site-packages\torch\serialization.py", line 938, in _legacy_load
typed_storage._storage._set_from_file(
RuntimeError: unexpected EOF, expected 333577 more bytes. The file might be corrupted.
https://drive.google.com/file/d/1lW4mf5YNtS4MAD7ZkAauDDWp2N3_Qzs7/view?usp=sharing
like
!gdown https://drive.google.com/uc?id=1fQtBSYEyuai9MjBOF8j7zZ4oQ9W2N64q --output {wav2lipPath}'/checkpoints/'
实践小白,请问一下如何使用GPU来启动程序呢
is there anyway for the output FPS to match the source video input FPS without using Dain paddle gan
6g的显存完全可以胜任推理任务,但是加了DAIN后就炸显存了,换了10G的机子还是炸
想知道作者用的配置是多大显存进行推理的?
6g of video memory is fully capable of reasoning tasks, but after the addition of DAIN will blow up the memory, replaced by a 10G machine or fried
I would like to know how much video memory the author used for the configuration for reasoning?
#如需使用DAIN模型进行补帧需安装paddle
多次测试发现,其实做为素材的视频,保持嘴部不动就可以了。反而原视频如果嘴一直在动,合成出来的视频嘴部效果就很奇怪。
附上一段我们生成的测试视频,看着效果还行,这个原视频嘴是一直闭着的
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.