ffmpeg error - anaconda python 3.8 -

`(SadTalker-Wav2lip) H:\SadTalker-Video-Lip-Sync>
(SadTalker-Wav2lip) H:\SadTalker-Video-Lip-Sync>python inference.py --driven_audio 1.wav --source_video 1.mp4 --use_DAIN
./checkpoints\epoch_20.pth
./checkpoints\auido2pose_00140-model.pth
./checkpoints\auido2exp_00300-model.pth
./checkpoints\facevid2vid_00189-model.pth.tar
./checkpoints\mapping_00109-model.pth.tar
3DMM Extraction for source image
landmark Det:: 100%|███████████████████████████████████████████████████████████████████████████████| 1020/1020 [01:11<00:00, 14.18it/s]
3DMM Extraction In Video:: 100%|███████████████████████████████████████████████████████████████████| 1020/1020 [00:16<00:00, 63.24it/s]
mel:: 100%|███████████████████████████████████████████████████████████████████████████████████████| 423/423 [00:00<00:00, 40052.16it/s]
audio2exp:: 100%|█████████████████████████████████████████████████████████████████████████████████████| 43/43 [00:00<00:00, 124.36it/s]
Face Renderer:: 100%|████████████████████████████████████████████████████████████████████████████████| 423/423 [00:35<00:00, 11.85it/s]
Traceback (most recent call last):
File "inference.py", line 123, in
main(args)
File "inference.py", line 76, in main
tmp_path, new_audio_path, return_path = animate_from_coeff.generate(data, save_dir, pic_path, crop_info,
File "H:\SadTalker-Video-Lip-Sync\src\facerender\animate.py", line 154, in generate
sound = AudioSegment.from_mp3(audio_path)
File "H:\anaconda3\envs\SadTalker-Wav2lip\lib\site-packages\pydub\audio_segment.py", line 796, in from_mp3
return cls.from_file(file, 'mp3', parameters=parameters)
File "H:\anaconda3\envs\SadTalker-Wav2lip\lib\site-packages\pydub\audio_segment.py", line 773, in from_file
raise CouldntDecodeError(
pydub.exceptions.CouldntDecodeError: Decoding failed. ffmpeg returned error code: 1

Output from ffmpeg/avlib:

ffmpeg version 2022-07-06-git-03d81a044a-full_build-www.gyan.dev Copyright (c) 2000-2022 the FFmpeg developers
built with gcc 12.1.0 (Rev2, Built by MSYS2 project)
configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-bzlib --enable-lzma --enable-libsnappy --enable-zlib --enable-librist --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-libbluray --enable-libcaca --enable-sdl2 --enable-libdav1d --enable-libdavs2 --enable-libuavs3d --enable-libzvbi --enable-librav1e --enable-libsvtav1 --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxvid --enable-libaom --enable-libjxl --enable-libopenjpeg --enable-libvpx --enable-mediafoundation --enable-libass --enable-frei0r --enable-libfreetype --enable-libfribidi --enable-liblensfun --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-nvdec --enable-nvenc --enable-d3d11va --enable-dxva2 --enable-libmfx --enable-libshaderc --enable-vulkan --enable-libplacebo --enable-opencl --enable-libcdio --enable-libgme --enable-libmodplug --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libshine --enable-libtheora --enable-libtwolame --enable-libvo-amrwbenc --enable-libilbc --enable-libgsm --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-ladspa --enable-libbs2b --enable-libflite --enable-libmysofa --enable-librubberband --enable-libsoxr --enable-chromaprint
libavutil 57. 27.100 / 57. 27.100
libavcodec 59. 36.100 / 59. 36.100
libavformat 59. 26.100 / 59. 26.100
libavdevice 59. 6.100 / 59. 6.100
libavfilter 8. 41.100 / 8. 41.100
libswscale 6. 6.100 / 6. 6.100
libswresample 4. 6.100 / 4. 6.100
libpostproc 56. 5.100 / 56. 5.100
[mp3float @ 000001c4461b57c0] Header missing
Last message repeated 127 times
[mp3 @ 000001c44619f200] Could not find codec parameters for stream 0 (Audio: mp3 (mp3float), 0 channels, fltp): unspecified frame size
Consider increasing the value for the 'analyzeduration' (0) and 'probesize' (5000000) options
Input #0, mp3, from '1.wav':
Duration: N/A, start: 0.000000, bitrate: N/A
Stream #0:0: Audio: mp3, 0 channels, fltp
Stream mapping:
Stream #0:0 -> #0:0 (mp3 (mp3float) -> pcm_s32le (native))
Press [q] to stop, [?] for help
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[mp3float @ 000001c446629b80] Header missing
Error while decoding stream #0:0: Invalid data found when processing input
[abuffer @ 000001c4461a1440] Value inf for parameter 'time_base' out of range [0 - 2.14748e+09]
Last message repeated 1 times
[abuffer @ 000001c4461a1440] Error setting option time_base to value 1/0.
[graph_0_in_0_0 @ 000001c446625e40] Error applying options to the filter.
Error reinitializing filters!
Error while filtering: Result too large
Finishing stream 0:0 without any data written to it.
[abuffer @ 000001c4461da2c0] Value inf for parameter 'time_base' out of range [0 - 2.14748e+09]
Last message repeated 1 times
[abuffer @ 000001c4461da2c0] Error setting option time_base to value 1/0.
[graph_0_in_0_0 @ 000001c4461da1c0] Error applying options to the filter.
Error configuring filter graph
Conversion failed!`

Not sure what else to do. I've tried different files, uninstall and re-install ffpeg but I keep this, any ideas?

First run issue

Hi,

I'm trying the first run with no success. After cloning the repository and installing the requirements, I run the following in the main project's directory:

python inference.py --driven-audio examples/driven_audio/chinese_poem1.wav --source-video examples/driven_video/1.mp4

The output:

Traceback (most recent call last):
  File "M:\AI\SadTalker-Video-Lip-Sync\inference.py", line 5, in <module>
    from src.utils.preprocess import CropAndExtract
ModuleNotFoundError: No module named 'src.utils'

I took a look at the project's structure and it's alright (src/utils are there). Am I missing something here? Thanks in advance.

和VideoReTalking、wav2lip的对比

作者可以分享一下SadTalker-Video-Lip-Sync和VideoReTalking、wav2lip的区别吗？

脸部克隆完成后总是提示如下错误，cuda和cunnn版本都换了好几次了。还是一样的问题

W0630 09:16:06.445015 25868 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 8.9, Driver API Version: 12.2, Runtime API Version: 10.2
W0630 09:16:06.447014 25868 dynamic_loader.cc:276] Note: [Recommend] copy cudnn into CUDA installation directory.
For instance, download cudnn-10.0-windows10-x64-v7.6.5.32.zip from NVIDIA's official website,
then, unzip it and copy it into C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0
You should do this according to your CUDA installation directory and CUDNN version.
Traceback (most recent call last):
File "inference.py", line 123, in
main(args)
File "inference.py", line 83, in main
predictor_dian = dain_predictor.DAINPredictor(args.dian_output, weight_path=args.DAIN_weight,
File "D:\SadTalker-Video-Lip-Sync\src\dain_model\dain_predictor.py", line 64, in init
self.build_inference_model()
File "D:\SadTalker-Video-Lip-Sync\src\dain_model\base_predictor.py", line 25, in build_inference_model
self.program, self.feed_names, self.fetch_targets = paddle.static.load_inference_model(
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\decorator.py", line 232, in fun
return caller(func, *(extras + args), **kw)
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\paddle\fluid\wrapped_decorator.py", line 25, in impl
return wrapped_func(*args, **kwargs)
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\paddle\fluid\framework.py", line 443, in impl
return func(*args, **kwargs)
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\paddle\static\io.py", line 809, in load_inference_model
deserialize_persistables(program, params_bytes, executor)
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\decorator.py", line 232, in fun
return caller(func, *(extras + args), **kw)
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\paddle\fluid\wrapped_decorator.py", line 25, in impl
return wrapped_func(*args, **kwargs)
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\paddle\fluid\framework.py", line 443, in impl
return func(*args, **kwargs)
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\paddle\static\io.py", line 650, in deserialize_persistables
executor.run(load_program)
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\paddle\fluid\executor.py", line 1299, in run
six.reraise(*sys.exc_info())
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\six.py", line 719, in reraise
raise value
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\paddle\fluid\executor.py", line 1285, in run
res = self._run_impl(
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\paddle\fluid\executor.py", line 1464, in _run_impl
return new_exe.run(list(feed.keys()), fetch_list, return_numpy)
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\paddle\fluid\executor.py", line 547, in run
tensors = self._new_exe.run(feed_names, fetch_list)._move_to_list()
RuntimeError: (PreconditionNotMet) The third-party dynamic library (cudnn64_7.dll) that Paddle depends on is not configured correctly. (error code is 126)
Suggestions:

Check if the third-party dynamic library (e.g. CUDA, CUDNN) is installed correctly and its version is matched with paddlepaddle you installed.
Configure third-party dynamic library environment variables as follows:

Linux: set LD_LIBRARY_PATH by export LD_LIBRARY_PATH=...
Windows: set PATH by `set PATH=XXX; (at ..\paddle\phi\backends\dynload\dynamic_loader.cc:303)

如何解决推理目录运行错误 "AttributeError: _2D"

Conversion failed!

Error configuring filter graph
Conversion failed!

how to solve this problem

how to resolve the problem "no module cv2"

虚拟人表情情绪的最佳实践

希望实现虚拟人对话中改变情绪，比如悲伤的说话，开心的说话，兴奋的说话，是使用不同的带有情绪的视频会比较好呢？亦或者是有其他实现？

setting an array element with a sequence错误，能解决吗？

3DMM Extraction In Video:: 0%| | 0/135 [00:00<?, ?it/s]
Traceback (most recent call last):
File "inference.py", line 123, in
main(args)
File "inference.py", line 67, in main
first_coeff_path, crop_pic_path, crop_info = preprocess_model.generate(pic_path, first_frame_dir)
File "D:\3090\SadTalker-Video-Lip-Sync-master\src\utils\preprocess.py", line 124, in generate
trans_params, im1, lm1, _ = align_img(frame, lm1, self.lm3d_std)
File "D:\3090\SadTalker-Video-Lip-Sync-master\src\face3d\util\preprocess.py", line 101, in align_img
trans_params = np.array([w0, h0, s, t[0], t[1]])
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (5,) + inhomogeneous part.

唇形效果还需要继续调整

唇形效果完全对不上！还需要调整

嘴型对齐失败 Failed Lip Matching

尝试哪怕audio输入为全静音的时候输出的视频也仍然有说话的嘴型，怀疑是某些网络没有运行，全程命令行中没有报错。请问有遇到类似的问题有头绪的吗？
I suspect that some neural networks in my device are not running …… when there is no audio input the generated video still has lip movements. There are no error messages in the command line throughout the process. Does anyone have any idea about this issue?

本项目是基于SadTalker的哪一次commit进行的后续构建？

rt

求commit号，或者大致的fork日期

how to run from colab

please give me steps

or starting file to run from colab like sadtalker

莫名其妙就不能用了

在colab上，前几天还好好的，今天又试了下，结果报错不能用了，不知道是哪里的问题，是模型的问题吗？

效果较差

temp_face1.tts_tmp.mp4

使用use_DAIN参数时报错， error when take parameter --user_DAIN

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ D:\soundmaker\SadTalker-Video-Lip-Sync\inference.py:124 in │
│ │
│ 121 │ │ args.device = "cuda" │
│ 122 │ else: │
│ 123 │ │ args.device = "cpu" │
│ ❱ 124 │ main(args) │
│ 125 │
│ │
│ D:\soundmaker\SadTalker-Video-Lip-Sync\inference.py:87 in main │
│ │
│ 84 │ │ predictor_dian = dain_predictor.DAINPredictor(args.dian_output, weight_path=args │
│ 85 │ │ │ │ │ │ │ │ │ │ │ │ │ time_step=args.time_step, │
│ 86 │ │ │ │ │ │ │ │ │ │ │ │ │ remove_duplicates=args.remove_dupl │
│ ❱ 87 │ │ frames_path, temp_video_path = predictor_dian.run(tmp_path) │
│ 88 │ │ paddle.disable_static() │
│ 89 │ │ save_path = return_path[:-4] + 'dain.mp4' │
│ 90 │ │ command = r'ffmpeg -y -i "%s" -i "%s" -vcodec copy "%s"' % (temp_video_path, new │
│ │
│ D:\soundmaker\SadTalker-Video-Lip-Sync/src/..\src\dain_model\dain_predictor.py:166 in run │
│ │
│ 163 │ │ │ │
│ 164 │ │ │ X = np.concatenate((X0, X1), axis=0) │
│ 165 │ │ │ │
│ ❱ 166 │ │ │ o = self.base_forward(X) │
│ 167 │ │ │ │
│ 168 │ │ │ y = o[0] │
│ 169 │
│ │
│ D:\soundmaker\SadTalker-Video-Lip-Sync/src/..\src\dain_model\base_predictor.py:44 in │
│ base_forward │
│ │
│ 41 │ │ │ else: │
│ 42 │ │ │ │ feed_dict[self.feed_names[0]] = inputs │
│ 43 │ │ │ │
│ ❱ 44 │ │ │ out = self.exe.run(self.program, │
│ 45 │ │ │ │ │ │ │ fetch_list=self.fetch_targets, │
│ 46 │ │ │ │ │ │ │ feed=feed_dict) │
│ 47 │
│ │
│ D:\sd-webui-aki-v4\py310\lib\site-packages\paddle\fluid\executor.py:1463 in run │
│ │
│ 1460 │ │ │ core.update_autotune_status() │
│ 1461 │ │ │ return res │
│ 1462 │ │ except Exception as e: │
│ ❱ 1463 │ │ │ six.reraise(*sys.exc_info()) │
│ 1464 │ │
│ 1465 │ def _run_impl(self, program, feed, fetch_list, feed_var_name, │
│ 1466 │ │ │ │ fetch_var_name, scope, return_numpy, use_program_cache, │
│ │
│ D:\sd-webui-aki-v4\py310\lib\site-packages\six.py:719 in reraise │
│ │
│ 716 │ │ │ │ value = tp() │
│ 717 │ │ │ if value.traceback is not tb: │
│ 718 │ │ │ │ raise value.with_traceback(tb) │
│ ❱ 719 │ │ │ raise value │
│ 720 │ │ finally: │
│ 721 │ │ │ value = None │
│ 722 │ │ │ tb = None │
│ │
│ D:\sd-webui-aki-v4\py310\lib\site-packages\paddle\fluid\executor.py:1450 in run │
│ │
│ 1447 │ │ │ self._log_force_set_program_cache(use_program_cache) │
│ 1448 │ │ │
│ 1449 │ │ try: │
│ ❱ 1450 │ │ │ res = self._run_impl(program=program, │
│ 1451 │ │ │ │ │ │ │ │ feed=feed, │
│ 1452 │ │ │ │ │ │ │ │ fetch_list=fetch_list, │
│ 1453 │ │ │ │ │ │ │ │ feed_var_name=feed_var_name, │
│ │
│ D:\sd-webui-aki-v4\py310\lib\site-packages\paddle\fluid\executor.py:1661 in _run_impl │
│ │
│ 1658 │ │ │ │ else: │
│ 1659 │ │ │ │ │ tensor._copy_from(cpu_tensor, self.place) │
│ 1660 │ │ │ │
│ ❱ 1661 │ │ │ return new_exe.run(scope, list(feed.keys()), fetch_list, │
│ 1662 │ │ │ │ │ │ │ return_numpy) │
│ 1663 │ │ │
│ 1664 │ │ compiled = isinstance(program, compiler.CompiledProgram) │
│ │
│ D:\sd-webui-aki-v4\py310\lib\site-packages\paddle\fluid\executor.py:631 in run │
│ │
│ 628 │ │ """ │
│ 629 │ │ fetch_list = self._check_fetch(fetch_list) │
│ 630 │ │ │
│ ❱ 631 │ │ tensors = self._new_exe.run(scope, feed_names, │
│ 632 │ │ │ │ │ │ │ │ │ fetch_list)._move_to_list() │
│ 633 │ │ if return_numpy: │
│ 634 │ │ │ return as_numpy(tensors, copy=True) │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: In user code:

File "/root/miniconda3/lib/python3.7/site-packages/paddle/fluid/framework.py", line 2798, in append_op
attrs=kwargs.get("attrs", None))

File "/root/miniconda3/lib/python3.7/site-packages/paddle/fluid/layer_helper.py", line 43, in append_op
return self.main_program.current_block().append_op(*args, **kwargs)

File "/root/miniconda3/lib/python3.7/site-packages/paddle/fluid/layers/tensor.py", line 1412, in range
outputs={'Out': out})

File "/paddle/work/github/DAIN/DAIN-paddle-release/PWCNet/PWCNet.py", line 125, in warp
bb = fluid.layers.range(0, B, 1, 'float32')

File "/paddle/work/github/DAIN/DAIN-paddle-release/PWCNet/PWCNet.py", line 292, in forward
warp5 = self.warp(c25, up_flow6 * 0.625)

File "/root/miniconda3/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 600, in __call__
outputs = self.forward(*inputs, **kwargs)

File "/paddle/work/github/DAIN/DAIN-paddle-release/networks/DAIN_slowmotion.py", line 136, in forward_flownets
temp = model(input)

File "/paddle/work/github/DAIN/DAIN-paddle-release/networks/DAIN_slowmotion.py", line 72, in forward
self.forward_flownets(self.flownets, cur_offset_input, time_offsets=time_offsets),

File "/root/miniconda3/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 600, in __call__
outputs = self.forward(*inputs, **kwargs)

File "parse_weights.py", line 25, in convert_model
out = DAIN(image)

File "parse_weights.py", line 179, in <module>
convert_model(pkl_path, model_path)


InvalidArgumentError: The step should be greater than 0 while start < end.
  [Hint: Expected step > 0, but received step:-0.0331662 <= 0:0.] (at

..\paddle/phi/kernels/funcs/range_function.h:33)
[operator < range > error]

当视频生成后好像开始了DAIN的处理，然后出现上述错误，step是负数，看不出来是什么原因造成的

example中的中文音频口型不对

您好，感谢您的工作！
经过测试，中文音频的输出口型似乎并不正确，请问是项目的bug吗

RuntimeError

how to solve：RuntimeError: Unable to open C:\Python-Project\数字人lip\SadTalker-Video-Lip-Sync\checkpoints\shape_predictor_68_face_landmarks.dat

Installation files for Windows

I apologize if this topic has already been addressed, and I understand that you may be receiving numerous messages. I am having difficulty installing the program on my Windows computer. I would greatly appreciate any assistance you can provide.
Thank you.

侧脸有点问题

how to resolve the problem "FPS"

和SadTalker的静态图片结果对比是否不太合适？

Hello，看起来sadtalker对比的合成，是静态的，头不会动，使用静态图片生成的吗？
我理解的是LipSync只改善了唇形部分，不至于头部动作也会影响吧

提示步长是负数

0%| | 0/787 [00:02<?, ?it/s]
Traceback (most recent call last):
File "inference.py", line 123, in
main(args)
File "inference.py", line 86, in main
frames_path, temp_video_path = predictor_dian.run(tmp_path)
File "D:\SadTalker-Video-Lip-Sync\src\dain_model\dain_predictor.py", line 166, in run
o = self.base_forward(X)
File "D:\SadTalker-Video-Lip-Sync\src\dain_model\base_predictor.py", line 44, in base_forward
out = self.exe.run(self.program,
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\paddle\fluid\executor.py", line 1299, in run
six.reraise(*sys.exc_info())
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\six.py", line 719, in reraise
raise value
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\paddle\fluid\executor.py", line 1285, in run
res = self._run_impl(
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\paddle\fluid\executor.py", line 1464, in _run_impl
return new_exe.run(list(feed.keys()), fetch_list, return_numpy)
File "C:\Users\menka.conda\envs\stvl\lib\site-packages\paddle\fluid\executor.py", line 547, in run
tensors = self._new_exe.run(feed_names, fetch_list)._move_to_list()
ValueError: (InvalidArgument) The step should be greater than 0 while start < end.
[Hint: Expected step > 0, but received step:-0.000413004 <= 0:0.] (at ..\paddle/phi/kernels/funcs/range_function.h:33)

Adjust 'audio2exp' to less than every 10 frames for people who talk fast

Thanks for your work on this - sorry this is in English.
The results are very intriguing using this repository but on my tests, the lips are not moving correctly.
I wonder if it is due to the audio only being checked every 10 frames.

Also, would be nice to have the frame rate of the video parsed from the input video and used instead of assuming 25fps.

Looking forward to your response (you can respond in Chinese, I will use a translation app.

视频越往后，口型效果越差

发现刚开始几秒的口型效果很好，视频越往后，效果越差

合成之后以视频时间为准，如果调整成为以音频时间为主

new 512x512 model

Sadtalker just released a 512x512 model, is it possible you can implement this? not sure if you are even using their 256 model but worth it to ask as i havent seen an update from you in a few months

blur

The is a large square blur surrounding the head on videos of 1920x1080 when there is no upscale, but when you upscale lips the blur is removed

怎样针对自己的人物形象进行训练呢？

怎样针对自己的人物形象进行训练呢？和wav2lip训练方式一样吗？

基本没啥用，散了吧

嘴部非常拉跨。。

Inferencing error

Please advise how to avoid this?

The generated video is named ./results\2023_07_19_11.11.52/video##audio_full.mp4
Error: Can not import paddle core while this file exists: C:\Users\Max\anaconda3\envs\SadTalkerLipSync\lib\site-packages\paddle\fluid\libpaddle.pyd
Traceback (most recent call last):
  File "inference.py", line 123, in <module>
    main(args)
  File "inference.py", line 80, in main
    import paddle
  File "C:\Users\Max\anaconda3\envs\SadTalkerLipSync\lib\site-packages\paddle\__init__.py", line 31, in <module>
    from .framework import monkey_patch_variable
  File "C:\Users\Max\anaconda3\envs\SadTalkerLipSync\lib\site-packages\paddle\framework\__init__.py", line 17, in <module>
    from . import random  # noqa: F401
  File "C:\Users\Max\anaconda3\envs\SadTalkerLipSync\lib\site-packages\paddle\framework\random.py", line 17, in <module>    from paddle import fluid
  File "C:\Users\Max\anaconda3\envs\SadTalkerLipSync\lib\site-packages\paddle\fluid\__init__.py", line 36, in <module>
    from . import framework
  File "C:\Users\Max\anaconda3\envs\SadTalkerLipSync\lib\site-packages\paddle\fluid\framework.py", line 35, in <module>
    from . import core
  File "C:\Users\Max\anaconda3\envs\SadTalkerLipSync\lib\site-packages\paddle\fluid\core.py", line 356, in <module>
    raise e
  File "C:\Users\Max\anaconda3\envs\SadTalkerLipSync\lib\site-packages\paddle\fluid\core.py", line 269, in <module>
    from . import libpaddle
ImportError: generic_type: type "_gpuDeviceProperties" is already registered!

All checkpoints were downloaded correctly and Python 3.8 env installed without any obstacles.

results目录下 video_landmarks.txt 内容为空

只生成了img文件夹下的全部视频截图，video_landmarks.txt并没有对图片进行标注。txt内容为空。

问一下python版本和 conda版本是多少啊

colab error: pip subprocess to install build dependencies did not run successfully.

Collecting scipy==1.5.3 (from -r requirements.txt (line 9))
Using cached scipy-1.5.3.tar.gz (25.2 MB)
error: subprocess-exited-with-error

× pip subprocess to install build dependencies did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
Installing build dependencies ... error
error: subprocess-exited-with-error

× pip subprocess to install build dependencies did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

Hello! I have a problem.

Traceback (most recent call last):
File "/content/SadTalker-Video-Lip-Sync/inference.py", line 123, in
main(args)
File "/content/SadTalker-Video-Lip-Sync/inference.py", line 47, in main
preprocess_model = CropAndExtract(path_of_lm_croper, path_of_net_recon_model, dir_of_BFM_fitting, device)
File "/content/SadTalker-Video-Lip-Sync/src/utils/preprocess.py", line 44, in init
self.kp_extractor = KeypointExtractor(device)
File "/content/SadTalker-Video-Lip-Sync/src/face3d/extract_kp_videos.py", line 16, in init
self.detector = face_alignment.FaceAlignment(face_alignment.LandmarksType._2D, device=device)
File "/usr/lib/python3.10/enum.py", line 437, in getattr
raise AttributeError(name) from None
AttributeError: _2D

I write jupyter notebook for colab and i have this problem. Notebook: https://colab.research.google.com/drive/1zeqfsNl7vZxPUztQzNDT2xRQthaz3Zws

none

DLL load failed

Here is ERROR information when running,是哪里的dll没导入吗？

(sadwav2lip) D:\pythonProject\SadTalker-Video-Lip-Sync>python inference.py --driven_audio sound.wav --source_video driving.mp4 --enhancer lip
Traceback (most recent call last):
File "inference.py", line 10, in
from third_part.GFPGAN.gfpgan import GFPGANer
File "D:\pythonProject\SadTalker-Video-Lip-Sync\third_part\GFPGAN\gfpgan_init_.py", line 5, in
from .archs import *
File "D:\pythonProject\SadTalker-Video-Lip-Sync\third_part\GFPGAN\gfpgan\archs_init_.py", line 2, in
from basicsr.utils import scandir
File "D:\Anaconda\envs\sadwav2lip\lib\site-packages\basicsr_init_.py", line 3, in
from .archs import *
File "D:\Anaconda\envs\sadwav2lip\lib\site-packages\basicsr\archs_init_.py", line 5, in
from basicsr.utils import get_root_logger, scandir
File "D:\Anaconda\envs\sadwav2lip\lib\site-packages\basicsr\utils_init_.py", line 5, in
from .img_util import crop_border, imfrombytes, img2tensor, imwrite, tensor2img
File "D:\Anaconda\envs\sadwav2lip\lib\site-packages\basicsr\utils\img_util.py", line 6, in
from torchvision.utils import make_grid
File "D:\Anaconda\envs\sadwav2lip\lib\site-packages\torchvision_init_.py", line 1, in
from torchvision import models
File "D:\Anaconda\envs\sadwav2lip\lib\site-packages\torchvision\models_init_.py", line 11, in
from . import detection
File "D:\Anaconda\envs\sadwav2lip\lib\site-packages\torchvision\models\detection_init_.py", line 1, in
from .faster_rcnn import *
File "D:\Anaconda\envs\sadwav2lip\lib\site-packages\torchvision\models\detection\faster_rcnn.py", line 7, in
from torchvision.ops import misc as misc_nn_ops
File "D:\Anaconda\envs\sadwav2lip\lib\site-packages\torchvision\ops_init_.py", line 1, in
from .boxes import nms, box_iou
File "D:\Anaconda\envs\sadwav2lip\lib\site-packages\torchvision\ops\boxes.py", line 2, in
from torchvision import _C
ImportError: DLL load failed: 找不到指定的模块。

夸克盘链接失效

使用use_DAIN 时报错

The generated video is named ./results\2023_07_10_06.45.09/6##1_full.mp4
Mon Jul 10 06:49:00-WARNING: The old way to load inference model is deprecated. model path: C:\Users\Spencer\PycharmProjects\pythonProject\SadTalker-Video-Lip-Sync\checkpoints\DAIN_weight\model, params path: C:\Users\Spencer\PycharmProjects\pythonProject\SadTalker-Video-Lip-Sync\checkpoints\DAIN_weight\params
W0710 06:49:01.032488 6856 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 12.0, Runtime API Version: 11.2
W0710 06:49:01.032488 6856 gpu_resources.cc:91] device: 0, cuDNN Version: 8.3.
'rm' 不是内部或外部命令，也不是可运行的程序
或批处理文件。
Old fps (frame rate): 25.0
New fps (frame rate): 50
[image2 muxer @ 0000000002f30620] Value 0.000000 for parameter 'start_number' out of range [1 - 2.14748e+009]
[image2 muxer @ 0000000002f30620] Error setting option start_number to value 0.
Could not write header for output file #0 (incorrect codec parameters ?): Error number -34 occurred
Traceback (most recent call last):
File "C:\Users\Spencer\PycharmProjects\pythonProject\SadTalker-Video-Lip-Sync\inference.py", line 123, in
main(args)
File "C:\Users\Spencer\PycharmProjects\pythonProject\SadTalker-Video-Lip-Sync\inference.py", line 86, in main
frames_path, temp_video_path = predictor_dian.run(tmp_path)
File "C:\Users\Spencer\PycharmProjects\pythonProject\SadTalker-Video-Lip-Sync\src\dain_model\dain_predictor.py", line 97, in run
out_path = video2frames(video_path, frame_path_input)
File "C:\Users\Spencer\PycharmProjects\pythonProject\SadTalker-Video-Lip-Sync\src\dain_model\dain_predictor.py", line 36, in video2frames
raise RuntimeError('ffmpeg process video: {} error'.format(vid_name))
RuntimeError: ffmpeg process video: ead5dc4a-3a48-423d-9996-aab5540abf06 error

ValueError: input_path must be a valid path to video/image file

how to solve this problem ?

中文的语音作为输入似乎口型对不上了，不知道是不是我的个例

视频生成完成后，cudnn还接着在处理什么，速度还很慢

add video

lip_sync.mp4

Why inferencing are this error? how to resolve it.

Traceback (most recent call last):
File "G:\dev\SadTalker-Video-Lip-Sync-master\inference.py", line 123, in
main(args)
File "G:\dev\SadTalker-Video-Lip-Sync-master\inference.py", line 58, in main
restorer_model = GFPGANer(model_path='checkpoints/GFPGANv1.3.pth', upscale=1, arch='clean',
File "G:\dev\SadTalker-Video-Lip-Sync-master\third_part\GFPGAN\gfpgan\utils.py", line 76, in init
self.face_helper = FaceRestoreHelper(
File "E:\anaconda\envs\sadtalker\lib\site-packages\facexlib\utils\face_restoration_helper.py", line 99, in init
self.face_det = init_detection_model(det_model, half=False, device=self.device, model_rootpath=model_rootpath)
File "E:\anaconda\envs\sadtalker\lib\site-packages\facexlib\detection_init_.py", line 22, in init_detection_model
load_net = torch.load(model_path, map_location=lambda storage, loc: storage)
File "E:\anaconda\envs\sadtalker\lib\site-packages\torch\serialization.py", line 713, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "E:\anaconda\envs\sadtalker\lib\site-packages\torch\serialization.py", line 938, in _legacy_load
typed_storage._storage._set_from_file(
RuntimeError: unexpected EOF, expected 333577 more bytes. The file might be corrupted.

CUDA 11.2

效果还是不错的，大家用的时候注意这个关键点

多次测试发现，其实做为素材的视频，保持嘴部不动就可以了。反而原视频如果嘴一直在动，合成出来的视频嘴部效果就很奇怪。

61.mp4

附上一段我们生成的测试视频，看着效果还行，这个原视频嘴是一直闭着的

zz-ww / sadtalker-video-lip-sync Goto Github PK

sadtalker-video-lip-sync's People

Contributors

Stargazers

Watchers

Forkers

sadtalker-video-lip-sync's Issues

CUDA 11.2

Recommend Projects

Recommend Topics

Recommend Org