Giter Site home page Giter Site logo

wav2lip_train's Introduction

WaveLip 训练模型代码

##Wav2Lip Wav2Lip 是一种基于对抗生成网络的由语音驱动的人脸说话视频生成模型。如下图所示,Wav2Lip的网络模型总体上分成三块:生成器、判别器和一个预训练好的Lip-Sync Expert组成。网络的输入有2个:任意的一段视频和一段语音,输出为一段唇音同步的视频。生成器是基于encoder-decoder的网络结构,分别利用2个encoder: speech encoder, identity encoder去对输入的语音和视频人脸进行编码,并将二者的编码结果进行拼接,送入到 face decoder 中进行解码得到输出的视频帧。判别器Visual Quality Discriminator对生成结果的质量进行规范,提高生成视频的清晰度。为了更好的保证生成结果的唇音同步性,Wav2Lip引入了一个预预训练的唇音同步判别模型 Pre-trained Lip-sync Expert,作为衡量生成结果的唇音同步性的额外损失。

本项目本质是在训练好的数据上做finetune,让你可以用自己的唇形数据 ###1. 环境的配置 1.建议准备一台有显卡的linux系统电脑

2.Python 3.6 或者更高版本

ffmpeg: sudo apt-get install ffmpeg

3.必要的python包的安装,所需要的库名称都已经包含在requirements.txt文件中,可以使用 pip install -r requirements.txt一次性安装.

4.在本实验中利用到了人脸检测的相关技术,需要下载人脸检测预训练模型:Face detection pre-trained model 并移动到 face_detection/detection/sfd/s3fd.pth文件夹下.

5.分别参考两个项目的代码来生成自己的代码: 一个是官方代码

git clone https://github.com/Rudrabha/Wav2Lip

一个是gitee上的项目,建议在开发时可以下载下来看看不同点

git clone https://gitee.com/sparkle__code__guy/wave2lip

###2. 训练前准备工作

  1. 下载权重文件到checkpoints目录下,可以自行下载
模型 描述 下载链接
Wav2Lip Highly accurate lip-sync wavlip

| | Wav2Lip + GAN |Slightly inferior lip-sync, but better visual quality |wavlip+GAN| | Expert Discriminator |Weights of the expert discriminator |Expert Discriminator|

  1. 下载人脸识别的pre-trained modelface_detection/detection/sfd/s3fd.pth下。如果不能下载试试这个链接

###3. 准备数据 准备自己的视频数据,至少要5个视频,视频中有明显的人的口型和声音。放入data/original_data目录下

###4.预处理数据

python preprocess.py --ngpu 1 --data_root E:/Projects/wav2lip_train/data/original_data --preprocessed_root E:/Projects/wav2lip_train/data/preprocessed_root --batch_size 8

data_root为原始视频地址,preprocessed_root为处理完的视频存放的位置 获取对应的文件列表并更新到filelists/train.txt和filelists/eval.txt。只保存对应的视频名称即可。

from glob import glob
import shutil,os
result = list(glob("/home/guo/wave2lip/wave2lip_torch/Wav2Lip/data/preprocessed_root/original_data/*"))
print(result)
result_list = []
for i,dirpath in enumerate(result):
    shutil.move(dirpath,"./data/preprocessed_root/original_data/".format(i))
    result_list.append("{}".format(i))
print("\n".join(result_list))

###5.训练 执行下面的命令进行训练

python wav2lip_train.py --data_root ./data/preprocessed_root/data --checkpoint_dir ./savedmodel --syncnet_checkpoint_path ./checkpoints/lipsync_expert.pth

###6.模型预测

python inference.py --checkpoint_path ./savedmodel/checkpoint_step000000001.pth --face ./input/test.mp4 --audio ./input/audio.wav --out-file ./output

———————————————— ####参考文档 CSDN博主「会发paper的学渣」的原创文章 ,遵循CC 4.0 BY-SA版权协议

wav2lip_train's People

Contributors

rogerle avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

wav2lip_train's Issues

预处理数据找不到s3fd.pth

(Wav2LipTrain) E:\wav2lip_train>python preprocess.py --ngpu 1 --data_root data/original_data --preprocessed_root data/preprocessed_root --batch_size 8 Traceback (most recent call last): File "preprocess.py", line 10, in <module> before running this script!') FileNotFoundError: Save the s3fd model to data/face_detection/detection/sfd/s3fd.pth before running this script!

preprocess.py 文件第8行改为if not path.isfile('face_detection/detection/sfd/s3fd.pth'):即可,亦或者将face_detection目录拷贝到data下

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.