Convert text and audio to facial expressions
pip install -r requirements.txt
- Save trained model to
artefact/best-lips.pt
- [M1] Comment out all references to librosa
streamlit run app.py
- Download LRS3 dataset to
lrs3_v0.4
directory - Download TED talks from YouTube to
video/{id}.mp4
where id is the query parameterv
- https://www.youtube.com/watch?v=0C5UQbWzwg8
- https://www.youtube.com/watch?v=0FQXicAGy5U
- https://www.youtube.com/watch?v=0FkuRwU8HFc
- https://www.youtube.com/watch?v=0GL5r3HVAZ0
- https://www.youtube.com/watch?v=0JGarsZE1rk
- https://www.youtube.com/watch?v=0LxPAY9yis8
- https://www.youtube.com/watch?v=0akiEFwtkyA
- https://www.youtube.com/watch?v=0bop3D7SdDM
- https://www.youtube.com/watch?v=0d6iSvF1UmA
- https://www.youtube.com/watch?v=0hzSUUdTDUA
- https://www.youtube.com/watch?v=0iTehgSOZ8A
- https://www.youtube.com/watch?v=1BHOflzxPjI
- Extract annotated frames from download videos to
noisy
directory
python preprocess.py
- Detect facial landmarks using OpenFace 2.0
docker-compose up
- Copy high confidence detections to
clean
directory
python postprocess.py
The clean
directory contains sample data that have been preprocessed. You may use it to reproduce our model.
python train.py
Call trained model with text input.
python score.py --text "Hello World!"
Video will be saved as output/line_0.mp4