View Code? Open in Web Editor
NEW
Official source code of the INTERSPEECH 2023 paper: "Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Model" (AVLIT)
Home Page: https://arxiv.org/abs/2306.00160
License: MIT License
avlit's Issues
How did you do the loss function ?
Can you please share the script to generate NTCD-TIMIT and LRS3+WHAM! datasets?
Thanks
Can you provide you pretrained model please?
In model configuration i am see:
video_encoder_checkpoint = "path/to/ae.ckpt",
What is it? Where i am can get this ae.ckpt file?
Thank you for the awesome work.
Is it possible to separate the input audio mixture only without video frames? (maybe by setting the video path to null).
Thank you
No 'tests/test_avlit.py' code file mentioned in readme.