Giter Site home page Giter Site logo

leoauri / waveflow Goto Github PK

View Code? Open in Web Editor NEW

This project forked from l0sg/waveflow

0.0 1.0 0.0 1.49 MB

A PyTorch implementation of "WaveFlow: A Compact Flow-based Model for Raw Audio"

Home Page: https://arxiv.org/abs/1912.01219

License: BSD 3-Clause "New" or "Revised" License

Python 28.58% Dockerfile 0.03% Jupyter Notebook 71.40%

waveflow's Introduction

This is a hack of the repo to enable installation in a pip environment and import.

WaveFlow: A Compact Flow-based Model for Raw Audio

Update: Pretrained weights are now available. See links below.

This is an unofficial PyTorch implementation of WaveFlow (Ping et al, ICML 2020) model.

The aim for this repo is to provide easy-to-use PyTorch version of WaveFlow as a drop-in alternative to various neural vocoder models used with NVIDIA's Tacotron2 audio processing backend.

Please refer to the official implementation written in PaddlePaddle for the official results.

Setup

  1. Clone this repo and install requirements

    git clone https://github.com/L0SG/WaveFlow.git
    cd WaveFlow
    pip install -r requirements.txt
  2. Install Apex for mixed-precision training

Train your model

  1. Download LJ Speech Data. In this example it's in data/

  2. Make a list of the file names to use for training/testing.

    ls data/*.wav | tail -n+10 > train_files.txt
    ls data/*.wav | head -n10 > test_files.txt

    -n+10 and -n10 indicates that this example reserves the first 10 audio clips for model testing.

  3. Edit the configuration file and train the model.

    Below are the example commands using waveflow-h16-r64-bipartize.json

    nano configs/waveflow-h16-r64-bipartize.json
    python train.py -c configs/waveflow-h16-r64-bipartize.json

    Single-node multi-GPU training is automatically enabled with DataParallel (instead of DistributedDataParallel for simplicity).

    For mixed precision training, set "fp16_run": true on the configuration file.

    You can load the trained weights from saved checkpoints by providing the path to checkpoint_path variable in the config file.

    checkpoint_path accepts either explicit path, or the parent directory if resuming from averaged weights over multiple checkpoints.

    Examples

    insert checkpoint_path: "experiments/waveflow-h16-r64-bipartize/waveflow_5000" in the config file then run

    python train.py -c configs/waveflow-h16-r64-bipartize.json

    for loading averaged weights over 10 recent checkpoints, insert checkpoint_path: "experiments/waveflow-h16-r64-bipartize" in the config file then run

    python train.py -a 10 -c configs/waveflow-h16-r64-bipartize.json

    you can reset the optimizer and training scheduler (and keep the weights) by providing --warm_start

    python train.py --warm_start -c configs/waveflow-h16-r64-bipartize.json
  4. Synthesize waveform from the trained model.

    insert checkpoint_path in the config file and use --synthesize to train.py. The model generates waveform by looping over test_files.txt.

    python train.py --synthesize -c configs/waveflow-h16-r64-bipartize.json

    if fp16_run: true, the model uses FP16 (half-precision) arithmetic for faster performance (on GPUs equipped with Tensor Cores).

Pretrained Weights

We provide pretrained weights via Google Drive. The models are trained for 5 M steps, then we averaged weights over 20 last checkpoints with -a 20. Audio quality almost matches the original paper.

Models Download
waveflow-h16-r64-bipartize Link
waveflow-h16-r128-bipartize Link

Reference

NVIDIA Tacotron2: https://github.com/NVIDIA/waveglow

NVIDIA WaveGlow: https://github.com/NVIDIA/waveglow

r9y9 wavenet-vocoder: https://github.com/r9y9/wavenet_vocoder

FloWaveNet: https://github.com/ksw0306/FloWaveNet

Parakeet: https://github.com/PaddlePaddle/Parakeet

waveflow's People

Contributors

l0sg avatar leoauri avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.