audio_style_transfer_pytorch's Introduction

audio_style_transfer_pytorch

Implementation of Dmitry Ulyanov's neural-style-audio-tf (Audio style transfer) with Pytorch.
Codes are heavily inspired by:
Dmitry's code: https://github.com/DmitryUlyanov/neural-style-audio-tf
Pytorch implementation of image transfer: https://pytorch.org/tutorials/advanced/neural_style_tutorial.html

Requirements

Python=3.8
pytorch==1.8.1
For others, see requirements.txt.

Environments

For Linux envnironment, use conde below (You can use codes in pytorch.org for installation on your own os.
conda create -n {name of environment} python=3.8
pip install -r requirements.txt
conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch

Training

You have two options:

If you want to just try one-by-one, just open jupyter notebook file ("Python3 Audio Style Transfer Run.ipynb") and follow the procedure.
If you have multiple contents and styles, set working directorys for content files and style files in configs.py and run the code below:


  python run_functions.py -p {which phase of spectrogram you are going to use} -s {tag when saving synthesized audio files}

Notes

There are some differences between Dmitry's code and mine that:

Instead of using tensorflow v1.0 like Dmitry, I've used latest pytorch (v1.8.1) for implementation.
I used Prem Seetharaman's STFT class constructed with pytorch. Thus, when synthesizing the sound, one can choose either pase of content or style for synthesis.
I've customized codes so that one can generate combinations of multiple content sounds and multiple style sounds.

June 2021, Dabin Moon

Recommend Projects