Hi there. Thank you very much for this TTS.
I will talk about training own data.
At first I noticed mistake in step 0
Clone this repository and build monotonic align
git clone https://github.com/alphacep/vosk-tts
cd vosk-tts/training
cd monotonic_align
python setup.py build_ext --inplace
cd ..
you missed
mkdir monotonic_align
and then you can run
python setup.py build_ext --inplace
at second when I try to run python3 train_finetune.py
It gives me some errors:
File "train_finetune.py", line 497, in
main()
File "train_finetune.py", line 58, in main
mp.spawn(run, nprocs=n_gpus, args=(n_gpus, hps,))
torch/multiprocessing/spawn.py", line 240, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
torch/multiprocessing/spawn.py", line 198, in start_processes
while not context.join():
torch/multiprocessing/spawn.py", line 160, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:
-- Process 0 terminated with the following error:
Traceback (most recent call last):
torch/multiprocessing/spawn.py", line 69, in _wrap
fn(i, *args)
vosk-tts/training/train_finetune.py", line 241, in run
train_and_evaluate(rank, epoch, hps, [net_g, net_d, net_dur_disc], [optim_g, optim_d, optim_dur_disc],
vosk-tts/training/train_finetune.py", line 358, in train_and_evaluate
loss_gen, losses_gen = generator_loss(y_d_hat_g)
TypeError: generator_loss() missing 1 required positional argument: 'disc_generated_outputs'
What I did wrong?
My System:
Description: Ubuntu 22.04.4 LTS
Release: 22.04
Codename: jammy
Env:
Conda
Python:
3.8.18
Pytorch
version 1.13.1 (+cu117)
I get the requirements.txt from here https://github.com/FENRlR/MB-iSTFT-VITS2/blob/main/requirements.txt
Voice data was recorded with same text of db-finetune folder. Wavs also 22050Hz mono.
Will be glad to hear from you comments.