alphacep / vosk-tts Goto Github PK

View Code? Open in Web Editor NEW

96.0 13.0 14.0 16.14 MB

Text To Speech Synthesis with Vosk

License: Apache License 2.0

Python 99.83% Makefile 0.02% Shell 0.04% Cython 0.11%

vosk-tts's Introduction

Vosk TTS

Simple TTS based on VITS with some old ideas

Usage

Command line

pip3 install vosk-tts

vosk-tts -n vosk-model-tts-ru-0.6-multi -s 2 --input "Привет мир!" --output out.wav

API

from vosk_tts import Model, Synth
model = Model(model_name="vosk-model-tts-ru-0.6-multi")
synth = Synth(model)

synth.synth("Привет мир!", "out.wav", speaker_id=2)

Voices

For now we support several Russian voices 3 females and 2 males. Get the model here:

vosk-model-tts-ru-0.6-multi

You can use speaker IDs from 0 to 4 included.

We plan to add more voices and languages in the future.

vosk-tts's People

Contributors

Stargazers

Watchers

Forkers

maxmax2016 entn-at guyt101z sts0mrg0 rusarh serjsv87 sang556 i-lern-lin oxanas alexcr-telecom waveletdeboshir vikneo2017 souchef kakatababaka

vosk-tts's Issues

Try DurFlexEVC

https://github.com/hs-oh-prml/DurFlexEVC

Не сбрасывается предыдущая генерация

Периодически, не могу понять с какой частотой и логикой, происходит зависание синтезатора.
Иногда помогает ребут, иногда нет.

Наташа вызывается так:

[hi]
exten = s,1,Answer()
same = n,Wait(1)
same = n,Set(RHV_FILE=${CALLERID(num)}_${EPOCH})
same = n,Set(RHV_TEXT="Компания привествует Вас, ${CALLERID(Name)}")
same = n,System(/var/lib/asterisk/agi-bin/Robovoice.py ${RHV_FILE} ${RHV_TEXT})
same = n,Playback(/home/asterisk/sounds/${RHV_FILE}&silence/1)
same = n,System(rm -f /home/asterisk/sounds/${RHV_FILE}.wav)
same = n,System(rm -f /home/asterisk/sounds/${RHV_FILE}_t.wav)

Сейчас попробовал запустить ее из командной строки тупо скопировав пример отсюда.
В выводе получил

[root@freepbx asterisk]# ^C
[root@freepbx asterisk]# vosk-tts -n vosk-model-tts-ru-0.1-natasha --input "Привет мир!" --output ~/out.wav

vosk-tts -n vosk-model-tts-ru-0.1-natasha --input "Привет мир"/home/asterisk/speech/Robovoice.py 9999_1691413691 'Компания приветствует Вас, Иван Иванов'" --output ~/out.wav
> ^C

и на этом зависание и выход по ctrl+c

Если я правильно понимаю, предыдущая генерация из диалплана как-то зависла и никуда не девается.
Как перед генерацией очистить предыдущий запрос принудительно?

Audio hours in the dataset

How many hours of audio were used to make the Russian language model in GPT-SoVITS?

ERROR

File "/.../training/data_utils.py", line 220, in _filter
for audiopath, sid, text, cleaned in self.audiopaths_sid_text:
ValueError: not enough values to unpack (expected 4, got 3)

TypeError: generator_loss() missing 1 required positional argument: 'disc_generated_outputs'

Hi there. Thank you very much for this TTS.
I will talk about training own data.

At first I noticed mistake in step 0

Clone this repository and build monotonic align

git clone https://github.com/alphacep/vosk-tts
cd vosk-tts/training
cd monotonic_align
python setup.py build_ext --inplace
cd ..
you missed
mkdir monotonic_align
and then you can run
python setup.py build_ext --inplace

at second when I try to run python3 train_finetune.py
It gives me some errors:

File "train_finetune.py", line 497, in
main()
File "train_finetune.py", line 58, in main
mp.spawn(run, nprocs=n_gpus, args=(n_gpus, hps,))

torch/multiprocessing/spawn.py", line 240, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')

torch/multiprocessing/spawn.py", line 198, in start_processes
while not context.join():
torch/multiprocessing/spawn.py", line 160, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 0 terminated with the following error:
Traceback (most recent call last):

torch/multiprocessing/spawn.py", line 69, in _wrap
fn(i, *args)
vosk-tts/training/train_finetune.py", line 241, in run
train_and_evaluate(rank, epoch, hps, [net_g, net_d, net_dur_disc], [optim_g, optim_d, optim_dur_disc],
vosk-tts/training/train_finetune.py", line 358, in train_and_evaluate
loss_gen, losses_gen = generator_loss(y_d_hat_g)
TypeError: generator_loss() missing 1 required positional argument: 'disc_generated_outputs'

What I did wrong?

My System:
Description: Ubuntu 22.04.4 LTS
Release: 22.04
Codename: jammy

Env:
Conda

Python:
3.8.18
Pytorch
version 1.13.1 (+cu117)

I get the requirements.txt from here https://github.com/FENRlR/MB-iSTFT-VITS2/blob/main/requirements.txt

Voice data was recorded with same text of db-finetune folder. Wavs also 22050Hz mono.

Will be glad to hear from you comments.

Try consistence loss

https://github.com/ConsistencyVC/ConsistencyVC-voive-conversion

English support?

Do we have English support? Something like:

model = Model(model_name="vosk-model-tts-en-0.6-multi")

TIA for any help!

Try EfficientSpeech

https://flashspeech.github.io/

https://github.com/roatienza/efficientspeech

Ударения

Прекрасная модель, все замечательно работает.
Как прямо указать ударение в некоторых словах? Пробовал делать заглавную букву или ставить плюс перед гласной - результата нет.
Спасибо

Got an error during the opening of model directory

Hello!

Have got an error

UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 517: character maps to undefined

trying to create a Model object. Could fix it by adding encoding type to the line #46 in model.py

for line in open(model_path / "dictionary", encoding='utf8'):

Возможность убрать внутренние print'ы

Николай, здравствуйте.
В коде присутствуют принты, которые могут мешать смотреть логи сервиса, в котором находится Ваша модель синтеза. Нет ли возможности их убрать, либо сделать опционально?

Try seq2seq-vc

https://github.com/unilight/seq2seq-vc

How to add new words to dictionary?

Hello dear Nikolay.
I have a new dataset with new text (sentences) and wav audio. As right is if the words of new content not in the dictionary it has an error.
Is any script to convert new words to format of your dictionary? Or any script which can add new words with transcription to exiting dictionary?
Thank you.

p.s. Hopefully will try your gpt-sovits in near time.