kyubyong / css10 Goto Github PK

CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages

License: Apache License 2.0

Jupyter Notebook 26.91% HTML 41.27% Python 31.82%

css10's Introduction

Hi there 👋

I'm writing to inform that I'm not allowed to introduce myself as "I work for Kakao Brain." any more. I left Kakao Brain, where I've put my heart for the last four years, and began a new journey on my own. Luckily, I'm not alone on that road not taken. I'm with five great co-founding members--all of them were my team mates at Kakao Brain. We named our startup TUNiB, inspired by the popular animation character (https://octonauts.fandom.com/wiki/Tunip_the_Vegimal). We are still at the very early stage, preparing for IR. Please support us as well as Kakao Brain. You can reach us/me at my email: either [email protected] or [email protected].

Best,
Kyubyong

css10's People

Contributors

Stargazers

Watchers

Forkers

romannamor9 kastnerkyle g-wang hbcbh1999 sucrerouge aascode shadowridgedev rahulsoibam wesszabo idgmatrix xzm2004260 amirunpri2018 hubeibei007 allensmile pajuhy123 sahwar desklop alokprasad hongwen-sun young-sun hnbrh luweishuang x-ccs mmalewski appalachianwine imiled wanglc2008 yfliao farshov shangqwe123 ajinkyakulkarni14 hacker-szabo madkote daxiafresh meisoftcoltd liyuanyaun kanapazombie lql0716 luvpine sciai-ai maxmax2016 achyun areafather jordiluque phat-do jkanangila lwang114 yunnet fetvac1 georgehappy1 vinlic rxhmdia yaoao2017 aliang-voice bigdan12 birnfly eiru0623 macguyversmusic bestdpf noone99999

css10's Issues

Problem with russian model

Hi, I have a problem with russian model
FailedPreconditionError (see above for traceback): Attempting to use uninitialized value embedding/lookup_table
[[Node: embedding/lookup_table/read = IdentityT=DT_FLOAT, _class=["loc:@embedding/lookup_table"], _device="/job:localhost/replica:0/task:0/cpu:0"]]

Is it impossible for Chinese audiobooks to use Gentle for force alignment?

Manual alignment of audiobooks is a waste of time.

srt vs bin incompatible data [Error] [Solved]

An error appeared when training, in line:

css10/tacotron/data_load.py

Line 98 in de56b92

mel = "{}/mels/{}".format(hp.lang, fname.replace("wav", "npy"))

I solved it by modifying the previous line:

css10/tacotron/data_load.py

Line 97 in de56b92

fname = os.path.basename(fpath)

to:

fname = os.path.basename(fpath).decode('utf-8')

I notify you just in case this modification is necessary for better compatibility with up-to-date python versions.

Brazilian Portuguese Model

Could you make a database available in Brazilian Portuguese? If not, could you guide me on how to train one like the databases you made available?

Can't use GPU

Hi,

I was able to test the synthesize function (synthesize.py) on CPU with success.
But when I tried to use GPU, I have faced with different issues.
First, I tried to use tensorflow-gpu==1.3.0, but according to this chart: https://www.tensorflow.org/install/source#gpu, it requires CUDA 8, and according to this list: https://gitlab.com/nvidia/container-images/cuda/blob/master/doc/supported-tags.md, I could use only Ubuntu 16.04 with an nvidia docker base image for CUDA 8.0, but I have failed with the installation of the requirements on Ubuntu 16.04.
As a second step, I have tried to use tensorflow-gpu==1.5.0 with CUDA 9, but the Nvidia base image for Ubuntu 18.04 support only CUDA 9.2, and not 9.0, and those looks uncompatible...
As a third step, I have tried tensorflow-gpu==1.13.1 with CUDA 10.0, with a CUDA 10.0 based Ubuntu 18.04 base docker image.
Finally, tensorflow can detect the GPU, but the session initialization (sess,run()) takes forever, and eats up all the GPU memory.
I have tried to limit the memory usage, and then the session initialization could finish after more than 4 minutes, but the Feed Forward just stuck at the very beginning, no progress at all within a few minutes.

Any ideas or suggestion? What am I doing wrong?

Thanks!

Vocab for japanese

i'm not found vocab in model DCTSS for japanese, you can share with me, thank.

Incorrect link to Tacotron Finnish audio samples

Dear @Kyubyong

I found another mixed up link.

The link for the Finnish audio samples for tacotron points to the Dutch ones (https://soundcloud.com/kyubyong-park/sets/ms10_nl_t).

The correct link should be: https://soundcloud.com/kyubyong-park/sets/ms10_fi_t

Regards!

Numbers or ( ) are not considered by the models

In automatically extracted sentences, both can appear. Looks like numbers can be handled by NTLK but "(" seems harder to handle, since they are associated to an "inflection" in the intonation.

Output node names

Hi,

I would like to freeze the pretrained Tacotron model (French) but I can't figure out what the output node names are. I tried various tools to visualize the model but none of them succeeded on my (old) machine because of the model size.

Would you mind sharing this information?

Thank you for your support and for publishing your work publicly.

Spanish Soundcloud/Dropbox links swapped

Very small issue, but the links to the models and samples are swapped in in readme for Spanish.

Synthesis example

Hi there, I am new to this project.

Would you please give an example of using pre-trained model to synthesize a new audio?