Giter Site home page Giter Site logo

Comments (7)

hayeong0 avatar hayeong0 commented on August 22, 2024

What does 'unintelligible speech' mean? Can I see your training logs or Tensorboard?

We have used about 60 hours of Hindi data from LIMMITS and have experience using phonemizer. Have you checked if tokens were properly extracted during the training and inference stages?

from hierspeechpp.

sh-lee-prml avatar sh-lee-prml commented on August 22, 2024

Hi

I have attached tensorboard loss curves for our TTV v1 model which was trained with LibriTTS-960 dataset.
we used 4x GPUs with 128 batch size (32 per GPU).

285719622-21e3d5cc-199a-4437-b334-adefb0c693f8.

How about the ctc loss curve you trained? Our checkpoint is from 930k steps.

and

I actually do not know Hindi language well... but I think Phonemizer may not be good for Hindi Language. In this case, how about using other tokenizer?

from hierspeechpp.

Pranjalya avatar Pranjalya commented on August 22, 2024

We have used phonemizer as well, and from past experience, it works decently for Hindi as well.
Here are my logs:
image

"unintelligible" means like it sounded like it was speaking clearly but nothing related to the text and not in the language. But again, it was just with 20k steps checkpoint.

from hierspeechpp.

rishikksh20 avatar rishikksh20 commented on August 22, 2024

@hayeong0 from how many steps onward we start getting some audible voice when train TTV from scratch ?

from hierspeechpp.

Pranjalya avatar Pranjalya commented on August 22, 2024

Just for reference, the audio from 20k steps.

hierspeech_mms.mp4

from hierspeechpp.

sh-lee-prml avatar sh-lee-prml commented on August 22, 2024

Here is our results from 10k, 20k, 50k, 100k, 200k, 950k. (with hierspeech synthesizer v1)

I have attached audio for some text and speaker of libritts-test-clean.

Link

When using LibriTTS dataset, the 10k steps model can synthesize an audible speech.

Thanks!

from hierspeechpp.

Pranjalya avatar Pranjalya commented on August 22, 2024

Thank you very much, it helped.

from hierspeechpp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.