Giter Site home page Giter Site logo

Comments (6)

fquirin avatar fquirin commented on August 16, 2024 1

Docker image has been updated with the fixed MaryTTS API +1 Let me know if there are any more issues.

The interface works like charm now 😃

There are 2 things though that confuse me.
First one is that the qualities low and medium are basically same speed 🤔 . That is something I've noticed before in an earlier version.
I've been testing the aarch64 container on RPi4 and the results look like this (Voice: Mary-Ann en_us):

  • Low: loaded in 4.0s, playtime 6.4s
  • Medium: loaded in 3.8s, playtime 6.5s
  • High: loaded in 24.8s, playtime 6.5s

Load-times are measured inside the SEPIA client and Larynx test-page and are identical to Larynx console plus ~100ms network delay.

Second thing is that after almost exactly 30s of inactivity it takes another 2s to load the result every time, no matter how long the text is :-|.
It looks like something is unloading or powering down, because these 2s are not showing up in the Larynx console. This is reproducible inside SEPIA client and the Larynx test-page and I'm wondering if it has something to do with Docker itself because loading the Larynx test-page will reset the timer as well 😕 . I need to double-check if the MaryTTS container (or any container) has the same effect 🤔

It should be possible now with gruut to add models for predicting more features of German words (e.g., case), and use them when verbalizing dates, etc.

If you show me the right place and methods I can try to fill them with life for German ;-)

I'll check out the SEPIA STT-Server! I'd like to use it as a base for a project like OpenTTS for STT, where all of the available models for a given language are gathered together behind a web API.

I'd definitely support that. I was planning to add Coqui to the server but haven't had time yet. Everything is prepared to handle different "engines" though :-)

from larynx.

synesthesiam avatar synesthesiam commented on August 16, 2024

Thanks! This sounds like a great idea -- I'll get it implemented tomorrow. I like the idea of having the quality come through too, so the voices will show up like "harvard;low", "harvard;high" since those are the values you can plug directly into /process

Also, I could use a few tips in gruut on how to properly parse more stuff in German. You can see all the good stuff I have for English. I'm using the dateparser library in Python, but oddly it didn't parse "YYYY.MM.DD" as a German date...

Oh, and how is the STT server going? I'm planning to cycle back around to that soon and was curious where you were at.

from larynx.

fquirin avatar fquirin commented on August 16, 2024

Thanks! This sounds like a great idea -- I'll get it implemented tomorrow.

Awesome 🤩

I'm using the dateparser library in Python, but oddly it didn't parse "YYYY.MM.DD" as a German date...

Ok that's really weird :-/. I'm not familiar with 'dateparser' but this seems like a trivial task 😅 , on the other hand nothing about time and dates is trivial when you want to support more than 1 language 😆 . I don't see 'dateparser' in the code you've linked above, where is it applied to the text?

I could certainly write some parsing methods for German, but keep in mind that methods like en_verbalize_time(time) won't work for German because pronunciation depends on the surrounding words 🙈 . It will require something like de_verbalize_time(full_text) 😶 . I've been working on 'text2num' German support a few weeks ago (its part of the ne STT server :-)) and it was very complicated to implement properly because the old code structure didn't really take into account that German behaves fundamentally different in some situations :-/.
I remember we've started to discuss the topic a while ago, maybe we can continue there.

Oh, and how is the STT server going? I'm planning to cycle back around to that soon and was curious where you were at.

It has been released and I'm very happy with the results so far: SEPIA STT-Server, you can use one of the Docker containers for testing 🙂 . I've written a Javascript client for the server and was planning to document the API next. Let me know if you need any info.

from larynx.

synesthesiam avatar synesthesiam commented on August 16, 2024

Docker image has been updated with the fixed MaryTTS API 👍 Let me know if there are any more issues.
I forgot to mention too that you can send SSML into the MaryTTS API; if the text begins with an angle bracket < it will be interpreted as SSML.

I'll continue the parsing discussion over on the gruut thread. It should be possible now with gruut to add models for predicting more features of German words (e.g., case), and use them when verbalizing dates, etc.

I'll check out the SEPIA STT-Server! I'd like to use it as a base for a project like OpenTTS for STT, where all of the available models for a given language are gathered together behind a web API.

from larynx.

fquirin avatar fquirin commented on August 16, 2024

Second thing is that after almost exactly 30s of inactivity it takes another 2s to load the result every time, no matter how long the text is :-|.

I solved this one 😅 . It seems to be a problem with domain name resolution inside my network O_o. If I use the IP instead of the RPi hostname I never get the 2s timeout :-/

from larynx.

synesthesiam avatar synesthesiam commented on August 16, 2024

Awesome, thanks for letting me know! I couldn't think of a reason it would do that 🙂

from larynx.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.