Hi Michael, congratulations for your Larynx v1.0 release 🥳 . Great

MaryTTS API interface is not 100% compatible about larynx HOT 6 CLOSED

fquirin commented on August 16, 2024

MaryTTS API interface is not 100% compatible

from larynx.

Comments (6)

fquirin commented on August 16, 2024 1

Docker image has been updated with the fixed MaryTTS API +1 Let me know if there are any more issues.

The interface works like charm now 😃

There are 2 things though that confuse me.
First one is that the qualities low and medium are basically same speed 🤔 . That is something I've noticed before in an earlier version.
I've been testing the aarch64 container on RPi4 and the results look like this (Voice: Mary-Ann en_us):

Low: loaded in 4.0s, playtime 6.4s
Medium: loaded in 3.8s, playtime 6.5s
High: loaded in 24.8s, playtime 6.5s

Load-times are measured inside the SEPIA client and Larynx test-page and are identical to Larynx console plus ~100ms network delay.

Second thing is that after almost exactly 30s of inactivity it takes another 2s to load the result every time, no matter how long the text is :-|.
It looks like something is unloading or powering down, because these 2s are not showing up in the Larynx console. This is reproducible inside SEPIA client and the Larynx test-page and I'm wondering if it has something to do with Docker itself because loading the Larynx test-page will reset the timer as well 😕 . I need to double-check if the MaryTTS container (or any container) has the same effect 🤔

It should be possible now with gruut to add models for predicting more features of German words (e.g., case), and use them when verbalizing dates, etc.

If you show me the right place and methods I can try to fill them with life for German ;-)

I'll check out the SEPIA STT-Server! I'd like to use it as a base for a project like OpenTTS for STT, where all of the available models for a given language are gathered together behind a web API.

I'd definitely support that. I was planning to add Coqui to the server but haven't had time yet. Everything is prepared to handle different "engines" though :-)

from larynx.

synesthesiam commented on August 16, 2024

Thanks! This sounds like a great idea -- I'll get it implemented tomorrow. I like the idea of having the quality come through too, so the voices will show up like "harvard;low", "harvard;high" since those are the values you can plug directly into /process

Also, I could use a few tips in gruut on how to properly parse more stuff in German. You can see all the good stuff I have for English. I'm using the dateparser library in Python, but oddly it didn't parse "YYYY.MM.DD" as a German date...

Oh, and how is the STT server going? I'm planning to cycle back around to that soon and was curious where you were at.

from larynx.

fquirin commented on August 16, 2024

Thanks! This sounds like a great idea -- I'll get it implemented tomorrow.

Awesome 🤩

I'm using the dateparser library in Python, but oddly it didn't parse "YYYY.MM.DD" as a German date...

Ok that's really weird :-/. I'm not familiar with 'dateparser' but this seems like a trivial task 😅 , on the other hand nothing about time and dates is trivial when you want to support more than 1 language 😆 . I don't see 'dateparser' in the code you've linked above, where is it applied to the text?

I could certainly write some parsing methods for German, but keep in mind that methods like en_verbalize_time(time) won't work for German because pronunciation depends on the surrounding words 🙈 . It will require something like de_verbalize_time(full_text) 😶 . I've been working on 'text2num' German support a few weeks ago (its part of the ne STT server :-)) and it was very complicated to implement properly because the old code structure didn't really take into account that German behaves fundamentally different in some situations :-/.
I remember we've started to discuss the topic a while ago, maybe we can continue there.

Oh, and how is the STT server going? I'm planning to cycle back around to that soon and was curious where you were at.

It has been released and I'm very happy with the results so far: SEPIA STT-Server, you can use one of the Docker containers for testing 🙂 . I've written a Javascript client for the server and was planning to document the API next. Let me know if you need any info.

from larynx.

synesthesiam commented on August 16, 2024

Docker image has been updated with the fixed MaryTTS API 👍 Let me know if there are any more issues.
I forgot to mention too that you can send SSML into the MaryTTS API; if the text begins with an angle bracket < it will be interpreted as SSML.

I'll continue the parsing discussion over on the gruut thread. It should be possible now with gruut to add models for predicting more features of German words (e.g., case), and use them when verbalizing dates, etc.

I'll check out the SEPIA STT-Server! I'd like to use it as a base for a project like OpenTTS for STT, where all of the available models for a given language are gathered together behind a web API.

from larynx.

fquirin commented on August 16, 2024

Second thing is that after almost exactly 30s of inactivity it takes another 2s to load the result every time, no matter how long the text is :-|.

I solved this one 😅 . It seems to be a problem with domain name resolution inside my network O_o. If I use the IP instead of the RPi hostname I never get the 2s timeout :-/

from larynx.

synesthesiam commented on August 16, 2024

Awesome, thanks for letting me know! I couldn't think of a reason it would do that 🙂

from larynx.

MaryTTS API interface is not 100% compatible about larynx HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent