Comments (5)
What do you mean by "runtimeDuration".
Do you mean how long the TTS engine will run to generate the given phrase? Do you mean the maximum time for the TTS engine to run to create the given phrase? Do you mean the actual duration of the utterance after it was generated?
I would guess you mean "the actual duration of the utterance after it was generated". If so, you can just get this from the SpeechSynthesisUtterance events.
from speech-api.
You are correct. I meant "the actual duration of the utterance after it was generated".
It seems like SpeechSynthesisUtterance events are fired only when the utterance is running "speak".
Would it be possible to get the actual duration of the utterance after it was generated the the time when an utterance is instantiated rather than during it is running "speak"?
The use case I am thinking about it similar to Youtube video showing its runtime duration. A user can see how long a video will last. Similarly, it would be useful to see how long an utterance will run for.
from speech-api.
I would guess that it is possible to get as estimate of the duration of an utterance before its audio is generated by, for example, introducing a duration prediction ML model. However, this would impose constraints on the implementation that would make this added feature onerous to implement.
For example, some browsers use the voices native to the OS and these voices, generally, do not have such an associated "duration prediction" ML model. So this would require that browser providers for each OS and for each voice create a "duration prediction" ML model and have these "duration prediction" ML models retrained for each new OS version and each new voice. While possible, I don't think the gained functionality justifies the effort.
I'd be interested in others views on this too.
from speech-api.
Agree that this sounds like a "nice to have", but not particularly critical thing to have in the spec.
from speech-api.
Hello there :)
Is there a way to compute a value for rate
so that an utterance (text
) is read in n
ms? I am hacking on a subtitles reader script and need to read sentences as fast as they are spoken in the video.
The spec is vague about rate
:
1 is the default rate supported by the speech synthesis engine or specific voice (which should correspond to a normal speaking rate)
How can I quantify "normal speaking rate"?
from speech-api.
Related Issues (20)
- Define how to load custom voices HOT 3
- Can I capture UserMedia stream and do processing on that along with using SpeechRecognition HOT 1
- SpeechRecognition ends after unspecified time
- Bubbling model for SpeechSynthesisUtterance HOT 1
- need to improve speech recognition in conversation between multiple speakers
- Speach Destination HOT 1
- Requirements for SpeechSynthesisErrorEvent HOT 1
- Offline/on-device speech recognition HOT 2
- Could we have web speech API support IPA for speech synthesis voice language HOT 2
- Arabic TTS Web Speech is Missing some letters
- Interaction with screen readers and other assistive technology? HOT 1
- getVoices() is supposed to be user agent dependent, but appears not to be. HOT 1
- Android issue HOT 1
- Could not get isFinal == true HOT 9
- Feature request: SpeechRecognition pause/resume HOT 2
- Multiple Issues HOT 1
- speechSynthesis: utterance with lang but not voice
- Why is SpeechRecognition not working correctly in Safari?
- Clear Privacy Contracts HOT 1
- Continuously listening
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from speech-api.