Comments (9)
Outputting to default speaker should be as fast as the demo on the trial page. If you are outputting to an audio file, it's slow.
from aspeak.
I'm outputing to speakers and it's slower
from aspeak.
should I switch to stream mode?
from aspeak.
How slow is it? I didn't experience significantly large delays compared with the demo.
from aspeak.
third as slow
from aspeak.
I just did a profile:
python -m cProfile -m aspeak -t
2860752 function calls (2854737 primitive calls) in 34.520 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 34.021 34.021 __main__.py:1(<module>)
2 0.000 0.000 0.399 0.199 auth.py:1(<module>)
1 0.000 0.000 0.879 0.879 auth.py:10(_get_auth_token)
1 0.000 0.000 32.147 32.147 functional.py:11(pure_text_to_speech)
1 0.000 0.000 33.983 33.983 main.py:122(main)
1 0.000 0.000 0.948 0.948 main.py:18(read_file)
1 0.000 0.000 0.000 0.000 main.py:25(preprocess_text)
1 0.000 0.000 32.147 32.147 main.py:46(speech_function_selector)
1 0.000 0.000 33.095 33.095 main.py:69(main_text)
1 0.290 0.290 32.146 32.146 provider.py:36(text_to_speech)
2/1 0.000 0.000 0.498 0.498 runpy.py:103(_get_module_details)
1 0.000 0.000 34.520 34.520 runpy.py:199(run_module)
1 0.000 0.000 34.022 34.022 runpy.py:63(_run_code)
1 0.000 0.000 31.820 31.820 speech.py:1565(speak_text)
1 0.000 0.000 29.846 29.846 speech_py_impl.py:6148(speak_text)
- 34.520s is the total time.
- 34.021s is the time spent in main.
- 33.893s spent on the main.py
- 32.146s spent on text_to_speech function in provider.py
- 0.948s spent on reading from stdin
- 0.879s spent on getting the auth token.
- 31.820s spent by azure's speech package to do the actual speech synthesis work which is out of my control.
So the space for optimization is 33.893 - 31.820 - 0.948 - 0.879 = 0.24600000000000044
Actually there is almost nothing to optimize, except:
aspeak/src/aspeak/api/provider.py
Line 40 in e3b1b44
We could cache the synthesizer here if you are always using the same parameters for text_to_speech
.
from aspeak.
I can provide an API with cached SpeechSynthesizer
in the next version but I'm very busy recently so don't expect that to arrive very soon.
You could do it yourself by building your own version of SpeechServiceProvider
if you are always calling text_to_speech
/pure_text_to_speech
with the same set of parameters
- Create and store your SpeechConfig and AudioOutputConfig in it.
- Just cache the
SpeechSynthesizer
and recreate it using the same config in case of token expiration. - Modify
text_to_speech
andssml_to_speech
method on yourSpeechServiceProvider
to utilize the cachedSpeechSynthesizer
and remove the config parmeters from the methods. - Call
SpeechServiceProvider.text_to_speech(text)
orSpeechServiceProvider.ssml_to_speech(ssml)
to do speech synthesis (You can create ssml using thecreate_ssml
function inaspeak.ssml
)
However, frankly speaking, I don't know by how mush will the performance improve.
from aspeak.
Actually I don't think the 200ms delay is realistic.
I opened https://eastus.tts.speech.microsoft.com in a browser and I got 268ms delay
from aspeak.
Thanks
from aspeak.
Related Issues (20)
- Error: 0: Websocket error 1: HTTP error: 200 OK HOT 2
- synthesized data returned by Microsoft save to wav file HOT 1
- hyper-0.14.25.crate: 1 vulnerabilities (highest severity is: 7.5) - autoclosed HOT 1
- Add an CLI arg to disable colored output HOT 1
- Add `audio` feature for crate HOT 2
- the free version may refer to this page:https://speech.microsoft.com/audiocontentcreation HOT 2
- Better error handling HOT 1
- Pass key/auth token via environment variable HOT 1
- Add support get voice list in lib HOT 1
- Question #19017 HOT 1
- No examples on v5.2.0 HOT 3
- Provide REST Mode HOT 3
- feat: batch processing API HOT 2
- tokio-tungstenite-0.19.0.crate: 4 vulnerabilities (highest severity is: 9.1) - autoclosed HOT 1
- reqwest-0.11.18.crate: 1 vulnerabilities (highest severity is: 5.5) - autoclosed HOT 1
- Python Binding: provide .pyi file
- colored-2.0.0.crate: 1 vulnerabilities (highest severity is: 9.8) - autoclosed HOT 1
- aspeak does not support python3.9 or above HOT 1
- Publish python wheels for apple silicon Macs HOT 1
- tokio-tungstenite-0.20.1.crate: 1 vulnerabilities (highest severity is: 7.5) - autoclosed HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from aspeak.