Giter Site home page Giter Site logo

mediatechlab / tts-wrapper Goto Github PK

View Code? Open in Web Editor NEW
14.0 2.0 9.0 559 KB

TTS-Wrapper makes it easier to use text-to-speech APIs by providing a unified and easy-to-use interface.

License: MIT License

Dockerfile 6.20% Python 90.50% Makefile 3.30%
polly text-to-speech microsoft-cognitive-services google-tts python-library python picotts speech-synthesis tts sapi5

tts-wrapper's Introduction

TTS-Wrapper

PyPI version build codecov Maintainability

Contributions are welcome! Check our contribution guide.

TTS-Wrapper makes it easier to use text-to-speech APIs by providing a unified and easy-to-use interface.

Currently the following services are supported:

  • AWS Polly
  • Google TTS
  • Microsoft TTS
  • IBM Watson
  • PicoTTS
  • SAPI (Microsoft Speech API)

Installation

Install using pip.

pip install TTS-Wrapper

Note: for each service you want to use, you have to install the required packages.

Example: to use google and watson:

pip install TTS-Wrapper[google, watson]

For PicoTTS you need to install the package on your machine. For Debian (Ubuntu and others) install the package libttspico-utils and for Arch (Manjaro and others) there is a package called aur/pico-tts.

Usage

Simply instantiate an object from the desired service and call synth().

from tts_wrapper import PollyTTS, PollyClient

tts = PollyTTS(client=PollyClient())
tts.synth('<speak>Hello, world!</speak>', 'hello.wav')

Notice that you must create a client object to work with your service. Each service uses different authorization techniques. Check out the documentation to learn more.

Selecting a Voice

You can change the default voice and lang like this:

PollyTTS(voice='Camila', lang='pt-BR')

Check out the list of available voices for Polly, Google, Microsoft, and Watson.

SSML

You can also use SSML markup to control the output of compatible engines.

tts.synth('<speak>Hello, <break time="3s"/> world!</speak>', 'hello.wav')

It is recommended to use the ssml attribute that will create the correct boilerplate tags for each engine:

tts.synth(tts.ssml.add('Hello, <break time="3s"/> world!'), 'hello.wav')

Learn which tags are available for each service: Polly, Google, Microsoft, and Watson.

Authorization

To setup credentials to access each engine, create the respective client.

Polly

If you don't explicitly define credentials, boto3 will try to find them in your system's credentials file or your environment variables. However, you can specify them with a tuple:

from tts_wrapper import PollyClient
client = PollyClient(credentials=(region, aws_key_id, aws_access_key))

Google

Point to your Oauth 2.0 credentials file path:

from tts_wrapper import GoogleClient
client = GoogleClient(credentials='path/to/creds.json')

Microsoft

Just provide your subscription key, like so:

from tts_wrapper import MicrosoftClient
client = MicrosoftClient(credentials='TOKEN')

If your region is not the default "useast", you can change it like so:

client = MicrosoftClient(credentials='TOKEN', region='brazilsouth')

Watson

Pass your API key and URL to the initializer:

from tts_wrapper import WatsonClient
client = WatsonClient(credentials=('API_KEY', 'API_URL'))

PicoTTS & SAPI

These clients dont't require authorization since they run offline.

from tts_wrapper import PicoClient, SAPIClient
client = PicoClient()
# or
client = SAPIClient()

File Format

By default, all audio will be a wave file but you can change it to a mp3 using the format option:

tts.synth('<speak>Hello, world!</speak>', 'hello.mp3', format='mp3)

License

Licensed under the MIT License.

tts-wrapper's People

Contributors

arpadatscorp avatar dependabot[bot] avatar gbottari avatar panoramix360 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

tts-wrapper's Issues

Add support for NSS (Mac OS built-in TTS)

Description

Since we are using pyttsx3 we could easily support that engine as well.

Suggested Steps

  1. Create the client and TTS classes
  2. Add tests following the base test classes
  3. Mark the tests so it only runs on Mac OS
  4. Actually test on Mac OS
  5. Bonus: see if we can have it working on the CI Runner (probably not possible)

Class variables should be prefixed with single underscore

Description

According to PEP8 we should use a single underscore in front of fields that are not part of the public API of a class.

This article does a good job of explaining why we should use this convention.

This is present in most classes, like PollyTTS:

class PollyTTS(AbstractTTS):
    def __init__(
        self,
        client: PollyClient,
        voice: Optional[str] = None,
    ) -> None:
        self.client = client  # should be changed to self._client
        self.voice = voice or "Joanna"   # should be changed to self._voice

Document base classes

Description

Currently, basic documentation is severely lacking. These classes require the most attention:

  • AbstractTTS
  • AbstractSSMLNode

Add support for other systems

Prerequisites

  • Did you perform a cursory search?

Description

I love this library. I think it would be great if it could pick up somewhat redundant packages like pyttsx3.. To do this though it would be useful to have more offline systems and smaller TTS systems.. e.g.

For us personally - we'd love rhvoice and ekho (supporting hard to reach languages is really key).

Playing sound immediatley rather than saving to wav

Description

We can currently create a wav or mp3 with tts.synth - but I'd like to have a tts.speak command

So something that just does the same but instead of saving an audio file just streams out the binary data. We can kind of do this with

        from playsound import playsound
        playsound('mysound.wav')

but 1) thats another library - and 2) its still having to do a save and then read..

Thanks for the great project :)

Install - bad pattern

Prerequisites

  • Can you reproduce the problem?
  • Are you running the latest version?
  • Did you check the documentation?
  • Did you perform a cursory search?

For more information, see the contributing guide.

Description

I'm running pip install TTS-Wrapper[google, watson] But I get zsh: bad pattern: TTS-Wrapper[google,

Running under 3.10
MacOS

Make sure extra module imports are optional

Prerequisites

  • Can you reproduce the problem?
  • Are you running the latest version?
  • Did you check the documentation?
  • Did you perform a cursory search?

For more information, see the contributing guide.

Description

All extra modules in pyproject.toml (section [tool.poetry.extras]) must be truly optional. For instance, PollyTTS requires boto3. If a user doesn't have boto3 installed, they should be able to use other modules (e.g. MicrosoftTTS) without needing to install boto3.

The project already works with this in mind. However, there are no tests for this yet. Maybe we can use importlib to remove an engine dependency and check if importing the TTS engine (without instantiating it) works.

missing cred for Watson

Prerequisites

  • Can you reproduce the problem?
  • Are you running the latest version?
  • Did you check the documentation?
  • Did you perform a cursory search?

Description

        client = WatsonClient(credentials=('SOMEKEY',
                              'https://gateway-lon.watsonplatform.net/text-to-speech/api'))
        tts = WatsonTTS(client=WatsonClient())
        tts.synth('<speak>some text'</speak>', 'mysound.wav', format='wav')

This gives you TypeError: WatsonClient.__init__() missing 1 required positional argument: 'credentials'

But looking at the docs - I cant see I'm missing any credential ?

tts.get_voices('lang')

Prerequisites

  • Can you reproduce the problem?
  • Are you running the latest version?
  • Did you check the documentation?
  • Did you perform a cursory search?

For more information, see the contributing guide.

Description

Right now to select a voice you have to know the obscure name, region etc of the name for that provider. Google, Microsoft all have endpoints to get the voicelist. Polly, and Watson - not so sure.

Expected behavior: [What you expected to happen]

voices = tts.get_voices('lang')
  • a dict of the name, langcode, sex of the voice to be returned

Snags

Because it looks like not all clients have a endpoint this may be a bad idea. But then maybe we just create caches of dicts from available info online in a release?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.