Giter Site home page Giter Site logo

pyht's Introduction

PlayHT API SDK

pyht is a Python SDK for the PlayHT's Text-to-Speech API. With pyht, you can easily convert text into high-quality audio streams in humanlike voice.

Features

  • Stream text-to-speech in real-time.
  • Use prebuilt voices or custom voice clones.
  • Supports WAV, MP3, PCM, Mulaw, FLAC, and OGG audio formats.

Requirements

  • Python 3.8+
  • numpy
  • simpleaudio

Installation

You can install the pyht SDK using pip:

pip install pyht

Usage

You can use the pyht SDK by creating a Client instance and calling its tts method. Here's a simple example:

from pyht import Client
from dotenv import load_dotenv
from pyht.client import TTSOptions
import os
load_dotenv()

client = Client(
    user_id=os.getenv("PLAY_HT_USER_ID"),
    api_key=os.getenv("PLAY_HT_API_KEY"),
)
options = TTSOptions(voice="s3://voice-cloning-zero-shot/d9ff78ba-d016-47f6-b0ef-dd630f59414e/female-cs/manifest.json")
for chunk in client.tts("Can you tell me your account email or, ah your phone number?", options):
    # do something with the audio chunk
    print(type(chunk))

For a more detailed example with command-line arguments and interactive mode, refer to the provided demo.

Command-Line Demo

You can run the provided demo from the command line.

Note: This demo depends on the following packages:

pip install numpy simpleaudio
python demo/main.py --user YOUR_USER_ID --key YOUR_API_KEY --text "Hello from Play!"

Alternatively, you can run the demo in interactive mode:

python demo/main.py --user YOUR_USER_ID --key YOUR_API_KEY --interactive

In interactive mode, you can input text lines to generate and play audio on-the-fly. An empty line will exit the interactive session.

Get an API Key

To get started with the pyht SDK, you'll need your API Secret Key and User ID. Follow these steps to obtain them:

  1. Access the API Page: Navigate to the API Access page.

  2. Generate Your API Secret Key:

    • Click the "Generate Secret Key" button under the "Secret Key" section.
    • Your API Secret Key will be displayed. Ensure you copy it and store it securely.
  3. Locate Your User ID: Find and copy your User ID, which can be found on the same page under the "User ID" section.

Keep your API Secret Key confidential. It's crucial not to share it with anyone or include it in publicly accessible code repositories.

pyht's People

Contributors

jnordberg avatar mahmoudfelfel avatar mtenpow avatar ncarrollplay avatar nkeenan38 avatar rodms10 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

pyht's Issues

Streaming from the output of an LLM in Python.

Is there any documentation or some one could provide me a code snippet of how to Input and Output Streaming with chatGPT (or any LLM) in python.

I've went through the official docs but it seems to be only having NodeJS not Python.

Support for `narrationStyle` in this SDK

Your API supports the parameter narrationStyle, but there is no way to provide this input in the SDK. Are you guys thinking of adding this? Assuming that the rpc endpoint already supports this parameter and then it should be a simple enough change to the proto and the client code.

Specifying voice_engine

Does the API default to a voice_engine of "PlayHT2.0-Turbo"? Otherwise, is there a way in this SDK to specify which voice_engine to use?

Also curious about other parameters like text_guidance which do not seem to be exposed in this SDK.

Thanks!

Stop running after done generating

Hi,
I'm using this code:

with open(f'{x}.wav', 'wb') as f:
    for chunk in client.tts("Can you tell me your account email or, ah your phone number?", options):
        if not chunk:
            break
        print(type(chunk))
        f.write(chunk)

However it does not terminate the script after the audio has finished generating, how can I detect when the audio is done generating and stop?
Thank you!

cc @NCarrollPlay @mahmoudfelfel

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.