Light

playht / pyht Goto Github PK

View Code? Open in Web Editor NEW

153.0 4.0 19.0 129 KB

PlayHT Python SDK -- Text-to-Speech Audio Streaming

Home Page: https://play.ht/

License: Apache License 2.0

Python 97.82% Makefile 2.18%

generative-ai sdk text-to-speech voice-ai

pyht's Introduction

PlayHT API SDK

pyht is a Python SDK for the PlayHT's Text-to-Speech API. With pyht, you can easily convert text into high-quality audio streams in humanlike voice.

Features

Stream text-to-speech in real-time.
Use prebuilt voices or custom voice clones.
Supports WAV, MP3, PCM, Mulaw, FLAC, and OGG audio formats.

Requirements

Python 3.8+
numpy
simpleaudio

Installation

You can install the pyht SDK using pip:

pip install pyht

Usage

You can use the pyht SDK by creating a Client instance and calling its tts method. Here's a simple example:

from pyht import Client
from dotenv import load_dotenv
from pyht.client import TTSOptions
import os
load_dotenv()

client = Client(
    user_id=os.getenv("PLAY_HT_USER_ID"),
    api_key=os.getenv("PLAY_HT_API_KEY"),
)
options = TTSOptions(voice="s3://voice-cloning-zero-shot/d9ff78ba-d016-47f6-b0ef-dd630f59414e/female-cs/manifest.json")
for chunk in client.tts("Can you tell me your account email or, ah your phone number?", options):
    # do something with the audio chunk
    print(type(chunk))

For a more detailed example with command-line arguments and interactive mode, refer to the provided demo.

Command-Line Demo

You can run the provided demo from the command line.

Note: This demo depends on the following packages:

pip install numpy simpleaudio

python demo/main.py --user YOUR_USER_ID --key YOUR_API_KEY --text "Hello from Play!"

Alternatively, you can run the demo in interactive mode:

python demo/main.py --user YOUR_USER_ID --key YOUR_API_KEY --interactive

In interactive mode, you can input text lines to generate and play audio on-the-fly. An empty line will exit the interactive session.

Get an API Key

To get started with the pyht SDK, you'll need your API Secret Key and User ID. Follow these steps to obtain them:

Access the API Page: Navigate to the API Access page.
Generate Your API Secret Key:
- Click the "Generate Secret Key" button under the "Secret Key" section.
- Your API Secret Key will be displayed. Ensure you copy it and store it securely.
Locate Your User ID: Find and copy your User ID, which can be found on the same page under the "User ID" section.

Keep your API Secret Key confidential. It's crucial not to share it with anyone or include it in publicly accessible code repositories.

pyht's People

Contributors

Stargazers

Watchers

Forkers

ajar98 newledge arthurmaroko tomchapin zihaurpang phroiland touristshaun phonx kbb99 vibrantvas existentialrecursionist jodidac a7mad-magdy77 adambear liucr lula-technologies-inc soliver84 j0ysutradhar

pyht's Issues

Streaming from the output of an LLM in Python.

Is there any documentation or some one could provide me a code snippet of how to Input and Output Streaming with chatGPT (or any LLM) in python.

I've went through the official docs but it seems to be only having NodeJS not Python.

Support for `narrationStyle` in this SDK

Your API supports the parameter narrationStyle, but there is no way to provide this input in the SDK. Are you guys thinking of adding this? Assuming that the rpc endpoint already supports this parameter and then it should be a simple enough change to the proto and the client code.

Specifying voice_engine

Does the API default to a voice_engine of "PlayHT2.0-Turbo"? Otherwise, is there a way in this SDK to specify which voice_engine to use?

Also curious about other parameters like text_guidance which do not seem to be exposed in this SDK.

Thanks!

Stop running after done generating

Hi,
I'm using this code:

with open(f'{x}.wav', 'wb') as f:
    for chunk in client.tts("Can you tell me your account email or, ah your phone number?", options):
        if not chunk:
            break
        print(type(chunk))
        f.write(chunk)

However it does not terminate the script after the audio has finished generating, how can I detect when the audio is done generating and stop?
Thank you!

cc @NCarrollPlay @mahmoudfelfel

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.