Giter Site home page Giter Site logo

elevenlabss4ts's Introduction

ElevenLabs S4TS

Speech to text, text to Speech - STTTTS - S4TS

ElevenLabs S4TS is a PySide6 (Qt) application that does speech to text and then text to speech using eleven labs. The automatic speech recognition (ASR) model used for this application is OpenAI’s Whisper.

At startup, the application will use the whisper-base model for faster audio transcription. However, if your hardware supports cuda, you can change it to whisper-medium by checking Use Medium Model. ElevenLabs S4TS will automatically use cuda if your hardware supports cuda and your PyTorch is installed to support it.

How to Run ElevenLabs S4TS

Install Dependencies

  • Make sure Python 3.9 > is installed

  • Make your conda or pip env

  • Activate the virtual environment

  • Install PyTorch by following the instructions here

  • Install ElevenLabs S4TS dependencies

    # Pip
    pip install -U -r requirements.txt
    
    # Conda
    conda install pip
    pip install -U -r requirements.txt
    

Run the application

Once you have all of the dependencies installed. We simply need to run ui.py by doing the following (assuming the virtual environment is activated):

python3 ui.py

How to use ElevenLabs S4TS

  • First of all, you need to have a plan for ElevenLabs. It does not matter what plan tier you have as long as you have one. Go here to check out plans that they offer.
  • When you’re signed up, go to your profile icon on the top left and click profile and copy your API Key.
  • Paste your API key on the input field labeled API Key on the window
  • Select your desired input and output device
  • Select desired ElevenLabs voice
  • Hold the Record button and speak
  • Once released, the audio will be processed using whisper for transcription
  • After transcription, the text will be sent to ElevenLabs using their API
  • The request returns an audio data that ElevenLabsS4TS plays through the set output device

Future plans

  • Package application
  • Add ability to voice clone using mic

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.