Giter Site home page Giter Site logo

able's Introduction

Shared Voice Interface

This project provides system-wide shared Speech to Text Engine to:

  1. Add voice commands to any Desktop App (via Extension/Addon/Plugin ex.Code or integrate into your own app like TalkGPT).
  2. Run your Automation scripts/command-line commands with personalized voice commands(uses a JSON file to link voice commands to programmatic execution).

Built-In Features

RealTime transcription - starts recording when a speaker says something, stops recording if speaker stops speaking for 0.5sec.

Google Anything - start by saying google followed by 'your_search_query' (ex: Google what's the weather outside?)

Architecture

Software Environment

python 3.9.9
Nodejs 18.9
OS: Linux (very likely it will work on OSX without any tweaks. On Windows bash scripts(in ./universal-commands/scripts and anywhere in src) will have to converted into batch scripts)

Hardware Config used during Development and Execution

System Ram : 8Gb (2x4Gb) [Recommended > 16Gb]
Graphic card : Nvidia Graphics MX350(Pascal Architecture, CUDA capability 6.1, VRAM 2GB)
Microphone : External Bluetooth Headeset with Microphone Arm (Recommended). Avoid using in-built microphone of your laptop.

Installation

git clone [email protected]:UmangRajpara13/able.git
cd ./able/listen
python3 -m venv venv
source "venv/bin/activate"
pip install -r requirements.txt
deactivate
cd ..
npm install

The project uses Whisper by OpenAI which requires the command-line tool ffmpeg to be installed on your system, which is available from most package managers:

# on Ubuntu or Debian
sudo apt update && sudo apt install ffmpeg

# on Arch Linux
sudo pacman -S ffmpeg

# on MacOS using Homebrew (https://brew.sh/)
brew install ffmpeg

# on Windows using Chocolatey (https://chocolatey.org/)
choco install ffmpeg

# on Windows using Scoop (https://scoop.sh/)
scoop install ffmpeg

Run

In 1st Terminal window

npm run engine

and in 2nd Terminal window.

npm run listen

Avoid using in-built microphone of your laptop, External headset with Microphone is recommended

able's People

Contributors

umangrajpara13 avatar

Stargazers

Santosh Gautam avatar Giovanne Feitosa avatar Swapnaneel Patra avatar Adrian  G L avatar Kiran Adimatyam avatar  avatar shita avatar  avatar Phương Nguyễn avatar 王树羽 avatar  avatar Book_A avatar  avatar  avatar Dima Dzundza avatar  avatar Angus McLauchlan avatar nanocosmos - Oliver Lietz avatar robe avatar Stefan avatar  avatar  avatar Manuj avatar Alex Cheema avatar  avatar  avatar Alvaro Tamura avatar  avatar Jatin Nagpal avatar Mehtab Ahmed avatar Robin avatar Ryan P. avatar Paweł avatar Halfsies avatar Ricardo avatar theycallmeloki avatar

Watchers

 avatar Etienne Monneret avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.