janlunge / orbit Goto Github PK

View Code? Open in Web Editor NEW

11.0 2.0 0.0 2.35 MB

A modular platform to build voice based LLM Assistants

Python 99.37% Shell 0.63%

orbit's Introduction

Orbit

a modular platform for building a voice based LLM assistant

services

🎤 audio streaming via mqtt (audio_satelite)
❗️ Hotword detection with porcupine
🎧 whisper speech recognition
⚙️ Command service for custom executable commands
🧠 LLM AI integration with OpenAi or local inference (llama.cpp, kobold.cpp, mistral via ollama)
💬 TTS via elevenlabs, pyttsx3 or macos say

what is this?

Build your own jarvis or alexa/siri/google assistant with this modular platform. it will listen for audio with a microphone streamer via mqtt, then a hotword module will trigger the hotword event if you say the hotword/wakeword atlas then the audio will be streamed to a whisper speech recognition module that will return the text then the text will be sent to an AI module that will return a response then the text will be sent to a command module that will execute the command then the response will be sent to a TTS module that will speak the response

This tool is mainly built for use on mac and linux but it should work on windows too with a bit of configuration.

TODOs:

current token limitations make the function calling not really feasible but in the close future you will be able to use your computer or other api apps just with your voice, AI will be the interface between you and your computer. get in now and be ready for the future!

requirements

a Mqtt server

on OSX install one with brew install mosquitto then manage it with brew services start mosquitto and brew services stop mosquitto

ffmpeg for the whisper speech recognition
a working pyaudio installation

optional

openai api key into .env named OPENAI_API_KEY (if you use chatgpt in the ai.py file)
a porcupine hotword model and access key
an elevenlabs api key for tts

Setup

for the first run use sh run.sh --setup after that just run sh run.sh -- poetry

get a poetry shell with poetry shell
install the dependencies with poetry install
then run python3 main.py to start the program

on mac os install pyaudo support with

xcode-select --install
brew remove portaudio
brew install portaudio
pip3 install pyaudio

Notes

using local models works best in simple mode as most ai models do not work well with the reasoning chains in langchain and will produce nonsense using openai works very well as agent in the advanced mode

training an intent model:

in the ludwig folder do ludwig train --dataset sequence_tags.csv --config config.yaml

Project Structure

AI Providers

Ollama (recommended)

start the ollama app and set the AI_PROVIDER to 'ollama' in the .env file

Kobold.cpp

compile the kobold.cpp based on the instructions then run a model of your choice with the following command: python3 koboldcpp.py ~/Downloads/wizard-vicuna-13b-uncensored-superhot-8k.ggmlv3.q4_K_M.bin 8888 --stream --contextsize 8192 --unbantokens --threads 8 --usemlock set the AI_PROVIDER to 'kobold' in the .env file and the AI_API_URL to the ip of the kobold server with the port (in this case https://localhost:8888)

TODO:

pull
install pyaudio via these commands:

xcode-select --install brew remove portaudio brew install portaudio pip3 install pyaudio

install mosquitto via brew and start it
set .env key for PORCUPINE_ACCESS_KEY https://picovoice.ai/platform/porcupine/
make sure to select your favorite wakeword in the wakewords folder or create one yourself at : link here
if you are using macOS system python instead of homebrew, refer to this: urllib3/urllib3#3020
make sure to install https://github.com/jmorganca/ollama
run sh run.sh --setup

Recommend Projects