Giter Site home page Giter Site logo

spandanm110 / ame Goto Github PK

View Code? Open in Web Editor NEW

This project forked from expl0dingcat/ame

0.0 0.0 0.0 260 KB

State-of-the-art, multi-modal virtual assistant framework powered by LLaMA. Ame is not complete and is under active development.

Home Page: https://discord.gg/S6h8XYsuZt

License: MIT License

Python 100.00%

ame's Introduction

Banner

Setting a new standard for local virtual assistants πŸ’§

Meet Ame, the most powerful virtual assistant framework, powered by cutting-edge technology. Ame is a feature-rich, multi-modal, open-source virtual assistant framework (API) designed to run entirely locally. It leverages the power of LLaMA to provide personalized and intuitive interaction. Ame's server is designed to run on enterprise-grade or high-end consumer-grade hardware (3090, 24gb VRAM+), you can run Ame on lower-end consumer hardware by using a more aggressive quantization, smaller model and/or by disabling TTS, STT and/or vision. Split computing is planned for v2 which will allow for splitting the compute workload across multiple devices. See announcements for updates and more information.

Join the discord for frequent dev updates, discussion, community involvement and more.

Disclaimer ⚠️

Ame is in an incomplete state and is being developed by me and only me, expect progress to be slow, refer to the progress section of the readme for more information. The client and server are unable to communicate the audio files, this has not been implemented yet, audio generation is functional.

Announcements πŸ“’

  • [2023-11-06] Seeking people willing to help debug and polish ame as well as help write documentation, please contact me if you are interested!
  • [2023-11-01] Long time no see! Burnout sucks! I've resumed working on Ame, but at a slower pace, currently finishing up features, then will spend a month or two polishing, then hopefully a beta version can be released, no promises on a release date or feature release timeline, though.

Overview πŸ“–

Key features πŸš€

Customizable Modules: Ame's modular design allows for easy customization and extensibility. Each module serves a specific function, such as managing calendars, providing updates, or assisting with personal tasksβ€”Ame adapts to you. Developers can create their own modules or modify existing ones to tailor Ame's capabilities to their specific requirements.

Text-to-Speech (TTS) and Speech-to-Text (STT): Ame's TTS and STT capabilities enable natural and effortless communication. STT is powered by OpenAI's whisper and TTS is powered by Suno's bark.

Discord & Telegram Integration: Ame seamlessly integrates with both Discord and Telegram, allowing you to interact with it through text-based messaging and voice notes. Discord and Telegram provide a familiar and convenient way to interact with Ame, enabling efficient communication and access to its full range of functionalities.

Open-Source: Ame is entirely open-source. This allows for knowledge sharing and the continuous improvement of Ame while contributing to the open-source community, democratizing ML research in the process.

Locally Run and Privacy-Focused: Ame prioritizes user privacy and data control by operating entirely on the user's local machine or a user controlled server.

Long-term Memory: Ame utilizes a vector database that optimizes memory storage and retrieval, enabling Ame to access data that goes beyond the context limit of its model.

Full feature list

* means the feature is yet to be implemented, see progress, this list does not include features that may be coming in v2.

  • Support for any LLaMA GGML/GGUF (via llama.cpp)
  • Developer-friendly module platform
  • Long-term memory
  • Full customizability
  • High-quality text-to-speech (via bark)
  • Accurate speech-to-text (via whisper)
  • Smart context limit management
  • Pre-built server and client
  • Remote server command
  • Client UI*
  • Telegram integrations*
  • Discord integrations*
  • Fully open-source
  • Easy-to-use API

Usage βš™οΈ

Install requirements

pip3 install sentence-transformers
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
set CMAKE_ARGS="-DLLAMA_CUBLAS=on"
set FORCE_CMAKE=1
pip3 install llama-cpp-python --no-cache-dir
pip3 install openai-whisper
pip3 install pyaudio
pip3 install aiohttp
pip3 install keyboard
pip3 install transformers
pip3 install git+https://github.com/suno-ai/bark.git
  • You must have CUDA 11.8
  • You must use torch (and its associated packages) version 2.0.0+ or it will break
  • If you need to reinstall torch, purge it before doing so
  • Ame was designed on Python 3.10.11

Server/client

Move server.py (interfaces/server-client/) to the root folder then run:

python server.py

To access the server locally make sure Local = True in client.py (interfaces/server-client/), to access it externally, modify the base URL and set Local = False, then run:

python client.py

API

Ame's API allows for programmatic use of Ame's entire system. Here is an example:

from controller import controller

# Initialize the controller, see documentation for more info
controller = controller()

# Generate text based on the input "Hello, World!"
response = controller.generate_response("Hello, World!")

For a more advanced example, see server.py.

Progress (v1)

πŸ”΄ Planned 🟑 In progress 🟒 Finished

Core

Component Status
Speech-to-text 🟒
Text-to-speech 🟒
Long-term memory 🟒
Primary controller 🟒
Module handler 🟒
Server/client interface 🟒

Ext

Component Status
Client UI πŸ”΄
Discord interface πŸ”΄
Telegram interface πŸ”΄
Documentation 🟑

Plans for v2 πŸ”΅

As v1 is still in development, this section is subject to volatile change, it currently contains features I wanted to include in v1 but don't have time as well as brand new concepts that may or may not be implemented. If you would like to suggest features for v2, please feel free to contact me.

  • Voice identification
  • Web UI
  • Multi-memory banks
  • Passive listening
  • Extreme redundancy
  • Vision system
  • Edge TPU support
  • RVC (singing and possibly TTS)
  • Vtuber integrations (weeb)
  • Home Assistant (this is already detectable out of the box by the module system but its a large task to integrate)

The meaning behind "Ame" πŸ’§

The name "Ame" originates from the Japanese word "雨" (pronounced ah-meh), which translates to "rain" in English. Like rain, Ame represents a refreshing and nourishing presence in your digital life. Just as raindrops bring life to the earth, Ame breathes life into your digital environment, providing support and efficiency.

Contributing 🀝

If you would like to contribute to the Ame project, please contact me. Ame is being developed by me, and me only. Any help is greatly appreciated.

Acknowledgements πŸ™

Ame relies on 3rd party open source software to function, this project would not have been possible without:

License βš–οΈ

Ame is released under the MIT License, which allows you to use, modify, and distribute the software freely. Please refer to the license file for more details.

ame's People

Contributors

expl0dingcat avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.