Giter Site home page Giter Site logo

auto-llama-cpp's Introduction

Auto-Llama-cpp: An Autonomous Llama Experiment

This is a fork of Auto-GPT with added support for locally running llama models through llama.cpp. This is more of a proof of concept. It's sloooow and most of the time you're fighting with the too small context window size or the models answer is not valid JSON. But sometimes it works and then it's really quite magical what even such a small model comes up with. But obviously don't expect GPT-4 brilliance here.

Supported Models


Since this uses llama.cpp under the hood it should work with all models they support. As of writing this is

  • LLaMA
  • Alpaca
  • GPT4All
  • Chinese LLaMA / Alpaca
  • Vigogne (French)
  • Vicuna
  • Koala

Model Performance (the experience so far)


Response Quality

So far I have tried

  • Vicuna-13b-4BIT
  • LLama-13B-4BIT

Overall the Vicuna model performed much better than the original LLama model in terms of answering in the required JSON format and how much sense the answers make. I just couldn't get it to stop starting every answer with ### ASSISTANT. I am very curious to hear how well others models perform. The 7B models seemed have problems with grasping what's asked of them in the prompt, but I tried very little in this direction since the inference speed didn't seem to be much faster for me.

Inference Speed

The biggest problem at the moment is indeed inference speed. As the agent is self prompting a lot, a few seconds of infernce that are acceptable in a chatbot scenario become minutes and more. Testing things like different prompts etc is a pain under these conditions.

Discussion

Fell free to add your thoughts and experiences in the discussion area. What models did you try? How well did they work ou for you?

Future Plans


  1. Add GPU Support via GPTQ
  2. Improve Prompts
  3. Remove external API support (This is supposed to be completely self-contained agent)
  4. Add support for Open Assistent models

auto-llama-cpp's People

Contributors

andrescdo avatar awmorgan avatar billschumacher avatar blankey1337 avatar blankster avatar coditamar avatar coley-angel avatar dr33dm3 avatar eltociear avatar fabricehong avatar hunteraraujo avatar jcp avatar keenborder786 avatar kinance avatar malikmalna avatar monkee1337 avatar onekum avatar pratiksinghchauhan avatar rhohndorf avatar richbeales avatar russellocean avatar slavakurilyak avatar sma-das avatar sweetlilmre avatar taytay avatar thomasfifer avatar tooktheredbean avatar torantulino avatar wladastic avatar yousefissa avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.