Giter Site home page Giter Site logo

neronjust2017 / auto-gpt-local Goto Github PK

View Code? Open in Web Editor NEW

This project forked from dgdev91/auto-gpt

0.0 0.0 1.0 1.9 MB

An experimental open-source attempt to make GPT-4 fully autonomous.

License: MIT License

Shell 0.10% JavaScript 0.31% Python 98.37% Batchfile 0.10% Dockerfile 1.12%

auto-gpt-local's Introduction

Auto-GPT: An Autonomous GPT-4 Experiment

This is a fork of Auto-GPT (https://github.com/Significant-Gravitas/Auto-GPT) with support for local LLaMA models. At present, this is more of an experimental idea.

Setup Guide

Setup local API server

gpt-llama.cpp is an API wrapper around llama.cpp. It runs a local API server that simulates OpenAI's API GPT endpoints but uses local llama-based models to process requests.

  1. Setup llama.cpp by the following instructions based on this README.
git clone https://github.com/ggerganov/llama.cpp

cd llama.cpp
mkdir build
cd build
cmake ..
cmake --build . --config Release

# install Python dependencies
python3 -m pip install -r requirements.txt
  1. Install and run gpt-llama.cpp locally
git clone https://github.com/keldenl/gpt-llama.cpp.git
cd gpt-llama.cpp

# install the required dependencies
npm install

# start the server
npm start

For more details, please refer to gpt-llama.cpp README

Download Models

  • LLaMA-7B-q4
  • Vicuna-7B-q4

I have tried above models so far. You can download original LLaMA following the instructions here. Or you can download in other ways : meta-llama/llama#149

For Vicuna weights, you can add its delta to the original LLaMA weights to obtain the Vicuna weights, instructions here.

  1. Convert your downloaded LLaMa-7B weights to Hugging Face Transformers foramt using the following script (source):
python src/transformers/models/llama/convert_llama_weights_to_hf.py \
    --input_dir /path/to/downloaded/llama/weights --model_size 7B --output_dir /output/path
  1. Get Vicuna-7B weights by applying the delta(detailed instructions)
python3 -m fastchat.model.apply_delta \
    --base /path/to/llama-7b \
    --target /output/path/to/vicuna-7b \
    --delta path/to/vicuna-7b-delta-v1.1
  1. Quantize Vicuna-7B model to 4-bits using llama.cpp
# obtain the original LLaMA model weights and place them in ./models
ls ./models
65B 30B 13B 7B Vicuna-7B tokenizer_checklist.chk tokenizer.model

# install Python dependencies
python3 -m pip install -r requirements.txt

# convert the 7B model to ggml FP16 format
python3 convert.py models/Vicuna-7B/

# quantize the model to 4-bits (using method 2 = q4_0)
./quantize ./models/Vicuna-7B/ggml-model-f16.bin ./models/Vicuna-7B/ggml-model-q4_0.bin 2

# run the inference
./main -m ./models/Vicuna-7B/ggml-model-q4_0.bin -n 128

Now, the local model is ready to go!

Install Auto-GPT (Guide)

  1. Install this Auto-GPT based on the DGdev91's PR #2594.
git clone https://github.com/Neronjust2017/Auto-GPT-LOCAL
  1. Install requirement, and make a .env file
    pip install -r requirements.txt
    cp .env.template .env
  1. Edit .env file
OPENAI_API_BASE_URL=http://localhost:443/v1
    
# you can find proper value for different LLaMA model at https://huggingface.co/shalomma/llama-7b-embeddings#quantitative-analysis
EMBED_DIM=4096 
OPENAI_API_KEY= ../llama.cpp/models/vicuna/13B/ggml-vicuna-unfiltered-13b-4bit.bin
  1. Run Auto-GPT
# On Linux or Mac:
./run.sh start
# On Windows:
.\run.bat

# or with python
python -m autogpt

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.