Auto-GPT: An Autonomous GPT-4 Experiment

This is a fork of Auto-GPT (https://github.com/Significant-Gravitas/Auto-GPT) with support for local LLaMA models. At present, this is more of an experimental idea.

Setup Guide

Install gpt-llama.cpp

Setup local API server

`gpt-llama.cpp` is an API wrapper around llama.cpp. It runs a local API server that simulates OpenAI's API GPT endpoints but uses local llama-based models to process requests.

Setup llama.cpp by the following instructions based on this README.

git clone https://github.com/ggerganov/llama.cpp

cd llama.cpp
mkdir build
cd build
cmake ..
cmake --build . --config Release

# install Python dependencies
python3 -m pip install -r requirements.txt

Install and run gpt-llama.cpp locally

git clone https://github.com/keldenl/gpt-llama.cpp.git
cd gpt-llama.cpp

# install the required dependencies
npm install

# start the server
npm start

For more details, please refer to gpt-llama.cpp README

Download Models

LLaMA-7B-q4
Vicuna-7B-q4

I have tried above models so far. You can download original LLaMA following the instructions here. Or you can download in other ways : meta-llama/llama#149

For Vicuna weights, you can add its delta to the original LLaMA weights to obtain the Vicuna weights, instructions here.

Convert your downloaded LLaMa-7B weights to Hugging Face Transformers foramt using the following script (source):

python src/transformers/models/llama/convert_llama_weights_to_hf.py \
    --input_dir /path/to/downloaded/llama/weights --model_size 7B --output_dir /output/path

Get Vicuna-7B weights by applying the delta(detailed instructions)

python3 -m fastchat.model.apply_delta \
    --base /path/to/llama-7b \
    --target /output/path/to/vicuna-7b \
    --delta path/to/vicuna-7b-delta-v1.1

Quantize Vicuna-7B model to 4-bits using llama.cpp

# obtain the original LLaMA model weights and place them in ./models
ls ./models
65B 30B 13B 7B Vicuna-7B tokenizer_checklist.chk tokenizer.model

# install Python dependencies
python3 -m pip install -r requirements.txt

# convert the 7B model to ggml FP16 format
python3 convert.py models/Vicuna-7B/

# quantize the model to 4-bits (using method 2 = q4_0)
./quantize ./models/Vicuna-7B/ggml-model-f16.bin ./models/Vicuna-7B/ggml-model-q4_0.bin 2

# run the inference
./main -m ./models/Vicuna-7B/ggml-model-q4_0.bin -n 128

Now, the local model is ready to go!

Install `Auto-GPT` (Guide)

Install this Auto-GPT based on the DGdev91's PR #2594.

git clone https://github.com/Neronjust2017/Auto-GPT-LOCAL

Install requirement, and make a .env file

    pip install -r requirements.txt
    cp .env.template .env

Edit .env file

OPENAI_API_BASE_URL=http://localhost:443/v1
    
# you can find proper value for different LLaMA model at https://huggingface.co/shalomma/llama-7b-embeddings#quantitative-analysis
EMBED_DIM=4096 
OPENAI_API_KEY= ../llama.cpp/models/vicuna/13B/ggml-vicuna-unfiltered-13b-4bit.bin

Run Auto-GPT

# On Linux or Mac:
./run.sh start
# On Windows:
.\run.bat

# or with python
python -m autogpt

neronjust2017 / auto-gpt-local Goto Github PK

auto-gpt-local's Introduction

Auto-GPT: An Autonomous GPT-4 Experiment

This is a fork of Auto-GPT (https://github.com/Significant-Gravitas/Auto-GPT) with support for local LLaMA models. At present, this is more of an experimental idea.

Setup Guide

Install gpt-llama.cpp

Setup local API server

`gpt-llama.cpp` is an API wrapper around llama.cpp. It runs a local API server that simulates OpenAI's API GPT endpoints but uses local llama-based models to process requests.

Download Models

Install `Auto-GPT` (Guide)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

neronjust2017 / auto-gpt-local Goto Github PK

auto-gpt-local's Introduction

Auto-GPT: An Autonomous GPT-4 Experiment

This is a fork of Auto-GPT (https://github.com/Significant-Gravitas/Auto-GPT) with support for local LLaMA models. At present, this is more of an experimental idea.

Setup Guide

Install gpt-llama.cpp

Setup local API server

gpt-llama.cpp is an API wrapper around llama.cpp. It runs a local API server that simulates OpenAI's API GPT endpoints but uses local llama-based models to process requests.

Download Models

Install Auto-GPT (Guide)

Recommend Projects

Recommend Topics

Recommend Org

`gpt-llama.cpp` is an API wrapper around llama.cpp. It runs a local API server that simulates OpenAI's API GPT endpoints but uses local llama-based models to process requests.

Install `Auto-GPT` (Guide)