Giter Site home page Giter Site logo

qqq-tech / adept-inference Goto Github PK

View Code? Open in Web Editor NEW

This project forked from persimmon-ai-labs/adept-inference

0.0 0.0 0.0 157 KB

Inference code for Persimmon-8B

Home Page: https://www.adept.ai/

License: Apache License 2.0

Shell 0.21% C++ 15.96% Python 76.26% C 0.66% Cuda 6.31% Dockerfile 0.61%

adept-inference's Introduction

Persimmon-8B User Guide

This repo contains inference code for Persimmon-8B, the new LLM from Adept.

Downloading the Checkpoint

The model checkpoints are stored on our public OCI bucket and can be downloaded using wget. The base model is not fine-tuned and is released under an Apache 2.0 license. The chat model is fine-tuned and is released under a CC-BY-NC 4.0 license.

Base:
https://axtkn4xl5cip.objectstorage.us-phoenix-1.oci.customer-oci.com/n/axtkn4xl5cip/b/adept-public-data/o/8b_base_model_release.tar
md5sum: cd0320cba9efad9ccd18e9ec4d16ae1b

Chat:
https://axtkn4xl5cip.objectstorage.us-phoenix-1.oci.customer-oci.com/n/axtkn4xl5cip/b/adept-public-data/o/8b_chat_model_release.tar
md5sum: 663aeace07269c44e90f4e8bcd07f32a

Untar the model into its own directory via tar -xvf 8b_base_model_release.tar or tar -xvf 8b_chat_model_release.tar

The scripts are set up to expect the model folder to be placed within the code directory, but you can place it elsewhere and modify the scripts accordingly.

Building Docker

Build the docker that will include all the necessary dependencies (and then some!) using the included Dockerfile:

docker build -f docker/Dockerfile -t 'adeptdocker' .

Running Docker

Ensure that the variable MODEL_DIR in run_text_generation_server.sh is set to the location of the model directory. By default it is set to MODEL_DIR=8b_chat_model_release, which is the default name for the chat model. (For the base model, change this line to MODEL_DIR=8b_base_model_release.)

Running sh docker_launch.sh will start a model server that you can query via:

curl '<address of server>/api' -X 'PUT' -H 'Content-Type: application/json; charset=UTF-8' -d '{"prompts": ["human: Hello, how are you?\n\nadept:"], "tokens_to_generate": 128, "top_p": 0.9, "random_seed": 1234, "logprobs": false}'

Notes

  • The chat model is fine-tuned to expect inputs of the form: human: {prompt}\n\nadept:1. To ensure best performance from this model, please use this format! You can see an example of this in the curl command above. To automatically wrap single-turn input prompts with this structure, you can modify the definition of megatron/text_generation/api.py::generate_and_post_process so that the default value for the argument process_prompts_for_chat is set to True.
  • We are releasing the model with tensor parallelism of 1. In this configuration, the model requires an 80GB GPU to run naively. It should be possible to fit the model on a 40GB card by removing the unused embeddings and reducing the maximum sequence length (at the top of run_text_generation_server.py).
    Quantization to 8-bit or lower would make also it fit with plenty of room to spare.
  • We included the .vocab file so you can browse the vocabulary in plain text - this file is otherwise unused.

Citation

If you use this model in your work, please use the following BibTeX citation:

@misc{persimmon-8b,
  author = {Elsen, Erich and Odena, Augustus and Nye, Maxwell and Ta\c{s}\i{}rlar, Sa\u{g}nak and Dao, Tri and Hawthorne, Curtis and Moparthi, Deepak and Somani, Arushi},
  title = {Releasing {Persimmon-8B}},
  url = {https://www.adept.ai/blog/persimmon-8b},
  year = {2023}
}

Footnotes

  1. Subsequent inputs should have the form human: {prompt}\n\nadept: {output}\n\nhuman: {follow_up}\n\nadept: โ†ฉ

adept-inference's People

Contributors

cghawthorne avatar ekelsen avatar eltociear avatar mtensor avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.