Giter Site home page Giter Site logo

elisebuehler2000 / candle-vllm Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ericlbuehler/candle-vllm

0.0 0.0 0.0 348 KB

Efficent platform for inference and serving local LLMs including an OpenAI compatible API server.

License: MIT License

Rust 100.00%

candle-vllm's Introduction

candle-vllm

Continuous integration

Efficient, easy-to-use platform for inference and serving local LLMs including an OpenAI compatible API server.

Features

  • OpenAI compatible API server provided for serving LLMs.
  • Highly extensible trait-based system to allow rapid implementation of new module pipelines,
  • Streaming support in generation.

Pipelines

  • Llama
    • 7b
    • 13b
    • 70b
  • Mistral
    • 7b

Examples

See this folder for some examples.

Example with Llama 7b

In your terminal, install the openai Python package by running pip install openai.

Then, create a new Python file and write the following code:

import openai

openai.api_key = "EMPTY"

openai.base_url = "http://localhost:2000/v1/"

completion = openai.chat.completions.create(
    model="llama7b",
    messages=[
        {
            "role": "user",
            "content": "Explain how to best learn Rust.",
        },
    ],
    max_tokens = 64,
)
print(completion.choices[0].message.content)

Next, launch a candle-vllm instance by running HF_TOKEN=... cargo run --release -- --hf-token HF_TOKEN --port 2000 llama7b --repeat-last-n 64.

After the candle-vllm instance is running, run the Python script and enjoy efficient inference with an OpenAI compatible API server!

Contributing

The following features are planned to be implemented, but contributions are especially welcome:

  • Sampling methods:
  • Pipeline batching (#3)
  • PagedAttention (#3)
  • More pipelines (from candle-transformers)

Resources

candle-vllm's People

Contributors

ericlbuehler avatar bm777 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.