Giter Site home page Giter Site logo

ivangabriele / docker-llm Goto Github PK

View Code? Open in Web Editor NEW
5.0 1.0 2.0 204 KB

Pre-loaded LLMs served as an OpenAI-Compatible API via Docker images.

License: GNU Affero General Public License v3.0

Dockerfile 40.59% Makefile 16.61% Shell 29.21% Python 13.59%
llm llms api docker openai openai-api openorca orca server vicuna

docker-llm's Introduction

Docker Image ―
OpenAI API-Compatible Pre-loaded LLM Server

img-github img-docker

Docker images are based on Nvidia CUDA images. LLMs are pre-loaded and served via vLLM.

Environment Variables

  • TENSOR_PARALLEL_SIZE: Number of GPUs to use. Default: 1.

Port

The OpenAI API is exposed on port 8000.

Tags & Deployment Links

Note

The VRAM column is the minimum required amount of VRAM used by the model on a single GPU.

Tag Model RunPod Vast.ai VRAM
ivangabriele/llm:lmsys__vicuna-13b-v1.5-16k img-huggingface img-runpod img-vastai 26GB
ivangabriele/llm:open-orca__llongorca-13b-16k img-huggingface img-runpod img-vastai 26GB

Roadmap

  • Add more popular models.
  • Start the server in background to allow for SSH access.

docker-llm's People

Contributors

ivangabriele avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

derkodex josephrp

docker-llm's Issues

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

Awaiting Schedule

These updates are awaiting their schedule. Click on a checkbox to get an update now.

  • fix(deps): update all non-major dependencies (accelerate, nvidia-cublas-cu11, nvidia-cuda-cupti-cu11, nvidia-cuda-nvrtc-cu11, nvidia-cuda-runtime-cu11, nvidia-cudnn-cu11, nvidia-curand-cu11, nvidia-cusolver-cu11, nvidia-cusparse-cu11, nvidia-nccl-cu11, nvidia-nvtx-cu11, torch, vllm)
  • chore(deps): update nvidia/cuda docker tag to v12

Detected dependencies

docker-compose
docker-compose.yml
dockerfile
Dockerfile
  • nvidia/cuda 11.8.0-cudnn8-devel-ubuntu22.04
poetry
pyproject.toml
  • nvidia-cublas-cu11 11.10.3.66
  • nvidia-cuda-cupti-cu11 11.7.101
  • nvidia-cuda-nvrtc-cu11 11.7.99
  • nvidia-cuda-runtime-cu11 11.7.99
  • nvidia-cudnn-cu11 8.5.0.96
  • nvidia-cufft-cu11 10.9.0.58
  • nvidia-curand-cu11 10.2.10.91
  • nvidia-cusolver-cu11 11.4.0.1
  • nvidia-cusparse-cu11 11.7.4.91
  • nvidia-nccl-cu11 2.14.3
  • nvidia-nvtx-cu11 11.7.91
  • python ^3.11.0
  • python-dotenv ^1.0.0
  • torch 2.0.1
  • vllm 0.2.0
  • accelerate 0.24.0
  • fschat 0.2.31
  • openai 0.28.1

  • Check this box to trigger a request for Renovate to run again on this repository

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.