Giter Site home page Giter Site logo

llm_server's Introduction

You will need to use a conda environment

Install conda first, e.g. Download the latest miniconda installer from https://docs.conda.io/en/latest/miniconda.html and Run the installer

example / slexample.sh has an example of all the steps that are needed

source ./slexample.sh should run completely on a slurm cluster and setup everything - but beware that slurm_start.sh
includes parameters that are specific to the cluster it is running on

The conda environment and all other required folders will be created in the current directory
(This will be fairly big, you should use a volume with enough storage - such as a "volatile" folder)

When you run the script, it will create a local conda environment and install all the necessary packages
It will also install and build llama.cpp, which will be needed for conversion and quantisation

Then it will run the conversion and quantisation for an example model (A Llama variant from nvidia)
This can then be executed - slexample.sh calls sbatch to run the specified model on a slurm cluster
example.sh will run it without slurm

One installed and running you can use the following scripts to interact with the server

get_model.py <model_name> will download a model and create quantised versions of it
sbatch slurm_start.sh <model_name> will start the server with the model
prompt.sh will generate text using the prompt or promptb64.sh will take a base64 encoded prompt
(To handle special characters inside the prompt)
exit.sh will tell the server to shutdown


llm_server's People

Contributors

martinrepo avatar tonyatliv avatar

Watchers

 avatar  avatar Kostas Georgiou avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.