Giter Site home page Giter Site logo

llama-on-babel's Introduction

Llama-on-babel

Setup

  • Connect to babel by ssh [andrewid]@babel.lti.cs.cmu.edu
  • Install miniconda for managing python virtual envs
  • Clone this repo and install requirements by pip install -r requirements.txt
  • Login to huggingface by huggingface-cli login. The Llama 2 models are gated, so you may need to request access if you haven't done so before.
  • (Optional) setup passwordless login
  • (Optional, but recommended) huggingface caches files in the home directory, which eats up disk space quickly. You can ask huggingface to cache model files on /scratch, a large shared storage space available on compute nodes, by adding the following lines to your ~/.bashrc file:
if [ -d /scratch ]; then
    mkdir -p /scratch/$USER
    export TRANSFORMERS_CACHE="/scratch/$USER/hf_cache/models"
fi

Start an interactive job and run llama inference

First, request an interactive session with GPU using the following command:

srun    --time 1:00:00 \
        --gres=gpu:1 \
        --mem=30GB \
        --exclude=babel-3-[3,11,32,36],babel-4-[7,11,13,18] \
        --pty \
        bash

The srun command starts a job in the real-time, and this command will request a node with 1 GPU, 30GB memory, and 1 hour time limit. The --exclude flag is optional, and it is used to include only nodes with A6000 GPUs on babel.

Note: slurm documentation can be found here.

With the appropriate python environment activated, python src/llama-pipeline.py should run a llama generation pipeline.

Submit a job and let it run in the background

You might want to submit a job to run in the background. sbatch does exactly this: it takes a script file that describes how a node is setup and what commands to run, and submits that job for execution.

Note: the environment inside sbatch is by default inherited from the environment where sbatch is called. So you need to activate the appropriate conda environment before calling sbatch.

srun and sbatch practically share the same arguments. sbatch scripts/submit.sh python src/llama-pipeline.py submits a job that runs python src/llama-pipeline.py in the background. Stdout and stderr are redirected to a file named slurm-[jobid].out.

llama-on-babel's People

Contributors

y0mingzhang avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.