Giter Site home page Giter Site logo

whyrrrrun / slam-llm Goto Github PK

View Code? Open in Web Editor NEW

This project forked from x-lance/slam-llm

0.0 0.0 0.0 24.75 MB

Speech, Language, Audio, Music Processing with Large Language Model

License: MIT License

Shell 0.83% Python 99.02% Dockerfile 0.16%

slam-llm's Introduction

SLAM-LLM

SLAM-LLM is a deep learning toolkit that allows researchers and developers to train custom multimodal large language model (MLLM), focusing on Speech, Language, Audio, Music processing. We provide detailed recipes for training and high-performance checkpoints for inference.

SLAM-LLM Logo

version version python mit

Table of Contents

  1. News
  2. Installation
  3. Uasge
  4. Features
  5. Acknowledge

News

Installation

git clone https://github.com/huggingface/transformers.git
cd transformers
git checkout tags/v4.35.2
pip install -e .
cd ..
git clone https://github.com/huggingface/peft.git
cd peft
git checkout tags/v0.6.0
pip install -e .
cd ..
pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118
git clone https://github.com/ddlBoJack/SLAM-LLM.git
cd SLAM-LLM
pip install  -e .

For some examples, you may need to use fairseq, the command line is as follows:

# you need to install fairseq before SLAM-LLM
git clone https://github.com/pytorch/fairseq
cd fairseq
pip install --editable ./

We also provide a docker image for convenience:

# build docker image
docker build -t slam-llm:latest .

# run docker image with gpu
docker run -it --gpus all --name slam --shm-size=256g slam-llm:latest /bin/bash

Usage

List of Recipes

We provide reference implementations of various LLM-based speech, audio, and music tasks:

Configuration Priority

We provide hierarchical configuration inheritance relationships as follows:

command-line (shell file) > Hydra configuration (yaml file) > dataclass configuration (Python file)

Features

  • Easily extend to new models and tasks.
  • Detailed recipes for training and high-performance checkpoints for inference.
  • Mixed precision training which trains faster with less GPU memory on NVIDIA tensor cores.
  • Multi-GPU training with data and model parallel, supporting DDP, FSDP and deepspeed (still need to be improved).
  • Flexible configuration based on Hydra and dataclass allowing a combination of code, command-line and file based configuration.

Acknowledge

  • We borrow code from Llama-Recipes for the training process.
  • We borrow code from Fairseq for deepspeed configuration.
  • We thank the contributors for providing diverse recipes.

slam-llm's People

Contributors

mreso avatar ddlbojack avatar hamidshojanazeri avatar chauhang avatar sekyondameta avatar yanghaha0908 avatar lauragpt avatar zzasdf avatar lchu-ibm avatar jeffxtang avatar polym avatar zhikangniu avatar cwx-worst-one avatar wangtianrui avatar zszheng147 avatar amitsangani avatar varunfb avatar luobots avatar cmiller01 avatar anshikavermag avatar awgu avatar tim-a-davis avatar activescott avatar johnbwilliams avatar irajmoradi avatar thuwyh avatar shijie-wu avatar rohan-varma avatar philparzer avatar avi-cenna avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.