Giter Site home page Giter Site logo

llm-tutorial's Introduction

LLaMA 2 & Orca 2 Tutorial

Step 1. Register for LLaMA download at https://ai.meta.com/resources/models-and-libraries/llama-downloads/.

Step 2. Install LLaMA models.

$ git clone https://github.com/facebookresearch/llama
$ cd llama
$ ./download.sh   ## Use the email-received approval URL. 

CAUTION: Do not copy the email's link directly, but click the link and then copy the link on the address bar. Ensure to re-clone the repository and do this all at once without making any mistake.

Step 3. Install the latest NVidia driver.

$ sudo ubuntu-drivers autoinstall
$ sudo apt install nvidia-driver-545
$ sudo reboot now
$ nvidia-smi   # Check if Nvidia is reachable

CAUTION: If you get an error for sudo ubuntu-drivers autoinstall:

$ sudo vi /usr/lib/python3/dist-packages/UbuntuDrivers/detect.py

Fix 835 line's version = int(package_name.split('-')[-1]), change to version = int(package_name.split('-')[-2]) (or to [2])

Step 4. Install essential software.

$ sudo apt update
$ sudo apt upgrade
$ sudo apt install python3-pip  nvidia-cuda-toolkit
$ cd llama
$ pip3 install torch numpy scipy fire fairscale sentencepiece accelerate transformers
$ pip3 install -e .
$ python3
 > import torch
 > torch.cuda.is_available()
 > torch.cuda.device_count()
 > torch.cuda.current_device()
 > torch.cuda.device(0)
 > torch.cuda.get_device_name(0)

Step 5. Run the LLaMA example chatbot.

### 7B Chat Model (1 GPU needed)
$ python3 -m torch.distributed.run --nproc_per_node 1 example_chat_completion.py     --ckpt_dir llama-2-7b-chat/     --tokenizer_path tokenizer.model     --max_seq_len 512 --max_batch_size 6

### 13B Chat Model (2 GPUs needed)
$ python3 -m torch.distributed.run --nproc_per_node 2 example_chat_completion.py     --ckpt_dir llama-2-7b-chat/     --tokenizer_path tokenizer.model     --max_seq_len 512 --max_batch_size 6

### 70B Chat Model (8 GPUs needed)
$ python3 -m torch.distributed.run --nproc_per_node 8 example_chat_completion.py     --ckpt_dir llama-2-7b-chat/     --tokenizer_path tokenizer.model     --max_seq_len 512 --max_batch_size 6

### 7B Text Completion Model (1 GPU needed)
$ python3 -m torch.distributed.run --nproc_per_node 1 example_text_completion.py --ckpt_dir llama-2-7b/ --tokenizer_path tokenizer.model --max_seq_len 128 --max_batch_size 4

### 13B Text Completion Model (2 GPUs needed)
$ python3 -m torch.distributed.run --nproc_per_node 2 example_text_completion.py --ckpt_dir llama-2-13b/ --tokenizer_path tokenizer.model --max_seq_len 128 --max_batch_size 4

### 70B Text Completion Model (4 GPUs needed)
$ python3 -m torch.distributed.run --nproc_per_node 8 example_text_completion.py --ckpt_dir llama-2-70b/ --tokenizer_path tokenizer.model --max_seq_len 128 --max_batch_size 4

Running LLaMA2 with Hugging Face (Google Colab or Local Setup)

$ python3 llama2-hf-inference.py   # replace the Hugging Face token by yours at [https://huggingface.co/settings/tokens](https://huggingface.co/settings/tokens)

Running Orca2 with Hugging Face (Google Colab or Local Setup)

$ python3 orca2-hf-inference.py   # replace the Hugging Face token by yours at [https://huggingface.co/settings/tokens](https://huggingface.co/settings/tokens)

Resources

llm-tutorial's People

Contributors

gogo9th avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.