Giter Site home page Giter Site logo

wuchg / fine-tuning_codellama-7b_model Goto Github PK

View Code? Open in Web Editor NEW

This project forked from vivekchauhan05/fine-tuning_codellama-7b_model

0.0 0.0 0.0 19 KB

We are Fine-tuning the CodeLlama-7b parameters model using PEFT Quantization method

License: Apache License 2.0

Python 34.36% Jupyter Notebook 65.64%

fine-tuning_codellama-7b_model's Introduction

Fine-Tuning CodeLlama-7b-Instruct-hf Model

1. Model Used

In this project we are using the CodeLlma -7b-Instruct-hf model which is basically used for generating the code. We take the base model codellama/CodeLlama-7b-Instruct-hf from the huggingface hub.

2. Fine-Tuning Process

Fine-tuning is the process of taking a pretrained model and adapting it to perform specific tasks or solve particular problems. In this project, the fine-tuning process involves several critical steps:

2.1. Tokenization

We use the AutoTokenizer from the Hugging Face Transformers library to tokenize the base model. This step prepares the model for training on specific tasks by converting text data into a suitable format.

2.2. Quantization

Quantization is applied to the base model using a custom configuration. This process optimizes the model for efficient execution while minimizing memory usage. We employ the following quantization parameters:

  • load_in_4bit: Activates 4-bit precision for base model loading.
  • bnb_4bit_use_double_quant: Uses double quantization for 4-bit precision.
  • bnb_4bit_quant_type: Specifies the quantization type as "nf4" (Nested float 4-bit).
  • bnb_4bit_compute_dtype: Sets the compute data type to torch.bfloat16.

2.3. LoRA (Long Range Attention) Configuration

We enhance the model's ability to handle long-range dependencies in sequences of data by configuring LoRA attention mechanisms. Key parameters for LoRA include:

  • lora_r: LoRA attention dimension set to 8.
  • lora_alpha: Alpha parameter for LoRA scaling set to 16.
  • lora_dropout: Dropout probability for LoRA layers set to 0.05.

2.4. Training Configuration

We configure various training parameters, including batch sizes, learning rates, and gradient accumulation steps. Some of the key training parameters are:

  • Batch size per GPU for training and evaluation
  • Gradient accumulation steps
  • Maximum gradient norm (gradient clipping)
  • Initial learning rate (AdamW optimizer)
  • Weight decay
  • Optimizer type (e.g., paged_adamw_8bit)
  • Learning rate schedule (e.g., cosine)

2.5. Supervised Fine-Tuning (SFT)

We employ a Supervised Fine-Tuning (SFT) approach to train the model on specific tasks. This involves providing labeled datasets related to the tasks LLM should specialize in.

2.6. Model Saving

After training, the specialized models are saved for future use.

3. Fine-Tuning Processes

The fine-tuning process consists of several key steps:

  • Tokenization: Transforming text data into a format suitable for the model.
  • Quantization: Optimizing the model for efficiency and memory usage.
  • LoRA Configuration: Enhancing long-range attention capabilities.
  • Training Configuration: Setting up training parameters and optimizations.
  • Supervised Fine-Tuning (SFT): Training the model on specific tasks using labeled data.
  • Model Saving: Saving the trained models for later use.

4. GPU Requirements

The fine-tuning process is computationally intensive and requires a GPU with sufficient capabilities to handle the workload effectively. While the specific GPU requirements may vary depending on the size of the model and the complexity of the tasks, it is recommended to use a high-performance GPU with CUDA support. Additionally, the availability of VRAM (Video RAM) is crucial, as large models like codellama/CodeLlama-7b-Instruct-hf can consume significant memory during training.

In this project, we have set the device to use CUDA, so we are using the google colab 15GB T4 GPU for fine-tuning.

Please ensure that your GPU meets the necessary hardware and software requirements to successfully execute the fine-tuning process.

Usage

1. Colab Notebook (recommended)

This is the simplest and easiest way to run this project.

  1. Locate the Fine-tuning-CodeLlama_demo.ipynb in this repo
  2. Click the "Open in Colab" button at the top of the file
  3. Change the runtime type to T4 GPU
  4. Run all the cells in the notebook

2. Run Locally

Inferencing this model locally requires a GPU with atleast 16GB of GPU RAM.

Instructions:

  1. Clone this repository to your local machine.
git clone https://github.com/VivekChauhan05/Fine-tuning_CodeLlama-7b.git
  1. Navigate to project directory.
cd Fine-tuning_CodeLlama-7b_Model
  1. Install the required dependencies.
pip install -r requirements.txt
  1. Run the app.py file.
python app.py
  1. Open the link provided in the terminal in your browser.

For more details on the code implementation and usage, refer to the code files in this repository.

License

This project is licensed under Apache 2.0

fine-tuning_codellama-7b_model's People

Contributors

vivekchauhan05 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.