Giter Site home page Giter Site logo

falconinference's Introduction

Text Generation with Quantization & Falcon-7B

This repository contains code for text generation using a language model with quantization. It utilizes the Hugging Face Transformers library to implement a text generation pipeline. Additionally, the model in this repository is quantized using the BitsAndBytes library, which allows for reduced memory and computational requirements while maintaining performance.

Introduction

Quantization is a technique used to reduce the precision of numerical values in a model. ➜ This involves lowering the number of bits used to represent model parameters and activations. By quantizing the model, we can significantly reduce its memory footprint and speed up inference without sacrificing the overall performance.

Inference Model

The text generation pipeline in this repository uses the pre-trained language model "falcon-7b-instruct" from the Hugging Face model hub. ➜ The model has been quantized using the BitsAndBytes library to load parameters and activations in 4-bit format, enabling more efficient computation.

The text generation pipeline takes a query as input and generates coherent text based on the prompt. ➜ The generated text can be controlled using parameters like max_length, top_k, and num_return_sequences.

Installation

To run the text generation pipeline with quantization, follow these steps:

  1. Install the required packages:
pip install transformers einops accelerate bitsandbytes
  1. Clone this repo:
git clone https://github.com/TheRealM4rtin/FalconInference.git
cd FalconInference
  1. Run

Usage

To generate text using the provided code, you can follow the example provided in the Python script. The prompt function allows you to input your query, and the model will generate corresponding text based on that prompt.

The text generation pipeline can be further customized by adjusting parameters like max_length, top_k, and num_return_sequences to control the length and diversity of the generated text.

falconinference's People

Contributors

therealm4rtin avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.