Giter Site home page Giter Site logo

balnarendrasapa / faq-llm Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 14.05 MB

This is course project for DSCI 6004 deals with fine-tuning a pretrained model llm with a custom data

License: GNU General Public License v3.0

Jupyter Notebook 99.18% Python 0.82%
finetuning huggingface llm transformers chatbot fine-tuning jupyter-notebook kaggle kaggle-notebook large-language-models

faq-llm's Introduction

E-Commerce FAQ Chatbot using Parameter Efficient Fine Tuning with LoRA Technique

Overview

This repository contains the code and resources for building an E-Commerce FAQ Chatbot using Parameter Efficient Fine Tuning with LoRA Technique. The project aims to develop a chatbot for an E-Commerce site by leveraging Large Language Models (LLMs) and adopting a fine-tuning approach using the Falcon-7B model. The fine-tuning is performed with Parameter Efficient Fine Tuning (PEFT) and the LoRA (Low-Rank Adaptation) Technique to enhance the model's performance.

Authors

  • Bal Narendra Sapa - University of New Haven - Email
  • Ajay Kumar Jagu - University of New Haven - Email

Table of Contents

  1. Introduction
  2. Objectives
  3. Related Work
  4. Dataset
  5. Methodology
  6. Results
  7. Conclusion
  8. Acknowledgements
  9. Deployment
  10. Code and Resources
  11. References

Introduction

In the fast-paced world of e-commerce, handling customer queries efficiently is crucial. This project introduces a chatbot solution leveraging advanced language models to automate responses to frequently asked questions. The fine-tuned model, Falcon-7B, is trained on a custom dataset extracted from Kaggle, addressing common user queries in the e-commerce domain.

Objectives

  1. Efficient Customer Support: Streamlining customer support by automating responses to frequently asked questions.
  2. Cost Savings through Automation: Reducing operational costs by automating responses to routine inquiries.
  3. Enhanced Resource Allocation: Optimizing human resources by automating responses to FAQs, allowing agents to focus on more complex issues.

Related Work

The project builds upon pre-trained models, including OpenAI's GPT models and META's LLAMA models. It also explores existing chatbots like IBM Watson Assistant and Ada Healthcare Chatbot. The comparison between RAG (Retrieval Augmented Generation) and fine-tuned models is discussed.

Dataset

The dataset, sourced from Kaggle, comprises 79 samples with questions and corresponding answers. The split includes 67 samples for training (85%) and 12 samples for testing (15%).

Link to Dataset on Kaggle

Methodology

The methodology involves using the FALCON-7B model, fine-tuning with PEFT and LoRA Technique, and leveraging key libraries such as Transformers and Peft by Hugging Face. The process includes dataset preprocessing, LoRA adapters configuration, and model training.

Results

The model demonstrates decreasing loss values across epochs, indicating positive training trends. The Bleu score, used for evaluation, showcases the model's proficiency in generating responses aligned with expected results.

Conclusion

The project contributes to enhancing customer support in e-commerce through advanced language models. While the achieved results are promising, further experiments with larger datasets and continuous model refinement are recommended.

Acknowledgements

Acknowledgments are extended to the developers of the FALCON-7B model, the Hugging Face community, Kaggle for hosting the dataset, and the faculty at the University of New Haven.

Deployment

A Streamlit application has been developed for local use, requiring a GPU with at least 16 gigabytes of video RAM (vRAM) for optimal performance. the app in this repository. checkout in streamlit-app dir

Code and Resources

Feel free to explore, experiment, and contribute to further improvements.

References

  1. Language Models are Few-Shot Learners - Tom B. Brown et al (2020)
  2. LLAMA 2: Open Foundation and Fine-Tuned Chat Models (2023)
  3. Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning - Haokun Liu et al (2022)
  4. LoRA: Low-Rank Adaptation of Large Language Models - Edward J Hu et al (2021)
  5. An overview of Bard: an early experiment with generative AI - James Manyika (2023)
  6. IBM Watson Assistant
  7. Hugging Face Blog on PEFT

faq-llm's People

Contributors

balnarendrasapa avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.