This project implements a chatbot using a GPT-2 language model. The chatbot is trained on a combined dataset of question-answer pairs and dialogue exchanges, enabling it to handle various conversational contexts. The backend is built using FastAPI, and the frontend is a simple ReactJS application.
As of now this is day 1 effort that's committed as initial commit. Its still not able to generate correct responses and has still a long way to go.
- Requirements
- Installation
- Data Preparation
- Training the Model
- Running the Backend
- Running the Frontend
- Testing the Chatbot
- Acknowledgments
- Python 3.8 or higher
- Node.js and npm
- pip (Python package installer)
-
Clone the repository:
git clone https://github.com/yourusername/chatbot-llm.git cd chatbot-llm
-
Set up the Python environment:
python -m venv venv source venv/bin/activate # On Windows use `venv\Scripts\activate` pip install -r requirements.txt
-
Set up the ReactJS frontend:
cd llm_frontend npm install cd ..
-
Create your datasets:
qa_dataset.json
(example):
[ {"question": "What is AI?", "answer": "AI stands for Artificial Intelligence."}, {"question": "Who wrote '1984'?", "answer": "George Orwell wrote '1984'."} ]
dialogue_dataset.json
(example):
[ {"context": "Hello! How can I help you today?", "response": "Hi! I'm looking for information on your services."}, {"context": "Sure, what do you need help with?", "response": "Can you tell me more about your pricing plans?"} ]
-
Preprocess the datasets:
python preprocess.py
Train the GPT-2 model on the combined dataset:
python train.py
- Ensure the virtual environment is activated:
source venv/bin/activate # On Windows use `venv\Scripts\activate`
- Start the FastAPI server:
uvicorn main:app --reload
- Navigate to the frontend directory:
cd llm_frontend
- Start the React application:
npm start
- Open your browser and navigate to http://localhost:3000.
- Enter a prompt in the input field and click "Submit".
- The chatbot should respond based on the trained model.