This project provides a RESTful API using Flask that allows users to interact with a LLaMA model for text generation. The API accepts user input and returns generated text based on the input. It is designed to be simple and extensible for future enhancements.
- Python 3.9 or higher
- Flask
- Transformers (Hugging Face)
- Flask-CORS (for cross-origin resource sharing)
- PyTorch (or TensorFlow, depending on your model implementation)
-
Create a virtual environment :
python -m venv venv source venv/bin/activate # On Windows use `venv\Scripts\activate`
-
Install the required packages:
Create a
requirements.txt
file with the following content:flask transformers flask-cors torch # or tensorflow, depending on your model
Then run:
pip install -r requirements.txt
-
Download the LLaMA model (if not using a pre-trained model from Hugging Face):
Follow the instructions from the Hugging Face model hub to download the model and tokenizer.
-
Ensure you have set the correct path for the LLaMA model in
app.py
:model_name = "path/to/llama/model" # Replace with actual model path or name
-
Start the Flask application:
python app.py
The application will run on
http://localhost:5000
.
-
Description: Generates text based on the input provided by the user.
-
Request Body:
{ "text": "Your input text here" }
-
Response:
-
Success (200):
{ "response": "Generated text based on the input." }
-
Error (400):
{ "error": "Missing 'text' field in the request" }
-
The API includes basic error handling for missing fields in the request body. If the text
field is not provided, a 400 error response will be returned.
You can test the API using curl
or tools like Postman.
curl -X POST -H "Content-Type: application/json" -d '{"text": "What is the weather today?"}' http://localhost:5000/generate
-
Set the request type to POST.
-
Enter the URL:
http://localhost:5000/generate
. -
In the body, select "raw" and set the type to JSON, then enter:
{ "text": "What is the weather today?" }
-
Send the request and view the response.
To containerize the application, you can use Docker. This allows you to run the application in a consistent environment across different systems
Create a file named Dockerfile in the root of your Project
Building the Docker Image Navigate to the directory where your Dockerfile is located. Build the Docker image using the following command:
docker build -t image-name .
then run it on your local host using:
docker run -p 5000:5000 image-name