Full-Stack PDF NLP Application

Overview

This project is a full-stack application that allows users to upload PDF documents and ask questions regarding the content of these documents. The backend processes these documents and utilizes natural language processing (NLP) to provide answers to the questions posed by the users.

Prerequisites
Installation
Setup
Usage
API Documentation
Application Architecture
Demo

Prerequisites

Python 3.9+: Required for the backend.
Node.js 14+: Required for the frontend.
PostgreSQL or SQLite: Required if using a database for storing document metadata.
Docker: Optional, for containerized deployment.

Installation

Backend

Structure

Clone the repository:

git clone https://github.com/atulyadav745/aiplanet.git
cd aiplanet/backend

Create and activate a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

Install the required packages:
```
pip install -r requirements.txt
```

Frontend

Navigate to the frontend directory:
```
cd ../frontend
```
Install the dependencies:
```
npm install
```

Setup

Backend Configuration

Environment Variables: Create a .env file in the backend directory with the following:

DATABASE_URL=sqlite:///./test.db  # or your PostgreSQL connection string
STORAGE_PATH=./uploads

Initialize the Database:
```
python main.py db init
```
Run the Backend:
```
uvicorn main:app --reload
```

Frontend Configuration

Run the Frontend:
```
npm start
```

Usage

Uploading PDFs

Navigate to the home page.
Click on the "Upload PDF" button.
Select and upload a PDF document.

Asking Questions

After uploading a PDF, go to the question input section.
Enter your question in the input field and submit.
View the answer below the question field.

API Documentation

Endpoints

Upload PDF

URL: /upload
Method: POST
Description: Uploads a PDF document.
Request:
- Headers: Content-Type: multipart/form-data
- Body: file: the PDF file.
Response:
- 200: Success, returns document metadata.
- 400: Error, returns error details.

Ask Question

URL: /ask
Method: POST
Description: Receives a question and returns an answer based on the uploaded PDF.
Request:
- Headers: Content-Type: application/json
- Body: { "doc_id": "<document_id>", "question": "<user_question>" }
Response:
- 200: Success, returns the answer.
- 400: Error, returns error details.

Application Architecture

You can view Application Architecture here.

Backend

Framework: FastAPI
NLP: LangChain for processing questions and generating answers.
PDF Processing: PyMuPDF for extracting text from PDFs.
Data Management: SQLite/PostgreSQL for metadata storage.
File Storage: Local filesystem or AWS S3 for PDF storage.

Frontend

Framework: React.js
State Management: Context API.
HTTP Client: Axios for API requests.
Styling: CSS Modules or styled-components.

Data Flow

PDF Upload: The frontend sends the PDF file to the backend.
PDF Processing: The backend extracts text and stores metadata.
Question Processing: The frontend sends questions, which the backend processes and responds to with answers.

Demo

You can view a live demo here.

Alternatively, watch a screencast here.

If you have any questions or need further assistance, feel free to open an issue.

Happy coding! 🎉

atulyadav745 / aiplanet Goto Github PK

aiplanet's Introduction

Full-Stack PDF NLP Application

Overview

Table of Contents

Prerequisites

Installation

Backend

Structure

Frontend

Setup

Backend Configuration

Frontend Configuration

Usage

Uploading PDFs

Asking Questions

API Documentation

Endpoints

Upload PDF

Ask Question

Application Architecture

Backend

Frontend

Data Flow

Demo

aiplanet's People

Contributors

Watchers

Recommend Projects

Recommend Topics

Recommend Org