ammarlodhi255 / image-captioning-system-to-assist-the-blind Goto Github PK

View Code? Open in Web Editor NEW

An image captioning system that is able to predict and speak out a caption of an image taken by visually impaired.

Jupyter Notebook 96.25% Python 1.35% HTML 1.24% JavaScript 0.64% CSS 0.52%

image-captioning-system-to-assist-the-blind's Introduction

Image Captioning System to Assist The Blind

About
Getting Started
Screenshots

About

The goal of the project is to develop a system using deep learning techniques to assist visually impaired individuals in obtaining information by describing images taken by them. The system uses a CNN model and an NLP model to create a single image captioning system that takes image features as input and generates a text sequence describing the image.

Incorporated state-of-the-art pre-trained models, such as ResNet50, VGG16, and VGG19, for image feature extraction and LSTM and Bidirectional LSTM for text generation. Evaluated various models to determine the best-performing model with a BLEU-score of 0.61 and deployed it using Flask and pyttsx3 for web and text-to-speech functionality in the app.

Getting Started

These instructions will get you a copy of the project up and running on your local machine.

Clone the project repository from GitHub:

git clone https://github.com/ammarlodhi255/image-captioning-system-to-assist-the-blind.git

Navigate to the project directory:

cd image-captioning-system-to-assist-the-blind

Create a virtual environment for the project:

python3 -m venv env

Activate the virtual environment:

source env/bin/activate

Export the Flask app:

export FLASK_APP=app.py

Run the Flask app:

flask run

Screenshots

Dataset Split

Model Anatomy

Project Workflow

Results

Final Outcome

Additional Outputs

image-captioning-system-to-assist-the-blind's People

Contributors

Stargazers

Watchers

Recommend Projects

ammarlodhi255 / image-captioning-system-to-assist-the-blind Goto Github PK