Giter Site home page Giter Site logo

opticalcharacterrecognition's Introduction

Overview: The Optical Character Recognition (OCR) project converts images of text into machine-readable text format. This allows users to take scanned documents or images containing text and convert them into editable and searchable text documents.

Technologies Used:

Python: The primary programming language used for the project.
TensorFlow and Keras: Used for building and training the Convolutional Neural Network (CNN) for character recognition.
OpenCV: Utilized for image processing tasks such as converting images to grayscale, blurring, and edge detection.
PyTesseract: A Python wrapper for Google's Tesseract-OCR Engine, used for extracting text from images.
PyQt5: Used for building the graphical user interface (GUI) of the application.

Project Structure:

model.h5: Pre-trained model file for the CNN.
OCR.py: Main Python script containing the implementation for loading the model, preprocessing the images, and recognizing text.
README.md: Documentation file providing an overview and setup instructions.
sample_image: Directory containing sample images used for testing the OCR.

Implementation Details:

Model Training:
    Dataset: Extended MNIST dataset (EMNIST) is used, which includes handwritten digits and alphabets.
    Preprocessing: The images are resized, normalized, and split into training and testing sets.
    CNN Architecture: A Sequential CNN model is built using Keras, which includes layers such as Convolutional, MaxPooling, and Dense layers.
    Training: The model is trained on the EMNIST dataset, and the training performance is visualized using matplotlib.

Image Recognition:
    Image Loading and Preprocessing: The input image is converted to grayscale and blurred to reduce noise.
    Edge Detection: Canny edge detection is used to highlight the edges of the characters in the image.
    Contour Detection: Contours of the characters are identified and filtered based on size.
    Character Extraction: Each contour is extracted, thresholded, and resized to match the input size expected by the CNN.
    Prediction: The pre-trained CNN model predicts the character for each contour, and the results are displayed on the image.

Setup Instructions:

    Clone the Repository:
    
        git clone https://github.com/Mudit2003/OpticalCharacterRecognition.git
        cd OpticalCharacterRecognition

Install Dependencies:

pip install tensorflow matplotlib opencv-python pytesseract pyqt5

Run the Application:

python OCR.py

Usage:

Modes: The application can operate in two modes: 'image' mode for recognizing text in static images and 'webcam' mode for live video capture and text recognition.
Pre-trained Model: The pre-trained model (model.h5) can be used to skip the training process and directly perform recognition.

opticalcharacterrecognition's People

Contributors

mudit2003 avatar

Stargazers

Randeep Piyush avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.