Giter Site home page Giter Site logo

rohaan2002 / ocr_web Goto Github PK

View Code? Open in Web Editor NEW

This project forked from fabinhojorge/ocr_web

0.0 0.0 0.0 253 KB

An OCR (tesseract) web interface to upload images. The idea of this project is to study technologies like Python, Django, Continuous Integration, Celery, etc...

Python 71.54% HTML 28.46%

ocr_web's Introduction

CircleCI

OCR web

This project is an OCR (tesseract) web interface to upload images. The idea of this project is to study technologies like Python, Django, Tesseract(OCR), Continuous Integration, Celery, etc...

How to install and Run

After activate your Python Virtual Environment (venv) run the below command:

pip install -r requirements.txt

python manage.py migrate

python manage.py runserver

So you can access in the local URL: localhost:8000

Inside the requirements.txt there are a package called pytesseract. It´s the wrapper to communicate with the Tesseract library (C/C++ code). So, the next step is to install the Tesseract itself.

For this, please follow the below instructions for your SO:

If an additional language is required, is necessary to download it from here and move it to $TESSERACT_PATH/tessdata/

How to use

  1. TBD

Libraries

  • Django
  • Pillow
  • Bootstrap
  • JQuery
  • Tesseract (pytesseract)
  • Celery

To Do

  • Create an initial project
  • Add the continuous integration build and test (Circleci)
  • Create the upload media system: models, forms, templates, media url, etc...
  • Call the OCR to process the image
  • Add image link in the home page. Click the image open a modal to check the image
  • Add support to different languages. OCR have a parameter to select the language of the text. User to inform this while uploading the image.
  • Create a model to store and an interface to return the text generated by the OCR.
  • Pagination in the first page
  • Model to handle a copy of the original image. This copy that will be used to run the OCR. The idea is to in the future use this copy to apply some image treatments (all triggered by interface).
  • Basic image treatment like: cut, rotate, threshold, 'grow'
  • After the core is working, enhance the BE with Celery.

Screen Shots

Home Page Home page

Image Zoom Image Zoom

ocr_web's People

Contributors

fabinhojorge avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.