Giter Site home page Giter Site logo

ctwardy / python_ocr_tutorial Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ybur-yug/python_ocr_tutorial

5.0 2.0 5.0 1.33 MB

This is a tutorial on getting OCR running on a simple web server, using python, flask, tesseract-ocr, and leptonica

License: MIT License

Dockerfile 3.32% Python 21.73% CSS 64.26% JavaScript 6.43% HTML 4.26%
python python3 ocr tutorial realpython flask dockerfile

python_ocr_tutorial's Introduction

Python OCR Docker

This is my branch of a RealPython Tutorial. I've updated it for Python3 / Unicode, and dropped most of the non-Docker bits so I can inherit from the tesseractshadow/tesseract4re Docker container. The result is a simple OCR Docker app using Tesseract. It provides:

  • cli.py - a command-line app that takes a URL and returns the text extracted from the image.
  • app.py - a small Flask app that does the same thing, but in the browser.

The only preprocessing is a single call to ImageFilter.SHARPEN.

Changes include:

  • Updated for Python3 & unicode.
  • Refactored to use tesseractshadow/tesseract4re Docker container.
  • Allows file:/// URLs. (Relative to the container!)

Alternatives

Usage

Install Docker

If you are not familiar with Docker please read Docker - Get Started..

Quick Start: CLI

The following should run the Docker container straight from DockerHub.

docker container run --publish 5000:5000 --interactive --tty ctwardy/python_ocr_tutorial:2.0 python3 cli.py

That should produce the following output:

===OOOO=====CCCCC===RRRRRR=====
==OO==OO===CC=======RR===RR====
==OO==OO===CC=======RR===RR====
==OO==OO===CC=======RRRRRR=====
==OO==OO===CC=======RR==RR=====
==OO==OO===CC=======RR== RR====
===OOOO=====CCCCC===RR====RR===

A simple OCR utility
What is the url of the image you would like to analyze?

Type the following:

file:///flask_server/tests/advertisement.jpg

to see:

The raw output from tesseract with no processing is:
-----------------BEGIN-----------------
b'ADVERTISEMENT.\n\nTus publication of the Works of Joan Knox, it is\nsupposed, will extend to Five Volumes. It was thought\nadvisable to commence the series with his History of\nthe Reformation in Scotland, as the work of greatest\nimportance. The next volume will thus contain the\nThird and Fourth Books, which continue the History to\nthe year 1564; at which period his historical labours\nmay be considered to terminate. But the Fifth Book,\nforming a sequel to the History, and published under\nhis name in 1644, will also be included. His Letters\nand Miscellaneous Writings will be arranged in the\nsubsequent volumes, as nearly as possible in chronolo-\ngical order ; each portion being introduced by a separate\nnotice, respecting the manuscript or printed copies from\nwhich they have been taken.\n\nIt may perhaps be expected that a Life of the Author\nshould have been prefixed to this volume. The Life of\nKnox, by Dr. M\xe2\x80\x98Crig, is however a work so universally\nknown, and of so much historical value, as to supersede\nany attempt that might be made for a detailed bio-'
------------------END------------------

Quick Start: App

Runs another app from the same container, pulled from DockerHub:

docker container run --publish 5000:5000 ctwardy/python_ocr_tutorial:2.0

Then visit localhost:5000 in your browser and type file:///flask_server/tests/advertisement.jpg to see essentially the same result, but rendered more nicely. Click "Again" to try another.

Stop it with Ctrl-C as usual.

Get this from GitHub

git clone https://github.com/ctwardy/python_ocr_tutorial.git

Build the Docker image:

From the python_ocr_tutorial folder, do: docker image build --tag python_ocr_test . (Note the trailing "." -- it means build from this folder.)

Pull the Docker image:

In case you want to pull but not run. docker pull ctwardy/python_ocr_tutorial:2.0

Run either the CLI or the App:

We did these above, but for reference.

  • CLI: docker container run --publish 5000:5000 --interactive --tty ctwardy/python_ocr_tutorial:2.0 python3 cli.py
  • App: docker container run --publish 5000:5000 ctwardy/python_ocr_tutorial:2.0

Stop the app using Ctrl-C.

TODO (or see GitHub Issues):

  • Browser app still won't display "<" characters. The returned JSON is fine, so it's getting sanitized in the javascript or browser.
  • Add endpoint for POSTing images directly.
  • Add file upload widget.
  • Try Tesseract4 instead of Tesseract3: tesseractshadow/tesseract4cmp

:)

python_ocr_tutorial's People

Contributors

ctwardy avatar mjhea0 avatar ybur-yug avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.