Giter Site home page Giter Site logo

azure-ai-ocr's Introduction

AI OCR with Azure

This project demonstrates 'autopilot' OCR with Azure Cognitive Services

Classic OCR models need training to extract structured information from documents. In this project I demonstrate how to use hybrid approach with LLM (multimodal) to get better results without any pre-training.

The project uses Azure Document Intelligence combined with GPT4 and GPT-Vision. Each of the tools have their strong points and the hybrid approach is better than any of them alone.

Notes:

  • The document-intelligence needs to be using the markdown preview (limited regions).
  • The openai model needs to be vision capable.

How to use

Run example projects in examples-* folder.

The examples need docker to run. Each folder has a script that you can execute to run the complete example. Each folder has also .env file that needs to be filled with your Azure service credentials.

Complete the .env files in each example folder before running.

Note: The powershell scripts don't work very well, the bash scripts are better...

Example 1

example 1 - Sample collection Extract process of water sample providing from an information flyer.

Example 2

example 2 - Complex tables Let's find some insurance products from a more complex table.

Notes on the examples

  • I used https://bjdash.github.io/JSON-Schema-Builder/ to create the json-schemas in the example folders. If the keys in the json model are not self-explanatory, you should use description fields to tell the LLM model what you mean by each key to increase accuracy.

User interface

User interface is provided for testing purposes only. To run it locally, install

poetry install --with ui

then run ./ui.sh in the root folder. (env is picked from .env file in the root folder)

Develop

Install with poetry

poetry install

azure-ai-ocr's People

Contributors

piizei avatar

Stargazers

 avatar Dominique Broeglin avatar Aymen avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.