Giter Site home page Giter Site logo

occ-ai / obs-ocr Goto Github PK

View Code? Open in Web Editor NEW
46.0 1.0 4.0 19.89 MB

OCR Plugin for OBS based on Tesseract

Home Page: https://obsproject.com/forum/resources/ocr-text-recognition-detection-built-in-obs.1866/

License: GNU General Public License v2.0

CMake 50.36% Shell 4.90% C 4.53% C++ 40.22%
obs-plugin obs-studio obs-studio-plugin ocr optical-character-recognition realtime-ocr tesseract-ocr

obs-ocr's Introduction

OCR - OBS Plugin

GitHub GitHub Workflow Status Total downloads GitHub release (latest by date) Discord

Introduction

The OCR Plugin for OBS provides real-time Optical Characted Recognition (OCR) or Text Recognition abilities over any OBS Source that provides an image - can be Image, Video, Browser or any other Source. It is based on the incredible Tesseract open source OCR engine, compiled and running directly inside OBS for real-time operation on every frame rendered.

Reading scoreboards? Try https://scoresight.live free OCR tool specifically made (by us) for reading and broadcasting scoreboards.

If this free plugin has been valuable to you consider adding a ⭐ to this GH repo, subscribing to my YouTube channel where I post updates, and supporting my work: https://www.patreon.com/RoyShilkrot https://github.com/sponsors/royshil

Usage Tutorials


4 Minutes

Do more with OCR Plugin

OCR Plugin enables many use cases for enhancing your stream or recording:

Features

Available now:

  • Add OCR Filter to any source with image or video output
  • Choose from Scoreboard model or English, French, Spanish, German, Chinese, Japanese, Arabic, Turkish, Portugese, Hindi, Russian and Italian
  • Choose the segmentation mode: Word, Line, Page, etc.
  • "Semantic Smoothing": getting more consistent outputs with higher accuracy and confidence by "averaging" several text outputs
  • Timing/Running modes: per X-milliseconds
  • Output OCR result to an OBS Text Source
  • Output to a text file (with/out aggregation)
  • Output formatting (with inja): e.g. "Score: {{score}}"
  • Output text detection to image source (draws boxes, text, etc.)
  • Output to settings (e.g. for other plugins to use as triggers)
  • Binarization methods (threshold, Otsu, Triangle, adaptive)
  • Image Dilation
  • Rescale (optimal Tesseract performance is at 35 pixels / character)

Coming soon:

  • More languages built-in (pretrained Tesseract models)
  • Allowing external model files
  • More output capabilities e.g. Parsing, websocket event, etc.
  • Detection area selection (to prevent using Crop/Pad Filter)
  • Different timing/run modes: per X-frames, image change, etc.
  • Image stabilization
  • Optical flow tracking for fast moving text
  • Image processing: Perspective warping, auto-cropping, etc.
  • Advanced binarization: Niblack, Sauvola

Check out our other plugins:

  • Background Removal removes background from webcam without a green screen.
  • Detect will detect and track >80 types of objects in any OBS source.
  • LocalVocal speech AI assistant plugin for real-time, local transcription (captions), translation and more language functions
  • Polyglot translation AI plugin for real-time, local translation to hunderds of languages
  • URL/API Source will connect to any URL/API HTTP and get the data/image/audio to your scene.
  • 🚧 Experimental 🚧 CleanStream for real-time filler word (uh,um) and profanity removal from live audio stream

If you like this work, which is given to you completely free of charge, please consider supporting it https://github.com/sponsors/royshil or https://www.patreon.com/RoyShilkrot

Download

Check out the latest releases for downloads and install instructions.

Building

The plugin was built and tested on Mac OSX (Intel & Apple silicon), Windows and Linux.

Start by cloning this repo to a directory of your choice.

Mac OSX

Using the CI pipeline scripts, locally you would just call the zsh script. By default this builds a universal binary for both Intel and Apple Silicon. To build for a specific architecture please see .github/scripts/.build.zsh for the -arch options.

$ ./.github/scripts/build-macos -c Release

Install

The above script should succeed and the plugin files (e.g. obs-ocr.plugin) will reside in the ./release/Release folder off of the root. Copy the .plugin file to the OBS directory e.g. ~/Library/Application Support/obs-studio/plugins.

To get .pkg installer file, run for example

$ ./.github/scripts/package-macos -c Release

(Note that maybe the outputs will be in the Release folder and not the install folder like pakage-macos expects, so you will need to rename the folder from build_x86_64/Release to build_x86_64/install)

Linux (Ubuntu)

Use the CI scripts again

$ ./.github/scripts/build-linux.sh

Copy the results to the standard OBS folders on Ubuntu

$ sudo cp -R release/RelWithDebInfo/lib/* /usr/lib/x86_64-linux-gnu/
$ sudo cp -R release/RelWithDebInfo/share/* /usr/share/

Note: The official OBS plugins guide recommends adding plugins to the ~/.config/obs-studio/plugins folder.

Windows

Use the CI scripts again, for example:

> .github/scripts/Build-Windows.ps1 -Target x64 -CMakeGenerator "Visual Studio 17 2022"

The build should exist in the ./release folder off the root. You can manually install the files in the OBS directory.

obs-ocr's People

Contributors

royshil avatar umireon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

obs-ocr's Issues

Tesseract Errors and Missing PNG Library

First of all, I want to thank you very much; your OCR tools are exactly what I was looking for! :) Keep up the great work!

When I load and use this plugin in OBS, I get some errors that I believe are related to Tesseract. Can these be ignored?

info: [obs-ocr] Loading tesseract model from: /usr/share/obs/obs-plugins/obs-ocr/tessdata
Error in pixReadMemTiff: function not present
Error in pixReadMem: tiff: no pix returned
Error in pixaGenerateFontFromString: pix not made
Error in bmfCreate: font pixa not made
info: [obs-ocr] Starting Tesseract thread, update timer: 100
info: Switched to scene 'Szene'
info: ------------------------------------------------
info: Loaded scenes:
info: - scene 'Szene':
info: - source: 'FLIR' (xshm_input)
info: - filter: 'Schärfung' (sharpness_filter_v2)
info: - filter: 'OCR-Filter' (ocr_filter)
info: - filter: 'Farbkorrektur' (color_filter_v2)
info: ------------------------------------------------
Warning in changeFormatForMissingLib: png library missing; output bmp format
Warning in changeFormatForMissingLib: png library missing; output bmp format
Warning in changeFormatForMissingLib: png library missing; output bmp format
Warning in changeFormatForMissingLib: png library missing; output bmp format

[Request] Add image recognition to display another image

Would it be possible to add some kind of image recognition (rather than just text OCR) to then display another image? In my use case I would like to record a TCG match, and have the most recently played card in a specified area show up a high res version on the overlay.

image_2024-03-02_065720541

Add support for outputting text to a text file

Feature Request

Support outputting to a local text file


First of all amazing plugin, I've been waiting for something like this for a long time. I can see so many great uses for this to really help the life of a streamer.

It would be great if the OCR plugin could write to a text file on the users system. This opens up quite a few possibilities for external apps to interact based on the results from OCR. An example might be a chat bot that can read local files or something like Lumia stream where you could trigger lights based on OCR.

Find and blur text on image?

image

@umireon u r wrong
StreamFX can blur X,Y,X1,Y1 Rectangle
In my example i have dynamic cooridantes of text...
it moves like a chat
and it needs to be OCRed and BLURed on different places on screen

will it support several masks ?

I think I can make it produce a binary mask with rectangles on the areas where text was detected.
So it's a single mask but it covers multiple text areas.
The blur filter shouldn't care either way.

QR read possibility

Hello again. I would like to propose a suggestion for future editions of the OCR plugin. In case it can be done. It would be if I could directly read QR codes and transform them into numbers. Example: table tennis scoreboard with numbers and QR. With a camera capture 4 QRs and transform it in a live scoreboard in Obs Studio. Thanks
image

[Request] Add blacklist words

Hi,
I don't know if it's easy to add a feature to only capture a list of words (banned words list from a file or a field in the GUI).

For example, I would like the plug-in will be able to capture my email address, a list of IP addresses (or others words from a list) so that's not being visible (with composite blur).

Thanks for your project.
Alexia.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.