matt-m-o / yomininja Goto Github PK

Open-source OCR and dictionary tool.

License: GNU General Public License v3.0

Python 1.44% TypeScript 97.67% C++ 0.70% JavaScript 0.08% Batchfile 0.05% Shell 0.03% NSIS 0.04%

dictonary ocr overlay language-learning languages

yomininja's Introduction

YomiNinja

YomiNinja is an application for extracting text from any type of visual content and is designed with language learners in mind.

Demonstration with 10ten

v0.3.x.demo.av1.mp4

Demonstration with Yomichan

merged_github.mp4

The extracted text overlays the original content, allowing for quick look-ups with pop-up dictionaries like 10ten, Yomitan and Inkah.
It minimizes distractions and simplifies the process of looking up unfamiliar words.
This is especially beneficial for language learners who study through videos or games.

YomiNinja is perfect for:

Language learners who study through games, videos, or any other visual content.
Anyone who values a distraction-free, efficient way to look up unfamiliar words.
Users looking for seamless text extraction and workflow improvement.

Check out this video by ganqqwerty to get started quickly and easily!

Dictionary Extensions

YomiNinja supports web browser dictionary extensions, enabling convenient word lookup without external applications.
While not all extensions are currently installable, 10Ten and Yomitan have been successfully tested and are included as pre-installed options for simplified installation.

Installation

Windows

You need Windows 10 or 11 and VCRedist installed.
If you are using the N or KN edition of Windows 10 or 11, please be aware that you will also need to install the Media Feature Pack. This is necessary to ensure that all the required DLLs are installed.

Download and install the latest YomiNinja release.

Linux

YomiNinja currently offers support for distros using the X11 window system. Wayland is not supported due to its limitations with global shortcuts and window positioning.

Install xdotool.
Download the YomiNinja package corresponding to your distribution.
Install the package. For example, on Debian-based distributions:
```
sudo dpkg -i yomininja-e_x.x.x_amd64.deb
```

macOS

Download and install the latest YomiNinja release (.dmg file).

Notes:

The list of available languages for the Apple Vision OCR engine depends on your macOS version.
Manga OCR will be supported in version 0.8 and above.
Native support for ARM64 (Apple Silicon) is coming in version 0.8 and above.

Current features

Text extraction from the entire screen or specific window.
Built-in pop-up dictionaries.
Chrome Extensions (partial support).
OCR Templates (predefined text areas, optimizing OCR efficiency).
Auto OCR.
Text to speech.
WebSocket for Texthookers.

Planned Features

Text extraction from snip.
Anki integration.
History.
Text translation.
Support for more OCR engines.
Support for more TTS voices.

Supported Languages

English
Japanese
Chinese
Korean

Supported OCR Engines

Building

Clone the git repository

git clone https://github.com/matt-m-o/YomiNinja.git && cd YomiNinja

Build OCR services:

cd ./ocr_services/py_ocr_service
./gen_grpc_service.bat
./build.bat
cd ../..
./copy_py_ocr_service_build.bat

Based on your platform, download and extract the latest build of PPOCR-Inference-Service into the appropriate directory:
- Windows: ./bin/win32/ppocr
- Linux: ./bin/linux/ppocr
(optional) Download 10ten v1.15.1 for Chrome, and place the zip file into the following directory:
```
./yomininja-e/extensions/
```
Install node modules. Note: --force is used due to outdated react-furi peerDependencies, but it should function normally.
```
cd yomininja-e && npm install --force
```
Generate gRPC Protobuf types
```
npm run grpc-types
```
Build
```
npm run dist
```

Inspired by:

yomininja's People

Contributors

Stargazers

Watchers

Forkers

dmitsuki

yomininja's Issues

Other Languages Support

could you add support for German? very very please

YomiNinja doesn't work on Linux Mint

Hi, I installed YomiNinja with its dependencies following the guide, unfortunately it doesn't scan anything, I tried with different ocrs, images, videos and changing the hotkeys but it still doesn't work :c.

Here's the log after launching the program from the terminal

  ROOT_DIR: '/opt/YomiNinja/resources/app.asar/',
  PAGES_DIR: '/opt/YomiNinja/resources/app.asar/renderer/out',
  BIN_DIR: '/opt/YomiNinja/bin',
  EXTENSIONS_DIR: '/home/emanuel/.config/YomiNinja/extensions',
  USER_DATA_DIR: '/home/emanuel/.config/YomiNinja'
}
{
  dicPath: '/opt/YomiNinja/resources/app.asar/node_modules/kuromoji/dict'
}
hook_thread_proc [101]: Could not set thread priority 49 for thread 0x7D621360B640!
Installing 10ten-ja-reader-1.15.1-chrome
Error: EACCES: permission denied, unlink '/opt/YomiNinja/resources/extensions/10ten-ja-reader-1.15.1-chrome.zip'
    at unlinkSync (node:fs:1808:3)
    at _unlinkSync (node:internal/fs/rimraf:214:14)
    at rimrafSync (node:internal/fs/rimraf:195:7)
    at Object.rmSync (node:fs:1276:10)
    at BrowserExtensionManager.installBuiltinExtensions (/opt/YomiNinja/resources/app.asar/main/electron-src/extensions/browser_extension_manager/browser_extension_manager.js:108:34)
    at async BrowserExtensionsService.handleBuiltinExtensions (/opt/YomiNinja/resources/app.asar/main/electron-src/extensions/browser_extensions.service.js:186:9)
    at async BrowserExtensionsController.init (/opt/YomiNinja/resources/app.asar/main/electron-src/extensions/browser_extensions.controller.js:12:9)
    at async AppController.init (/opt/YomiNinja/resources/app.asar/main/electron-src/app/app.controller.js:52:9)
    at async App.<anonymous> (/opt/YomiNinja/resources/app.asar/main/electron-src/index.js:15:5) {
  errno: -13,
  syscall: 'unlink',
  code: 'EACCES',
  path: '/opt/YomiNinja/resources/extensions/10ten-ja-reader-1.15.1-chrome.zip'
}
Installing Google-Translate
Error: EACCES: permission denied, unlink '/opt/YomiNinja/resources/extensions/Google-Translate.zip'
    at unlinkSync (node:fs:1808:3)
    at _unlinkSync (node:internal/fs/rimraf:214:14)
    at rimrafSync (node:internal/fs/rimraf:195:7)
    at Object.rmSync (node:fs:1276:10)
    at BrowserExtensionManager.installBuiltinExtensions (/opt/YomiNinja/resources/app.asar/main/electron-src/extensions/browser_extension_manager/browser_extension_manager.js:108:34)
    at async BrowserExtensionsService.handleBuiltinExtensions (/opt/YomiNinja/resources/app.asar/main/electron-src/extensions/browser_extensions.service.js:186:9)
    at async BrowserExtensionsController.init (/opt/YomiNinja/resources/app.asar/main/electron-src/extensions/browser_extensions.controller.js:12:9)
    at async AppController.init (/opt/YomiNinja/resources/app.asar/main/electron-src/app/app.controller.js:52:9)
    at async App.<anonymous> (/opt/YomiNinja/resources/app.asar/main/electron-src/index.js:15:5) {
  errno: -13,
  syscall: 'unlink',
  code: 'EACCES',
  path: '/opt/YomiNinja/resources/extensions/Google-Translate.zip'
}
Installing jpd-breader_13.0_yn
Error: EACCES: permission denied, unlink '/opt/YomiNinja/resources/extensions/jpd-breader_13.0_yn.zip'
    at unlinkSync (node:fs:1808:3)
    at _unlinkSync (node:internal/fs/rimraf:214:14)
    at rimrafSync (node:internal/fs/rimraf:195:7)
    at Object.rmSync (node:fs:1276:10)
    at BrowserExtensionManager.installBuiltinExtensions (/opt/YomiNinja/resources/app.asar/main/electron-src/extensions/browser_extension_manager/browser_extension_manager.js:108:34)
    at async BrowserExtensionsService.handleBuiltinExtensions (/opt/YomiNinja/resources/app.asar/main/electron-src/extensions/browser_extensions.service.js:186:9)
    at async BrowserExtensionsController.init (/opt/YomiNinja/resources/app.asar/main/electron-src/extensions/browser_extensions.controller.js:12:9)
    at async AppController.init (/opt/YomiNinja/resources/app.asar/main/electron-src/app/app.controller.js:52:9)
    at async App.<anonymous> (/opt/YomiNinja/resources/app.asar/main/electron-src/index.js:15:5) {
  errno: -13,
  syscall: 'unlink',
  code: 'EACCES',
  path: '/opt/YomiNinja/resources/extensions/jpd-breader_13.0_yn.zip'
}
Installing yomitan-chrome-24.2.12.0
Error: EACCES: permission denied, unlink '/opt/YomiNinja/resources/extensions/yomitan-chrome-24.2.12.0.zip'
    at unlinkSync (node:fs:1808:3)
    at _unlinkSync (node:internal/fs/rimraf:214:14)
    at rimrafSync (node:internal/fs/rimraf:195:7)
    at Object.rmSync (node:fs:1276:10)
    at BrowserExtensionManager.installBuiltinExtensions (/opt/YomiNinja/resources/app.asar/main/electron-src/extensions/browser_extension_manager/browser_extension_manager.js:108:34)
    at async BrowserExtensionsService.handleBuiltinExtensions (/opt/YomiNinja/resources/app.asar/main/electron-src/extensions/browser_extensions.service.js:186:9)
    at async BrowserExtensionsController.init (/opt/YomiNinja/resources/app.asar/main/electron-src/extensions/browser_extensions.controller.js:12:9)
    at async AppController.init (/opt/YomiNinja/resources/app.asar/main/electron-src/app/app.controller.js:52:9)
    at async App.<anonymous> (/opt/YomiNinja/resources/app.asar/main/electron-src/index.js:15:5) {
  errno: -13,
  syscall: 'unlink',
  code: 'EACCES',
  path: '/opt/YomiNinja/resources/extensions/yomitan-chrome-24.2.12.0.zip'
}
Error: Bad archive
    at FsRead.readUntilFoundCallback [as callback] (/opt/YomiNinja/resources/app.asar/node_modules/node-stream-zip/node_stream_zip.js:203:39)
    at FsRead.readCallback (/opt/YomiNinja/resources/app.asar/node_modules/node-stream-zip/node_stream_zip.js:996:25)
    at FSReqCallback.wrapper [as oncomplete] (node:fs:682:5)
stdout: Notice: this application is experimental!!! 


 App settings preset root: /home/emanuel/.config/YomiNinja/ppocr/presets/

 App settings preset: default
{
    "cls_thresh": 0.9,
    "cpu_threads": 8,
    "det_db_box_thresh": 0.6,
    "det_db_score_mode": "slow",
    "det_db_thresh": 0.3,
    "det_db_unclip_ratio": 1.6,
    "inference_backend": "Open_VINO",
    "initialize_all_language_presets": false,
    "language_code": "ja",
    "language_presets": {
        "ch": "chinese_v4",
        "en": "english_v4",
        "ja": "japanese_v4",
        "ko": "korean_v4"
    },
    "max_image_width": 1600,
    "name": "default",
    "port": 12345,
    "use_dilation": true
}
language_code: ch, preset_name: chinese_v4
language_code: en, preset_name: english_v4
language_code: ja, preset_name: japanese_v4
language_code: ko, preset_name: korean_v4

 App settings preset root: /home/emanuel/.config/YomiNinja/ppocr/presets/

 App settings preset: default
{
    "cls_thresh": 0.9,
    "cpu_threads": 8,
    "det_db_box_thresh": 0.6,
    "det_db_score_mode": "slow",
    "det_db_thresh": 0.3,
    "det_db_unclip_ratio": 1.6,
    "inference_backend": "Open_VINO",
    "initialize_all_language_presets": false,
    "language_code": "ja",
    "language_presets": {
        "ch": "chinese_v4",
        "en": "english_v4",
        "ja": "japanese_v4",
        "ko": "korean_v4"
    },
    "max_image_width": 1600,
    "name": "default",
    "port": 12345,
    "use_dilation": true
}
language_code: ch, preset_name: chinese_v4
language_code: en, preset_name: english_v4

stdout: language_code: ja, preset_name: japanese_v4
language_code: ko, preset_name: korean_v4

initializing wih address: 0.0.0.0:12345
stdout: [INFO-JSON]:{"server_address":"0.0.0.0:12345"}

Error: 14 UNAVAILABLE: read ECONNRESET
    at callErrorFromStatus (/opt/YomiNinja/resources/app.asar/node_modules/@grpc/grpc-js/build/src/call.js:31:19)
    at Object.onReceiveStatus (/opt/YomiNinja/resources/app.asar/node_modules/@grpc/grpc-js/build/src/client.js:192:76)
    at Object.onReceiveStatus (/opt/YomiNinja/resources/app.asar/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:360:141)
    at Object.onReceiveStatus (/opt/YomiNinja/resources/app.asar/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:323:181)
    at /opt/YomiNinja/resources/app.asar/node_modules/@grpc/grpc-js/build/src/resolving-call.js:99:78
    at process.processTicksAndRejections (node:internal/process/task_queues:77:11)
for call at
    at ServiceClientImpl.makeUnaryRequest (/opt/YomiNinja/resources/app.asar/node_modules/@grpc/grpc-js/build/src/client.js:160:32)
    at ServiceClientImpl.<anonymous> (/opt/YomiNinja/resources/app.asar/node_modules/@grpc/grpc-js/build/src/make-client.js:105:19)
    at clientResponse (/opt/YomiNinja/resources/app.asar/main/electron-src/@core/infra/ocr/ppocr.adapter/ppocr.adapter.js:201:92)
    at new Promise (<anonymous>)
    at PpOcrAdapter.updateSettings (/opt/YomiNinja/resources/app.asar/main/electron-src/@core/infra/ocr/ppocr.adapter/ppocr.adapter.js:201:36)
    at async UpdateSettingsPresetUseCase.handleOcrAdapterSettingsUpdate (/opt/YomiNinja/resources/app.asar/main/electron-src/@core/application/use_cases/update_settings_preset/update_settings_preset.use_case.js:85:24)
    at async UpdateSettingsPresetUseCase.execute (/opt/YomiNinja/resources/app.asar/main/electron-src/@core/application/use_cases/update_settings_preset/update_settings_preset.use_case.js:31:33)
    at async initializeApp (/opt/YomiNinja/resources/app.asar/main/electron-src/@core/infra/app_initialization.js:77:13)
    at async AppController.init (/opt/YomiNinja/resources/app.asar/main/electron-src/app/app.controller.js:57:9)
    at async App.<anonymous> (/opt/YomiNinja/resources/app.asar/main/electron-src/index.js:15:5) {
  code: 14,
  details: 'read ECONNRESET',
  metadata: Metadata { internalRepr: Map(0) {}, options: {} }
}
retrying: PpOcrAdapter.updateSettings
stdout: Saving settings...
Settings file path: /home/emanuel/.config/YomiNinja/ppocr/presets/default.json
{
    "name": "default",
    "language_presets": {
        "ch": "chinese_v4",
        "en": "english_v4",
        "ja": "japanese_v4",
        "ko": "korean_v4"
    },
    "language_code": "ja",
    "initialize_all_language_presets": false,
    "inference_backend": "Open_VINO",
    "cpu_threads": 8,
    "port": 12345,
    "max_image_width": 1600,
    "det_db_thresh": 0.3,
    "det_db_box_thresh": 0.6,
    "det_db_unclip_ratio": 1.6,
    "det_db_score_mode": "slow",
    "use_dilation": true,
    "cls_thresh": 0.9
}

stdout: Saving settings...
Settings file path: /home/emanuel/.config/YomiNinja/ppocr/presets/default.json
{
    "name": "default",
    "language_presets": {
        "ch": "chinese_v4",
        "en": "english_v4",
        "ja": "japanese_v4",
        "ko": "korean_v4"
    },
    "language_code": "ja",
    "initialize_all_language_presets": false,
    "inference_backend": "Open_VINO",
    "cpu_threads": 8,
    "port": 12345,
    "max_image_width": 1600,
    "det_db_thresh": 0.3,
    "det_db_box_thresh": 0.6,
    "det_db_unclip_ratio": 1.6,
    "det_db_score_mode": "slow",
    "use_dilation": true,
    "cls_thresh": 0.9
}

Loading extension from /home/emanuel/.config/YomiNinja/extensions/Brian Birtles.10ten Japanese Reader (Rikaichamp)
(node:515056) ExtensionLoadWarning: Warnings loading extension at /home/emanuel/.config/YomiNinja/extensions/Brian Birtles.10ten Japanese Reader (Rikaichamp):
  Manifest version 2 is deprecated, and support will be removed in 2023. See https://developer.chrome.com/blog/mv2-transition/ for more details.
  Permission 'contextMenus' is unknown or URL pattern is malformed.

(Use `yomininja-e --trace-warnings ...` to show where the warning was created)
Loading extension from /home/emanuel/.config/YomiNinja/extensions/[email protected].__MSG_8969005060131950570__
(node:515056) ExtensionLoadWarning: Warnings loading extension at /home/emanuel/.config/YomiNinja/extensions/[email protected].__MSG_8969005060131950570__:
  Manifest version 2 is deprecated, and support will be removed in 2023. See https://developer.chrome.com/blog/mv2-transition/ for more details.
  Permission 'contextMenus' is unknown or URL pattern is malformed.

Loading extension from /home/emanuel/.config/YomiNinja/extensions/[email protected]
Loading extension from /home/emanuel/.config/YomiNinja/extensions/undefined.JPDBreader
(node:515056) ExtensionLoadWarning: Warnings loading extension at /home/emanuel/.config/YomiNinja/extensions/undefined.JPDBreader:
  Manifest version 2 is deprecated, and support will be removed in 2023. See https://developer.chrome.com/blog/mv2-transition/ for more details.
  Permission 'contextMenus' is unknown or URL pattern is malformed.

[515189:0612/181314.034480:ERROR:native_extension_bindings_system.cc(600)] Failed to create API on Chrome object.
[515189:0612/181314.037748:ERROR:native_extension_bindings_system.cc(600)] Failed to create API on Chrome object.
[515189:0612/181314.038182:ERROR:native_extension_bindings_system.cc(600)] Failed to create API on Chrome object.
[515189:0612/181314.038847:ERROR:native_extension_bindings_system.cc(600)] Failed to create API on Chrome object.
[515189:0612/181314.039136:ERROR:native_extension_bindings_system.cc(600)] Failed to create API on Chrome object.
Error occurred in handler for 'app:get_active_capture_source': Error: No handler registered for 'app:get_active_capture_source'
    at WebContents.<anonymous> (node:electron/js2c/browser_init:2:89692)
    at WebContents.emit (node:events:513:28)
AppController.handleOcrCommand
AppController.handleOcrCommand
AppController.handleOcrCommand```

OCR not detecting anything

I installed it successfully (with all the various versions of the VC thing too) and there's a red border around the whole screen. When I press Alt + G (the hotkey I set it to. I don't remember if that's the default or not) the cursor freezes for a bit but the red border around the whole screen is unchanged and Japanese text isn't selected. I've got Yomichan set up. When there's Japanese text in my clipboard a little window pops up. I've tried various combinations for the OCR engine but nothing. It doesn't detect English either when I set it to English.

I'm on the latest version of Windows 11 with all the updates including the October 2023 feature update. manga_ocr works fine for me. I'm using a 4K screen and I tried HDR on and off too. RTX 4090 and Ryzen 7900X3D.

Installing Yomitan via Zip completely destroys YomiNinja Install.

I've downloaded yomitan via CRX Extractor/Downloader and attempted to install it into the latest version of YomiNinja, upon doing so the yomininja gui has become completely unresponsive. Closing the program and relaunching it does not solve the issue, it remains unresponsive.

I've uninstalled YomiNinja, reinstalled it and tried to install the extension again with the exact same result just to check for a fluke but the results remain the same, the program becomes permanently unresponsive.

EDIT: I'm leaving this up just incase you do want to bring yomitan into the application itself in the future, but I am a complete idiot and forgot you have provided another way to use yomitan. I was so excited to see the browser extension installer I temporarily forgot how to read.

Any updates on the mac support?

The readme says "in progress". Can you give any rough dates or what the current progress has been?

Magpie Support

Magpie is a tool used for up-scaling games. It's especially useful for games with low resolutions. Unfortunately, YomiNinja doesn't seem to work well with Magpie.

For a game up-scaled with Magpie, selecting Entire screen as the Capture source will result red overlays of YomiNinja to be displayed in the correct position but hovering over them doesn't work correctly. See the image below:

I can interact with 爽やか if I move the cursor to the black circle drawn in the image above.

Selecting the game window as the Capture source will result red overlays of YomiNinja to be displayed incorrectly but hovering over 爽やか will interact with the correct overlay. (Kinda, the mouse interaction can be off with a little offset in this case as well.)

Would it be possible to make YomiNinja work well with Magpie?

Automatic Detection

It'd be nice instead of every time using a shortcut, it saves a lot of time
Also not seeing how ocr actually extracting text is quite awful

WebSocket/Clipboard settings

The main reason many people prefer WebSockets over the clipboard to transfer text from some application to another is because they want to avoid polluting the clipboard. But AFAICT YomiNinja currently sends the text to the clipboard and to the hard-coded WebSocket port and does not let us choose which we prefer. An option to let us choose if we want text to be copied to the clipboard or to the specified WebSocket or to both would be quite handy. It would be also better if we could specify which WebSocket port is used for sending the text, because the port YomiNinja tries to use might be already in-use by some other application and whatnot.

An option (and maybe a hotkey) to auto copy the hovered text would be quite useful for using YomiNinja with external tools. There's already Copy on hover option but it copies whole OCRed text, but what I need is copying the text starting from the mouse position. e.g., Suppose that the OCRed text is はい、カズマです。, if I hover で, I want a way です。 to be automatically copied to the clipboard/WebSocket instead of the whole text. That way I can automatically look up the word through an external tool without selecting the part I want to look up again.

Copy text on click doesn't seem to work when Click-through mode is enabled, regardless of whether the React to clicks with click-through enabled option is on or off. Is this by-design?

0.3.1 slow/delayed ocr processing

Wanted to open an issue here just in case, but a few people are having issues with the ocr results being delayed on the most recent update. Switching back to 0.2.1 everything seems fine though.

New way to implement Google's OCR

Someone recently found a way to implement Google Lens into their OCR program. Lens can be used unlimitedly without the need of an API key, unlike with the normal Google Vision implementation. It seems to work very well for now, would be very nice if it could be implemented in Yomininja. Here's the Github: https://github.com/AuroraWright/owocr

0.6.4 When creating Yomitan Anki cards, surrounding text lines are included as well

First off I'm very grateful for the native Yomitan support - it's a complete game changer for me!

There's a small issue where, when I create an Anki card through Yomitan, the Yomitan {sentence} parameter contains not only the target line of text but most of the surrounding lines of text as well (see images).

I'm guessing that YominNinja passes along all lines of text to Yomitan rather than only the target line of text?
And I'm guessing that the reason that nothing before カイト's lines are included is that Yomitan itself splits sentences at punctuation (in this case the "?" in "ん......?").

[Feature Request] Option to send OCR text through websocket instead of copying to clipboard

Windows clipboard is not designed to be both written to and read from by applications on the fly, so it's not equipped to handle the workload, resulting in delay when using the application with a texthooking page and a clipboard inserter, while if you use a websocket it's pretty much instant.

Some OCR mismatches

As you can see, 喉 is mismatched and the OCR reads 候 , I looked on PaddleOCR japan_dict.txt and this kanji doesn't appear, may that be the source of the problem?

By the way, I tried the invert capture option, both inference runtimes and different precision settings, all throwing the same result.

Mining to Anki produces errors and extra text

Currently testing this tool out and it's amazing for old retro games that can't be text hooked. Only problem I'm running into is when mining. I'm using the built in Yomitan and am getting errors and extra jibberish text in the sentence field.

The error shows up on the Yomitan pop up and says "chrome.tabs.captureVisibleTab is not a function".

Any idea if these are bugs or settings issues?

Overlay broken on Linux Mint 21.1 (with standard Cinnamon desktop)

YomiNinja looks like exactly what I need, but I'm having several breaking issues:

When hovering over overlay rectangles, the OCR'd text glitches, either showing the correct text or another to the right. Please check the provided video (the image glitching at the top is a bug with the casting app).
The overlay is misplaced when applied to a window. It's fine when full screen but that doesn't solve issue 1.
The external border of the overlay is missing the bottom and right lines, preventing resizing it when misplaced.
When launching YomiNinja the desktop panels (menu bar etc) are hidden.

I would like to sponsor fixes for these. Or at least the first one which is the only one really preventing me to use your app.
If you're interested, let's discuss

Screencast_00000.mp4

I tried changing various options and disabling extensions.

Wayland support

Hello

This program as it stands essentially already supports wayland. The problems are as follows.

When requesting permission to do something, that permission is not saved (For example if you ask for a screen when pressing alt-s, you must ask for that screen every single time you press the hotkey)

You cannot manually move the red window (This would be solved if you had an option to give it a title bar)

No global hotkeys (Not solvable)

If a person could set an OCR to run on a timer as an option, and activate a window decoration, this would become usable on wayland. Is this functionality for initial wayland support desired?

Overlay area turns black when activating the ocr (if not launched as admin)

It looks odd that the whole area turns black but it's still useable.

Integrate Manga-OCR and Scanning with browser popup dictionary

It would be very nice to also have manga-ocr as it can be used with vertical text and manga, and it works better for pixelated retro text than PaddleOCR.

Something else that would be nice though not sure if possible, would be to be able to scan the text with browser popup dictionaries as if it were part of the browser, not with a different built in dictionary.

Really appreciate your work!

Mouse Disappears

Whenever I try to use the program on things like Persona 5 Royal or Star Ocean the Second, i'll hit printscreen and my mouse doesn't show. This makes it impossible to use yomininja. Do you know a fix for this? Thank you.

Application crashes whenever trying to copy the OCR text

I've been trying the app some more and for some reason now it crashes whenever I try to copy the OCRd text.
It works if I close my browser, but I was able to use it just fine a few days ago... I reinstalled it and even tried with the new version but same thing. I only have 8GBs of Ram and oldish CPU, but I'm able to play modern games for the most part, so idk...

initializing wih address: 0.0.0.0:12345
stdout: Notice: this application is experimental!!!

Args usage: [app_settings_preset] [language_code] [server_port]
e.g.: default ja 12345


 App settings: default
  inference_backend: Open_VINO
  server_port: 12345

  cpu_threads: 4

language_code: ch, preset_name: chinese_v4
language_code: en, preset_name: english_v4
language_code: ja, preset_name: japanese_v4
language_code: ko, preset_name: korean_v4
[INFO-JSON]:{"server_address":"0.0.0.0:12345"}


processing recognition input
stdout:
  Models:
  det_model_dir: ./models/ch_PP-OCRv4_det_infer
  cls_model_dir: ./models/ch_ppocr_mobile_v2.0_cls_infer
  rec_model_dir: ./models/japan_PP-OCRv4_rec_infer
  rec_label_file: ./recognition_label_files/dict_japan.txt


stdout: [INFO] fastdeploy/runtime/backends/openvino/ov_backend.cc(218)::fastdeploy::OpenVINOBackend::InitFromPaddle
        number of streams:1.
[INFO] fastdeploy/runtime/backends/openvino/ov_backend.cc(228)::fastdeploy::OpenVINOBackend::InitFromPaddle     affinity:YES.
[INFO] fastdeploy/runtime/backends/openvino/ov_backend.cc(240)::fastdeploy::OpenVINOBackend::InitFromPaddle     Compile OpenVINO model on device_name:CPU.

stdout: [INFO] fastdeploy/runtime/runtime.cc(286)::fastdeploy::Runtime::CreateOpenVINOBackend   Runtime initialized with Backend::OPENVINO in Device::CPU.

stdout: [INFO] fastdeploy/runtime/backends/openvino/ov_backend.cc(218)::fastdeploy::OpenVINOBackend::InitFromPaddle
        number of streams:1.
[INFO] fastdeploy/runtime/backends/openvino/ov_backend.cc(228)::fastdeploy::OpenVINOBackend::InitFromPaddle     affinity:YES.
[INFO] fastdeploy/runtime/backends/openvino/ov_backend.cc(240)::fastdeploy::OpenVINOBackend::InitFromPaddle     Compile OpenVINO model on device_name:CPU.

stdout: [INFO] fastdeploy/runtime/runtime.cc(286)::fastdeploy::Runtime::CreateOpenVINOBackend   Runtime initialized with Backend::OPENVINO in Device::CPU.

stdout: [INFO] fastdeploy/runtime/backends/openvino/ov_backend.cc(218)::fastdeploy::OpenVINOBackend::InitFromPaddle
        number of streams:1.
[INFO] fastdeploy/runtime/backends/openvino/ov_backend.cc(228)::fastdeploy::OpenVINOBackend::InitFromPaddle     affinity:YES.
[INFO] fastdeploy/runtime/backends/openvino/ov_backend.cc(240)::fastdeploy::OpenVINOBackend::InitFromPaddle     Compile OpenVINO model on device_name:CPU.

stdout: [INFO] fastdeploy/runtime/runtime.cc(286)::fastdeploy::Runtime::CreateOpenVINOBackend   Runtime initialized with Backend::OPENVINO in Device::CPU.

controller.recognize: 2.111s

"warning unknown display value" when trying to OCR

Just set this up to work with JPDBreader and anytime I try to OCR my game that is in windowed mode it throws the above mentioned error. This happens regardless of the OCR engine. This is with manual OCR and seems to happen regardless of the OCR I'm using.

I wasn't able to find logs in the files but let me know what info you need to fix this as I just started using it and am eager to actually use it.

Update: New update seems to have fixed it will reopen if issue shows up again

OCR mistakes

First, let me say thank you for the amazing and immensely helpful tool! I just used it for a ~20h playthrough and I'm convinced it has saved me several hours of manual dictionary lookups. With that said, the OCR is not perfect and it makes a lot of relatively simple mistakes. Not sure how much control you have over the OCR engine, but in case there are some knobs to turn, take a look at these. I tried changing the OCR engine but it didn't seem to help much.

Some improvements and bugs

Some improvements:

Option to clear overlay when clicking outside the textbox instead of having to press Alt + V.
Option to ignore non-japanese characters when your OCR language is set to Japanese.
Better* Support for vertical text.
- If you try it on manga it gets really messy with all the textboxes and all
Let the user choose the drive they want to install YomiNinja.
- This is something necessary for people with limited storage space on the main drive, especially if you add more OCR engines in the future, which will take up more space.
Be able to remove the border around the screen and around textboxes .
- If I set it to 0, it still continues to show the border
- Maybe adding a keybind to show and hide the borders would not be a bad idea.
Option to put the program in the system tray when you minimize it.
- This is not something REALLY necessary but it more for esthetics.
Option to toggle Show/Hide the overlay.
- I don't know if Alt + C already does this, if it does, it's not working for me.
- This can be related to the improvement number 5 (but here the text would also be hidden)

Some bugs:
Just for context, I'm using windows 11.

The option to copy text on click does not work.
- instead, it loses focus from the overlay when I click on the textbox
The default keybind to copy text (C) does not work on some browsers.
- I tested with Chrome and Edge, and it does not work, but with Opera, it works.
- Instead, I need to press Alt + C to copy the text (which conflicts with the default keybind to show the overlay).
- I can just change the keybinds to other keys, but I'm reporting this so you can know and maybe change the defaults to something else.

I want to thank you for the great work you're doing here. I really think that YomiNinja can be a breakthrough for learning Japanese with games and other media, especially when we have more OCR engines and automatic detection. But the program is already really good the way it is. Good job, dude!

Maintain focus on the previous window.

Currently when you initiate OCR the overlay becomes the focused window. For some games this is an issue because you lose control of the game unless you tab back in. If possible having it not steal the focus away would be beneficial.

JPDBreader support

JPDBreader is a browser extension that parses web pages through JPDB and provides furigana+a popup dictionary. The biggest benefit of using it is that you can connect your JPDB account to it and quickly add words to your flashcard deck, plus easily visualize which words you do/don't already know with highlighting. (you can also customize things like not showing furigana for known words, etc.)
This extension makes using JPDB as a learning aid very seamless (similar to connecting Yomichan+Anki) so supporting it or something similar in YomiNinja would be great :)

Issues with capturing some windows

Probably easiest to explain with a video...

YomiNinja_2024-01-11_15-01-59.mp4

Currently I'm having issues with デレステ. While most other DMM games don't seem to have this issue, the PC version of iM@S Song for Prism does.

It seems like once you select the window, it's not updating how it looks, which leads it to OCR the same text repeatedly. As well, with デレステ pressing PRTSC with the window open causes the text to advance (Alt+S works fine.)

Of course if I want to use these applications right now then I can just capture my whole desktop. (also want to note that the DMM game launcher is regionlocked! sorry ahaha)

YomiNinja doesn't start

So Yomininja crashes during launch without showing messages at all, i don't see any logs, but am pretty sure that culprit is Paddle_OCR, as i can see crashdamps like this "ppocr_infer_service_grpc.exe.4560.dmp" created immediately after laung attempt, i tried to run ppocr-inference-service in isolation and it doesn't work too, showing 0xc0000142 error during launch.
My question is:
"I've heard that Paddle is bad for japanese anyway and i'm much more interested in using Google Lense functionality anyway, is it possible to launch YomiNinja without paddle_OCR whatsoever?"

PrintScreen button is overriden by default windows screenshot tool

It's mentioned multiple times that pressing PrintScreen button is the fastest way to OCR. However, by default in Windows 11, the PrintScreen button runs the screenshot tool. Do I need to remove that default shortcut somewhere in Windows settings?

Cloud Vision API - No Text Recognized Please Try Again

Hello, I am using the latest release, created an API Key where Google Cloud Vision is enabled and put it in the settings. But when I try to use Cloud Vision I ended up getting error: !No Text Recognized - Please Try Again

As comparison the Google Lens method works flawlessly so it's no issue with the screen I am trying to OCR

I am also using another application (Luna Translator) using the same API Key and Google Vision method worked there, so I don't it's the API issue either

On another note, the setting for this has API Key and Client Email. I am not sure what to put in Client Email as previosu experience of using Google Cloud Vision in other enabling apps I only need to put the API Key (I used aforementioned Luna Translator and Ztranslate)

Can you help on this?

Right click>Inspect can't close

Hello,

I was playing The Great Ace Attorney Chronicles and somehow managed to right click and this popped up.

I clicked on Inspect and then this popped up:

I was not able to close this screen and had to alt tab to yomininja and alt+F4 in order to close it. I'm not sure if this is intended, I just wanted to let you know.

Better veritical text recognition

I've played around with this tool a bit and ive noticed it's got alot of potential, ive tried games and for the most part it worked flawlessly even OUTSIDE of games because its an ocr litteraly anything with text can be mined. Which i think that's what gives this tool so much potential However, I've noticed vertical text gives me a weird problem. The ocr seems to get confused when it comes to vertical text and its kinda buggy so i think that should be fixed. But thank you for the amazing tool you're doing an amazing job!

Russian Dictionaries don't work well for some reason

For some reason 2 dictionaries below just won't work.

RU-JP Kenrowa.zip

RU-JP Warodai.zip

On this screenshot I have 2 of them turned on but the 読む there are no definitions. If I go and use the app, they will show definitions but only on some words.

If I switch to any other dictionary it works fine. There is no such issue on Yomichan desktop and this problem persists across devices.

Please help