Giter Site home page Giter Site logo

Error: Illegal min or max specification! "Fatal error encountered!" == NULL:Error:Assert failed:in file globaloc.cpp, line 75 about tesseract HOT 9 CLOSED

tesseract-ocr avatar tesseract-ocr commented on April 28, 2024
Error: Illegal min or max specification! "Fatal error encountered!" == NULL:Error:Assert failed:in file globaloc.cpp, line 75

from tesseract.

Comments (9)

zdenop avatar zdenop commented on April 28, 2024

Did it happen with google traineddata file (or custom training)?

from tesseract.

oelleo avatar oelleo commented on April 28, 2024

It happened with custom training

from tesseract.

zdenop avatar zdenop commented on April 28, 2024

Try to set LC_NUMERIC to C during training

from tesseract.

oelleo avatar oelleo commented on April 28, 2024

Hello,
I found that tesseract had a patch for this problem (https://code.google.com/p/tesseract-ocr/issues/detail?id=910)
Why is this not in the new version of Tesseract 3.04 ?
Will it be in the next version ?
Thanks

from tesseract.

oelleo avatar oelleo commented on April 28, 2024

Btw the custom training I use is not mine so I cannot run it again with LC_NUMERIC=C

from tesseract.

zdenop avatar zdenop commented on April 28, 2024

Why do you think this patch is not in current version??? issue 910 you are reffering has problem with official google traineddata file. This was fixed.
AFAIR problem is in custom training.

from tesseract.

oelleo avatar oelleo commented on April 28, 2024

Ok my bad.
But I just tried with the eng.traineddata from official google traineddata file and I've got the same error
"Error: Illegal min or max specification!
"Fatal error encountered!" == NULL:Error:Assert failed:in file globaloc.cpp, line 75"

from tesseract.

tfmorris avatar tfmorris commented on April 28, 2024

I'm having a hard time seeing how this is going wrong due to locale with the current code. The actual error is signaled here: https://github.com/tesseract-ocr/tesseract/blob/master/classify/clusttool.cpp#L89 which happens when it is unhappy with the results that tfscanf gets for the feature parameters. tfscanf is a private, locale-independent version of fscanf, which calls, in turn, the private tvfscanf which implements its own parsing of floats with a hard coded decimal separator of '.'

One thing that definitely could cause it though is a bad/corrupted feature parameter file.

I just tested with the stock tesseract 3.03 on a brand new Debian 8 installation with the locale set to fr_FR.UTF-8 and everything worked perfectly.

If you still can't get this to work, please post the output of the following commands:

uname -a
tesseract -v
locale

from tesseract.

zdenop avatar zdenop commented on April 28, 2024

@oelleo: unfortunately tesseract requires (at the moment) training data use dot as decimal separator => you need to correct your custom training data.

I think it could be possible without retraining. Try to unpack your data (combine_tessdata -u eng.traineddata tmp/eng.) and fix decimal separator in eng.normproto (replace eng with your name of your custom training)

from tesseract.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.