ocropus-archive / dup-ocropy Goto Github PK

View Code? Open in Web Editor NEW

3.4K 3.4K 588.0 41.81 MB

Python-based tools for document analysis and OCR

License: Apache License 2.0

Python 28.38% HTML 0.04% Shell 0.27% Jupyter Notebook 71.27% Dockerfile 0.03%

dup-ocropy's People

Contributors

Stargazers

Watchers

Forkers

tedyhabtegebrial bygreencn doubaokun ceubex chagge bx5974 winnetou wollmers adnanulhasan stevenlol zxytim amoliu tsivkyn kirkhadley ddohler yiiwood kaishengyao kostyll icecream4u djj88 zelladoor wqren sauravbiswasiupr rbjork vincent-ucas eric013 danvk donsunsoft overstable splade abhigarg cstollw pengming273 kushal124 sherjilozair antimatter15 xuanhan863 stamhe fangzheng354 fanfannothing ak9527lq wangdongfrank chrisrammy cgenie timwee riordan vanl kaynewest inndy abhilash-potharaju yanweifu fireae kuronekodaisuki zengqiang2006 tajmorton nonva hughp aphilippi shuk nagyistoce rd-wixproducts stweil commonssibi mhr xshhhm cdsj pgrens vrqin nicodjimenez qulogic zjucsxxd agrawal-mohit zuphilip yodebu wanghong-yang azridev spideryan gotomypc uikit0 ashokpant darkseed markismus mikepatrickryan liu4lin mnjstwins a-hilaly wavelets kba supersom ginking liulei2776 jimitit matrixplayer wikicarlos lesliekim llp1992 pythonpunters lunactic duum kalyanp

dup-ocropy's Issues

FloatingPointError while training

I followed the "run-rtrain" document to train RNN. I got the training set by " tar -zxf tests/uw3-500.tgz". I didn't change anything, neither RNN structure nor parameters. I meet the following problem while training: FloatingPointError: overflow encountered in exp. However, the program does not stop. It keeps iterating while showing the above FloatingPointError message again and again.
I reduce the learning rate to half of the default value. But the same problem occurs. Do I train the network in the wrong way?

pylab.uint32 error (ubuntu 14.04)

Hi. I tried installing the dependencies using pip on master and ran tests (on Ubuntu 14.04.2 LTS), and I get ImportError: cannot import name uint32 (full stack trace at the bottom). I also tried installing exact versions as in requirement_1.txt, yet no luck.

I checked out v1.0, removed pip libs and did aptitude install of the dependencies and still get the same error:

Traceback (most recent call last):
  File "/usr/local/bin/ocropus-nlbin", line 9, in <module>
    import ocrolib
  File "/usr/local/lib/python2.7/dist-packages/ocrolib/__init__.py", line 12, in <module>
    from common import *
  File "/usr/local/lib/python2.7/dist-packages/ocrolib/common.py", line 16, in <module>
    import ligatures
  File "/usr/local/lib/python2.7/dist-packages/ocrolib/ligatures.py", line 8, in <module>
    from pylab import uint32
ImportError: cannot import name uint32

What do you think is wrong?

ValueError: shape mismatch in `ocropus-gpageseg`

When I run ocropus-gpageseg on the following image:

I get a ValueError which prevents any lines from being extracted:

$ ocropus-gpageseg -n --minscale 10 --maxcolseps 0 book-703662b.crop/0001.bin.png

########## /usr/local/bin/ocropus-gpageseg -n --minscale 10 --maxcolsep

book-703662b.crop/0001.bin.png
scale 13.8564064606
computing segmentation
computing column separators
computing lines
propagating labels
spreading labels
number of lines 4
finding reading order
writing lines
1: (slice(3L, 32L, None), slice(3L, 607L, None))
Traceback (most recent call last):
  File "/usr/local/bin/ocropus-gpageseg", line 435, in safe_process1
    process1(job)
  File "/usr/local/bin/ocropus-gpageseg", line 408, in process1
    binline = psegutils.extract_masked(1-cleaned,l,pad=args.pad,expand=args.expand)
  File "/usr/local/lib/python2.7/site-packages/ocrolib/toplevel.py", line 213, in argument_checks
    result = f(*args,**kw)
  File "/usr/local/lib/python2.7/site-packages/ocrolib/psegutils.py", line 114, in extract_masked
    line = where(mask,line,amax(line))
ValueError: shape mismatch: objects cannot be broadcast to a single shape

Avoid text recognition in image areas of the page

In several tests ocropus/ocropy tries to recognize some text within an image area in my pages. How can this be avoided? Here is an example of such a page: normal-beispiel-mit-bild

I run the same sequence of commands as in run-test and ocropus/ocropy recognizes two columns one with the actual text and the other with some nonsensical symbols from the picture.

Installing on Unix

Hello,

Trying to install on mac OS X Yosemite (10.10.1), using anaconda. Have installed packages in ./PACKAGES manually (curl python-scipy python-matplotlib python-tables firefox imagemagick python-opencv python-bs4) mostly using brew. Following commands threw no warnings:
$ wget -nd http://www.tmbdev.net/en-default.pyrnn.gz
$ mv en-default.pyrnn.gz models/
$ sudo python setup.py install

Test throws following error:
kungfujams-mbp:ocropy-master kungfujam$ ./run-test
clang: warning: -O4 is equivalent to -O3
ld: library not found for -lgomp
clang: error: linker command failed with exit code 1 (use -v to see invocation)
Traceback (most recent call last):
File "/Users/kungfujam/anaconda/bin/ocropus-nlbin", line 9, in
import ocrolib
File "/Users/kungfujam/anaconda/lib/python2.7/site-packages/ocrolib/init.py", line 12, in
from common import *
File "/Users/kungfujam/anaconda/lib/python2.7/site-packages/ocrolib/common.py", line 18, in
import lstm
File "/Users/kungfujam/anaconda/lib/python2.7/site-packages/ocrolib/lstm.py", line 32, in
import nutils
File "/Users/kungfujam/anaconda/lib/python2.7/site-packages/ocrolib/nutils.py", line 25, in
lstm_native = compile_and_load(lstm_utils)
File "/Users/kungfujam/anaconda/lib/python2.7/site-packages/ocrolib/native.py", line 67, in compile_and_load
path = compile_and_find(c_string,**keys)
File "/Users/kungfujam/anaconda/lib/python2.7/site-packages/ocrolib/native.py", line 63, in compile_and_find
raise CompileError()
ocrolib.native.CompileError

This may be a gcc/clang issue as seen here: http://stackoverflow.com/questions/20321988/error-enabling-openmp-ld-library-not-found-for-lgomp-and-clang-errors

Any advice gratefully received.

James

ocropus-gpageseg --> TypeError in error message for scale

This is my problem when trying to ocr with my pngs

book/0001.bin.png SKIPPED too many connnected components for a page image (21112 > 1176) (use -n to disable this check)

./ocropus-gpageseg -n book/0001.bin.png
INFO:
INFO: book/0001.bin.png
INFO: scale 4.472135955
Traceback (most recent call last):
File "./ocropus-gpageseg", line 423, in safe_process1
process1(job)
File "./ocropus-gpageseg", line 373, in process1
print_error("%s: scale (%g) less than --minscale; skipping\n"%(fname,str(scale)))
TypeError: float argument required, not str

README virtualenv instructions

Hi,

I think that the README instructions should read:

source ocropus_venv/bin/activate

instead of

source ocropus_venv/bin/source

Using Character Probabilities as Confidence Estimates

Is there a way that I could use the character probabilities output by the LSTM network (and shown in the --show diagrams of rpred) to estimate confidence for a given transcription? It's unclear how I would actually access those values.

Documentation

Is there any sort of documentation on using the software and an overview of how it works for a new user? Perhaps there is already a work in progress? I would be willing to help write some basic documentation if that is of interest.

module pylab not contained

I just want to inform you that there may be some packages missing for the installation. I was following the installation instructions. When I came to the point where to run the test, I got the following error:

~/ocropy$ ./run-test
Traceback (most recent call last):
  File "/usr/local/bin/ocropus-nlbin", line 3, in <module>
    from pylab import *

Running

sudo apt-get install python-numpy python-scipy python-matplotlib

afterwards resolved the problem for me and I could successfully run the test.

Installing Ocropus in Mac Yosemite

Hi,
I want to install Ocropus in mac, I've followed the guidelines from [http://www.danvk.org/2015/01/09/extracting-text-from-an-image-using-ocropus.html] and from Ocropy repository. I managed o do this part:
<$brew install python
$brew install opencv
$brew install homebrew/python/scipy>
and also this part:
<$ cd /usr/local/Cellar/python/2.7.6_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages
$ rm cv.py cv2.so
$ ln -s /usr/local/Cellar/opencv/2.4.9/lib/python2.7/site-packages/cv.py cv.py
$ ln -s /usr/local/Cellar/opencv/2.4.9/lib/python2.7/site-packages/cv2.so cv2.so>
But I'm not sure about the instructions provided in the Ocropy Github, see, what section do I use? System-wide or Python Virtual Environment, I'm on mac Yosemite.

To install OCRopus dependencies system-wide:

$ sudo apt-get install $(cat PACKAGES)
$ wget -nd http://www.tmbdev.net/en-default.pyrnn.gz
$ mv en-default.pyrnn.gz models/
$ sudo python setup.py install

Alternatively, dependencies can be installed into a Python Virtual Environment:

$ virtualenv ocropus_venv/
$ source ocropus_venv/bin/activate
$ pip install -r requirements_1.txt

tables has some dependencies which must be installed first:

$ pip install -r requirements_2.txt
$ wget -nd http://www.tmbdev.net/en-default.pyrnn.gz
$ mv en-default.pyrnn.gz models/

Could someone give a step-by-step guide to follow? I'm a bit lost. thanks!!!

ocropus-rpred crashes when printing unicode text without locale

When running ocropus-rpred without a locale (e.g. using subprocess.Popen) and recognizing non-ASCII character it tries to print them directly onto the command line causing an UnicodeEncodeError and subsequent skipping of the line. The offending line is 197 in ocropus-rpred:

   print fname,":",pred

As similar calls appear to occur in a wide range of ocropus utilities it would be sensible to wrap them all in the correct encode() statements.

Metadata about detected characters: quality scores + alternatives

The ocropus-rpred tool outputs text files of predicted text for each image. It would be nice if there were a way for it to output quality scores for each character, as well as alternatives.

For example, this line:

is being transcribed as:
2. 14E St. Lrand Loncourse, n.w. cor.

It's possible that G is the second most-likely candidate for the first letter in Lrand and C for Loncourse. If I were to build some kind of language model as a post-processing step, it would be clear that G and C are the better choices at those positions.

Some kind of JSON output would be helpful. It might look something like:

[
  {
    "x": 216,
    "char": "L",
    "candidates": [
      {
        "char": "L",
        "score": 0.9
      },
      {
        "char": "G",
        "score": 0.8
      },
      ...
    ]
  },
  ...
]

can't download en-default.pyrnn.gz

$ wget -nd http://www.tmbdev.net/en-default.pyrnn.gz
--2015-01-09 20:06:04--  http://www.tmbdev.net/en-default.pyrnn.gz
Resolving www.tmbdev.net (www.tmbdev.net)... 69.163.203.33
Connecting to www.tmbdev.net (www.tmbdev.net)|69.163.203.33|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2015-01-09 20:06:05 ERROR 404: Not Found.

Pickle fails with EOFError while loading models on Windows

Here's the traceback (I edited the path to \ocropy):

D:\ocropy>python ocropus-rpred -Q 2 -m D:\ocropy\models\fraktur.pyrnn.gz  T:\0001.bin.png

########## ocropus-rpred -Q 2 -m D:\ocropy\models\fr

#inputs 1
# loading object D:\ocropy\models\fraktur.pyrnn.gz
Traceback (most recent call last):
  File "ocropus-rpred", line 103, in <module>
    network = ocrolib.load_object(args.model,verbose=1)
  File "D:\ocropy\ocrolib\common.py", line 513, in load_object
    return unpickler.load()
EOFError

0001.bin.png is:

fraktur.pyrnn.gz is freshly downloaded from http://www.tmbdev.net/fraktur.pyrnn.gz. I am running Win7 x64 and Python 2.7.6 x64.

Also, why do the instructions first suggest downloading en-default.pyrnn.gz and then running with fraktur.pyrnn.gz?

Python-based ocropy faster than c++ version

Hello,
When I training text-line recognizer, I found Python-based ocropy is even faster than c++ version (on Red Hat 4.4.7, using ocropus-ltrain). This is very strange, as it is said CLSTM is faster than ocropy. Has anyone found the same phenomenon? What is the reason behind this?
Best,
Thanks!

GPageSeg

Having installed Ocropy, I get a CheckError trying to run the example given in the Readme.

book/0001.bin.png
Traceback (most recent call last):
  File "./ocropus-gpageseg", line 414, in safe_process1
    process1(job)
  File "./ocropus-gpageseg", line 356, in process1
    scale = psegutils.estimate_scale(binary)
  File "[...]/ocropy-master/ocrolib/psegutils.py", line 41, in estimate_scale
    objects = binary_objects(binary)
  File "[...]/ocropy-master/ocrolib/psegutils.py", line 37, in binary_objects
    objects = morph.find_objects(labels)
  File "[...]/ocropy-master/ocrolib/toplevel.py", line 209, in argument_checks
    raise e
CheckError:
CheckError for argument image of function <function find_objects at 0x10e37d9b0>
<ndarray-7f87ebc09080 (5753, 4304) uint64 [0,11315]> of type <type 'numpy.ndarray'>: array must contain integer values

ocropus-linegen does not handle unicodes well

This versions of ocropus-linegen does not handle languages other than English. E.g. If you try to use it to generate french text-lines, you will see boxes for some accented letters. Beware!

plan for supporting CUDA or OpenCL?

Hello,
Currently I'm training a big model, It seems that the learning is kind of slow without gpu.
Do you have some plan for supporting CUDA or OpenCL?
Thanks!

hOCR per word basis?

Hi,

I really like your tool, it's recognition seems to be better than Tesseract's in some cases. Tesseract, however, has a more detailled hOCR output:

Each word gets wrapped in a span with class ocrx_word and has a bbox and x_wconf property.

The bbox property for each word gives the user the possibility to write an own implementation of layout detection, while the x_wconf allows omitting words, which were probably not recognized correctly.

Is this also possible with ocropy or is this planned?

Thank you.

Line detection with different font sizes

The header line (title) of a document is often written in larger font as the normal text. I experienced that ocropus sometimes cuts a larger font size line into two lines (which are then recognized into nonsense). If the header font is not too much larger (twice seems okay), then the splitting up in lines is okay. But the problem occurs if the header font is 3 times the size of the normal font (36pt and 12pt). E.g. ocropus-gpageseg of 0002 bin

where the headline is split up into three lines:

i.e.

Can the parameters of ocropus-gpagesegavoid such a behaviour? Or line detection tweaked in general?

multi-language documents

hello
can ocropy support multi-language text in the same document( image) ?

differences in Softmax implementation with clstm (separate-derivs branch)

How does W2 and DW2 in ocropy's python Softmax implementation relate to W and w in the C++ implementation in clstm?

Other Languages

Is there support for non-latin languages like Chinese, Japanese or Thai?

Understanding ocropy installation

Hello everyone,

I'm working on a project and need to use ocropy, I tried to install it on windows but failed, so I moved to Ubuntu. I'm not a nerdy when it comes to Ubuntu, so I'm stuck now.

I have installed python 2.7 on Ubuntu and all the requirements 1 and 2, also I've installed opencv.

Then I tried to install ocropy as written in the read-me but failed at this line:
mv en-default.pyrnn.gz models/

I actually don't understand it, because its previous line gets a .gz then we want to move it to a model directory (which is not created yet!) then we need to run setup.py which is not their. So I don't know if I'm missing something, I know I might sound so ignorant to some of you but I'm really new to this and I'm doing my best to understand, I also didn't find any helpful information on the net regarding my issue.

Any help is appreciated, Thank you in advance.

Applying patterns to line recognition

I have a large photo to OCR, and each line follows the same pattern, for example, we can use regular expression to express as (\d+ [A-Z]+), but there are a lot of error in recognizing parts of them due to the documents are so old and not clear, so I am thinking can I make it read the first part of each line as only digits, which may help some.(I know I can do some cleaning later, but it seems that is not a good choice). I have not familiar with LSTM, do you know which file I should look at?

Error "array must contain integer values" in ocropus-gpageseg with uint64

Hello,

I'm trying to run the test script on OS X 10.10 with python 2.7. An exception is throw at the second step running ocropus-gpageseg.

MacBook-Pro-2:ocropy pujia$ PATH=$PATH:. ./run-test 
# tests/testpage.png
=== tests/testpage.png 1
estimating skew angle
estimating thresholds
rescaling
tests/testpage.png lo-hi (0.39 1.44) angle  0.1  no-normalization
writing

########## ./ocropus-gpageseg temp/????.bin.png

temp/0001.bin.png
Traceback (most recent call last):
  File "./ocropus-gpageseg", line 414, in safe_process1
    process1(job)
  File "./ocropus-gpageseg", line 356, in process1
    scale = psegutils.estimate_scale(binary)
  File "/Users/pujia/git-workspace/ocropy/ocrolib/psegutils.py", line 41, in estimate_scale
    objects = binary_objects(binary)
  File "/Users/pujia/git-workspace/ocropy/ocrolib/psegutils.py", line 37, in binary_objects
    objects = morph.find_objects(labels)
  File "/Users/pujia/git-workspace/ocropy/ocrolib/toplevel.py", line 209, in argument_checks
    raise e
CheckError: 
CheckError for argument image of function <function find_objects at 0x103972398>
<ndarray-7fbead2b6420 (3000, 2078) uint64 [0,5187]> of type <type 'numpy.ndarray'>: array must contain integer values

########## ./ocropus-rpred -n temp/????/??????.bin.png

Traceback (most recent call last):
  File "./ocropus-rpred", line 92, in <module>
    inputs = ocrolib.glob_all(args.files)
  File "/Users/pujia/git-workspace/ocropy/ocrolib/toplevel.py", line 213, in argument_checks
    result = f(*args,**kw)
  File "/Users/pujia/git-workspace/ocropy/ocrolib/common.py", line 654, in glob_all
    raise FileNotFound("%s: expansion did not yield any files"%arg)
ocrolib.common.FileNotFound: file not found temp/????/??????.bin.png: expansion did not yield any files

I just want to check to see if this issue is known before I start debug from bottom up. Thanks.

Please provide en-default.pyrnn.gz and fraktur.pyrnn.gz models

http://www.tmbdev.net is not accessiable.

It would be great if en-default.pyrnn.gz and fraktur.pyrnn.gz models can be bundled with the code

Sorry， I accidentally closed the issue "FloatingPointError while training"

My environment:

python 2.7.6

Ubuntu 14.04.1 LTS \n \l

gcc version 4.8.4

how to get the confidence of predicted output ?

Hello,
how to get the confidence of predicted output ?
any advice will be welcomeed. thanks in advance !
Best regards,
Thanks!

Unable to install on windows... wtf is 'source'?

The virtualenv/ stuff seems to instlall, but wtf is 'source'? as in:

source ocropus_venv/bin/activate

Can't do it! Can't find it - try finding something called source on google. Can't be done. wtf, over....

ocropus-gpageseg - Segmentation

I'm from Thailand, and I use OCRopus to recognition my language structure, It's work!
but I got some problem in part of segmentation, for Thai structure have upper and lower character (not related to middle line)
your segmentation have cut some part of upper Thai character. I attempt to adjust argument value.
It still not work. can you tell me about how to add a upper height of the upper line or guild me a part of code that i can edit.

the upper character is cut

Question about line segmenting

(Examples taken from this pdf - https://www.dropbox.com/s/6sy77shnro7sqdf/6.pdf?dl=0)

I have a bunch of files from which I've extracted the text in both a line format and a coherent blob format and I'm trying to understand what the best practices are for using ocropy-linegen.

An example in the document given is lines 5-8 (reproduced below):

流動資産は、たな卸資産が減少したものの、受取手形及び売掛金などが増加したことなどにより、 前連結会計年度末に比べ5億65百万円増加し、630億33百万円となりました。固定資産は、有形固 定資産、無形固定資産ともに減価償却により減少したものの、投資有価証券の評価差額が増加した ことにより、前連結会計年度末に比べ5億8百万円増加し、212億86百万円となりました。

Here, I could feed that whole blob to ocropy-linegen or I could feed it line by line:

流動資産は、たな卸資産が減少したものの、受取手形及び売掛金などが増加したことなどにより、
...
ことにより、前連結会計年度末に比べ5億8百万円増加し、212億86百万円となりました

I get the sense that the latter is what it expects. Is that right?

For another example, see the table further down on that page. The second row is:

自己資本比率    28.1    ...    23.4

Does ocropy-linegen want the full line (row), the full line with the spacing, or would it rather have each cell individually?

Thanks.

Could you recommend some materials about the algorithm you use?

Hi! I feel this project is very interesting and I want to learn from it.
So could you recommend me some materials(papers or books) you referred in this project?
Thank you very much

Where can I find source code for previous Ocropus?

The http://code.google.com/p/ocropus is not available any more. I am reading a few papers related to the old Ocropus, and want to take a look of the code.

Error "expected a segmentation image" in ocropus-gpageseg with uint64

OS: OS X El Captain.

 ./ocropus-gpageseg 'book/0001.bin.png'
INFO:  
INFO:  ########## ./ocropus-gpageseg book/0001.bin.png
INFO:  
INFO:  book/0001.bin.png
INFO:  scale 41.701318924
INFO:  computing segmentation
INFO:  computing column separators
INFO:  computing lines
INFO:  propagating labels
Traceback (most recent call last):
  File "./ocropus-gpageseg", line 423, in safe_process1
    process1(job)
  File "./ocropus-gpageseg", line 379, in process1
    segmentation = compute_segmentation(binary,scale)
  File "./ocropus-gpageseg", line 320, in compute_segmentation
    llabels = morph.propagate_labels(boxmap,seeds,conflict=0)
  File "/Users/lihanli/projects/ocropy/ocrolib/toplevel.py", line 209, in argument_checks
    raise e
CheckError: 
CheckError for argument labels of function <function propagate_labels at 0x1095f5578>
<ndarray-7ff76272e160 (5753, 4304) uint64 [0,168]> of type <type 'numpy.ndarray'>: expected a segmentation image

What is the "ALN" line?

When I run ocropus-rtrain, I see lines like this:

1000 56.70 (726, 48) 704213b-crop-01000d.png
   TRU: u'Eugene L. Armbruster Collection.'
   ALN: u'Eugene L. Armbbbruster Collection.'
   OUT: u're S. rrter eleoton.'

TRU and OUT are pretty clear. But what is ALN? It's usually better than OUT, especially when I first start training the model (as in the example above). Is ALN based on the predictions, or does it somehow incorporate truth data? Could I use it instead of the output of ocropus-rpred (which matches OUT)?

(Let me know if you'd prefer that I post these sorts of questions on the mailing list—I generally find content much easier to find in GitHub than mailing list archives)

ValueError: setting an array element with a sequence.

On some images, rpred and rtrain's dewarp function throws an error because the padding is insufficient. This seems to be particularly true of images which have borders or other noise near the top and bottom edges. Attached is an example.

I have a simple, but unsatisfying, fix for this where I double the amount of padding. See my github fork commit (branch bugfix) at:
braddockcg@be2e6d1

The full error is:

Traceback (most recent call last):
File "/usr/local/bin/ocropus-rpred", line 245, in safe_process1
return process1(arg)
File "/usr/local/bin/ocropus-rpred", line 145, in process1
line = lnorm.normalize(line,cval=amax(line))
File "/usr/local/lib/python2.7/dist-packages/ocrolib/lineest.py", line 59, in normalize
dewarped = self.dewarp(img,cval=cval,dtype=dtype)
File "/usr/local/lib/python2.7/dist-packages/ocrolib/lineest.py", line 56, in dewarp
dewarped = array(dewarped,dtype=dtype).T
ValueError: setting an array element with a sequence.

clstm.py not found

When running ocropus-ltrain, 'clstm module not found' is popping up. There is a setup.py file in clstm directory, when I run this, following complain pops up:
file clstm.py (for module clstm) not found

Where can I find clstm.py file?

Error in ocropus-econf with some k-options

The function ocropus-econf returns an error when comparisons should only be done among the letters or digits, i.e.

$ ocropus-econf -k digits output/*/*.gt.txt
Traceback (most recent call last):
  File "/usr/local/bin/ocropus-econf", line 59, in <module>
    outputs = sorted(list(outputs))
  File "/usr/local/lib/python2.7/dist-packages/ocrolib/common.py", line 560, in parallel_map
    result = fun(e)
  File "/usr/local/bin/ocropus-econf", line 50, in process1
    err,cs = edist.xlevenshtein(txt,gt,context=args.context)
  File "/usr/local/lib/python2.7/dist-packages/ocrolib/edist.py", line 43, in xlevenshtein
    cost = current[n]
UnboundLocalError: local variable 'current' referenced before assignment

Possible bug in ocropus-rtrain

During the training, an exception is raised if 'floating point error' occurs. The script (ocropus-rtrain) then loads the previous model using lstm_load(last_save) (line 289 in ocropus-rtrain).
Now, if the floating point error is raised during the first mini-batch, there is no model found, as last_save is still None. The result is:

Traceback (most recent call last):
File "../ocropy/ocropus-rtrain", line 313, in
network = load_lstm(last_save)
File "../ocropy/ocropus-rtrain", line 191, in load_lstm
network = ocrolib.load_object(last_save)
File "/export/home/adnan/ocropy/ocrolib/common.py", line 503, in load_object
fname = ocropus_find_file(fname)
File "/export/home/adnan/ocropy/ocrolib/common.py", line 682, in ocropus_find_file
if os.path.exists(fname):
File "/usr/lib/python2.7/genericpath.py", line 18, in exists
os.stat(path)
TypeError: coercing to Unicode: need string or buffer, NoneType found

In this case, there should be a way to restart the training.

Install guide for fedora

Hello,

not sure this is a valid issue, so please just close it, if not.

But could someone add a description how to install everything on fedora?

I tried it and they're obviously not using exactly the same package names,
since I couldn't install the requirements with yum/dnf or even pip.

Or even just a confirmation that someone got it running on fedora would be nice.

Possible bugs?

Hello,
When I running ocropus-ltrain, it will occasionally warning: "FloatingPointError: overflow encountered in exp", and the program seems to restart from the nearest saved state. The problem occurs mainly in the "ffunc" function in lstm.py, which defines the softmax function using: 1.0/(1.0+exp(-x)). Same problem also occurs in the "sigmoid" function. I think this may be caused by large values in x. In the CLSTM source code, the values x is clipped to 20 for positive values, and -20 for negtive values. After clipping like this, the program goes well without warning.

Another problem is that the "backward" method in class "Parallel" returns None. This is correct for 1-layer BLSTM system, but for multiple layers BLSTM configuration which stacking paralleled BLSTM one over another, this will lead to error, as the deltas of subsequent layer is assigned as the current deltas. So, maybe the method should return deltas.

Best,

A maxheight option for lines

I'm running ocropus-nlbin and ocropus-gpageseg on this image:

The first line that comes out of ocropus-gpageseg is this:

i.e. two lines joined into one. Naturally this produces nonsensical output when I run it through ocropus-rpred. Admittedly this is a hard case (there's literally one pixel separating the two lines in that image), but it might be nice to have some kind of maxlineheight option I could pass to ocropus-gpageseg to give it a hint that this is wrong.

Running ocropus-gpageseg --scale 11 --minscale 11 does split the lines, but I'm reluctant to explicitly set --scale for all my images. There are at least two fonts present in the collection, one of which is taller than the other. So it's hard to say the x-height in advance. But I can safely say that if any image of a line is >50 px, then something went wrong.

I'm not sure if such an option would make sense or if there's a better solution, but I wanted to toss the idea out there!

Commands:

ocropus-nlbin -n 734090b.crop.png -o book
ocropus-gpageseg -n [--scale 11 --minscale 11] --maxcolseps 0 book/????.bin.png

Thanks for the great OCR library!

error while training

After executing (on 156 files of groundtruth text and imagery):
ocropus-rtrain gt/????/*.png -F 10000 -o mub_combined &
I've got the following reproduceable error:

454 150.32 (1486, 48) gt/0001/01000b.bin.png
TRU: u'quod dicitur Fulda, quod est situm in pago Grapfeld, constructum in honore sancti'
ALN: u'quuod dicituur Fuulda, qquod et situumm in pagoo Grapfeld, construuctuuumm in honnore '
OUT: u' iiii ii te ti imm tm e iii eutmut m mi eii '

oops, got FloatingPointError overflow encountered in exp

Traceback (most recent call last):
File "/usr/local/bin/ocropus-rtrain", line 228, in
pcs = network.trainSequence(line,cs,update=do_update,key=fname)
File "/usr/local/lib/python2.7/dist-packages/ocrolib/lstm.py", line 863, in trainSequence
self.outputs = array(self.lstm.forward(xs))
File "/usr/local/lib/python2.7/dist-packages/ocrolib/lstm.py", line 587, in forward
xs = net.forward(xs)
File "/usr/local/lib/python2.7/dist-packages/ocrolib/lstm.py", line 636, in forward
outputs = [net.forward(xs) for net in self.nets]
File "/usr/local/lib/python2.7/dist-packages/ocrolib/lstm.py", line 545, in forward
self.WIP,self.WFP,self.WOP)
File "/usr/local/lib/python2.7/dist-packages/ocrolib/lstm.py", line 419, in forward_py
go[t] = ffunc(gox[t])
File "/usr/local/lib/python2.7/dist-packages/ocrolib/lstm.py", line 367, in ffunc
return 1.0/(1.0+exp(-x))
FloatingPointError: overflow encountered in exp
Traceback (most recent call last):
File "/usr/local/bin/ocropus-rtrain", line 232, in
network = ocrolib.load_object(last_save)
File "/usr/local/lib/python2.7/dist-packages/ocrolib/common.py", line 502, in load_object
fname = ocropus_find_file(fname)
File "/usr/local/lib/python2.7/dist-packages/ocrolib/common.py", line 680, in ocropus_find_file
if os.path.exists(fname):
File "/usr/lib/python2.7/genericpath.py", line 18, in exists
os.stat(path)
TypeError: coercing to Unicode: need string or buffer, NoneType found

another case with half of the files (dir 0001 only):

960 110.63 (1490, 48) gt/0001/010022.bin.png
TRU: u'in honorem\u2074 domini salvatoris Jesu Christi et beate Marie genetricis\u2075 eius episco-'
ALN: u'in honorem~ domini salvatoris Jesu Christi et beate MMarie genetricis eius episco-'
OUT: u'iu bouoreu ouiui salvatoris lesu bristi et beate arie geuetricis eius episoo-'

oops, got FloatingPointError overflow encountered in exp

Traceback (most recent call last):
File "/usr/local/bin/ocropus-rtrain", line 228, in
pcs = network.trainSequence(line,cs,update=do_update,key=fname)
File "/usr/local/lib/python2.7/dist-packages/ocrolib/lstm.py", line 863, in trainSequence
self.outputs = array(self.lstm.forward(xs))
File "/usr/local/lib/python2.7/dist-packages/ocrolib/lstm.py", line 587, in forward
xs = net.forward(xs)
File "/usr/local/lib/python2.7/dist-packages/ocrolib/lstm.py", line 636, in forward
outputs = [net.forward(xs) for net in self.nets]
File "/usr/local/lib/python2.7/dist-packages/ocrolib/lstm.py", line 619, in forward
return self.net.forward(xs[::-1])[::-1]
File "/usr/local/lib/python2.7/dist-packages/ocrolib/lstm.py", line 545, in forward
self.WIP,self.WFP,self.WOP)
File "/usr/local/lib/python2.7/dist-packages/ocrolib/lstm.py", line 419, in forward_py
go[t] = ffunc(gox[t])
File "/usr/local/lib/python2.7/dist-packages/ocrolib/lstm.py", line 367, in ffunc
return 1.0/(1.0+exp(-x))
FloatingPointError: overflow encountered in exp
Traceback (most recent call last):
File "/usr/local/bin/ocropus-rtrain", line 232, in
network = ocrolib.load_object(last_save)
File "/usr/local/lib/python2.7/dist-packages/ocrolib/common.py", line 502, in load_object
fname = ocropus_find_file(fname)
File "/usr/local/lib/python2.7/dist-packages/ocrolib/common.py", line 680, in ocropus_find_file
if os.path.exists(fname):
File "/usr/lib/python2.7/genericpath.py", line 18, in exists
os.stat(path)
TypeError: coercing to Unicode: need string or buffer, NoneType found

doc

Is there any documentation for this ?

"ocropus-gtedit" in ./run-test does not exist

Could you please fix an issue from the 9th line of file ./run-test?

You mentioned a script "ocropus-gtedit" there, but it does not exist in the repo.

test case failed

After I installed everything and I run the test, there is an error in ocropus-hocr regarding to matplotlib

 File "./ocropus-hocr", line 13, in <module>
  from pylab import *
File "/usr/local/lib/python2.7/dist-packages/pylab.py", line 1, in <module>
  from matplotlib.pylab import *
File "/usr/local/lib/python2.7/dist-packages/matplotlib/pylab.py", line 274, in <module>
  from matplotlib.pyplot import *
File "/usr/local/lib/python2.7/dist-packages/matplotlib/pyplot.py", line 109, in <module>
  _backend_mod, new_figure_manager, draw_if_interactive, _show = pylab_setup()
File "/usr/local/lib/python2.7/dist-packages/matplotlib/backends/__init__.py", line 32, in pylab_setup
  globals(),locals(),[backend_name],0)
File "/usr/local/lib/python2.7/dist-packages/matplotlib/backends/backend_gtk.py", line 36, in <module>
  from matplotlib.backends.backend_gdk import RendererGDK, FigureCanvasGDK
File "/usr/local/lib/python2.7/dist-packages/matplotlib/backends/backend_gdk.py", line 33, in <module>
  from matplotlib.backends._backend_gdk import pixbuf_get_pixels_array
ImportError: No module named _backend_gdk

Please have a look.
My matplotlib version is 1.4.2, which version should I use?
And I am using python 2.7, on ubuntu 14.04.

Thank you.

TypeError: dot() takes no keyword arguments

when I execute ./run-test and ./run-rtrain commands, appear below error, please check and help me:

TypeError: dot() takes no keyword arguments
Traceback (most recent call last):
File "/usr/local/bin/ocropus-rpred", line 245, in safe_process1
return process1(arg)
File "/usr/local/bin/ocropus-rpred", line 150, in process1
pred = network.predictString(line)
File "/usr/local/lib/python2.7/dist-packages/ocrolib/lstm.py", line 934, in predictString
cs = self.predictSequence(xs)
File "/usr/local/lib/python2.7/dist-packages/ocrolib/lstm.py", line 884, in predictSequence
self.outputs = array(self.lstm.forward(xs))
File "/usr/local/lib/python2.7/dist-packages/ocrolib/lstm.py", line 605, in forward
xs = net.forward(xs)
File "/usr/local/lib/python2.7/dist-packages/ocrolib/lstm.py", line 661, in forward
outputs = [net.forward(xs) for net in self.nets]
File "/usr/local/lib/python2.7/dist-packages/ocrolib/lstm.py", line 559, in forward
self.WIP,self.WFP,self.WOP)
File "/usr/local/lib/python2.7/dist-packages/ocrolib/lstm.py", line 414, in forward_py
dot(WGI,source[t],out=gix[t])

execute ./run-rtrain, error message as below:

$ ./run-rtrain

tar -zxf tests/uw3-500.tgz
ocropus-rtrain 'book//.bin.png' -d 5 -o uw3-500-model
inputs 500

tests None

CenterNormalizer

using default codec

charset size 157 [ ~!"#$%&'()*+,-./0123456789:;<=>?@abcdefghijklmnopqrstuvwxyz[]^_`abcdefghijklmnopqrstuvwxyz{|}隆垄拢搂漏芦庐掳露禄驴妹?
```
                                                    妹妹妹妹妹妹妹犆⒚っγ┟疵睹访姑幻济颗排糕犫♀⑩ｂ光衡猹猥猞]
```
last_trial 0
Traceback (most recent call last):
File "/usr/local/bin/ocropus-rtrain", line 285, in
pcs = network.trainSequence(line,cs,update=do_update,key=fname)
File "/usr/local/lib/python2.7/dist-packages/ocrolib/lstm.py", line 890, in trainSequence
self.outputs = array(self.lstm.forward(xs))
File "/usr/local/lib/python2.7/dist-packages/ocrolib/lstm.py", line 605, in forward
xs = net.forward(xs)
File "/usr/local/lib/python2.7/dist-packages/ocrolib/lstm.py", line 661, in forward
outputs = [net.forward(xs) for net in self.nets]
File "/usr/local/lib/python2.7/dist-packages/ocrolib/lstm.py", line 559, in forward
self.WIP,self.WFP,self.WOP)
File "/usr/local/lib/python2.7/dist-packages/ocrolib/lstm.py", line 414, in forward_py
dot(WGI,source[t],out=gix[t])
TypeError: dot() takes no keyword arguments

Cannot run the tests on OS X Yosemite

Problem

I am following the installation guide with installed virtualenv and managed to install requirements 1 and 2 successfully. The next step is to run the tests but I cannot execute them. First: simply running ./run-test fired zsh: Command ocropus-nlbin cannot be found or something like that so I changed the script as follows:

#!/bin/zsh -e

rm -rf temp
./ocropus-nlbin tests/testpage.png -o temp
./ocropus-gpageseg 'temp/????.bin.png'
./ocropus-rpred -n 'temp/????/??????.bin.png'
./ocropus-hocr 'temp/????.bin.png' -o temp.html
./ocropus-visualize-results temp
./ocropus-gtedit html temp/????/??????.bin.png -o temp-correction.html

echo "to see recognition results, type: firefox temp.html"
echo "to see correction page, type: firefox temp-correction.html"
echo "to see details on the recognition process, type: firefox temp/index.html"

Then it found the commands I guess but ./run-test caused the following error to occur:

clang: warning: -O4 is equivalent to -O3
ld: library not found for -lgomp
clang: error: linker command failed with exit code 1 (use -v to see invocation)
Traceback (most recent call last):
  File "./ocropus-nlbin", line 9, in <module>
    import ocrolib
  File "/Users/nyxz/dev/workspace/python/ocropy/ocrolib/__init__.py", line 12, in <module>
    from common import *
  File "/Users/nyxz/dev/workspace/python/ocropy/ocrolib/common.py", line 18, in <module>
    import lstm
  File "/Users/nyxz/dev/workspace/python/ocropy/ocrolib/lstm.py", line 32, in <module>
    import nutils
  File "/Users/nyxz/dev/workspace/python/ocropy/ocrolib/nutils.py", line 25, in <module>
    lstm_native = compile_and_load(lstm_utils)
  File "/Users/nyxz/dev/workspace/python/ocropy/ocrolib/native.py", line 67, in compile_and_load
    path = compile_and_find(c_string,**keys)
  File "/Users/nyxz/dev/workspace/python/ocropy/ocrolib/native.py", line 63, in compile_and_find
    raise CompileError()
ocrolib.native.CompileError

Previous steps

To make the requirements install successfully I needed to install some additional stuff. I will list everything that I had to install here:

pip
python3
virtualenv
hdf5
pylab

I use ZSH as you may already noticed. I don't know what impact this has on the installation so I am just mentioning it.

Any suggestion on how to finish the installation with test running normally are highly appreciated!

ocropus-archive / dup-ocropy Goto Github PK

dup-ocropy's People

Contributors

Stargazers

Watchers

Forkers

dup-ocropy's Issues

tables has some dependencies which must be installed first:

oops, got FloatingPointError overflow encountered in exp

another case with half of the files (dir 0001 only):

oops, got FloatingPointError overflow encountered in exp

inputs 500

tests None

CenterNormalizer

using default codec

charset size 157 [ ~!"#$%&'()*+,-./0123456789:;<=>?@abcdefghijklmnopqrstuvwxyz[]^_`abcdefghijklmnopqrstuvwxyz{|}隆垄拢搂漏芦庐掳露禄驴妹?

last_trial 0

Problem

Previous steps

Recommend Projects

Recommend Topics

Recommend Org