Giter Site home page Giter Site logo

ocrd_segment's Introduction

ocrd_segment

This repository aims to provide a number of OCR-D compliant processors for layout analysis and evaluation.

CircleCI image Docker Automated build

Installation

In your Python virtual environment, run:

pip install ocrd_segment

Usage

Contains processors for various tasks:

  • exporting segment images (including results from preprocessing like cropping/masking, deskewing, dewarping or binarization) along with polygon coordinates and metadata:
  • importing layout segmentations from other formats:
  • post-processing or repairing layout segmentations:
    • ocrd-segment-repair (validity and consistency of all coordinates; also, for regions, reduce overlaps/redundancy between neighbours, and/or simplify polygons, and/or shrink to the alpha shape of foreground contours)
    • ocrd-segment-project (remake segment coordinates into the concave hull / alpha shape of their constituents)
    • ocrd-segment-replace-original (rebase all segments on cropped+deskewed border frame as new full page)
    • ocrd-segment-replace-page (2 input fileGrps; overwrite segmentation below page of first fileGrp by all segments of second fileGrp, rebasing all coordinates; "inverse" of replace-original)
    • ocrd-segment-replace-text (insert text below page from single-segment text files; "inverse" of extract-*)
  • comparing different layout segmentations:
  • pattern-based segmentation (input file groups N=1, based on a PAGE template, e.g. from Aletheia, and some XSLT or Python to apply it to the input file group)
    • ocrd-segment-via-template ๐Ÿšง (unpublished)
  • data-driven segmentation (input file groups N=1, based on a statistical model, e.g. Neural Network)
    • ocrd-segment-via-model ๐Ÿšง (unpublished)

For detailed behaviour, see --help on each processor CLI. For detailed description on input/output and parameters, see ocrd-tool.json or --dump-json on each processor CLI.

Development

Prerequisities

Requires libgeos-dev library for building shapely binary requirement, see Shapely Installation from source. Please ensure it's available before trying to install local requirements.

Testing

None yet.

ocrd_segment's People

Contributors

bertsky avatar jbballing avatar joschrew avatar kba avatar m3ssman avatar tboenig avatar wrznr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ocrd_segment's Issues

change default output filegroup for `ocrd-segment-replace-original`

The default output filegroup for ocrd-segment-replace-original is set to OCR-D-IMG-CROP which already exists in the majority of METS-files. Would be great to change the default value, so a user is not forced to specify it on his own (as for the other processors this is purely optional).
@kba suggested, that changing the default value might actually not be necessary, if we could drop the rule for two output filegroups also for this processor.

more geometry heuristics for validate/repair

We should have heuristics to check for

  • polygon containment (overlapping regions, word outside line etc.)
  • artifacts from annotation like point or line-like regions
  • lines with (way) too much whitespace (bad cropping, or bad segmentation)
  • probably even: missing @orientation

Originally posted by @kba in OCR-D/assets#28 (comment)

Error in shapely/ocrd_segment

The segment-repair processor in the following workflow:

ocrd process \
"olena-binarize -I OCR-D-IMG -O OCR-D-BIN -P impl sauvola" \
"anybaseocr-crop -I OCR-D-BIN -O OCR-D-CROP" \
"olena-binarize -I OCR-D-CROP -O OCR-D-BIN2 -P impl kim" \
"cis-ocropy-denoise -I OCR-D-BIN2 -O OCR-D-BIN-DENOISE -P level-of-operation page" \
"cis-ocropy-deskew -I OCR-D-BIN-DENOISE -O OCR-D-BIN-DENOISE-DESKEW -P level-of-operation page" \
"tesserocr-segment-region -I OCR-D-BIN-DENOISE-DESKEW -O OCR-D-SEG-REG" \
"segment-repair -I OCR-D-SEG-REG -O OCR-D-SEG-REPAIR -P plausibilize true" \
"cis-ocropy-deskew -I OCR-D-SEG-REPAIR -O OCR-D-SEG-REG-DESKEW -P level-of-operation region" \
"cis-ocropy-clip -I OCR-D-SEG-REG-DESKEW -O OCR-D-SEG-REG-DESKEW-CLIP -P level-of-operation region" \
"tesserocr-segment-line -I OCR-D-SEG-REG-DESKEW-CLIP -O OCR-D-SEG-LINE" \
"segment-repair -I OCR-D-SEG-LINE -O OCR-D-SEG-REPAIR-LINE -P sanitize true" \
"cis-ocropy-dewarp -I OCR-D-SEG-REPAIR-LINE -O OCR-D-SEG-LINE-RESEG-DEWARP" \
"calamari-recognize -I OCR-D-SEG-LINE-RESEG-DEWARP -O OCR-D-OCR -P checkpoint_dir qurator-gt4histocr-1.0"

executed on the DEFAULT file group inside this workspace:
https://content.staatsbibliothek-berlin.de/dc/PPN631277528.mets.xml

produces the following error:

  12:45:52.522 INFO processor.RepairSegmentation - INPUT FILE 0 / PHYS_0001
  12:45:52.524 INFO ocrd.page_validator.validate - Validating input file 'FILE_0001_OCR-D-SEG-LINE'
  12:45:52.652 INFO processor.RepairSegmentation - INPUT FILE 1 / PHYS_0002
  12:45:52.654 INFO ocrd.page_validator.validate - Validating input file 'FILE_0002_OCR-D-SEG-LINE'
  12:45:52.776 INFO processor.RepairSegmentation - INPUT FILE 2 / PHYS_0003
  12:45:52.777 INFO ocrd.page_validator.validate - Validating input file 'FILE_0003_OCR-D-SEG-LINE'
  12:45:52.912 INFO processor.RepairSegmentation - INPUT FILE 3 / PHYS_0004
  12:45:52.914 INFO ocrd.page_validator.validate - Validating input file 'FILE_0004_OCR-D-SEG-LINE'
  12:45:53.017 INFO processor.RepairSegmentation - INPUT FILE 4 / PHYS_0005
  12:45:53.019 INFO ocrd.page_validator.validate - Validating input file 'FILE_0005_OCR-D-SEG-LINE'
  12:45:53.026 WARNING processor.RepairSegmentation - Fixed CoordinateValidityError for SeparatorRegion 'region0011'
  12:45:53.027 WARNING processor.RepairSegmentation - Fixed CoordinateValidityError for SeparatorRegion 'region0012'
  12:45:53.119 WARNING processor.RepairSegmentation - Zero contour area in region "region0000"
  12:45:53.730 WARNING processor.RepairSegmentation - Zero contour area in region "region0011"
  12:45:53.734 WARNING processor.RepairSegmentation - Zero contour area in region "region0012"
  12:45:54.609 INFO processor.RepairSegmentation - INPUT FILE 5 / PHYS_0006
  12:45:54.610 INFO ocrd.page_validator.validate - Validating input file 'FILE_0006_OCR-D-SEG-LINE'
  12:45:54.708 INFO processor.RepairSegmentation - INPUT FILE 6 / PHYS_0007
  12:45:54.710 INFO ocrd.page_validator.validate - Validating input file 'FILE_0007_OCR-D-SEG-LINE'
  12:45:54.812 WARNING processor.RepairSegmentation - Zero contour area in region "region0003"
  12:45:55.186 ERROR shapely.geos - TopologyException: side location conflict at 262 1071. This can occur if the input geometry is invalid.
  Traceback (most recent call last):
    File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/bin/ocrd-segment-repair", line 8, in <module>
      sys.exit(ocrd_segment_repair())
    File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/click/core.py", line 1130, in __call__
      return self.main(*args, **kwargs)
    File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/click/core.py", line 1055, in main
      rv = self.invoke(ctx)
    File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/click/core.py", line 1404, in invoke
      return ctx.invoke(self.callback, **ctx.params)
    File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/click/core.py", line 760, in invoke
      return __callback(*args, **kwargs)
    File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd_segment/cli.py", line 21, in ocrd_segment_repair
      return ocrd_cli_wrap_processor(RepairSegmentation, *args, **kwargs)
    File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd/decorators/__init__.py", line 108, in ocrd_cli_wrap_processor
      run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
    File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd/processor/helpers.py", line 88, in run_processor
      processor.process()
    File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd_segment/repair.py", line 188, in process
      padding=self.parameter['sanitize_padding'])
    File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd_segment/repair.py", line 559, in shrink_regions
      if len(contour) >= 3], scale=scale)
    File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd_segment/project.py", line 179, in join_polygons
      jointp = unary_union(polygons)
    File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/shapely/ops.py", line 161, in unary_union
      return geom_factory(lgeos.methods['unary_union'](collection))
    File "/home/mm/venv37-ocrd/sub-venv/headless-tf1/lib/python3.7/site-packages/shapely/geometry/base.py", line 73, in geom_factory
      raise ValueError("No Shapely geometry can be created from null value")
  ValueError: No Shapely geometry can be created from null value

This is the input image:
FILE_0007_DEFAULT

Build fails for MacOS (ocrd-fork-pycocotools)

Running make all for ocrd_all or pip install . for ocrd_segment fails on MacOS with Homebrew:

      Compiling pycocotools/_mask.pyx because it changed.
      [1/1] Cythonizing pycocotools/_mask.pyx
      /private/var/folders/wf/g2hmm5bd72v2r_p0r1smct_00000gn/T/pip-install-0xp4jh31/ocrd-fork-pycocotools_7b0159a305264f708a622a0e4daa80bd/.eggs/Cython-3.0.0a11-py3.9.egg/Cython/Compiler/Main.py:345: FutureWarning: Cython directive 'language_level' not set, using '3str' for now (Py3). This has changed from earlier releases! File: /private/var/folders/wf/g2hmm5bd72v2r_p0r1smct_00000gn/T/pip-install-0xp4jh31/ocrd-fork-pycocotools_7b0159a305264f708a622a0e4daa80bd/pycocotools/_mask.pyx
        tree = Parsing.p_module(s, pxd, full_module_name)
      building 'pycocotools._mask' extension
      creating build/common
      creating build/temp.macosx-12-arm64-cpython-39
      creating build/temp.macosx-12-arm64-cpython-39/common
      creating build/temp.macosx-12-arm64-cpython-39/pycocotools
      clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk -I/OCR-D/venv-20221112/lib/python3.9/site-packages/numpy/core/include -I./common -I/OCR-D/venv-20221112/include -I/opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.9/include/python3.9 -c ../common/maskApi.c -o build/temp.macosx-12-arm64-cpython-39/../common/maskApi.o -Wno-cpp -Wno-unused-function -std=c99
      clang: error: no such file or directory: '../common/maskApi.c'
      clang: error: no input files
      error: command '/usr/bin/clang' failed with exit code 1
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for ocrd-fork-pycocotools

evaluate: false redundant matches if overlaps occur on any side already

The multi-match overlap algorithm (necessary to calculate over- and undersegmentation) still has a glitch: it will create fake/redundant pairings if either side has a segmentation that already overlaps locally. For example, take a page with a GraphicRegion overlapping multiple TextRegions, and evaluate that against itself: the matching will not only produce the 1:1 pairs, but also other matches. That's probably not what we want.

Update README, only announce features that are acutally provided

README announces ocrd-segment-via-template and ocrd-segment-via-model โ€“ none of which are actually provided by this package.
It does provide some ocrd-segment-extract-* features; these do not do any segmentation though (or I could not find out how).

Provide more output image formats

Hello orc-deers,

please provide a possibility to control what kind of image-data is created by segmentation. For instance, for an easy out-of-the-box integration with tesstrain it's fine to output image data in TIFF-format as well.

Thanks in advance!

ocrd-segment-extract-lines - Lines are not extracted, in case they are in an area of other lines

Hi,
I think I have found a bug in ocrd-segment-extract-lines:
I cannot prove to 100%, but I think I see my environment, that the lines are not extracted (no images are created), in case a line is somehow graphically (concerning the coordinates) within another line of the same region.
I extract only images in this case using this command:

ocrd-segment-extract-lines -I $infolder -O $extractLineImagesFolder  -P  output-types '[]' -P min-line-length 0 -P min-line-width 5 -P min-line-height 5

Page-Extract: Here the line TR-15_line0002 was not extracted:

    <pc:TextRegion id="TR-15" orientation="0.">
      <pc:AlternativeImage filename="OCR-D-REG-VL-BL/OCR-D-REG-VL-BL_4749_007817786_00183_TR-15.IMG-DESKEW.png" comments=",binarized,deskewed,verticallinesremoved" />
      <pc:Coords points="237,383 237,438 443,438 443,383" />
      <pc:TextLine id="TR-15_line0001">
        <pc:Coords points="237,438 237,383 239,383 253,391 311,391 320,383 349,383 357,390 365,383 384,383 402,391 419,383 427,383 430,418 428,438 302,438 298,435 289,435 284,438" />
        <pc:Baseline points="227,415 430,418" />
      </pc:TextLine>
      <pc:TextLine id="TR-15_line0003">
        <pc:Coords points="261,438 269,433 274,433 295,438" />
        <pc:Baseline points="254,475 295,475" />
      </pc:TextLine>
      <pc:TextLine id="TR-15_line0002">
        <pc:Coords points="385,438 388,435 388,434 409,434 409,438" />
        <pc:Baseline points="343,478 412,475" />
      </pc:TextLine>
    </pc:TextRegion>

Logfile content for this case:

2022-08-11_14-21-13-extractlines.log-2022-08-11 14:21:31.189 WARNING processor.ExtractLines - Line 'TR-14_line0001' contains no text content
2022-08-11_14-21-13-extractlines.log-2022-08-11 14:21:31.201 INFO ocrd.workspace.save_image_file - created file ID: OCR-D-SEG-LINE-CCS-IMG-BL-4749_007817786_00183_TR-14_TR-14_line0001.bin, file_grp: OCR-D-SEG-LINE-CCS-IMG-BL, path: OCR-D-SEG-LINE-CCS-IMG-BL/OCR-D-SEG-LINE-CCS-IMG-BL-4749_007817786_00183_TR-14_TR-14_line0001.bin.png
2022-08-11_14-21-13-extractlines.log-2022-08-11 14:21:31.242 WARNING processor.ExtractLines - Line 'TR-15_line0001' contains no text content
2022-08-11_14-21-13-extractlines.log:2022-08-11 14:21:31.255 INFO ocrd.workspace.save_image_file - created file ID: OCR-D-SEG-LINE-CCS-IMG-BL-4749_007817786_00183_TR-15_TR-15_line0001.bin, file_grp: OCR-D-SEG-LINE-CCS-IMG-BL, path: OCR-D-SEG-LINE-CCS-IMG-BL/OCR-D-SEG-LINE-CCS-IMG-BL-4749_007817786_00183_TR-15_TR-15_line0001.bin.png
2022-08-11_14-21-13-extractlines.log-2022-08-11 14:21:31.256 WARNING processor.ExtractLines - Line 'TR-15_line0003' contains no text content
2022-08-11_14-21-13-extractlines.log-2022-08-11 14:21:31.267 INFO ocrd.workspace.save_image_file - created file ID: OCR-D-SEG-LINE-CCS-IMG-BL-4749_007817786_00183_TR-15_TR-15_line0003.bin, file_grp: OCR-D-SEG-LINE-CCS-IMG-BL, path: OCR-D-SEG-LINE-CCS-IMG-BL/OCR-D-SEG-LINE-CCS-IMG-BL-4749_007817786_00183_TR-15_TR-15_line0003.bin.png
2022-08-11_14-21-13-extractlines.log-2022-08-11 14:21:31.268 WARNING processor.ExtractLines - Line 'TR-15_line0002' contains no text content
2022-08-11_14-21-13-extractlines.log-2022-08-11 14:21:31.311 WARNING processor.ExtractLines - Line 'TR-16_line0001' contains no text content
2022-08-11_14-21-13-extractlines.log-2022-08-11 14:21:31.348 INFO ocrd.workspace.save_image_file - created file ID: OCR-D-SEG-LINE-CCS-IMG-BL-4749_007817786_00183_TR-16_TR-16_line0001.bin, file_grp: OCR-D-SEG-LINE-CCS-IMG-BL, path: OCR-D-SEG-LINE-CCS-IMG-BL/OCR-D-SEG-LINE-CCS-IMG-BL-4749_007817786_00183_TR-16_TR-16_line0001.bin.png

Conversion Error

When using the ocrd-segment-repair, I've encountered the following Error

10:30:48.758 INFO processor.RepairSegmentation - Sanitizing region "region0071"
/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/numpy/core/fromnumeric.py:3335: RuntimeWarning: Mean of empty slice.
  out=out, **kwargs)
/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/numpy/core/_methods.py:161: RuntimeWarning: invalid value encountered in double_scalars
  ret = ret.dtype.type(ret / rcount)
Traceback (most recent call last):
  File "/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/bin/ocrd-segment-repair", line 8, in <module>
    sys.exit(ocrd_segment_repair())
  File "/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/ocrd_segment/cli.py", line 13, in ocrd_segment_repair
    return ocrd_cli_wrap_processor(RepairSegmentation, *args, **kwargs)
  File "/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/ocrd/decorators.py", line 60, in ocrd_cli_wrap_processor
    run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
  File "/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/ocrd/processor/base.py", line 57, in run_processor
    processor.process()
  File "/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/ocrd_segment/repair.py", line 88, in process
    self.sanitize_page(page, page_id)
  File "/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/ocrd_segment/repair.py", line 202, in sanitize_page
    scale = int(np.median(np.array(heights)))
ValueError: cannot convert float NaN to integer

The Region in question belongs to a Newspaper Digitalization.

It is possible to workaround at line 202 in repair.py (please see above) with with a check like

            _median = np.median(np.array(heights))
            if not np.isnan(_median):
                scale = int(_median)
            else:
                scale = 1

which finally yields at the same place

10:48:47.496 INFO processor.RepairSegmentation - Sanitizing region "region0071"
/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/numpy/core/fromnumeric.py:3335: RuntimeWarning: Mean of empty slice.
  out=out, **kwargs)
/home/hartwig/Projekte/ulb-dd-ocr-eval-ocrd/venv/lib/python3.6/site-packages/numpy/core/_methods.py:161: RuntimeWarning: invalid value encountered in double_scalars
  ret = ret.dtype.type(ret / rcount)
10:48:47.699 WARNING processor.RepairSegmentation - Zero contour area in region "region0071"

but this way the processing move further.

impossible repair: local variable referenced before assigment

plattform: ocrd/all:maximum
makefile: crop-anyocr-binarize-page-olena-sauvola-denoise-ocropy-deskew-page-ocropy-segment-tesseract-ocropy-dewarp-ocr-ocropy-tesseract (added binarize before anyocr)

Actual Behavior

Recipe crashing at Segmentation

Stacktrace

07:20:24.307 INFO processor.RepairSegmentation - INPUT FILE 13 / IMG_OCR-D-SEG-BLOCK-tesseract_16457324
Traceback (most recent call last):
  File "/usr/bin/ocrd-segment-repair", line 8, in <module>
    sys.exit(ocrd_segment_repair())
  File "/usr/lib/python3.6/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/usr/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/lib/python3.6/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/ocrd_segment/cli.py", line 13, in ocrd_segment_repair
    return ocrd_cli_wrap_processor(RepairSegmentation, *args, **kwargs)
  File "/usr/lib/python3.6/site-packages/ocrd/decorators.py", line 54, in ocrd_cli_wrap_processor
    run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
  File "/usr/lib/python3.6/site-packages/ocrd/processor/base.py", line 57, in run_processor
    processor.process()
  File "/usr/lib/python3.6/site-packages/ocrd_segment/repair.py", line 166, in process
    _plausibilize_group(regionspolys, rogroup, mark_for_deletion, mark_for_merging)
  File "/usr/lib/python3.6/site-packages/ocrd_segment/repair.py", line 312, in _plausibilize_group
    for elem in regionrefs:
UnboundLocalError: local variable 'regionrefs' referenced before assignment
Makefile:304: recipe for target 'OCR-D-plausible' failed
make[1]: *** [OCR-D-plausible] Error 1
make[1]: Leaving directory '/data'
Makefile:194: recipe for target '.' failed
make: Leaving directory '/data'
make: *** [.] Error 2

Suggested Fix

Add regionrefs = list() to the initalized vars, like this:

def _plausibilize_group(regionspolys, rogroup, mark_for_deletion, mark_for_merging):
    wait_for_deletion = list()
    reading_order = dict()
    regionrefs = list()

repair: inaccurate coordinates (tiny inconsistency/invalidity)

From discussion on OCR-D/core#418:

Additionally, IMO the coordinate checks should be made a little less strict (and thus more compatible with Aletheia) to avoid crying wolf.

Things I see frequently:

  1. very small (up to 1 pixel) violations of non-containment in parent element
    • Shapely does not have almost_within, but one could try containment within the dilated version:
      if not (child_poly.within(node_poly) or
              child_poly.within(node_poly.buffer(0.5)))
  2. tiny (direct neighbour) self-intersections because of back-and-forth (probably caused by internal rounding)
    • This must be repaired on the spot, otherwise Shapely will not operate on these polygons. Possibly:
      if not node_poly.is_valid:
          if node_poly.simplify(0.8).is_valid:
              node_poly = node_poly.simplify(0.8)

But it could be more prudent to keep a strict validator, and outsource these repairs into a dedicated Aletheia postprocessor (e.g. ocrd-segment-repair with a new correct-coords=true).

Originally posted by @bertsky in OCR-D/core#418 (comment)

documentation: README completness, debug ocrd-tool.json

Please debug your ocrd_tool.json file.
I found an error:

<report valid="false">
  <error>[tools.ocrd-segment-evaluate] 'output_file_grp' is a required property</error>
</report>

You can find the ocrd-tool.json documentation: https://ocr-d.github.io/ocrd_tool

Please check your README file and complet them. An ideal README file look like:

# Name of application


## Introduction
...

## Installation
...

## Usage
...

## Testing
...

Thank you very much.

cannot upload to pypi anymore

The change c875627 depends on ppwwyyxx/cocoapi#7, an addition of mine to the current pycocotools version 2.0.3 on PyPI. Such git URL references are allowed in requirements.txt / setuptools, but the PyPI server refuses taking such builds:

Invalid value for requires_dist. Error: Can't have direct dependency: 'pycocotools @ git+https://github.com/bertsky/pycocotools#subdirectory=PythonAPI'

@kba, do you know what to do under such circumstances?

Expand regions via repair/sanitize

Before samitization:
image

Regions are often too small and do not span the lines they (should) contain.

After sanitization:
image

Situation is not much better, although

$ ocrd-segment-repair -J
...
"sanitize": {
   "type": "boolean",
   "default": false,
   "description": "Shrink and/or expand a region in such a way that it coordinates include those of all its lines"
  }
...

Expansion does not work, is not complete.

ocrd-segment-repair: handle case where points is empty

Version 0.1.20, ocrd/core 2.33.0

I have a PAGE file, which does not have any real content - like this:

    <pc:Page imageFilename="OCR-D-IMG/0038_IMAGE000918_00001.tif" imageWidth="1420" imageHeight="2313" orientation="0.">
        <pc:AlternativeImage filename="OCR-D-BIN/OCR-D-BIN_0038_IMAGE000918_00001.IMG-BIN.png" comments=",binarized"/>
        <pc:TextRegion id="TR-1" orientation="0.">
            <pc:Coords points=""/>
        </pc:TextRegion>
    </pc:Page>

If I call ocrd-segment-extract-lines, I get an expection like this:

09:19:19.733 DEBUG ocrd.workspace.image_from_page - page 'P_0038_IMAGE000918_00001' has  orientation=0 skew=0.00
09:19:19.733 DEBUG ocrd.workspace.image_from_page - Using AlternativeImage 1 {'', 'binarized'} for page 'P_0038_IMAGE000918_00001'
09:19:19.734 DEBUG ocrd.workspace.download_file - download_file <OcrdFile fileGrp=OCR-D-BIN ID=OCR-D-BIN_0038_IMAGE000918_00001.IMG-BIN, mimetype=image/png, url=OCR-D-BIN/OCR-D-BIN_0038_IMAGE000918_00001.IMG-BIN.png, local_filename=OCR-D-BIN/OCR-D-BIN_0038_IMAGE000918_00001.IMG-BIN.png]/>  [_recursion_count=0]
09:19:19.735 DEBUG PIL.PngImagePlugin - STREAM b'IHDR' 16 13
09:19:19.735 DEBUG PIL.PngImagePlugin - STREAM b'IDAT' 41 65536
Traceback (most recent call last):
  File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/bin/ocrd-segment-extract-lines", line 8, in <module>
    sys.exit(ocrd_segment_extract_lines())
  File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/ocrd_segment/cli.py", line 65, in ocrd_segment_extract_lines
    return ocrd_cli_wrap_processor(ExtractLines, *args, **kwargs)
  File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/ocrd/decorators/__init__.py", line 88, in ocrd_cli_wrap_processor
    run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
  File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/ocrd/processor/helpers.py", line 88, in run_processor
    processor.process()
  File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/ocrd_segment/extract_lines.py", line 171, in process
    transparency=self.parameter['transparency'])
  File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/ocrd/workspace.py", line 829, in image_from_segment
    fill=fill, transparency=transparency)
  File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/ocrd/workspace.py", line 1012, in _crop
    segment_polygon = coordinates_of_segment(segment, parent_image, parent_coords)
  File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/ocrd_utils/image.py", line 136, in coordinates_of_segment
    polygon = np.array(polygon_from_points(segment.get_Coords().points))
  File "/home/ocrdadmin/ocrd_all/venv/sub-venv/headless-tf1/lib/python3.6/site-packages/ocrd_utils/image.py", line 148, in polygon_from_points
    polygon.append([float(x_y[0]), float(x_y[1])])
ValueError: could not convert string to float: 

My expection would be, that this PAGE file simply would be ignored.
--> please, clarify ...

Processor ocrd-segment-repair exits with exception

Log output:

12:06:01.982 INFO ocrd.task_sequence.run_tasks - Start processing task 'segment-repair -I OCR-D-SEG-REG -O OCR-D-SEG-REPAIR -p '{"plausibilize": true, "sanitize": false, "plausibilize_merge_min_overlap": 0.9}''
Traceback (most recent call last):
  File "/venv-20200919/bin/ocrd", line 8, in <module>
    sys.exit(cli())
  File "/venv-20200919/lib/python3.7/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/venv-20200919/lib/python3.7/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/venv-20200919/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/venv-20200919/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/venv-20200919/lib/python3.7/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/venv-20200919/lib/python3.7/site-packages/ocrd/cli/process.py", line 28, in process_cli
    run_tasks(mets, log_level, page_id, tasks, overwrite)
  File "/venv-20200919/lib/python3.7/site-packages/ocrd/task_sequence.py", line 149, in run_tasks
    raise Exception("%s exited with non-zero return value %s. STDOUT:\n%s\nSTDERR:\n%s" % (task.executable, returncode, out, err))
Exception: ocrd-segment-repair exited with non-zero return value 1. STDOUT:

STDERR:
12:06:02.420 INFO processor.RepairSegmentation - INPUT FILE 0 / PHYS_0001
12:06:02.423 INFO ocrd.page_validator - Validating input file 'FILE_0001_OCR-D-SEG-REG'
12:06:02.439 INFO processor.RepairSegmentation - INPUT FILE 1 / PHYS_0002
12:06:02.440 INFO ocrd.page_validator - Validating input file 'FILE_0002_OCR-D-SEG-REG'
Traceback (most recent call last):
  File "/venv-20200919/local/sub-venv/headless-tf1/bin/ocrd-segment-repair", line 8, in <module>
    sys.exit(ocrd_segment_repair())
  File "/venv-20200919/local/sub-venv/headless-tf1/lib/python3.7/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/venv-20200919/local/sub-venv/headless-tf1/lib/python3.7/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/venv-20200919/local/sub-venv/headless-tf1/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/venv-20200919/local/sub-venv/headless-tf1/lib/python3.7/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/venv-20200919/local/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd_segment/cli.py", line 16, in ocrd_segment_repair
    return ocrd_cli_wrap_processor(RepairSegmentation, *args, **kwargs)
  File "/venv-20200919/local/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd/decorators.py", line 102, in ocrd_cli_wrap_processor
    run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
  File "/venv-20200919/local/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd/processor/helpers.py", line 69, in run_processor
    processor.process()
  File "/venv-20200919/local/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd_segment/repair.py", line 94, in process
    parents = list(set([region.parent_object_ for region in page.get_AllRegions(classes=['Text'])]))
  File "/venv-20200919/local/sub-venv/headless-tf1/lib/python3.7/site-packages/ocrd_models/ocrd_page_generateds.py", line 2905, in __hash__
    return hash(self.id)
AttributeError: 'PageType' object has no attribute 'id'

Processor segment-repair end with Exception

The processor 'segment-repir' ends wirh Exception "Exception: ocrd-segment-repair exited with non-zero return value 1" if it comes after processor 'cis-ocropy-segment' in the workflow. In a changed workflow.

In a modified workflow, where processor 'cis-ocropy-segment' is replaced by processor 'tesserocr-segment-line', the processing runs.

evaluate: create PRImA layouteval schema as report

Even if we lack a free GUI, we should strive to generate the layouteval schema from PAGE as evaluation report.

Currently, our output looks like this:

example report

{
  "level-of-operation": "region",
  "ignore-subtype": true,
  "only-fg": true,
  "for-categories": "",
  "by-image": {
    "phys_0001": {
      "true_positives": {
        "SeparatorRegion": [
          {
            "GT.ID": "r_4",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0001_sep0004",
            "GT.area": 2215,
            "DT.area": 892,
            "I.area": 892,
            "IoGT": 0.40270880361173816,
            "IoDT": 1.0,
            "IoU": 0.40270880361173816
          }
        ],
        "TextRegion": [
          {
            "GT.ID": "region_1475249588483_205",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0001_region0001",
            "GT.area": 364,
            "DT.area": 112796,
            "I.area": 364,
            "IoGT": 1.0,
            "IoDT": 0.003227064789531544,
            "IoU": 0.003227064789531544
          },
          {
            "GT.ID": "r_3_1",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0001_region0001",
            "GT.area": 66605,
            "DT.area": 112796,
            "I.area": 66605,
            "IoGT": 1.0,
            "IoDT": 0.5904907975460123,
            "IoU": 0.5904907975460123
          },
          {
            "GT.ID": "r_3_2",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0001_region0001",
            "GT.area": 4215,
            "DT.area": 112796,
            "I.area": 4215,
            "IoGT": 1.0,
            "IoDT": 0.03736834639526224,
            "IoU": 0.03736834639526224
          },
          {
            "GT.ID": "r_3_3",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0001_region0001",
            "GT.area": 3995,
            "DT.area": 112796,
            "I.area": 3995,
            "IoGT": 1.0,
            "IoDT": 0.03541792262136955,
            "IoU": 0.03541792262136955
          },
          {
            "GT.ID": "r_3_4",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0001_region0001",
            "GT.area": 4447,
            "DT.area": 112796,
            "I.area": 4447,
            "IoGT": 1.0,
            "IoDT": 0.03942515692045817,
            "IoU": 0.03942515692045817
          },
          {
            "GT.ID": "r_3_5",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0001_region0001",
            "GT.area": 5095,
            "DT.area": 112796,
            "I.area": 5095,
            "IoGT": 1.0,
            "IoDT": 0.04517004149083301,
            "IoU": 0.04517004149083301
          },
          {
            "GT.ID": "r_3_6",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0001_region0001",
            "GT.area": 6237,
            "DT.area": 112796,
            "I.area": 6237,
            "IoGT": 1.0,
            "IoDT": 0.055294513989857796,
            "IoU": 0.055294513989857796
          },
          {
            "GT.ID": "TextRegion_1479743672641_693",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0001_region0001",
            "GT.area": 10032,
            "DT.area": 112796,
            "I.area": 10032,
            "IoGT": 1.0,
            "IoDT": 0.08893932408950672,
            "IoU": 0.08893932408950672
          },
          {
            "GT.ID": "TextRegion_1479743652909_687",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0001_region0001",
            "GT.area": 7684,
            "DT.area": 112796,
            "I.area": 7136,
            "IoGT": 0.9286829776158251,
            "IoDT": 0.06326465477499202,
            "IoU": 0.0629587803500847
          },
          {
            "GT.ID": "TextRegion_1475249644781_208",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0001_region0002",
            "GT.area": 7297,
            "DT.area": 7238,
            "I.area": 6689,
            "IoGT": 0.9166780868850213,
            "IoDT": 0.924150317767339,
            "IoU": 0.8525363242416518
          },
          {
            "GT.ID": "r_3_12",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0001_region0003",
            "GT.area": 1988,
            "DT.area": 52347,
            "I.area": 1988,
            "IoGT": 1.0,
            "IoDT": 0.03797734349628441,
            "IoU": 0.03797734349628441
          },
          {
            "GT.ID": "r_3_13",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0001_region0003",
            "GT.area": 49005,
            "DT.area": 52347,
            "I.area": 49003,
            "IoGT": 0.9999591878379758,
            "IoDT": 0.9361185932336141,
            "IoU": 0.9360828287073296
          },
          {
            "GT.ID": "r_3_14",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0001_region0003",
            "GT.area": 639,
            "DT.area": 52347,
            "I.area": 639,
            "IoGT": 1.0,
            "IoDT": 0.012207003266662846,
            "IoU": 0.012207003266662846
          }
        ]
      },
      "false_positives": {
        "SeparatorRegion": [
          {
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0001_sep0005",
            "area": 1061
          }
        ],
        "TextRegion": []
      },
      "false_negatives": {
        "SeparatorRegion": [
          {
            "GT.ID": "Separator_1475249605846_207",
            "area": 13984
          }
        ],
        "TextRegion": []
      },
      "oversegmentation": {
        "SeparatorRegion": 0.0,
        "TextRegion": 0.0
      },
      "undersegmentation": {
        "SeparatorRegion": 0.0,
        "TextRegion": 0.6153846153846154
      }
    },
    "phys_0002": {
      "true_positives": {
        "SeparatorRegion": [
          {
            "GT.ID": "r_25",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_sep0002",
            "GT.area": 2574,
            "DT.area": 1046,
            "I.area": 992,
            "IoGT": 0.3853923853923854,
            "IoDT": 0.9483747609942639,
            "IoU": 0.3774733637747336
          },
          {
            "GT.ID": "r_28",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_sep0003",
            "GT.area": 642,
            "DT.area": 648,
            "I.area": 642,
            "IoGT": 1.0,
            "IoDT": 0.9907407407407407,
            "IoU": 0.9907407407407407
          },
          {
            "GT.ID": "r_31",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_sep0004",
            "GT.area": 489,
            "DT.area": 526,
            "I.area": 482,
            "IoGT": 0.9856850715746421,
            "IoDT": 0.9163498098859315,
            "IoU": 0.9043151969981238
          },
          {
            "GT.ID": "r_30",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_sep0005",
            "GT.area": 419,
            "DT.area": 369,
            "I.area": 362,
            "IoGT": 0.863961813842482,
            "IoDT": 0.981029810298103,
            "IoU": 0.8497652582159625
          },
          {
            "GT.ID": "r_35",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_sep0006",
            "GT.area": 321,
            "DT.area": 289,
            "I.area": 285,
            "IoGT": 0.8878504672897196,
            "IoDT": 0.986159169550173,
            "IoU": 0.8769230769230769
          },
          {
            "GT.ID": "r_29",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_sep0007",
            "GT.area": 717,
            "DT.area": 726,
            "I.area": 716,
            "IoGT": 0.99860529986053,
            "IoDT": 0.9862258953168044,
            "IoU": 0.984869325997249
          },
          {
            "GT.ID": "r_34",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_sep0008",
            "GT.area": 644,
            "DT.area": 661,
            "I.area": 641,
            "IoGT": 0.9953416149068323,
            "IoDT": 0.9697428139183056,
            "IoU": 0.9653614457831325
          },
          {
            "GT.ID": "r_32",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_sep0009",
            "GT.area": 652,
            "DT.area": 1403,
            "I.area": 652,
            "IoGT": 1.0,
            "IoDT": 0.4647184604419102,
            "IoU": 0.4647184604419102
          },
          {
            "GT.ID": "r_36",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_sep0009",
            "GT.area": 686,
            "DT.area": 1403,
            "I.area": 686,
            "IoGT": 1.0,
            "IoDT": 0.488952245188881,
            "IoU": 0.488952245188881
          },
          {
            "GT.ID": "r_33",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_sep0010",
            "GT.area": 464,
            "DT.area": 404,
            "I.area": 401,
            "IoGT": 0.8642241379310345,
            "IoDT": 0.9925742574257426,
            "IoU": 0.8586723768736617
          },
          {
            "GT.ID": "r_37",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_sep0011",
            "GT.area": 465,
            "DT.area": 454,
            "I.area": 451,
            "IoGT": 0.9698924731182795,
            "IoDT": 0.9933920704845814,
            "IoU": 0.9636752136752137
          }
        ],
        "TextRegion": [
          {
            "GT.ID": "r_1_1",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_region0001",
            "GT.area": 448,
            "DT.area": 151852,
            "I.area": 448,
            "IoGT": 1.0,
            "IoDT": 0.002950241024155098,
            "IoU": 0.002950241024155098
          },
          {
            "GT.ID": "r_1_2",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_region0001",
            "GT.area": 9564,
            "DT.area": 151852,
            "I.area": 9564,
            "IoGT": 1.0,
            "IoDT": 0.06298237757816821,
            "IoU": 0.06298237757816821
          },
          {
            "GT.ID": "r_2_1",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_region0001",
            "GT.area": 356,
            "DT.area": 151852,
            "I.area": 356,
            "IoGT": 1.0,
            "IoDT": 0.0023443879566946765,
            "IoU": 0.0023443879566946765
          },
          {
            "GT.ID": "TextRegion_1479403886958_5",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_region0001",
            "GT.area": 5661,
            "DT.area": 151852,
            "I.area": 5661,
            "IoGT": 1.0,
            "IoDT": 0.03727971972710271,
            "IoU": 0.03727971972710271
          },
          {
            "GT.ID": "TextRegion_1479403909734_6",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_region0001",
            "GT.area": 5987,
            "DT.area": 151852,
            "I.area": 5987,
            "IoGT": 1.0,
            "IoDT": 0.03942654690092985,
            "IoU": 0.03942654690092985
          },
          {
            "GT.ID": "r_5_1",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_region0001",
            "GT.area": 6434,
            "DT.area": 151852,
            "I.area": 6434,
            "IoGT": 1.0,
            "IoDT": 0.04237020256565603,
            "IoU": 0.04237020256565603
          },
          {
            "GT.ID": "r_6_1",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_region0001",
            "GT.area": 577,
            "DT.area": 151852,
            "I.area": 577,
            "IoGT": 1.0,
            "IoDT": 0.0037997523904854725,
            "IoU": 0.0037997523904854725
          },
          {
            "GT.ID": "TextRegion_1479403947382_7",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_region0001",
            "GT.area": 7550,
            "DT.area": 151852,
            "I.area": 7550,
            "IoGT": 1.0,
            "IoDT": 0.04971946368832811,
            "IoU": 0.04971946368832811
          },
          {
            "GT.ID": "TextRegion_1479403979431_8",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_region0001",
            "GT.area": 8223,
            "DT.area": 151852,
            "I.area": 8223,
            "IoGT": 1.0,
            "IoDT": 0.05415141058398967,
            "IoU": 0.05415141058398967
          },
          {
            "GT.ID": "r_10_1",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_region0001",
            "GT.area": 5452,
            "DT.area": 151852,
            "I.area": 5452,
            "IoGT": 1.0,
            "IoDT": 0.035903379606458924,
            "IoU": 0.035903379606458924
          },
          {
            "GT.ID": "r_11_2",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_region0001",
            "GT.area": 809,
            "DT.area": 151852,
            "I.area": 809,
            "IoGT": 1.0,
            "IoDT": 0.005327555777994363,
            "IoU": 0.005327555777994363
          },
          {
            "GT.ID": "TextRegion_1479404023350_9",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_region0001",
            "GT.area": 10238,
            "DT.area": 151852,
            "I.area": 10238,
            "IoGT": 1.0,
            "IoDT": 0.0674209098332587,
            "IoU": 0.0674209098332587
          },
          {
            "GT.ID": "TextRegion_1479404138750_10",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_region0001",
            "GT.area": 12574,
            "DT.area": 151852,
            "I.area": 12574,
            "IoGT": 1.0,
            "IoDT": 0.08280430945921029,
            "IoU": 0.08280430945921029
          },
          {
            "GT.ID": "r_14_1",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_region0001",
            "GT.area": 5145,
            "DT.area": 151852,
            "I.area": 5145,
            "IoGT": 1.0,
            "IoDT": 0.03388167426178121,
            "IoU": 0.03388167426178121
          },
          {
            "GT.ID": "r_11_6",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_region0001",
            "GT.area": 626,
            "DT.area": 151852,
            "I.area": 626,
            "IoGT": 1.0,
            "IoDT": 0.0041224350025024364,
            "IoU": 0.0041224350025024364
          },
          {
            "GT.ID": "TextRegion_1479404164750_11",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_region0001",
            "GT.area": 10218,
            "DT.area": 151852,
            "I.area": 10218,
            "IoGT": 1.0,
            "IoDT": 0.06728920264468034,
            "IoU": 0.06728920264468034
          },
          {
            "GT.ID": "TextRegion_1479404196214_12",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_region0001",
            "GT.area": 12333,
            "DT.area": 151852,
            "I.area": 12333,
            "IoGT": 1.0,
            "IoDT": 0.08121723783684114,
            "IoU": 0.08121723783684114
          },
          {
            "GT.ID": "r_19_1",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_region0001",
            "GT.area": 5776,
            "DT.area": 151852,
            "I.area": 5776,
            "IoGT": 1.0,
            "IoDT": 0.038037036061428234,
            "IoU": 0.038037036061428234
          },
          {
            "GT.ID": "r_17_1",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_region0001",
            "GT.area": 504,
            "DT.area": 151852,
            "I.area": 504,
            "IoGT": 1.0,
            "IoDT": 0.003319021152174486,
            "IoU": 0.003319021152174486
          },
          {
            "GT.ID": "TextRegion_1479404228230_13",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_region0001",
            "GT.area": 13351,
            "DT.area": 151852,
            "I.area": 13351,
            "IoGT": 1.0,
            "IoDT": 0.08792113373547929,
            "IoU": 0.08792113373547929
          },
          {
            "GT.ID": "TextRegion_1479404237182_14",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_region0001",
            "GT.area": 14758,
            "DT.area": 151852,
            "I.area": 14758,
            "IoGT": 1.0,
            "IoDT": 0.09718673445196639,
            "IoU": 0.09718673445196639
          },
          {
            "GT.ID": "region_1475750962946_8",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_region0001",
            "GT.area": 808,
            "DT.area": 151852,
            "I.area": 808,
            "IoGT": 1.0,
            "IoDT": 0.005320970418565446,
            "IoU": 0.005320970418565446
          },
          {
            "GT.ID": "r_23_6",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0002_region0001",
            "GT.area": 1021,
            "DT.area": 151852,
            "I.area": 1021,
            "IoGT": 1.0,
            "IoDT": 0.006723651976924901,
            "IoU": 0.006723651976924901
          }
        ]
      },
      "false_positives": {
        "SeparatorRegion": [],
        "TextRegion": []
      },
      "false_negatives": {
        "SeparatorRegion": [
          {
            "GT.ID": "Graphic_1488818077814_475",
            "area": 18908
          }
        ],
        "TextRegion": []
      },
      "oversegmentation": {
        "SeparatorRegion": 0.0,
        "TextRegion": 0.0
      },
      "undersegmentation": {
        "SeparatorRegion": 0.016666666666666666,
        "TextRegion": 1.0
      }
    },
    "phys_0003": {
      "true_positives": {
        "SeparatorRegion": [
          {
            "GT.ID": "r_11",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0003_sep0002",
            "GT.area": 2419,
            "DT.area": 2463,
            "I.area": 2418,
            "IoGT": 0.9995866060355518,
            "IoDT": 0.9817295980511571,
            "IoU": 0.9813311688311688
          },
          {
            "GT.ID": "r_31",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0003_sep0003",
            "GT.area": 475,
            "DT.area": 753,
            "I.area": 475,
            "IoGT": 1.0,
            "IoDT": 0.6308100929614874,
            "IoU": 0.6308100929614874
          },
          {
            "GT.ID": "r_33",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0003_sep0003",
            "GT.area": 113,
            "DT.area": 753,
            "I.area": 102,
            "IoGT": 0.9026548672566371,
            "IoDT": 0.13545816733067728,
            "IoU": 0.13350785340314136
          },
          {
            "GT.ID": "r_30",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0003_sep0004",
            "GT.area": 628,
            "DT.area": 965,
            "I.area": 606,
            "IoGT": 0.964968152866242,
            "IoDT": 0.627979274611399,
            "IoU": 0.6139817629179332
          },
          {
            "GT.ID": "r_32",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0003_sep0004",
            "GT.area": 638,
            "DT.area": 965,
            "I.area": 445,
            "IoGT": 0.6974921630094044,
            "IoDT": 0.46113989637305697,
            "IoU": 0.3842832469775475
          }
        ],
        "TextRegion": [
          {
            "GT.ID": "r_1_1",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0003_region0001",
            "GT.area": 506,
            "DT.area": 199098,
            "I.area": 506,
            "IoGT": 1.0,
            "IoDT": 0.0025414619935910958,
            "IoU": 0.0025414619935910958
          },
          {
            "GT.ID": "r_4_1",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0003_region0001",
            "GT.area": 5747,
            "DT.area": 199098,
            "I.area": 5747,
            "IoGT": 1.0,
            "IoDT": 0.0288651819706878,
            "IoU": 0.0288651819706878
          },
          {
            "GT.ID": "r_3_1",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0003_region0001",
            "GT.area": 664,
            "DT.area": 199098,
            "I.area": 664,
            "IoGT": 1.0,
            "IoDT": 0.0033350410350681575,
            "IoU": 0.0033350410350681575
          },
          {
            "GT.ID": "TextRegion_1475753702385_214",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0003_region0001",
            "GT.area": 1580,
            "DT.area": 199098,
            "I.area": 1580,
            "IoGT": 1.0,
            "IoDT": 0.007935790414770615,
            "IoU": 0.007935790414770615
          },
          {
            "GT.ID": "TextRegion_1475753868353_284",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0003_region0001",
            "GT.area": 2904,
            "DT.area": 199098,
            "I.area": 2904,
            "IoGT": 1.0,
            "IoDT": 0.01458578187626194,
            "IoU": 0.01458578187626194
          },
          {
            "GT.ID": "TextRegion_1475753868353_283",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0003_region0001",
            "GT.area": 3748,
            "DT.area": 199098,
            "I.area": 3748,
            "IoGT": 1.0,
            "IoDT": 0.0188249003003546,
            "IoU": 0.0188249003003546
          },
          {
            "GT.ID": "TextRegion_1479743908679_710",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0003_region0001",
            "GT.area": 3674,
            "DT.area": 199098,
            "I.area": 3674,
            "IoGT": 1.0,
            "IoDT": 0.018453224040422305,
            "IoU": 0.018453224040422305
          },
          {
            "GT.ID": "TextRegion_1479743908679_709",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0003_region0001",
            "GT.area": 3725,
            "DT.area": 199098,
            "I.area": 3725,
            "IoGT": 1.0,
            "IoDT": 0.018709379300645913,
            "IoU": 0.018709379300645913
          },
          {
            "GT.ID": "TextRegion_1479743989171_724",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0003_region0001",
            "GT.area": 3640,
            "DT.area": 199098,
            "I.area": 3640,
            "IoGT": 1.0,
            "IoDT": 0.0182824538669399,
            "IoU": 0.0182824538669399
          },
          {
            "GT.ID": "TextRegion_1479743994179_732",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0003_region0001",
            "GT.area": 3727,
            "DT.area": 199098,
            "I.area": 3727,
            "IoGT": 1.0,
            "IoDT": 0.018719424604968407,
            "IoU": 0.018719424604968407
          },
          {
            "GT.ID": "TextRegion_1479743994178_731",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0003_region0001",
            "GT.area": 4158,
            "DT.area": 199098,
            "I.area": 4158,
            "IoGT": 1.0,
            "IoDT": 0.020884187686465962,
            "IoU": 0.020884187686465962
          },
          {
            "GT.ID": "r_5_2",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0003_region0001",
            "GT.area": 3753,
            "DT.area": 199098,
            "I.area": 3753,
            "IoGT": 1.0,
            "IoDT": 0.018850013561160835,
            "IoU": 0.018850013561160835
          },
          {
            "GT.ID": "r_5_3",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0003_region0001",
            "GT.area": 3889,
            "DT.area": 199098,
            "I.area": 3888,
            "IoGT": 0.999742864489586,
            "IoDT": 0.01952807160292921,
            "IoU": 0.019533094255090457
          },
          {
            "GT.ID": "r_7_1",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0003_region0001",
            "GT.area": 5793,
            "DT.area": 199098,
            "I.area": 5793,
            "IoGT": 1.0,
            "IoDT": 0.029096223970105174,
            "IoU": 0.029096223970105174
          },
          {
            "GT.ID": "TextRegion_1475753825328_258",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0003_region0001",
            "GT.area": 890,
            "DT.area": 199098,
            "I.area": 889,
            "IoGT": 0.998876404494382,
            "IoDT": 0.004465137771348783,
            "IoU": 0.00447016042351003
          },
          {
            "GT.ID": "r_6_1",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0003_region0001",
            "GT.area": 1513,
            "DT.area": 199098,
            "I.area": 1513,
            "IoGT": 1.0,
            "IoDT": 0.007599272719967051,
            "IoU": 0.007599272719967051
          },
          {
            "GT.ID": "TextRegion_1475753831444_264",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0003_region0001",
            "GT.area": 2002,
            "DT.area": 199098,
            "I.area": 2001,
            "IoGT": 0.9995004995004995,
            "IoDT": 0.010050326974655696,
            "IoU": 0.010055349626816944
          },
          {
            "GT.ID": "TextRegion_1475753831444_263",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0003_region0001",
            "GT.area": 2625,
            "DT.area": 199098,
            "I.area": 2625,
            "IoGT": 1.0,
            "IoDT": 0.013184461923273966,
            "IoU": 0.013184461923273966
          },
          {
            "GT.ID": "TextRegion_1479743867699_705",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0003_region0001",
            "GT.area": 2488,
            "DT.area": 199098,
            "I.area": 2488,
            "IoGT": 1.0,
            "IoDT": 0.012496358577183096,
            "IoU": 0.012496358577183096
          },
          {
            "GT.ID": "TextRegion_1479743867698_704",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0003_region0001",
            "GT.area": 2570,
            "DT.area": 199098,
            "I.area": 2570,
            "IoGT": 1.0,
            "IoDT": 0.012908216054405367,
            "IoU": 0.012908216054405367
          },
          {
            "GT.ID": "r_8_1",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0003_region0001",
            "GT.area": 2471,
            "DT.area": 199098,
            "I.area": 2471,
            "IoGT": 1.0,
            "IoDT": 0.012410973490441892,
            "IoU": 0.012410973490441892
          },
          {
            "GT.ID": "r_8_2",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0003_region0001",
            "GT.area": 2616,
            "DT.area": 199098,
            "I.area": 2616,
            "IoGT": 1.0,
            "IoDT": 0.01313925805382274,
            "IoU": 0.01313925805382274
          },
          {
            "GT.ID": "TextRegion_1479743815947_697",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0003_region0001",
            "GT.area": 2659,
            "DT.area": 199098,
            "I.area": 2659,
            "IoGT": 1.0,
            "IoDT": 0.013355232096756372,
            "IoU": 0.013355232096756372
          },
          {
            "GT.ID": "TextRegion_1479743815946_696",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0003_region0001",
            "GT.area": 2374,
            "DT.area": 199098,
            "I.area": 2374,
            "IoGT": 1.0,
            "IoDT": 0.011923776230800912,
            "IoU": 0.011923776230800912
          },
          {
            "GT.ID": "region_1475753754785_230",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0003_region0001",
            "GT.area": 2693,
            "DT.area": 199098,
            "I.area": 2693,
            "IoGT": 1.0,
            "IoDT": 0.013526002270238776,
            "IoU": 0.013526002270238776
          },
          {
            "GT.ID": "r_9_1",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0003_region0001",
            "GT.area": 119496,
            "DT.area": 199098,
            "I.area": 119495,
            "IoGT": 0.9999916315190467,
            "IoDT": 0.6001818200082372,
            "IoU": 0.6001868426603983
          },
          {
            "GT.ID": "r_10_1",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0003_region0001",
            "GT.area": 803,
            "DT.area": 199098,
            "I.area": 803,
            "IoGT": 1.0,
            "IoDT": 0.004033189685481522,
            "IoU": 0.004033189685481522
          }
        ]
      },
      "false_positives": {
        "SeparatorRegion": [],
        "TextRegion": []
      },
      "false_negatives": {
        "SeparatorRegion": [
          {
            "GT.ID": "Separator_1475753666973_204",
            "area": 12556
          }
        ],
        "TextRegion": []
      },
      "oversegmentation": {
        "SeparatorRegion": 0.0,
        "TextRegion": 0.0
      },
      "undersegmentation": {
        "SeparatorRegion": 0.4444444444444444,
        "TextRegion": 1.0
      }
    },
    "phys_0004": {
      "true_positives": {
        "SeparatorRegion": [
          {
            "GT.ID": "r_10",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0004_sep0006",
            "GT.area": 1785,
            "DT.area": 1234,
            "I.area": 1206,
            "IoGT": 0.6756302521008404,
            "IoDT": 0.9773095623987034,
            "IoU": 0.6651958080529509
          },
          {
            "GT.ID": "r_11",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0004_sep0007",
            "GT.area": 258,
            "DT.area": 568,
            "I.area": 254,
            "IoGT": 0.9844961240310077,
            "IoDT": 0.4471830985915493,
            "IoU": 0.44405594405594406
          },
          {
            "GT.ID": "r_14",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0004_sep0008",
            "GT.area": 279,
            "DT.area": 497,
            "I.area": 270,
            "IoGT": 0.967741935483871,
            "IoDT": 0.5432595573440644,
            "IoU": 0.5335968379446641
          }
        ],
        "TextRegion": [
          {
            "GT.ID": "r_6_1",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0004_region0001",
            "GT.area": 497,
            "DT.area": 137997,
            "I.area": 497,
            "IoGT": 1.0,
            "IoDT": 0.0036015275694399156,
            "IoU": 0.0036015275694399156
          },
          {
            "GT.ID": "r_6_2",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0004_region0001",
            "GT.area": 106875,
            "DT.area": 137997,
            "I.area": 106875,
            "IoGT": 1.0,
            "IoDT": 0.7744733581164808,
            "IoU": 0.7744733581164808
          },
          {
            "GT.ID": "region_1475755598239_323",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0004_region0001",
            "GT.area": 9170,
            "DT.area": 137997,
            "I.area": 9170,
            "IoGT": 1.0,
            "IoDT": 0.06645071994318717,
            "IoU": 0.06645071994318717
          },
          {
            "GT.ID": "r_9_2",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0004_region0001",
            "GT.area": 10298,
            "DT.area": 137997,
            "I.area": 10288,
            "IoGT": 0.9990289376577977,
            "IoDT": 0.07455234534084074,
            "IoU": 0.0745469432709935
          },
          {
            "GT.ID": "r_9_3",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0004_region0001",
            "GT.area": 7774,
            "DT.area": 137997,
            "I.area": 5079,
            "IoGT": 0.6533316182145613,
            "IoDT": 0.036805147937998654,
            "IoU": 0.03610013362522389
          },
          {
            "GT.ID": "r_9_4",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0004_region0002",
            "GT.area": 37009,
            "DT.area": 9341,
            "I.area": 9318,
            "IoGT": 0.2517765948823259,
            "IoDT": 0.9975377368590087,
            "IoU": 0.2516202203499676
          },
          {
            "GT.ID": "r_9_4",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0004_region0003",
            "GT.area": 37009,
            "DT.area": 1873,
            "I.area": 1873,
            "IoGT": 0.0506093112486152,
            "IoDT": 1.0,
            "IoU": 0.0506093112486152
          },
          {
            "GT.ID": "r_9_4",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0004_region0004",
            "GT.area": 37009,
            "DT.area": 3687,
            "I.area": 3686,
            "IoGT": 0.09959739522818774,
            "IoDT": 0.9997287767832926,
            "IoU": 0.09959470413401783
          },
          {
            "GT.ID": "r_9_4",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0004_region0005",
            "GT.area": 37009,
            "DT.area": 19158,
            "I.area": 18480,
            "IoGT": 0.4993379988651409,
            "IoDT": 0.964610084559975,
            "IoU": 0.49035476424231167
          },
          {
            "GT.ID": "r_9_5",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0004_region0005",
            "GT.area": 590,
            "DT.area": 19158,
            "I.area": 590,
            "IoGT": 1.0,
            "IoDT": 0.030796534084977557,
            "IoU": 0.030796534084977557
          }
        ],
        "GraphicRegion": [
          {
            "GT.ID": "Graphic_1478617598537_277",
            "DT.ID": "Graphic_1478617598537_277",
            "GT.area": 1543,
            "DT.area": 1543,
            "I.area": 1543,
            "IoGT": 1.0,
            "IoDT": 1.0,
            "IoU": 1.0
          },
          {
            "GT.ID": "Graphic_1478617605105_278",
            "DT.ID": "Graphic_1478617605105_278",
            "GT.area": 959,
            "DT.area": 959,
            "I.area": 959,
            "IoGT": 1.0,
            "IoDT": 1.0,
            "IoU": 1.0
          },
          {
            "GT.ID": "Graphic_1478617608880_279",
            "DT.ID": "Graphic_1478617608880_279",
            "GT.area": 990,
            "DT.area": 990,
            "I.area": 990,
            "IoGT": 1.0,
            "IoDT": 1.0,
            "IoU": 1.0
          },
          {
            "GT.ID": "Graphic_1478617637709_280",
            "DT.ID": "Graphic_1478617637709_280",
            "GT.area": 945,
            "DT.area": 945,
            "I.area": 945,
            "IoGT": 1.0,
            "IoDT": 1.0,
            "IoU": 1.0
          },
          {
            "GT.ID": "Graphic_1478617640501_281",
            "DT.ID": "Graphic_1478617640501_281",
            "GT.area": 1156,
            "DT.area": 1156,
            "I.area": 1156,
            "IoGT": 1.0,
            "IoDT": 1.0,
            "IoU": 1.0
          },
          {
            "GT.ID": "Graphic_1478617643372_282",
            "DT.ID": "Graphic_1478617643372_282",
            "GT.area": 1668,
            "DT.area": 1668,
            "I.area": 1668,
            "IoGT": 1.0,
            "IoDT": 1.0,
            "IoU": 1.0
          }
        ]
      },
      "false_positives": {
        "SeparatorRegion": [
          {
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0004_sep0009",
            "area": 621
          }
        ],
        "TextRegion": [],
        "GraphicRegion": []
      },
      "false_negatives": {
        "SeparatorRegion": [
          {
            "GT.ID": "r_12",
            "area": 296
          },
          {
            "GT.ID": "r_13",
            "area": 284
          },
          {
            "GT.ID": "Separator_1475754927411_317",
            "area": 13420
          },
          {
            "GT.ID": "Separator_1475755518880_322",
            "area": 71
          }
        ],
        "TextRegion": [],
        "GraphicRegion": []
      },
      "oversegmentation": {
        "SeparatorRegion": 0.0,
        "TextRegion": 0.11428571428571428,
        "GraphicRegion": 0.0
      },
      "undersegmentation": {
        "SeparatorRegion": 0.0,
        "TextRegion": 0.4,
        "GraphicRegion": 0.0
      }
    },
    "phys_0005": {
      "true_positives": {
        "SeparatorRegion": [],
        "TextRegion": [
          {
            "GT.ID": "region_1475759430611_18",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0005_region0001",
            "GT.area": 497,
            "DT.area": 156616,
            "I.area": 497,
            "IoGT": 1.0,
            "IoDT": 0.0031733667058282677,
            "IoU": 0.0031733667058282677
          },
          {
            "GT.ID": "r_6_1",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0005_region0001",
            "GT.area": 126553,
            "DT.area": 156616,
            "I.area": 126457,
            "IoGT": 0.9992414245414964,
            "IoDT": 0.8074334678449201,
            "IoU": 0.8069388432283424
          },
          {
            "GT.ID": "r_6_2",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0005_region0001",
            "GT.area": 21852,
            "DT.area": 156616,
            "I.area": 21798,
            "IoGT": 0.9975288303130149,
            "IoDT": 0.13918118199928486,
            "IoU": 0.13913320993170358
          },
          {
            "GT.ID": "r_8_1",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0005_region0001",
            "GT.area": 4028,
            "DT.area": 156616,
            "I.area": 4026,
            "IoGT": 0.9995034756703078,
            "IoDT": 0.02570618583031108,
            "IoU": 0.025705857564264644
          },
          {
            "GT.ID": "r_8_3",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0005_region0002",
            "GT.area": 1758,
            "DT.area": 358,
            "I.area": 358,
            "IoGT": 0.2036405005688282,
            "IoDT": 1.0,
            "IoU": 0.2036405005688282
          },
          {
            "GT.ID": "r_9_1",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0005_region0003",
            "GT.area": 1622,
            "DT.area": 1779,
            "I.area": 1621,
            "IoGT": 0.9993834771886559,
            "IoDT": 0.911186059584036,
            "IoU": 0.9106741573033708
          },
          {
            "GT.ID": "r_8_4",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0005_region0004",
            "GT.area": 1845,
            "DT.area": 347,
            "I.area": 347,
            "IoGT": 0.1880758807588076,
            "IoDT": 1.0,
            "IoU": 0.1880758807588076
          },
          {
            "GT.ID": "r_8_7",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0005_region0007",
            "GT.area": 561,
            "DT.area": 566,
            "I.area": 561,
            "IoGT": 1.0,
            "IoDT": 0.991166077738516,
            "IoU": 0.991166077738516
          },
          {
            "GT.ID": "r_10_1",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0005_region0008",
            "GT.area": 1123,
            "DT.area": 1133,
            "I.area": 1123,
            "IoGT": 1.0,
            "IoDT": 0.9911738746690203,
            "IoU": 0.9911738746690203
          }
        ],
        "GraphicRegion": [
          {
            "GT.ID": "Graphic_1488818176830_476",
            "DT.ID": "Graphic_1488818176830_476",
            "GT.area": 1311,
            "DT.area": 1311,
            "I.area": 1311,
            "IoGT": 1.0,
            "IoDT": 1.0,
            "IoU": 1.0
          }
        ]
      },
      "false_positives": {
        "SeparatorRegion": [
          {
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0005_sep0009",
            "area": 814
          }
        ],
        "TextRegion": [
          {
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0005_region0005",
            "area": 5840
          },
          {
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0005_region0006",
            "area": 2890
          }
        ],
        "GraphicRegion": []
      },
      "false_negatives": {
        "SeparatorRegion": [
          {
            "GT.ID": "r_11",
            "area": 8904
          },
          {
            "GT.ID": "Separator_1478617711310_283",
            "area": 15141
          }
        ],
        "TextRegion": [
          {
            "GT.ID": "r_8_2",
            "area": 18463
          },
          {
            "GT.ID": "r_8_5",
            "area": 25529
          },
          {
            "GT.ID": "TextRegion_1475759982805_45",
            "area": 21648
          },
          {
            "GT.ID": "TextRegion_1475759982805_44",
            "area": 28512
          }
        ],
        "GraphicRegion": []
      },
      "oversegmentation": {
        "SeparatorRegion": 0.0,
        "TextRegion": 0.0,
        "GraphicRegion": 0.0
      },
      "undersegmentation": {
        "SeparatorRegion": 0.0,
        "TextRegion": 0.038461538461538464,
        "GraphicRegion": 0.0
      }
    },
    "phys_0006": {
      "true_positives": {
        "SeparatorRegion": [
          {
            "GT.ID": "r_10",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0006_sep0006",
            "GT.area": 1606,
            "DT.area": 558,
            "I.area": 558,
            "IoGT": 0.34744707347447074,
            "IoDT": 1.0,
            "IoU": 0.34744707347447074
          }
        ],
        "TextRegion": [
          {
            "GT.ID": "r_5_1",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0006_region0001",
            "GT.area": 600,
            "DT.area": 83299,
            "I.area": 600,
            "IoGT": 1.0,
            "IoDT": 0.007202967622660536,
            "IoU": 0.007202967622660536
          },
          {
            "GT.ID": "r_5_2",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0006_region0001",
            "GT.area": 10250,
            "DT.area": 83299,
            "I.area": 10250,
            "IoGT": 1.0,
            "IoDT": 0.12305069688711749,
            "IoU": 0.12305069688711749
          },
          {
            "GT.ID": "r_5_3",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0006_region0001",
            "GT.area": 30461,
            "DT.area": 83299,
            "I.area": 30461,
            "IoGT": 1.0,
            "IoDT": 0.36568266125643767,
            "IoU": 0.36568266125643767
          },
          {
            "GT.ID": "r_5_4",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0006_region0001",
            "GT.area": 38093,
            "DT.area": 83299,
            "I.area": 38062,
            "IoGT": 0.9991862021893786,
            "IoDT": 0.4569322560895089,
            "IoU": 0.45676227049081963
          },
          {
            "GT.ID": "r_7_1",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0006_region0002",
            "GT.area": 2433,
            "DT.area": 2504,
            "I.area": 2433,
            "IoGT": 1.0,
            "IoDT": 0.9716453674121406,
            "IoU": 0.9716453674121406
          },
          {
            "GT.ID": "TextRegion_1475761269876_83",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0006_region0003",
            "GT.area": 13713,
            "DT.area": 13718,
            "I.area": 13710,
            "IoGT": 0.9997812294902647,
            "IoDT": 0.9994168246100015,
            "IoU": 0.9991983091611398
          },
          {
            "GT.ID": "region_1475761259018_82",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0006_region0004",
            "GT.area": 184,
            "DT.area": 51862,
            "I.area": 184,
            "IoGT": 1.0,
            "IoDT": 0.0035478770583471523,
            "IoU": 0.0035478770583471523
          },
          {
            "GT.ID": "region_1475761389513_84",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0006_region0004",
            "GT.area": 4528,
            "DT.area": 51862,
            "I.area": 4528,
            "IoGT": 1.0,
            "IoDT": 0.08730862674019513,
            "IoU": 0.08730862674019513
          },
          {
            "GT.ID": "r_8_1",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0006_region0004",
            "GT.area": 45761,
            "DT.area": 51862,
            "I.area": 45761,
            "IoGT": 1.0,
            "IoDT": 0.8823608807990436,
            "IoU": 0.8823608807990436
          },
          {
            "GT.ID": "TextRegion_1475761237646_79",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0006_region0004",
            "GT.area": 690,
            "DT.area": 51862,
            "I.area": 690,
            "IoGT": 1.0,
            "IoDT": 0.01330453896880182,
            "IoU": 0.01330453896880182
          },
          {
            "GT.ID": "TextRegion_1475761237646_78",
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0006_region0004",
            "GT.area": 901,
            "DT.area": 51862,
            "I.area": 901,
            "IoGT": 1.0,
            "IoDT": 0.01737302842158035,
            "IoU": 0.01737302842158035
          }
        ],
        "ImageRegion": []
      },
      "false_positives": {
        "SeparatorRegion": [],
        "TextRegion": [],
        "ImageRegion": [
          {
            "DT.ID": "OCR-D-GT-SEG-PAGE-BINPAGE-sauvola-DENOISE-ocropy-DESKEW-tesseract-DESKEW-ocropy_0006_image0005",
            "area": 8754
          }
        ]
      },
      "false_negatives": {
        "SeparatorRegion": [
          {
            "GT.ID": "Separator_1475761205058_77",
            "area": 16380
          }
        ],
        "TextRegion": [],
        "ImageRegion": []
      },
      "oversegmentation": {
        "SeparatorRegion": 0.0,
        "TextRegion": 0.0,
        "ImageRegion": -1
      },
      "undersegmentation": {
        "SeparatorRegion": 0.0,
        "TextRegion": 0.4090909090909091,
        "ImageRegion": -1
      }
    }
  },
  "by-category": {
    "bg": {},
    "SeparatorRegion": {
      "oversegmentation": 0.0,
      "undersegmentation": 0.027649769585253454,
      "segment-precision": 0.8571428571428571,
      "segment-recall": 0.6774193548387096,
      "IoGT": 0.8520799638945556,
      "IoDT": 0.7868156800908347,
      "IoU": 0.6601135855639871,
      "pixel-precision": 0.9190657251493753,
      "pixel-recall": 0.412130069419072
    },
    "TextRegion": {
      "oversegmentation": 0.0019342359767891683,
      "undersegmentation": 0.3568665377176016,
      "segment-precision": 0.9090909090909091,
      "segment-recall": 0.9574468085106383,
      "IoGT": 0.9439081132167711,
      "IoDT": 0.20961115880140255,
      "IoU": 0.15851362650646703,
      "pixel-precision": 0.9987255178164354,
      "pixel-recall": 0.9878159438943457
    },
    "GraphicRegion": {
      "oversegmentation": 0.0,
      "undersegmentation": 0.0,
      "segment-precision": 1.0,
      "segment-recall": 1.0,
      "IoGT": 1.0,
      "IoDT": 1.0,
      "IoU": 1.0,
      "pixel-precision": 1.0,
      "pixel-recall": 1.0
    },
    "ImageRegion": {
      "oversegmentation": -1,
      "undersegmentation": -1,
      "segment-precision": -1,
      "segment-recall": -1,
      "IoGT": -1,
      "IoDT": -1,
      "IoU": -1,
      "pixel-precision": -1,
      "pixel-recall": -1
    }
  },
  "scores": {
    "Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all ]": 0.4196631613710822,
    "Average Precision  (AP) @[ IoU=0.50      | area=   all ]": 0.4683218321832183,
    "Average Precision  (AP) @[ IoU=0.75      | area=   all ]": 0.42063052459092065,
    "Average Precision  (AP) @[ IoU=0.50:0.95 | area= small ]": 0.18292079207920792,
    "Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium ]": 0.41578893603646083,
    "Average Precision  (AP) @[ IoU=0.50:0.95 | area= large ]": 0.028855220686903858,
    "Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all ]": 0.4659574468085107,
    "Average Recall     (AR) @[ IoU=0.50:0.95 | area= small ]": 0.1866666666666667,
    "Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium ]": 0.5143097643097644,
    "Average Recall     (AR) @[ IoU=0.50:0.95 | area= large ]": 0.04565217391304348
  }
}

ocrd-segment-extract-lines ignores lines with "\n" in <TextEquiv><Unicode>

I have used ocrd-segment-extract-lines with a PAGE file, which has had some <TextLine> with a "\n" in the <Unicode>area.
Unfortunately, for these lines the extraction is not done.

This example works ok:```

    <pc:TextEquiv>
      <pc:Unicode>1889</pc:Unicode>
    </pc:TextEquiv>
This example does not work:
    <pc:TextEquiv>
      <pc:Unicode>1889
      </pc:Unicode>
    </pc:TextEquiv>

==> please clarify ...

plausibilize and sanitize are too broad terms

ocrd-segment-repair has the optional operations "plausibilize" and "sanitize" โ€“ I have no idea what this exactly does :) I would prefer something like this:

  • shrink-regions-to-hull-of-lines
  • whatever-plausibilize-does

There seems to also be another thing ocrd-segment-repair does.

In other words: Make operations explicit.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.