karolzak / boxdetect Goto Github PK

BoxDetect is a Python package based on OpenCV which allows you to easily detect rectangular shapes like character or checkbox boxes on scanned forms.

License: MIT License

Python 100.00%

computer-vision cv2 rectangle-detection box-detection forms documents scanned-documents scanned-images scanned-image-pdfs bounding-boxes

boxdetect's People

Contributors

Stargazers

Watchers

Forkers

amanjee-bits marcelfenerich balajimcr kbrajwani amulyakali chetan8000 yeongkwoncho prabhatrmishra lukelu520 yemenpython drewablo waiyeung1 tanaponyindee ajay-v03 ajunlonglive brycestevenwilley pandinosaurus youngjunekwon kamlesh0606

boxdetect's Issues

As a user I want to automatically get optimal configuration based on provided ground truth

Can't detect table cells

Shouldn't it also be possible to detect tables and cells of tables?

What if I wanted to detect all the cells of such a table?:

As described in #29 it would be great to have some kind of guidance on how to find the right config.

Unhandled exception when no rectangles found

Need to add to check and add graceful exception if no rects are found:
--> 109 mean_width = np.mean(rects[:, 2])

TypeError Traceback (most recent call last)
in
2
3 rects, grouping_rects, img, output_image = get_boxes(
----> 4 os.path.join(DATA_PATH, file_name), config=config, plot=False)

/media/shane/HD/anaconda3/envs/nlp/lib/python3.7/site-packages/boxdetect/pipelines.py in get_boxes(img, config, plot)
107 # merge rectangles into group if overlapping
108 rects = group_countours(cnts_list)
--> 109 mean_width = np.mean(rects[:, 2])
110 # mean_height = np.mean(rects[:, 3])
111 # group rectangles vertically (line by line)

TypeError: tuple indices must be integers or slices, not tuple

Which configurations should I use?

I'm currently trying to get started with the package.
When I use the config's and try to get results using the steps in the readme, I just get WARNING: No rectangles were found in the input image., which is quite frustrating.

It's not very clear for me what exactly the different *config'*s do.
So it would be great to have some more in-depth documentation about that.
And it would be great to have some recommended settings which work for a wide variety of projects (although I guess that's not so easy).

Add an option to save/load configs

Make it possible to save/load configs to .yaml or .json files

Cumulative results?

I have 2 documents which are almost identical except one is a shrunk a bit and a bit more grainy.

Boxdetect can get all the checkboxes on DocA but only some of the checkboxes on DocB, with this configuration:

# important to adjust these values to match the size of boxes on your image
cfg.width_range = [(30, 70)]
cfg.height_range = [(30, 70)]

# w/h ratio range for boxes/rectangles filtering
cfg.wh_ratio_range = [(0.8, 1.2)]

# num of iterations when running dilation tranformation (to engance the image)
cfg.dilation_iterations = [1]
cfg.dilation_kernel = [(1,1)]

When I attempt to capture more checkboxes in DocB with this configuration:

# important to adjust these values to match the size of boxes on your image
cfg.width_range = [(30, 70),(40, 70),(40, 70)]
cfg.height_range = [(30, 70),(40, 70),(40, 70)]

# w/h ratio range for boxes/rectangles filtering
cfg.wh_ratio_range = [(0.8, 1.2),(0.8, 1.2),(0.8, 1.2)]

# num of iterations when running dilation tranformation (to engance the image)
cfg.dilation_iterations = [1,2,1]
cfg.dilation_kernel = [(1,1),(2,2),(1,4)]

I'm finding that the checkboxes boxdetect originally captured in DocA are no longer captured, though the original configuration is in index 0 of the configuration. Why is that?

Failure in UnitTests

Hi!

Been having this issue for a while. Aaside from an extra warning log while running and while running my unit tests, I always got this warning but no major issues locally and with my company Jenkins job.

But recently my Jenkins job started to reject me those tests, so I tested it downloading boxdetect, adding the recomendation from the first image and now everything is working fine!

(The failed test has nothing to do with boxdetect hehe)

Add missing docstrings

Need to make sure docstrings for all the functions and classes are in place

Removing noise while preserving the boundary of the checkbox

This is not a bug but a request for help.

I am trying to identify the checkboxes in the attached image (clip2.png). The top 4 are identified but the bottom 2 are not. I've tried various dilation and kernel sizes but I haven't been able to successful get the box. At the same time I would like to be able to get rid of the peppering to avoid false positives as there are other docs that have checkmarks that are much smaller.

I've attached the configuration ([boxdetect_cfg.yaml.txt) being used as well.

Any suggestion will be appreciated.

boxdetect_cfg.yaml.txt

Not detecting all the boxes

Hi,

Thanks for the amazing work! I am trying to use the config to detect boxes on my image but it's not detecting all the boxes.What i should change?

Not sure why it's not detecting some of the boxes.Any ideas?

from boxdetect import config

cfg = config.PipelinesConfig()

important to adjust these values to match the size of boxes on your image

cfg.width_range = (30,500)
cfg.height_range = (40,500)

the more scaling factors the more accurate the results but also it takes more time to processing

too small scaling factor may cause false positives

too big scaling factor will take a lot of processing time

cfg.scaling_factors = [1.0]

w/h ratio range for boxes/rectangles filtering

cfg.wh_ratio_range = (0.5,4.0)

group_size_range starting from 2 will skip all the groups

with a single box detected inside (like checkboxes)

cfg.group_size_range = (0,0)

num of iterations when running dilation tranformation (to engance the image)

cfg.dilation_iterations = 1

Strategies for getting accurate checkboxes on documents with Serif Font

Hi,

I am actually using this as a part of a OCR pipeline I am building in a commercial product. First I want to say thank you for building something that really works and is open source. AWS Textract does checkboxes at 6 cents a page, which is too expensive to be used at the load the project is going for. So thank you for this amazing library!

I was wondering if you had any insight about how to get the right parameters for an image of grainy quality. I have written an OpenCV pipeline that takes PDFs and splits them into images, then rotates them using a deskewing library, applies a crop, and then produces a cropped correctly rotated color image and a correctly rotated BW thresholded image. I am trying to run boxdetect on both color and thresholded images, and facing a few challenges.

I was wondering if you had some general tips on how to ascertain checkboxes only - I am picking up zeros, lowercase N's (especially with Serif fonts), and other things. I rarely get 4 checkboxes which is what there is in the sample image, I sometimes get 3, or 5, or 8, etc. I also confess I don't use the True/False too often, but I love the percentage feature and the cropped matrix of the box as I personally find it very accurate (I notice checked boxes typically are 55% as opposed to 25% black)

Since I have tally up the number of boxes I find and match them to a unfilled reference document and find the checkbox based on the percentage of the region, it is most important to avoid missing true positives, but it would also be nice to not have as many false positives.

Here are some parameters I am using, and here is a reference image (3 checkboxes are found except the one that says en rampant de toitures). In other images, n's and E's are picked:

rotated = {
"w": (25, 50),
"h": (25, 30),
"wh": (0.85, 1.15),
"scale": [1.2, 1.0, .9, .7, ],
"group": (2, 100),
"iterations": 5,
}
thresh = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 129, 27)

The post-rotated images come in a variety of sizes of 11 by 8.5 sheets with about +/-200 pixels of white padding, so it is difficult for me to have a width or height range, but wh ratio is generally easy, and I could stretch them to a defined width and height. I have actually found the px_threshold (I sometimes use 0.1, or 0.3) to be very helpful. How is that used exactly? Also, what do you think the recommended kernel is for a 3600 by 2600 image? Any help would be appreciated!

Failed detection of cropped image

Hello,
1.Boxdetect works well for the first image
Input:

Output:

2.Boxdetect fails to detect the checkboxes when used on crop of the image
Input:

Output:

Why does this happen? I want Boxdetect to be able to detect cropped images as well.

New release for scikit-learn installation

Hello maintainers,

I saw there was a commit recently to replace sklearn with scikit-learn in the repository's requirements, as the former is now deprecated. Will there be a new release (1.0.1?) that includes this change?

Thanks,

Tomi

Is there way to extract coordinates

Hi,

I am working with pdf files and I came across the box-detect library. Thanks for creating this amazing library. I am using the following PDF file
Form_49A.PDF
I am trying to annotate over the pdf file, however, for achieving this I was looking at ways to extract these boxes and then annotate. Is there a way to extract the coordinates for the boxes present in the pdf-file?

challenging case on checkbox crossing outside box

Dear Boxdetect friends and the author Karolzak,

My image is as shown attached, checked X crossing a little bit outside the box. So far we can not detect any checkbox, tried different configs. Hope to hear your expertise on the challenging case.

Best,
--Luke

using boxdetect in a lambda errors due to GUI artifacts

When trying to use boxdetect in an AWS lambda, this error occurs when deploying:

[ERROR] Runtime.ImportModuleError: Unable to import module 'pd_ocr/handlers/s3_write': libGL.so.1: cannot open shared object file: No such file or directory Traceback (most recent call last):

The error occurs because it is trying to use opencv artifacts necessary for GUI interface which is not needed in a lambda.

Proposed solution: boxdetect should use opencv-python-headless instead of opencv-python. That way these unneeded GUI artifacts are not included allowing boxdetect to be used in a lambda.

Potential bug in `boxdetect.pipelines.get_boxes()` when resizing the image

It seems like it's scalling the height (image.shape[0]) with scaling_factor and then passing this value as the desired width into imutils.resize()

boxdetect/boxdetect/pipelines.py

Lines 48 to 49 in 37147f0

    
           image = imutils.resize( 
        
               image, width=int(image.shape[0] * scaling_factor))

Proposed fix:

        image = imutils.resize(
            image, width=int(image.shape[1] * scaling_factor))

AttributeError: module 'boxdetect.config' has no attribute 'update_num_iterations'. Did you mean: 'dilation_iterations'?

I am trying to run a basic demo of the get_boxes method and there seems to be an internal error or maybe I'm missing a new parameter:

from boxdetect import config
from boxdetect.pipelines import get_boxes
import matplotlib.pyplot as plt

config.min_w, config.max_w = (20,50)
config.min_h, config.max_h = (20,50)
config.scaling_factors = [0.4]
config.dilation_iterations = 0
config.wh_ratio_range = (0.5, 2.0)
config.group_size_range = (1,100)
config.horizontal_max_distance_multiplier = 2

image_path = 'input\large.png'
rects, grouped_rects, org_image, output_image = get_boxes(image_path, config, plot=False)

print("Indv boxes (green):", rects)
print("Grouped boxes (red): ",grouped_rects)
plt.figure(figsize=(25,25))
plt.imshow(output_image)
plt.show()

==========================================
Processing file: input\large.png
Traceback (most recent call last):
File "c:\repos\wwex\boxes.py", line 15, in
rects, grouped_rects, org_image, output_image = get_boxes(image_path, config, plot=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\repos\wwex.venv\Lib\site-packages\boxdetect\pipelines.py", line 147, in get_boxes
cfg.update_num_iterations()
^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: module 'boxdetect.config' has no attribute 'update_num_iterations'. Did you mean: 'dilation_iterations'?

AttributeError: module 'boxdetect.config' has no attribute 'update_num_iterations'

Default config for vertical grouping has bad results for vertically aligned checkboxes

I am creating this issue to help anybody having the same issue with vertically aligned checkboxes not being detected well.

The group_size_range config option gets overwritten to a hardcoded value of (1, 1) at the start of the get_checkboxes pipeline. So setting that config option does nothing when using this function.

By default in the config the vertical_max_distance option is set to 10, meaning if you are trying to detect vertically aligned checkboxes (like in a form) it will give really bad results as it will see the whole column as a single group. I don't know if this is intended and what the use case is. I don't quite understand the grouping logic in the library.

Ways to fix it would be to either set this option to 0, and then find and filter out unwanted close detections with your own needed logic. Or copy over the get_checkboxes function without that first hardcoding line (but this might group horizontal checkboxes). I don't understand the difference between the vertical and the horizontal grouping but vertical grouping for checkboxes seems to be a bit faulty.

checkbox detect fails with sloppy crosses

Hi, amazing module! Very useful. I use it to detect student answers in a research project.

Some of the students make the crosses over the border of the checkbox. Most of the time those square checkboxes seem to be not detected, many others get detected reliably. I tried many different values in the config. Any suggestions on how to proceed a bit smarter than just guessing values?
I attached some images of undetected and detected ones (the blue marks are made by the program to do a manual check. If a checkbox is detected as "checked" the blue square is filled out).

cfg.width_range = (25,42)
cfg.height_range = (25,42)
cfg.scaling_factors = [0.5]
cfg.wh_ratio_range = (0.3, 1.6)
cfg.dilation_iterations = 0

Check box mapping with text

Is there any way to mapping of word or text with the checkbox, which check-box relate to which text,

City : NEWTON Pin code : 07860

Add full tests coverage

Need unit tests for all the functions

	image = imutils.resize(
	image, width=int(image.shape[0] * scaling_factor))