Giter Site home page Giter Site logo

Comments (4)

githubharald avatar githubharald commented on July 20, 2024

thanks for reporting this.
Interestingly, on my machine imread (OpenCV 3.0.0) returns None in case of damaged images (even if I use Python 2.7.12). I handled these damaged images in the function preprocess.

OpenCV doc say the following about imread:

The function imread loads an image from the specified file and returns it. If the image cannot be read (because of missing file, improper permissions, unsupported or invalid format), the function returns an empty matrix ( Mat::data==NULL ).

So, even though they talk about the C++ API, they say that damaged images yield empty matrices (which might be translated to None in Python). So there shouldn't be any crash.

Now the question is: why is your imread not returning None but crashing the program? I don' think that this is the desired behaviour. What happens if you try to open the damaged image directly from the Python 2.7 interactive interpreter:
>>> print(cv2.imread('r06-022-03-05.png'))
It simply prints None on my machine.

As a work-around, I would suggest checking for damaged files (files with 0 size) directly when reading the file words.txt, e.g. by adding the following code just before adding the word to the list of samples. This way, the check is only done once at the beginning of training.

# check if image is not empty
if not os.path.getsize(fileName):
	print('damaged img file:', fileName)
	continue

It outputs the following two damaged files on my machine which won't be used for training now:
damaged img file: ../data/words/a01/a01-117/a01-117-05-02.png
damaged img file: ../data/words/r06/r06-022/r06-022-03-05.png

from simplehtr.

Chazzz avatar Chazzz commented on July 20, 2024

Thanks for the detailed response. Based on the docs, sounds like a bug in cv2. Your proposed workaround resolves the cv2.imread issue on my system, and is significantly more mergable than my fix. My only suggestion would be to indicate to the user that the 2 damaged image files being detected is expected behaviour.

from simplehtr.

githubharald avatar githubharald commented on July 20, 2024

merged your PR and I additionally used __future__ for print and division. Please let me know if you encounter any more problems with Python 2.

from simplehtr.

Chazzz avatar Chazzz commented on July 20, 2024

Good idea. Builds and runs without errors.

from simplehtr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.