Comments (4)
thanks for reporting this.
Interestingly, on my machine imread (OpenCV 3.0.0) returns None in case of damaged images (even if I use Python 2.7.12). I handled these damaged images in the function preprocess.
OpenCV doc say the following about imread:
The function imread loads an image from the specified file and returns it. If the image cannot be read (because of missing file, improper permissions, unsupported or invalid format), the function returns an empty matrix ( Mat::data==NULL ).
So, even though they talk about the C++ API, they say that damaged images yield empty matrices (which might be translated to None in Python). So there shouldn't be any crash.
Now the question is: why is your imread not returning None but crashing the program? I don' think that this is the desired behaviour. What happens if you try to open the damaged image directly from the Python 2.7 interactive interpreter:
>>> print(cv2.imread('r06-022-03-05.png'))
It simply prints None on my machine.
As a work-around, I would suggest checking for damaged files (files with 0 size) directly when reading the file words.txt, e.g. by adding the following code just before adding the word to the list of samples. This way, the check is only done once at the beginning of training.
# check if image is not empty
if not os.path.getsize(fileName):
print('damaged img file:', fileName)
continue
It outputs the following two damaged files on my machine which won't be used for training now:
damaged img file: ../data/words/a01/a01-117/a01-117-05-02.png
damaged img file: ../data/words/r06/r06-022/r06-022-03-05.png
from simplehtr.
Thanks for the detailed response. Based on the docs, sounds like a bug in cv2. Your proposed workaround resolves the cv2.imread issue on my system, and is significantly more mergable than my fix. My only suggestion would be to indicate to the user that the 2 damaged image files being detected is expected behaviour.
from simplehtr.
merged your PR and I additionally used __future__
for print and division. Please let me know if you encounter any more problems with Python 2.
from simplehtr.
Good idea. Builds and runs without errors.
from simplehtr.
Related Issues (20)
- TypeError: a bytes-like object is required, not 'NoneType' (dataloader_iam.py line 119) HOT 4
- Blank line filter in dataloader doesn't quite work HOT 1
- Deep Stream HOT 1
- How to use in ML.NET c#?
- unable to build wheel for word_beam_search HOT 1
- Where can I find the tagset.txt file HOT 3
- Add feature to save train loss in summary + minor bug fix HOT 2
- Data visualization HOT 2
- pip install error: ModuleNotFoundError: No module named 'patch_ng' HOT 1
- which version of python used?
- Add cudnn64_8.dll to the Windows/System32 folder, otherwise the program cannot run properly.
- How to convert checkpoint to ONNX HOT 2
- Outdated version of tensorflow used HOT 6
- Training Model
- Training the model from scratch and error "model not found" HOT 4
- Wrong detection of words in model validation HOT 8
- Missing CITATION.cff file for repository HOT 1
- where is json HOT 9
- Hello, I'm sorry to disturb you again. How to make a front-end webpage for this project, which only needs to be able to open locally. Could you please teach me? HOT 1
- performance evaluation of the experimental results HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from simplehtr.