Given an image containing a paragraph written in Arabic, the system classifies the paragraph into one of four fonts (from 0 to 3) with classical machine learning techniques.
Font Code | Font Name |
---|---|
0 | Scheherazade New |
1 | Times New Roman |
2 | Lemonada |
3 | IBM Plex Sans Arabic |
This preprocessing module is designed to enhance the quality of images containing text before further analysis or processing. Here's a breakdown of the module:
- Salt and Pepper Noise Detection and Removal:
- The detect_salt_and_pepper function assesses if the image contains salt and pepper noise.
- If detected, it applies a median filter (median_filter) to reduce the noise.
- Binarization:
- The binarizeImage function converts the grayscale image to a binary image, making text clearer for extraction.
- It ensures the image is in black text on a white background.
- Hough Line Transform:
- The hough_transforms function detects lines in the binary image using the Hough line transform.
- It rotates the image to align detected lines horizontally.
- Text Orientation Correction:
- The pytesseract_orientation function uses Tesseract OCR to detect the text orientation.
- It rotates the image based on the detected orientation for proper alignment.
- Image Preprocessing Pipeline:
- The preprocess function orchestrates the entire preprocessing pipeline:
- Loads the image.
- Detects salt and pepper noise and removes it if present.
- Binarizes the image.
- Applies the Hough transform to align text horizontally.
- Corrects the text orientation.
- Saves the preprocessed image.
- Segmentation of Text
- The find_contours function identifies contours within the thresholded and dilated image.
- Contours represent the boundaries of distinct objects or regions within the image.
- It filters out small or insignificant contours based on width and height thresholds.
- For each valid contour, it creates a bounding box around the region of interest and saves it as a separate image file in a specified output directory
- The function returns the count of extracted regions.
We tried the following approaches:
- Horizontal and vertical Histogram
- Entropy
- HoG
- SIFT
- Local Phase Quantization (LPQ) After comparing results of the listed above approaches we decided to use LPQ as it yields the best results
We tried the following approaches:
- KNN
- SVM
- Decision Tree
- Random Forest After comparing results of the listed above approaches we decided to use Random Forest as it yields the best results with LPQ
Testing the above classifiers we got the following results:
- Total Accuracy of KNN = 91.54481842707052 %
- Total Accuracy of Decision Tree = 93.34605637273302 %
- Total Accuracy of SVM = 71.19171382008213 %
- Total Accuracy of Random Forest = 98.80024797790851 %
Our System scored an accuracy of 98% on test data.
Ahmed Samy |
Kareem Samy |
Nancy Ayman |
Yara Hisham |
This software is licensed under MIT License, See License