Certainly! Here's a clear and concise README file for your project:
This project leverages OpenCV, EasyOCR, and Google's Text-to-Speech (gTTS) to read and annotate text from an image and convert detected text to speech. The recognized text is highlighted in the original image, and the extracted text is saved as an audio file.
- Text Detection: Detects text from an input image using EasyOCR.
- Text Annotation: Highlights detected text with bounding boxes and labels it with the recognized text.
- Text-to-Speech Conversion: Converts all detected text to an audio file using gTTS.
Before running the project, ensure you have the following installed:
- Python 3.6+
- OpenCV (
opencv-python
) - EasyOCR
- matplotlib
- gTTS
You can install the required libraries with:
pip install opencv-python easyocr matplotlib gtts
-
Prepare Your Image: Place your target image in a known directory and set its path in the
image_path
variable within the script. -
Run the Script: Execute the script to detect and annotate text, then save it as an audio file.
python text_detection_to_speech.py
- Output:
- Annotated Image: Displays the image with highlighted text.
- Audio File: Generates an audio file (
output_audio.mp3
) containing the detected text.
- Confidence Threshold: The confidence threshold for accepting recognized text can be adjusted using the
threshold
variable. - Language Support: EasyOCR supports multiple languages. Modify the language list in
easyocr.Reader
as needed.
When you run the script, you will see an annotated image showing the detected text and an audio file containing the recognized content.
Let me know if you'd like to add more information or details!