This Python script is designed to solve CAPTCHAs using image processing techniques and OCR (Optical Character Recognition). It provides a method to preprocess CAPTCHA images, align letters for better OCR accuracy, and extract text from CAPTCHA images. It scrape data from thewebsite using Selenium. It automates the process of solving CAPTCHA, entering company details, and downloading the payment details Excel file.
The CAPTCHA solver script performs the following tasks:
- Preprocesses CAPTCHA images to enhance readability.
- Aligns letters in CAPTCHA images to improve OCR accuracy.
- Extracts text from preprocessed CAPTCHA images using OCR.
- Provides a method to integrate with Selenium for solving CAPTCHAs in web automation tasks.
Screen.Recording.2024-02-27.at.2.15.57.AM.online-video-cutter.com.mp4
- Python 3.11 or higher
- Chrome driver
requirements.txt
file should be verified before running the code
- Clone the repository.
- Install the required dependencies from
requirements.txt
. - Run the
main.py
file.
The script performs the following tasks:
- Solves the CAPTCHA automatically using image processing techniques.
- Preprocesses CAPTCHA images to enhance readability.
- Aligns letters in CAPTCHA images to improve OCR accuracy.
- Extracts text from preprocessed CAPTCHA images using OCR.
- Enters the company name and solves the CAPTCHA on the EPFO website.
- Navigates to the payment details page and downloads the Excel file containing payment details.
main.py
: Main Python script to initiate the scraping process.requirements.txt
: File containing the required Python packages.data/
: Directory where the downloaded Excel file is stored.utils.py
: Helper functions for image processing and CAPTCHA solving.
- Install the required dependencies using
pip install -r requirements.txt
. - Execute
python main.py
. - Follow the on-screen instructions.
Unit tests are provided to verify the correctness of the scraped data. Run the test_scrape_data()
function after scraping to ensure data integrity.
This CAPTCHA solver is designed for learning purposes and should not be used for any illegal or unethical activities. Respect the terms of service of websites and APIs when using CAPTCHA solving techniques.