This project extracts, transforms, and loads (ETL) data related to the market capitalization of the largest banks. The data is fetched from a web archive of a Wikipedia page, transformed according to specified exchange rates, and then loaded into a CSV file and an SQLite database.
- Clone the repository:
git clone https://github.com/ash-codess/etl-python.git
- Navigate to the project directory:
cd etl-python
- Install the required dependencies:
pip install -r requirements.txt
- Ensure you have the exchange rates CSV file (
exchange_rate.csv
) in the project directory. - Run the main script:
python main.py
Fetches the HTML content of the given URL.
Extracts a table from the given HTML content based on the provided table attributes.
Extracts a table from the given URL and attributes.
Transforms the 'Market cap (US$ billion)' column in the DataFrame to different currencies.
Saves the DataFrame to a CSV file.
Saves the DataFrame to an SQLite database table.
Runs an SQL query on the database and returns the results.
For questions or suggestions, please contact [email protected].