Contents
- About The Project
- Scope of the Project
- Scope of the Project
- Data Sources
- System Requirements
- Python Libraries
- Steps To Run
- ER Diagram
- Processed Covered
- This project aims to determine the factors affecting the customer experience throughout the flight journey of the passengers and further provide recommendations to improve the customer experience by analyzing the data collected.
- There are various factors that play a crucial role in determining a passenger's soothing travel experience.
- Customer experience being one of them, plays an important role in determining what are the certain drawbacks that the aviation industry faces in terms of meeting the expectation of its passengers right from the time they book a flight, until they reach their destination.
- The travel experience of the passengers can be defined through services such as flight booking, baggage handling, availability of the desired meal, comfortable seat allotment, washroom facility, safety assurance and many more.
The scope of our project includes the following
- Database design
- Web scraping and data cleaning
- Analyzing factors affecting customer satisfaction and experience for various airline
- Providing suitable recommendations for improving customer experience Determining fruitful data points to analyze why an airline is succeeding/failing
- Skytrax Website
- Kayak Website
- Kaggle,csv datasets
- python 3.9
- pip 22.2 or above
- Jupyter
- sqlalchemy
- pymysql
- Pandas
- numpy
- Re
- Sntwitter
- Snscrape
- Run DDL commands - ./SQL/Table DDL/create_tables.sql
- Run Jupyter file to insert CSV files - ./Python Scripts/Data Ingestion Scripts/data_ingestion.ipynb
- Run insert_flight DML - ./SQL/Table DDL/insert_flight.sql
- Run views, triggers and indices - ./SQL/Table DDL/
- Run all queries - ./SQL/use_cases.sql
- Data for passenger reviews extracted from Skytrax Website.
- Python script written in Jupyter notebookto extract data using python library Beautiful Soup, Scrapy, Selenium.
- Flight Details and information extracted from Kayak Website.
- Further Datasets collected from GitHub, Kaggle and other sources.
- Jupyter Scripts created to cleaning, munging the extracted data.
- Script created to populate data collected after scrapping to the Airline Database.