This project aims to determine the factors that influence tip amount in services. The datasets used involve the New York City taxi, retrieved from the Taxi and Limousine Commission: https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page.
There are many possible factors that influence how much money customers and/or passengers tip. Using the New York City taxi dataset as an example, we can find the factors that influence tip amount, and how much they influence tip amount. The dataset used is large and raw, and pre-processing has been made.
The datasets used are:
- Yellow Taxi Trip Records 2015: Contains data for all recorded yellow taxi trips in 2015. Found in 'pre_processed_dataset.csv'.
- Borough Labor Force Data: Contains data about the number of people employed and New York City's unemployment rate. Found in 'Revised 2014-2018 Borough Labor Force.xls'. The folder 'Visualisation' contains all visualisations used in this project. 'Code.ipynb'are the Python code used for this experiment, while all R code are provided in the report.
- Python 3
- R
- Dr. Chris Culnane
- The University of Melbourne