Giter Site home page Giter Site logo

kekyei / airline-booking-prediction Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 3.66 MB

This repository contains the code and resources for a flight booking predictive model. The model is trained on customer booking data and is designed to predict the likelihood of a customer making a booking. The model is evaluated on its accuracy and f1 score. The findings are summarized in a presentation slide for management

Jupyter Notebook 100.00%
airline-booking datascience machinelearning machinelearningmodel customerbooking

airline-booking-prediction's Introduction

Flight Booking Predictive Model

Problem Statement

Customers are more empowered than ever because they have access to a wealth of information at their fingertips. This is one of the reasons the buying cycle is very different to what it used to be. Today, if you’re hoping that a customer purchases your flights or holidays as they come into the airport, you’ve already lost! Being reactive in this situation is not ideal; airlines must be proactive in order to acquire customers before they embark on their holiday.

This is possible with the use of data and predictive models. The most important factor with a predictive model is the quality of the data you use to train the machine learning algorithms. For this task, you must manipulate and prepare the provided customer booking data so that you can build a high-quality predictive model.

Objectives

  • Explore and prepare the customer booking data for use in a predictive model
  • Train a machine learning model to predict the likelihood of a customer making a booking
  • Evaluate the model's performance and interpret the results to understand the contributions of each variable to the model's predictive power
  • Summarize findings in a single slide for presentation to management

Evaluation Criteria

  • Accuracy of the predictive model
  • Interpretability of the model and its contributions from each variable
  • Quality of the summary slide presentation

Data Description

The dataset for this project is a customer booking data provided in the Customer Booking.csv file. It includes various features such as customer demographics and past booking information.

  • num_passengers = number of passengers travelling
  • sales_channel = sales channel booking was made on
  • trip_type = trip Type (Round Trip, One Way, Circle Trip)
  • purchase_lead = number of days between travel date and booking date
  • length_of_stay = number of days spent at destination
  • flight_hour = hour of flight departure
  • flight_day = day of week of flight departure
  • route = origin -> destination flight route
  • booking_origin = country from where booking was made
  • wants_extra_baggage = if the customer wanted extra baggage in the booking
  • wants_preferred_seat = if the customer wanted a preferred seat in the booking
  • wants_in_flight_meals = if the customer wanted in-flight meals in the booking
  • flight_duration = total duration of flight (in hours)
  • booking_complete = flag indicating if the customer completed the booking

Resources

Evaluation Metric

The primary evaluation metric for this project will be the accuracy of the predictive model. This will be measured through cross-validation and the calculation of appropriate evaluation metrics such as precision, recall, and F1 score.

Plotting the important features for the model

r

Plotting Categorical values

def plot_categorical_distribution(data: pd.DataFrame = None, column: str = None, height: int = 8, aspect: int = 2):
"""
Plot the distribution of a categorical variable
:param data: The dataframe containing the data
:param column: The column to plot
:param height: The height of the plot
:param aspect: The aspect ratio of the plot
:return: None
"""
sns.catplot(
    data=data,
    x=column,
    kind='count',
    height=height,
    aspect=aspect,
    order=data[column].value_counts().iloc[:10].index
).set(title=f'Distribution of {column}')

output 1

output 2

output 3

output 4

output 5

output 6

output 7

output

Plotting Continuous variables

def plot_continuous_distribution(data: pd.DataFrame = None, column: str = None, height: int = 8):

"""
Plot the distribution of a continuous variable
:param data: The dataframe containing the data
:param column: The column to plot
:param height: The height of the plot
:return: None

"""
sns.displot(data, x=column, kde=True, height=height, aspect=height/5).set(title=f'Distribution of {column}')

1

2

output

Plotting the Correlation of the variables

def correlation_plot(data: pd.DataFrame = None):
"""
Plot the correlation matrix of the data
:param data: The dataframe containing the data
:return: None
"""
corr = data.corr()
corr.style.background_gradient(cmap='coolwarm')
sns.heatmap(corr, xticklabels=corr.columns.values, yticklabels=corr.columns.values, annot = True, annot_kws={'size':10})
# Axis ticks size
plt.xticks(fontsize=10)
plt.yticks(fontsize=10)
plt.show()

corr

airline-booking-prediction's People

Contributors

kekyei avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.