Giter Site home page Giter Site logo

flight-explorer's Introduction

Comparing Kayak flight prices โœˆ

Goals

  • Collect and analyze plane ticket prices - scraped from Kayak Explore - to determine whether factors like day of the week and time of the day directly influence ticket prices.
  • Find low ticket prices programatically for my own use ๐Ÿ˜
  • Set up an email alert system to let me know when the ticket prices are lower than usual (or ever) for some route I'm interested in.

Obs: Why not just use Google Flights or other competitor, you might ask? With this project, I can track price changes and receive alerts for 100s of routes compiled in a single email without having to set up several alerts. And also, I want the expertise of integrating several systems and architectures using Python, SQL, BigQuery, GitHub Actions (and more to come).

Airports of interest so far:

  • Porto (OPO)
  • Lisbon (LIS)
  • Madrid (MAD)
  • Milano Malpensa (MXP)
  • Napoli (NAP)

Challenges

In addition to the obvious learning curve of each tool and package used:

  • Security - Concerns regarding secure access to Google Cloud account to use a BigQuery database (using Github Secrets)
  • Scalability - Pretty soon we could be dealing with hundreds of files and thousands of rows
  • Resource efficiency - Reducing BigQuery storage and queries must be a priority

Roadmap

Should (and will) do it:

  • Setting up automated scraper script ( Python )
  • Determine criteria for finding out if a price should be reported
  • Test e-mail output
  • Integrate Github Actions and Google Big Query
  • Determine price baseline for each destination
  • Read baselines from files stored in Github
  • Setting up alert system that will send automated e-mail when tickets are available at good prices
  • Save new baselines to Google Bigquery
  • Read baselines from Google Bigquery at the end of each execution
  • Store results in a Google Bigquery database
  • Add one airport of interest: Sรฃo Paulo
  • Adjust small details on email sent with deals:
    • Add weekday
    • Order by price (asc)
    • Add number of days
  • Design diagram explaining project/data flow
  • Create slide deck with project summary and goals attained
  • Set up new automation to acquire data from another of Kayak's APIs - that one shows every possible flight on a certain route
  • Decide what to do regarding large files resulted from calling the second API (1.5K rows for each route)

Would be nice:

  • Create parameter file for reading configs like departing airports and filter conditions (time of the year, one-way or round trip, etc.)
  • Create parameter with different email addresses to send results for each city
  • Streamlit page to show results
  • Powerpoint presentation highlighting challenges, solutions and results
  • Ad-hoc analysis to determine with a good amount of certainty best times to buy

Automated data collection task:

Scraping flights from Kayak

Limitations

  • No way to obtain historical prices as the API used only returns upcoming flights.
  • Straightforward logic for determining which prices to report: simply reports flights which prices are lower than historical minimum prices for that route/month

flight-explorer's People

Contributors

actions-user avatar rafabelokurows avatar

Forkers

sandy4321

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.