Giter Site home page Giter Site logo

siegelpaul / api_project_aviation_weather Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 1.64 MB

A look at American Airlines operations data during and after a significant weather event in 2019.

License: MIT License

Jupyter Notebook 99.88% Python 0.12%
api aviation aviation-data aviation-weather sql

api_project_aviation_weather's Introduction

SQL-API project

It's project time again ๐ŸŽ‰!
In this project you will combine your SQL, Python and API skills and use them in combination with each other.

Objective

As we have learned, the two main tools of Data Analysts are SQL and Python.
In the last lectures and exercises, you have learned how to use SQL, how to get data out of a database into a pandas dataframe and how to enrich your data with the help of APIs. And now you should do it altogether.

Task Overview

The Research Center for Aerospace (RCA) where you work for as Data Analyst wants to keep track of accumulated flights data in combination with weather data. Your task is to find a situation where the weather has impacted on flight performance and use this to contribute some knowledge about how different weather affects flights in different cities.

Setting-up working environment

  1. Fork this repository in Github - but: team members work best on one repository collaboratively.
  2. Open VS Code and a terminal window in VS Code
  3. It's up to you: Use your existing nf_sql environment from our exercises or(!) create a new conda environment for this project. In both cases you can install any additional packages needed for doing some EDA and visualizations.
# Create new environment, eg.:
conda create --clone nf_sql --name <your_env_name>
# Activate environment 
$ conda activate <your_env_name>
# Installing packages 
$ conda install <package_name>

Activate your environment.

  1. Add your sql_functions.py file (and maybe your .env) from the external data sourcing notebooks:
$ cp ../da-external-data-sourcing/.env .
$ cp ../da-external-data-sourcing/sql_functions.py .
  1. Check your .gitignore in order to avoid pushing credentials to GitHub.
  2. Open one or more new notebook(s) in VS Code to work in.

Task steps

  1. Choose a historical weather event that occured in the United States sometime in the past 30 years that you believe would have led to the cancellation of flights.

  2. Get data on flights & Set up a connection to our SQL database.
    a. As described in this notebook, download csv file containing flights data for the specific years and months you need from the Bureau of Transportation Statistics website.
    b. Clean your data (e.g. specify which columns you want to keep, rename columns etc.).
    c. Reduce your dataframe to include at least 5 origin airports (choose either big cities or locations from this list of locations with weather stations). You can expand your dataset to include more locations or destinations if this is helpful in your analysis.
    d. Make an EDA on the flights data you have downloaded to explain what data you have and any unexpected findings.
    e. Connect to database and join the data with the airports table of our database to get the latitude, longitude or city names for the airports in your dataset.

  3. As next step, get historical weather data using the Meteostat API.
    a. Follow the steps here and sign-up to RapidAPI in ordet to get access to the Meteostat API
    b. Read the docs to find out what the call limits are for this API. Ensure your data retrieval needs (including testing) fit within these limits.
    c. Use your API key to get weather data for your chosen month/year and locations.
    d. If necessary flatten your JSON data and transform it into a DataFrame for future analysis.
    e. Make sure to have primary and foreign keys so that it's possible to join the weather data to your flights data.

  4. Perform a basic EDA on both of the tables.
    a. Come up with three different hypotheses regarding your available data, taking into account both of the datasets you have. You should start with asking these questions: "Can we see the weather event in the weather data?", "Can we see the weather event in the flights data?", "Can we see a correllation between the data".
    b. Go deeper into your hypotheses (perhaps linking dep_delay to weather) and clearly outline your findings (either that everything is as expected or any unexpected results).

Deliverables

  1. Clean and structured .ipynb notebook containing the (well-documented) code to connect to database as well as API as well as the required EDA part.
  2. ~10-minutes technical presentation (eg. via google slides) to your colleagues, presenting the results of your data exploration and answering your hypotheses.

And:
Keep in mind that your API calls are limited!
When possible, separate code calling the API from other code working on the data.

api_project_aviation_weather's People

Contributors

siegelpaul avatar

Watchers

 avatar

api_project_aviation_weather's Issues

Looping over multiple lists

Taking an example of your code looping over three lists.

lat =["41.97694","35.21375","32.897233","32.897233","39.872084","33.434278"]
lon =["-87.908149","-80.949055","-97.037694","-80.290115","-80.290115","-112.011582"]
airport_code = ["ORD","CLT","DFW","MIA","PHL","PHX"]

weather_daily_df=pd.DataFrame([])
for i in range(6):
    lat[i]
    lon[i]#this for loop will go through lat and lon as a pair
    url = "https://meteostat.p.rapidapi.com/point/daily"

    querystring = {"lat":lat[i],"lon":lon[i],"start":"2019-01-15","end":"2019-01-25"}

    headers = {
	    "X-RapidAPI-Key": "51ab1b611dmshe9b71ca60f61eb3p163ecbjsn00b9e87566c4",
	    "X-RapidAPI-Host": "meteostat.p.rapidapi.com"}

    response = requests.get(url, headers=headers, params=querystring)
    weather_dtd= response.json()
    weather_dtd_df = pd.json_normalize(weather_dtd,
                                    sep='_',
                                    record_path='data')
    weather_dtd_df['faa']=airport_code[i]#creat a column called faa for airport code 
    weather_daily_df = pd.concat([weather_daily_df,weather_dtd_df])
    #print(json.dumps(response.json(), indent=3))
    time.sleep(1)

Instead of iterating over an index like you doo in this example you can combine the 3 lists into a single lists that provides all three items of each list per iteration. Functionally the same but bit more elegant and nicer to read. In addition you can use the "unpacking" of sequences in python for even better readability.
See below.

latitudes =["41.97694","35.21375","32.897233","32.897233","39.872084","33.434278"]
longitudes =["-87.908149","-80.949055","-97.037694","-80.290115","-80.290115","-112.011582"]
airport_codes = ["ORD","CLT","DFW","MIA","PHL","PHX"]

# loop with no unpacking
for items in zip(latitudes, longitudes, airport_codes):
    lat = items[0]
    lon = items[1]
    airport_code = items[2]
    ...

# loop with sequence unpacking!
for lat, lon, airport_code in zip(latitudes, longitudes, airport_codes):
    ...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.