Giter Site home page Giter Site logo

openaq-api's People

Contributors

andrewharvey avatar danielfdsilva avatar dolugen avatar jflasher avatar kamicut avatar nickolasclarke avatar olafveerman avatar rocketd0g avatar russbiggs avatar sethvincent avatar sruti avatar vgeorge avatar webbkyr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

openaq-api's Issues

Nicer error message

Return a nicer error message when specifying a source that doesn't exist:

$ node fetch.js --dryrun --source=au.json
--- Dry run for Testing, nothing is saved to the database. ---
/home/olaf/projects/openaq-api/fetch.js:46
    var adapter = findAdapter(source.adapter);
                                    ^
TypeError: Cannot read property 'adapter' of undefined
    at /home/olaf/projects/openaq-api/fetch.js:46:37
    [...]

@jflasher

Store organization / provider

For proper attribution, it could make sense to store the organization providing the data. This could either be set on source level, or set per measurement by the adapter.

In the case of the Dutch (#25) and the Chilean (#29) data, there is a difference between data provider and the maintainer of the station.

@jflasher @RocketD0g Any idea if and how you want to store this?

Comment from Knight Foundation Application - Chisato

As a social scientist conducting research on air pollution management in Ulaanbaatar, Mongolia, I highly commend this Open AQ initiative. Air pollution is certainly the most pressing environmental health issue in this capital city. Since 2012, there have been increased efforts to improve air quality monitoring by installing more air quality monitors throughout Ulaanbaatar. However, more monitors do not necessarily catalyze better data sharing. Even if these technologies produce reliable, real-time air quality data, this data will not make an impact on the research community, development field, and most importantly, the public-at-large if there is no sustainable, user-friendly, robust system set in place for data sharing. After all, air pollution is an inherently social problem. We need to connect the data to people. I believe that Open AQ would provide this foundational connection. Open source sharing of air quality data would create and sustain a global commitment around air pollution issues -- connecting people across cities and regions to examine the various ways to tackle this environmental challenge. The Open AQ initiative also values the socio-cultural dimensions of the air pollution issue. Different cultures with different political-economic systems approach the air pollution problem, it's management, and potential solutions in different ways. Open AQ will host it's first workshop in Ulaanbaatar this November with the goal of bringing together local experts, media, and community members together to develop, dispute, and deploy strategies that would best suit Ulaanbaatar. This demonstrates that the Open AQ initiative will engage with local communities as key members of developing this platform.

As a suggestion, (once the air quality data is calibrated and complete) I recommend including a section or layer on guidelines and policies from different countries. I think it would be beneficial to make information about different interventions (especially related to health) accessible. How is Delhi tackling the air pollution issue? Can Mexico City use the same model? Why are Chinese residents wearing masks but not residents in Jakarta? People across the globe can learn how different governments are tackling the air pollution problem and/or how local communities are using the data to hold different actors and institutions accountable for air pollution reduction. For example, a lot air pollution protection/air pollution-induced illness information is not readily available or part of the public discourse. In order for people to take ownership over their health as inhabitants on polluted cities, I think that a guidelines "layer" would catalyze more urgency of this issue and strengthen efforts to improve air quality from within communities. I foresee a global "blog" on air pollution issues where people discuss, debate, and learn from each other on how to best tackle the air pollution problem in their own communities.

Add timeouts to the requests in adapters

If we don't set timeouts on the requests, Heroku may time us out which feels worse. Makes me wonder if we should have a system-wide request object that gets passed around so we can set defaults in one place?

Comment from Knight Foundation Application - MH

Connecting individuals (including the media, politicians, citizen-scientists, and even other scientists) with scientific data is a constant challenge, and OpenAQ is an outstanding leap forward. By providing data in a programmatic method that ANYONE can utilize, you all are setting the standard for how this should be done. High school student writing about air pollution downwind from a power plant in your neighborhood? Click a button and get data. Reporter writing a story on how pollution in your city compares to another across the word? Click two buttons and download the data. Scientist wanting to do complex queries across multiple locations, adjusting for seasonality and time of day? Incorporate the API into your Python script.

One suggestion for long-term, future work. I agree with many of the other commenters that OpenAQ would make for a fantastic platform to extend to additional types of data. Meteorological, sea ice, and terrestrial flux data, for example, can be difficult data sets to access for both non-scientists and scientists alike. Often this data is squirreled away on a server in a proprietary format, is confusing to access, is not available by API, etc. Your platform could and should set the standard for how all types of scientific data is made easily and quickly accessible to the public.

How to include alternate unit sources

This just came up when looking to include Chilean data #37. For some of the measurements, they're reporting data in ppb or ppm (it looks like we may also be able to get it in ug/m3, but for the sake of argument, forget about that). We have the unit field in the measurement record for exactly this scenario, but do we actually want to use it? If some of the sources are reporting in ug/m3 and some are reporting in ppb or another alternate unit, it would seem to severely lessen the ability to directly visualize the data next to each other.

Is there an easy way to convert between ppb and ug/m3 or should we even do that?

cc/ @RocketD0g @olafveerman

Better email handling

Too many emails are getting sent, figure out a more sane way to handle this. They are currently disabled via Heroku scheduler task until this is fixed.

2015 GBD/WHO template for including data in their global databases

No Immediate Action Intended - Background Info

FYI, a useful template of the type of information collected for the upcoming 2015 WHO and GBD global databases of annual average PM2.5 and PM10 pollution is below (They are primarily on the search currently for 2014 data). I can't find the issue, but I think @olafveerman brought up the categorization of sites before (e.g. what is the criteria for residential, urban, industrial, etc?). It has been indicated there is not strict criteria for this currently and countries are directed to fill out the template using their best judgement.

http://www.who.int/entity/phe/health_topics/outdoorair/databases/PHE-Template-OAP-database-entries-June2015.xls?ua=1

Automatically create csv of last day's data and put on S3

Paraphrasing Slack conversation:

Basically, I’d like some way to dump either/both database dumps and daily/weekly/monthly csv dumps to an S3 bucket and make them available for easy download. Want to make it easy for someone to grab all of our data at once, and that’s probably not through the API.

Documentation

I'm working on some project related documentation. Mostly a glossary of the project (source, station, measurement), the application's flow and some guidelines on how to contribute.

I can imagine this living in a couple of places:

  1. in the wiki of the openaq.github.io repo
    Advantage: easy to edit
  2. as a chapter on the Open AQ website
    Advantage: easy to read, Disadvantage: less easy to contribute
  3. as markdown files in the /docs folder of a repo
    Disavantage: not easy to read, not easy to contribute

@RocketD0g @jflasher Any thoughts on how you want to set this up?

Handle dates in a better way

We need to do some thinking about how to best handle dates across the platform. Dates should be stored in UTC in the database, but we probably need to keep some track of timezones for location and whether it supports DST? ugh.

Chile data sources

Sinca is the Chilean AQ information system. It contains measurements from 194 stations, including the one in Valdivias (see #28).

Have to check:

  • the license under which this is published
  • whether there is an API we can use

Belgian sources

Measurements for a lot of Belgian measuring stations: http://www.ircel.be/nl/luchtkwaliteit/metingen
This page shows rolling averages for most parameters.

When you drill down, it's possible to get the actual hourly measurements and not the rolling averages. For example:

  1. go to this page
  2. click on table with detailed info per monitoring site

Have not found a programmatic way to access this data, it might need to be scraped.

Turkey - Sources

Map of stations with coordinates and current readings and stations' current data are here (though not on unique urls):

http://www.havaizleme.gov.tr/Default.ltr.aspx

'TÜM İSTASYONLAR' = all stations

Clicking on the stations reveals site coordinates (click 'station description) and pollutant types measured.

(Sidenote: They use an AQI system with breakpoints same as the US EPA)

Comment from Knight Foundation Application - Lodoysamba

This is a constructive comment that was posted on our open Knight Foundation News Challenge (url at bottom).


This is an important work as scientists, researchers, and students should have access to air quality data via an internet portal. This is crucial for timely review and analysis at both the level of individual cities and regions, and at the international level. Superficial reports of atmospheric conditions solely from air quality stations are not adequate. Air quality depends not only on source emissions but also on weather conditions, population activity, and other factors. In some cases, initial data are not always available. In other cases, air quality offices refuse to share their data. Hence, while the environmental scientist aims to analyze and interpret the data, these problems of poor data quality and restricted access injure the scientist’s ability to generate quality measures to reduce air pollution. On the other hand, when comprehensive scientific data is available, policymakers and air quality officers are better able to orient their strategies to reduce air pollution.
An internet forum to warehouse data will help scientists from all nations to learn from one another. Such a forum would lead to improved methods to record data, analyze data, and utilize data more efficiently. By accessing a data warehouse, scientists in less developed countries could quickly learn how reduction measures affect air quality in other cities around the globe.
Furthermore, many students and researchers from non-environmental disciplines could also find value in the data. Mathematicians, for example, could use these types of data sets to improve tools of statistical data analysis.
The prototype http://openaq.org contains data for some analysis, but could use certain enhancements. For instance, the site ought to include information concerning the air quality measuring station type, along with the extent of validation or calibration of recording media to give researchers more confidence in the quality of the information.

prof.S.Lodoysamba, Mongolia

https://www.newschallenge.org/challenge/data/entries/openaq-the-first-open-air-quality-data-hub-for-the-world#c-b367e525a7e574817c19ad24b7b35607

Rename Heroku app

Or else I will forget what it's doing in 2 months and try a delete it.

Japanese Sources

Sources for Yokohama:

http://cgi.city.yokohama.lg.jp/kankyou/saigai/data/taiki/all/all_0000_00_001.html
http://www.ihe.pref.miyagi.jp/telem/dayreportitem/?itemSelect=10&day=2015%E5%B9%B410%E6%9C%8804%E6%97%A5

Appears to be hourly but unsure. Joe is contacting Miyagi Prefecture regarding details, potentially existing API, station coordinates.

Tokyo:
Just oxides? Need help with translation:
http://www.ox.kankyo.metro.tokyo.jp/index.php?chiku=1
http://www.ox.kankyo.metro.tokyo.jp/

Main page: http://www.kankyo.metro.tokyo.jp/nature/index.html

Validated / unvalidated

@RocketD0g What do you feel about adding a validated / un-validated flag? This might be valuable information, especially when we start adding validated sources.

@jflasher's fine with it. Just checked with him.

How to handle negative values?

With the inclusion of #61, we are going to be pulling some negative values into the platform. There may already be some, just noticed with latest data source. Some of the negative values are -0.25 and some are -999.

For right now, I think we just store these as is, but in the future do we throw out measurements with negative values? Do we keep them in the platform and leave it up to others to remove them?

Comment from Knight Foundation Application - Langley

As a research scientist working on air pollution issues, more accessible AQ data in different regions of the world is highly critical to understanding sources, transport, and transformation of air pollutants in the atmosphere. A major global health issue, atmospheric particulate formation and transport is still not fully understood by the scientific community, and increasing the geospatial resolution of available AQ measurements for modellers and researchers could really raise our understanding of these issues. Additionally, this platform could help inform the public about local air quality issues and provide needed data for medical workers and journalists.
During my time working in a developing country, I have found it frustrating to try to scour scientific papers for names of scientists that may or may not know where AQ data is kept (when preparing proposals, briefings, and other official reports).
I like the suggestion from Chistato about including a section on guidelines and policies in different countries. This will allow direct impact of regulations to be observed. I am currently working in a developing country attempting to regulate AQ, and it is difficult for the officials to decide what type of AQ monitoring equipment to purchase and which regulations to push initially. Knowing what countries with similar air pollution sources and available resources have done in the past, and how this worked, would really be an asset for developing countries beginning to address AQ issues.
On a more scientific note, if possible and if available, including the meteorological data often captured by AQ monitoring stations (wind direction and wind speed) and a general description of measurement locations would help scientists best utilize these data in models.

Averaging period

As mentioned in #36, there are a couple of sources that report rolling averages. We seem to agree to store this with every measurement, but how to go about it?

We can either do a general purpose note field that can be used for anything:

{
  parameter: 'pm25',
  value: 4,
  note: '24 hour rolling average'
}

or we can attempt to standardize it in some way:

{
  parameter: 'pm25',
  value: 4,
  averagingPeriod: 24
}

@RocketD0g Do averaging periods tend to fall within a 4 - 24 hour range? Thoughts @jflasher ?

Standardize data fields

Which data fields will the platform support and what will they be called?

Currently we have defined names for pm25 and pm10.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.