Giter Site home page Giter Site logo

louisvillemetro / wazeccpprocessor Goto Github PK

View Code? Open in Web Editor NEW
79.0 79.0 25.0 15.79 MB

Waze WARP takes your CCP data feed and processes it into your cloud provider for access, analysis, and visualization. An Open Government Coalition (OGC) project. @govintheopen

Home Page: https://www.govintheopen.com/

License: MIT License

JavaScript 2.42% HCL 29.85% TypeScript 65.15% HTML 2.58%
amazon-aws aws cloudformation louisville mobility terraform transportation waze waze-ccp

wazeccpprocessor's People

Contributors

gabrielbogo avatar hmdavis avatar jrstill avatar lamontadams avatar schnuerle avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

wazeccpprocessor's Issues

Web-based interactive map that overlays current snapshot of alerts, jams, irregularities. Provides a date/time selector and slider to look back in time. Will likely use the API to overlay data as GeoJSON.

  • Overlays current snapshot of alerts, jams, irregularities, with filters
  • Provides a date/time selector and slider to look back in time
  • Good to get leadership/mayoral buy-in, business intelligence cases

Technical: A open source mapping platform (OpenStreetMap base layer with Leaflet.js) that has selectors that can pass XHR data into the API query string and return a clickable GeoJSON overlay of current or past conditions, running serverless to spin up only on demand. Allow JSON data export from map view (eg, API call).

Improvements in RDS Schema

Hi, I have some suggestions with respect to the RDS Schema::

  • Store jam's list of coordinates as JSON (helps simplify queries, makes further analyses easier). I am not sure, but I would also say it has better performance since it would be one less JOIN.
  • Store ALL JSONs from waze, regardless of them containing any jams, alerts or irregularities. This is important to calculate the Traffic Probability (empty JSONs are needed).
  • Store startDate and endDate as TIMESTAMP WITH TIMEZONE. It is best practice, avoids errors from the analysts that will use the data in the future
  • A few other UNIQUE and FOREIGN KEY constraints

What do you guys think? Should I do a Pull Request?

Create geometry fields with line/point data for GIS applications

Is there a way to update a geometry column where there are lat/long columns? I would like to put it into a view and can't use UPDATE in it. Here's what I run manually to create the Geom field in a new column.

`DROP TABLE IF EXISTS waze.wazejams;
CREATE TABLE waze.wazejams AS
SELECT j.speed, j.level, j.length, j.delay, j.city, j.street, j.line, j.pub_utc_date at time zone 'utc' at time zone 'america/denver',j.id
from waze.jams j
ORDER BY j.pub_utc_date DESC;
SELECT AddGeometryColumn ('waze','wazejams','geom_line',4326,'LINESTRING',2);
WITH
lines AS(
   SELECT id,
          ST_MakeLine(
            ST_MakePoint(
              (line -> n ->> 'x')::NUMERIC,
              (line -> n ->> 'y')::NUMERIC
            )
          ) AS geom
   FROM waze.wazejams
   CROSS JOIN generate_series(0, jsonb_array_length(line)) AS n
   GROUP BY id)

 
UPDATE waze.wazejams AS a
SET geom_line = ST_SetSRID(b.geom, 4326)
FROM lines AS b
WHERE a.id = b.id`

Your Use Cases

Leave comments here on some use cases you'd like to see supported by this project.

If you've got a fleshed out idea, create a new issue for it and label it Use Case for others to take a look at.

We'll consolidate and get the best ones onto the Roadmap.

Create Carto.com Integration with a Connector

Once the RDS is complete in Phase 1, we'd like to support more data visualization integrations. If you use Carto, you can make a pull request to add support for it with examples to this repo so everyone can benefit. Use this issue as a place to discuss and collaborate.

Optimize how files are stored or processed

Once the RDS is complete in Phase 1, we'd like to optimize the data storage and processing process if possible. If you have ideas on how to do this, you can make a pull request to add support for it with examples to this repo so everyone can benefit.

One idea is to store each JSON file in a zip, thus reducing storage costs.

Use this issue as a place to discuss and collaborate.

Add type_id to the Alerts table

I'd like to add a type_id that references the alert_types table. Currently this table is unused. I also think the type and subtype columns should be removed from the alerts table.

-- modify database
ALTER TABLE waze.alerts ADD COLUMN IF NOT EXISTS type_id INTEGER;
CREATE INDEX CONCURRENTLY alerts_type_id_idx ON waze.alerts (type_id);

-- add data to new type_id field in alerts
UPDATE waze.alerts AS a 
SET type_id = t.id
FROM waze.alert_types AS t
WHERE a.type = t.type 
 and a.subtype = t.subtype 
 and a.type_id is null; 

Optimize Data Store for performance

Once the RDS is complete in Phase 1, we'd like to optimize it for performance. If you have ideas on how to do this with index or other methods, you can make a pull request to add support for it with examples to this repo so everyone can benefit. Use this issue as a place to discuss and collaborate.

Release Versioning - Upgrade docs and new install integration

We are working on our first release that changes the database structure.

Currently we are at 2.0.1, and are working on 2.1:
https://github.com/LouisvilleMetro/WazeCCPProcessor/releases

This 2.1 release changes the database by adding fields, filling in data in those fields, enriching the DB, adding a new user, etc. See this branch for details:
https://github.com/LouisvilleMetro/WazeCCPProcessor/tree/schema/refactoring-1

And these issues labelled v2.1:
https://github.com/LouisvilleMetro/WazeCCPProcessor/issues?q=is%3Aissue+is%3Aopen+label%3Av2.1

To accommodate that we are doing two things.

  1. Fresh installs - (assigned to Jason S)
    If you are coming and getting the code fresh, then these changes will be handled automatically for you. Terraform will run a Lambda that runs the code here in an initial install file that will be kept up to date:
    https://github.com/LouisvilleMetro/WazeCCPProcessor/blob/schema/refactoring-1/code/sql/initial-install.sql

  2. Upgrades - (assigned to Michael S)
    If you are upgrading from one version to another, you will have to manually run DB scripts to make changes. There will be step-by-step directions written for this, and the script will be named and stored like this one:
    https://github.com/LouisvilleMetro/WazeCCPProcessor/blob/schema/refactoring-1/code/sql/update-2.0.1-2.1.sql

We hope to have this release complete before August 8, 2018.

Create Tableau integration

Once the RDS is complete in Phase 1, we'd like to support more data visualization integrations. If you use Tableau, you can make a pull request to add support for it with examples to this repo so everyone can benefit. Use this issue as a place to discuss and collaborate.

Enrich DB tables with useful fields - Direction, Day of Week

Once the RDS is complete in Phase 1, we'd like to enrich the database with more fields that are useful. If you have special fields or views you need for your analysis, you can make a pull request to add support for it with examples to this repo so everyone can benefit. Use this issue as a place to discuss and collaborate.

Traffic Study. Create QuickSight integration with Data Store. User can select bounding box, date range, and time grouping to do a mini traffic study to see a before and after analysis.

  • Provides a traffic study replacement tool that allows comparison of a traffic area before and after a change (signal retiming, road diet, reconfiguration, traffic events)
  • Working tool showing how Waze data can replace studies that can cost $25K - 75K
  • Good to get traffic department, internal gov operations, data analytics buy-in

Technical: Louisville currently has a useful working version in Power BI that the traffic department uses that can be recreated in the cloud. User selects area of geographic interest, times of day/week, date ranges, direction, and gets a before and after analysis to show effectiveness.

Screenshot from current internal traffic tool in Louisville
Power BI Screenshot
https://imgur.com/download/lioITKh

Add local timezone field

Create a new field in Jams and Alerts that is a date/time field adjusted to a single local timezone.

The timezone is set by the user at the time of deployment as a variable.

You may not be able to set a timezone in a database field. If not, then adjust the date time from UTC to local timezone through a query upon data ingestion.

Also have a script to back fill existing deployments with this data.

Create a read-only database user

Create method for external entities to read-only query final Data Store RDS via SQL connection.

Add a user with read only access to the database that you can share with other users in your organization to run queries and reports.

Create API call to access the data in JSON/GeoJSON/XML format by passing in things like bounding box, start and end time, Jam fields, Alert types, etc.

  • Allows easy access to chunks of data from the DB
  • Proof of concept limited for use of Interactive Map
  • Good to get developer buy-in, service integrations

Technical: HTTP endpoint would return GeoJSON file from DB. Parameters include start date/time, end date/time, lat/lon bounding box, Waze categories (Jam types, Alert types, Irregularities).

Examples:

List of reports
https://endpoint.com/api/waze/reports?startdate=2018-02-17&enddate=2018-02-19&starttime=1600&endtime=1800&minlat=37.9971&maxlat=38.38051&minlon=-85.948441&maxlon=-85.4051&jams=2,3,4&alerttypes=ACCIDENT_MAJOR,JAM_STAND_STILL_TRAFFIC,HAZARD_ON_SHOULDER_CAR_STOPPED&irregularity=0&roadtypes=3,5,8,9&streetname=&delay=&speed=&length=&format=json

Detailed data on a single report
https://endpoint.com/api/waze/reportdetail?reportid=FB6E701F-E8A6-3667-BC5A-434A593E5C76&format=json

Geodata (line or point) on a single report
https://endpoint.com/api/waze/segmentgeo?reportid=FB6E701F-E8A6-3667-BC5A-434A593E5C76&format=json

Create an import script

Some CCP members would need to transfer existing json files to an S3 (or GS, etc...) buckets.

We need to create a simple script that would allow uploading from a source (starting with a local source) to a destination. Making it possible to resume broken uploads and converting directory structure, etc.

Support Google Cloud within Terraform

We refactored our code base between version 1.0 and 2.0. Everything used to be written in AWS Cloud Formation, but now is written in Terraform. This is a cloud-agnostic language that gets you 80% of the way to any cloud platform using infrastructure-as-code.

Currently we have this project working 100% in AWS using Terraform. It's 80% working in Google Cloud because of Terraform, but we need the remaining 20% of the work to be complete (some functions and code tweaks). Help us get there by writing the rest if you can, or finding a funder that can pay to complete the work.

Getting ready for Terraform release v2.0

This Issue it to prep the Terraform branch for merger back into master as v2.0, with documentation in the readme and testing in the cloud to find any issues. This branch is the Phase 1 project, which is the end-to-end processor and RDS.
https://github.com/LouisvilleMetro/WazeCCPProcessor/projects/1

v2.0 branch:
https://github.com/LouisvilleMetro/WazeCCPProcessor/tree/feature/terraform-buildout

We will want to resolve or comment on open numbered issues (7-12) as part of Phase 1:
https://github.com/LouisvilleMetro/WazeCCPProcessor/issues?q=is%3Aissue+is%3Aopen+label%3A%22v2.0+release%22

Everyone can help test by:

  1. grabbing the code and deploying to their cloud
  2. recording any issues or idea here
  3. writing or editing the readme to be accurate and helpful
  4. etc

FYI, I marked the current master as v1.0: https://github.com/LouisvilleMetro/WazeCCPProcessor/releases

Work to reduce database costs

Once the RDS is complete in Phase 1, we'd like to find ways to reduce the overall cloud costs. If you have ideas, you can make a pull request to add support for it with examples to this repo so everyone can benefit.

One idea is to pull data into a database that is real-time in nature (gets updated every 2 minutes, but maybe only has 1 day of data) that runs all the time, and then have a second serverless database that is historic in nature (has all data and gets an update from the real-time table nightly) which will only be active when queried, thus saving costs.

Use this issue as a place to discuss and collaborate.

Useful SQL Query Collection

Post as comments here sample SQL queries that you find useful. Make sure to:

  • say what you use it for, eg a title
  • show how people can modify it for themselves
  • explain its value or how it integrates into a use case
  • how long it takes to run and note any suggestions for database indexes that might speed up the query
  • wrap the SQL with code markdown formatting, eg using ``` around the SQL

DB Testing Example

Not that useful but shows off some joins and runs quickly.

A quick query to select weather hazard alerts joined with road type and returns the latitude and longitude of the location. Take a few seconds to run if you have some new recommended indexes setup.

select a.pub_utc_date, a.type, a.subtype, a.street, a.reliability, 
 r."name", c.latitude, c.longitude 
from waze.alerts a 
join waze.roads r on a.road_type = r.value
join waze.coordinates c on a.id = c.alert_id
where a.type = 'WEATHERHAZARD'
limit 100;

Support Microsoft Azure within Terraform

We refactored our code base between version 1.0 and 2.0. Everything used to be written in AWS Cloud Formation, but now is written in Terraform. This is a cloud-agnostic language that gets you 80% of the way to any cloud platform using infrastructure-as-code.

Currently we have this project working 100% in AWS using Terraform. It's 80% working in Azure because of Terraform, but we need the remaining 20% of the work to be complete (some functions and code tweaks). Help us get there by writing the rest if you can, or finding a funder that can pay to complete the work.

Creating Historical Database for Analysis

Thanks for the awesome codebase/infrastructure, @schnuerle!

City of LA has now deployed and is archiving the CCP feed. As mentioned, we'd like to find a way to put it into a traditional RDBMS or similar for querying. This is an issue to discuss how to build out the architecture.

Create R integration

Once the RDS is complete in Phase 1, we'd like to support more data visualization integrations. If you use R, you can make a pull request to add support for it with examples to this repo so everyone can benefit. Use this issue as a place to discuss and collaborate.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.