ward-wise / data-analysis Goto Github PK
View Code? Open in Web Editor NEWData analysis on Chicago infrastructure and infrastructure spending
License: MIT License
Data analysis on Chicago infrastructure and infrastructure spending
License: MIT License
Businesses don't know what modes of transport their customers are using and there's no good tools to track this. This data is important, because it drives decisions around how street space is allocated for pedestrians, cyclists, and parked vehicles.
This should be started as a new repo.
https://usaddress.readthedocs.io/en/latest/
This package might do a better job pattern matching than our current setup.
John Ruf geocoded data from 2022 back to 2005. We'd like to get this into a geoJSON format.
https://github.com/JohnCRuf/alderman_machine/tree/master/tasks/data_geocode_menu/output
He's got this ID column on each row. He's generating those IDs as part of his scraping, so each project will have a unique idea. I believe the geocoded data can have multiple rows with the same ID because he split text with multiple locations into multiple rows
In Google Doc? There's the data sets on the Chicago Open Data portal to sort through. Some agencies have data on their websites that is not in the portal.
https://geocoding.geo.census.gov/geocoder/
Should experiment first to see how well it handles Chicago street data.
Use current bike map, upcoming bike lanes, and roads <10 mph to show a map of current bike network.
Geocoding for street address ("1234 N Name Ave") uses the location of the property at that address. These should project to nearest part of the street centerline.
https://www.epa.gov/smartgrowth/smart-location-mapping
https://enviroatlas.epa.gov/enviroatlas/DataFactSheets/pdf/Supplemental/Numberofhouseholdswithzerovehicles.pdf
Data source for measuring "location efficiency" (housing density, land use, etc.). Might be usable for something.
The 5-year CIP books contain how the city spends its entire budget. The neighborhood and streetscape sections are relevant to infrastructure spending.
https://www.chicago.gov/city/en/depts/obm/provdrs/cap_improve/svcs/cip-archive.html
Most of the ward spending address data is in the format "N Streetname Ave & W Otherstreetname Ave & N Western Ave & W Belmont Ave". We represent these as a polygon by finding the intersection of the street centerlines, but sometimes two north/south streets intersect way off in the distance and it produces strange results.
Scripts and tests are referencing the package, which requires us to reinstall the package any time we make a change. Python has a way to use periods to specify a relative path so we use the file directly.
https://storymaps.arcgis.com/stories/da5601c3e0924e5ab3ee07ade9954f7a
The geographic data is useful. I would find out where it's coming from, then download by hand and/or record the location so we can make fetch requests in the future.
Take project spending with multiple locations (multiple points, street segment, or polygon) and turn into points with evenly distributed project costs.
Lines -> line of points
Polygons -> point cloud
Take the generated point data and map as a heatmap in kepler.gl
There are regex statements set up to handle single word streets (S BLACKSTONE AVE), but they currently don't detect multi-word street names (S DR MARTIN LUTHER KING JR DR). We need to modify the regex to detect these cases. Other functions inside the address_processing module may need to be tweaked to accept the changes.
Write a Python function that takes an address and returns GPS coordinates.
Use geopy and the Nominatim API?
Preferably something free. Use GitHub sites? Leaflet for maps?
We have a custom geocoding setup that falters with some of Chicago's unique street names (e.g., "North Ave", "North Broadway", "Fulton Market", etc.). We either need to add code for these edge cases or switch to a cheap and effective geocoding API.
This can be used to setup a local geocoding API.
Alley location text tends to produce geometry that stretches across half the city. We need to fix the initial geocoding and implement a script to find poorly geocoded alleys and redo them.
The main issue is we're looking for all intersections between streets that bound the alley, and most parallel streets actually cross at some point in the distance. We need to interpret which streets are North-South and East-West, and then only find the intersections between N-S and E-W streets.
CIP Archive - Previous Aldermanic Menu Program Books by Year section
Yes, someone has already asked the city if this data is available in a CSV file. It is not.
Write Python functions to convert the text in the PDFs to a CSV format. You can do OCR with pytesseract, but the you might be able to get the text directly out of the PDF file using PyMuPDF or something other library. The PDFs for different years have different formats.
Write a Python module that servers as a wrapper for the Chicago Open Data Portal API. This might need to be dataset by dataset.
The API docs have example code: https://dev.socrata.com/foundry/data.cityofchicago.org/dpkg-upyz
The current street resurfacing scripts find the last time a street segment (section of street between intersections) was resurfaced. However, sometimes only a portion of a street segment is resurfaced at a time, which means we might be missing old sections of street because they're part of the same segment as a recently resurfaced section of street.
Changes to make:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.