taylorterry3 / covid_dc Goto Github PK
View Code? Open in Web Editor NEWTools for gathering and processing Washington DC covid data
Tools for gathering and processing Washington DC covid data
Right now this is just happening in the notebook in two lines, but it needs some null handing added and should go in a function.
Add a side table of ward, address, DCPS codes, and other info for each school. This will also require creating a canonical short name for each school.
Add elementary/middle/high/other field to dataset. This will require a decode dict.
The main script needs some sort of secrets storage so that it can be deployed somewhere other than my laptop and run on a cron.
Similar to #14 , the PDF's are about to spill off into the dustbin of history and need to be archived.
A friend sent this to me as a link, so I need to document where it came from and what I did to it in the README
About 450 letters from now I'll need to repeat the process in #14 so we don't lose data as letters fall off the page they're scraped from.
A few new school names have shown up in the data set, so they need to be added to the decode dict.
I've been pretty inconsistent with adding type annotations outside of function signatures, so at some point I need to fix that.
existing functions could use some tests
Right now this system scrapes everything every time, which will stop working soon because the source page only shows the most recent 500 letters (see https://dcpsreopenstrong.com/health/response/notifications/). I had planned to switch to reading the PDFs and keep the old ones in the repo, which would be slow but stateless. The PDFs have been buggy and occasionally malformed, though, so I need a plan B.
When the year rolls over there will be a need to fix things that assume that the dates being parsed are in the current year. This may just amount to versioning the data through December and starting a fresh round in January.
Because some of the one-shots find and replace data within the scraped data they will need to be versioned along with the archived data so they aren't looking for things that aren't there. Another option would be to handle the error.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.