Giter Site home page Giter Site logo

census-communities-usa's People

Contributors

gmisshula avatar hunterowens avatar matthewgee avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

census-communities-usa's Issues

Setup Varnish (or some kind of caching engine)

Response time is generally pretty bogus. Once the process of fixing the length of the tract codes is finished, we should probably go ahead and setup Varnish or something similar to start caching the API responses since they really won't ever change.

endpoint for totals per census tract

Based on the conversations and use cases so far, this endpoint described by @evz will be useful:

http://ec2-54-212-141-93.us-west-2.compute.amazonaws.com/tract-average/

[{
    '2010': {
        '17031010600': {
            'S000': 10809,
            'SE01': 2037,
            'SE02': 4472,
            'SE03': 4300
        }
    }
},
{
    '2011': {
        '17031010600': {
            'S000': 12191,
            'SE01': 2681,
            'SE02': 4561,
            'SE03': 4949
        }
    }
}]

Eventually, it would be nice to optionally include the tract boundary GeoJSON, perhaps with a flag like boundary=1 in the query string. @evz, how hard would it be to add that?

Relates to #17

endpoint for current job totals for a given CBSA

For the homepage of Chicago Breadwinners, I'd like to show a choropleth of jobs per tract in the Chicago CBSA.

This might be a heavy call, so we could limit it to just the current year and just the S000 value. Oh ... and cache it too!

Scraper can be actual scraper instead of loader

According to the docs that you link to in the readme, it would seem to me that you could build a script that went and fetched the data one file at a time and shoved it into Mongo instead of attempting to lead it all in one go. That would make this project way easier for others to prop up on their own instead of making it rely on a 53GB file and a particular Mongo endpoint in order for it to work.

Data Questions

OK, I swear I started this last night...

Anyways, Derek and I were talking about questions we might have about the data and thought it might be smart to start compiling a list so that the next time we have someone from the Census Bureau on the phone we can see about getting some answers. I'll start:

• Is there any way of accounting for telecommuting in the origin destination data?
• What do the job types (primary, private, private primary, federal, and federal primary) actually refer to?
• How reliable is the geocoding? If a worker works for a McDonalds, does their job exist at the location of the McDonalds or at the corporate headquarters?
• If the company that a worker works for is in, for example, the retail industry but their job could be described as a job in technology, what gets reported in these data?

limiting results again?

Good news: the site seems to be responsive again!

Bad news: looks like we're limiting results to the top 100 connected tracts again.

screen shot 2013-10-02 at 10 12 59 am

Load polygons of census blocks into geo_xwalk table

It would seem to me that, since the current version of Mongo supports storing and indexing polygon data, it would be feasible to download, parse, and load all that data along with all the geographic crosswalk data. You can get shapefiles grouped by state here and there are a few python libraries that will translate those into GeoJSON that we can put into MongoDB.

rename repo to something more descriptive

census-communities-usa doesn't make a lot of sense to me considering we are serving up origin-destination data for workers across the country.

Suggestions

  • census-lodes-api
  • census-home-work-commute-api
  • where-we-work

Unicode errors when loading CSV

As we discovered last night, it would seem that when we attempt to push the data from the CSV into MongoDB, it needs to be Unicode (or at least something that is trivial for pymongo to make into Unicode). The python-unicodecsv module got us halfway there but in order to take it the rest of the way, I needed to actually declare the incoming encoding as well. So, after a bit of trial and error, it seems that the proper encoding, at least for the Arizona geographic crosswalk table, is latin-1.

That means that in a few hours we'll have Alabama, Arkansas, Alaska and Arizona loaded.

research existing work with this data

@JoeGermuska pointed me to this presentation:

Digging Deeper on Employment: Local Employment Dynamics Data

Paul Overberg, database editor, USA TODAY

Most local business reporting on employment focuses on net numbers because that’s all that has been available until recently. Now reporters can analyze the tremendous churn of workers in and out of new and old firms in various sectors of their local economies using LED data. Learn how to use browser-based tools to analyze and visualize this data for a state or metro audience and how to extract a local slice for more analysis.

Download PowerPoint
Watch video of session

endpoint for traveling to/from of a given tract

When a user clicks on a tract, I'd like to show the inflows and outflows, similar to this: http://www.forbes.com/special-report/2011/migration.html

screen shot 2013-08-26 at 5 22 49 pm

Example call/response:

http://ec2-54-212-141-93.us-west-2.compute.amazonaws.com/tract-origin-destination/17031010600

[{
    '17031010600': {
        'traveling-from': {
            '17031010601': 5,
            '17031010602': 3,
            '17031010603': 17,
            '17031010604': 43,
        },
        'traveling-to': {
            '17031010608': 1,
            '17031010609': 90,
            '17031010610': 4,
            '17031010611': 12,
        }
    }
}

Relates to #17

@evz doable?

Convert API from Mongo to Postgres

Based on at @evz's testing, postgres seems to be a much better option than mongo. Subsequently, we should convert parts of web/app.py to run on postgres and update the documentation. [Parts of the API already use psycopg2/postgres]

Also, should we use an ORM (ie, SQLAlchemy)?

Thoughts/Feel free to debate this contention-

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.