Giter Site home page Giter Site logo

api's People

Contributors

loganpowell avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

api's Issues

Block group IDs coming back as null for the 2013-2017 ACS

Hi @loganpowell -

A tidycensus user just alerted me to an issue with block group IDs coming out of the 2013-2017 5-year ACS API over in walkerke/tidycensus#347. block group is returning null when requested from the API - see here for an example. I've run through a few states/counties to test this out and it is consistently missing. This does not appear to be an issue for any other year. Thanks for looking into this when you get the chance!

Kyle

Accept summary level codes for "for" attribute as well as strings

To specify the geographic summary level requested, the various API endpoints take a for URL parameter with english language strings as the acceptable values.

It would be good if the summary level codes were usable in addition to strings, e.g. 310 for metropolitan statistical area/micropolitan statistical area

Add explanations for top-coded values to API doc

repost from Slack

Should top-coded variables be mentioned in Notes on ACS Estimate and Annotation Values ?
I just realized some differences between API returns and files download from data.census.gov in terms of how they store top-coded values.
For median household income variable in Table B19013, there are two tracts in New York County are coded as 250,000+ from the csv file I downloaded via data.census.gov , but in the API return, it is coded as 250,001. I wonder if it is a good idea to remind folks about this special case in the Notes since most of the time when I send API requests, I barely request annotations.

image

2000 Decennial sf1 api call issues

Hi,

Thanks for all the work you do with the census api. I'm using tidycensus to download dissertation data for all tracts in the country for a number of variables. I've run the below call several times over the past few days to get 2000 decennial data, but I keep getting errors. ( I'm hone my script by just calling data for CT and 75% of the variables I need). I've contacted the author of the tidycensus package, but we don't think the error is with the package.

# assign selected variables
my_vars00sf1 <- c(pop = "P001001", 
             
                # race-ethnicity
                racetot = "P007001", nhwht = "P007002", nhblk = "P007003", nhnat = "P007004", nhasian = "P007005",
                nhpac = "P007006", nhother = "P007007", nhtwo = "P007008", hisptot = "P008001", nothisp = "P008002", 
                hisp = "P008010",
                
                # sex 
                sex_tot = "P012001", sex_m = "P012002", sex_f = "P012026",
                
                # median age
                mage = "P013001", mage_m = "P013002", mage_f = "P013003", 

                # housing unit
                hu = "H003001", occ = "H003002", vac = "H003003", occ_tot = "H004001", owner = "H004002", 
                renter = "H004003", hhsize = "H012001", hhsizeo = "H012002", hhsizer = "H012003")

# run function to call all sf1 for country
sf1 <- get_decennial(state = "CT",
                     geography = "tract",  # specify tract geography
                     year = 2000,  # specify year, 2000
                     variables = my_vars00sf1 , # set variables 
                     sumfile = "sf1", # specify sum file for decennial
                     show_call = TRUE, # show call for troubleshooting
                     geometry = FALSE) # specify that we want the output to be wide 

# glimpse to verify call worked        
head(sf1)

Error in load_data_decennial(geography, variables, key, year, sumfile = "sf3",  : 
The Census API has returned the error message error: error: unknown variable 'H003002'.
This may be due to mixing SF1 and SF3 variables. If so, separate your requests to SF1 and SF3 when using `get_decennial()` by using the `sumfile` argument in separate calls.
Error in gather.default(., key = variable, value = value, -GEOID, -NAME) : 
  object 'NAME' not found

I know there was some issues over the last few months with new endpoints, so perhaps some issue remain. I can run the call with different combinations of variables and sometimes it is successful and sometimes not. "H003002" is a problematic variable that seems to be tripping up the call. Just calling H003002 by itself usually works, but whenever it is included in a call with more than two other vars, it seems to break the call.

Is anyone else having issues with this or is there something I'm missing here? I'm also having similar issues with SF3 calls too. Any help would be greatly appreciated.

Thanks.

FR: Serve numeric data as numbers (without quotes)

Census APIs which return data as JSON should take advantage of JSON's data typing. Numeric values should not be served with quotes. Take this API call, for instance:

https://api.census.gov/data/timeseries/bds/firms?get=emp,estabs,estabs_entry,state,year,year2&for=state:*&time=from+1977+to+2014

As can be seen in this screenshot of a browser JSON viewer:
image

all values are quoted, and thus interpreted by JSON parsers as strings. However, the API documentation shows clearly that the predicateType for most of these variables is int.

By returning the numeric values without quotes, most JSON tools would parse the API response and return those as numeric values, saving API users from redundant work and making the real nature of the data more clear.

FR: Add pre-2000 population estimates dataset to APIs

It would be incredibly useful if earlier population estimates were added to the population estimates APIs, https://www.census.gov/data/developers/data-sets/popest-popproj/popest.html.

State estimates are only available 2000+. The 1990s intercensals only have counties broken down by subgroups, no totals or state totals. For earlier data I could only find a bunch of different txt and csv files that aren’t really formatted consistently year to year. If there’s ever any time to go through the backlog adding in older intercensals would be great.

add WKT string for geometry info into Census API

It's very tedious that you have to get the census data (e.g., ACS) via Census API and then find a shapefile to download through TIGER and merge them. Usually you need to merge them based on their FIPS code, but it sometimes changes from year to year. Let's say the Census decides to add a few more tracts in a county or change their area. In other words, you have to download, for example, ACS 2015 shapefile to match with ACS 2015 data. This is very frustrating and also error-prone as some people don't realize this.

I think a very simple solution would be using WKT to store the geometry info and adding a field in the JSON file returned so that users can covert it to a geo-file after the request from their end.

https://en.wikipedia.org/wiki/Well-known_text_representation_of_geometry

Household serial numbers for 2018 PUMS API

I know that the PUMS API is not "official" yet; however I did notice that Census is starting to advertise the data.census.gov Microdata tool (e.g. http://apdu.org/2020/05/27/new-microdata-access-tool-for-acs-and-cps/), which has the API as one of the output options. Right now, household serial numbers for the 2018 ACS do not come through correctly, e.g. at this example call to the API:

[["SERIALNO","SPORDER","WGTP","PWGTP","AGEP","PUMA","ST"],
["*","1","42","43","58","100","50"],
["*","4","43","16","14","200","50"],
["*","3","43","37","15","200","50"],
["*","2","43","23","42","200","50"],
["*","1","43","43","49","200","50"],
["*","2","58","34","10","300","50"],
...

It would be fantastic to get household serial numbers to come through the API for 2018 correctly as this would enable use of replicate weights, appropriate variance estimation in modeling, etc. I haven't noticed this issue for earlier years. Thanks for all your hard work on this, I appreciate it!

Document geographic variants in GEOIDs

Data returned from data.census.gov and api.census.gov can include a GEO_ID value which is a key for linking data across queries, as well as linking to GIS data.

However, the structure of some of these GEO_IDs is obscure, and should be documented more clearly. There is a page, Understanding Geographic Identifiers (GEOIDs), which is a good start. However, the only mention it makes of the "geographic variant" is pointing out that it is 00 in an example GEO_ID.

However, for several geographies, the GEO_ID values returned have something other than 00 in that position.

  • Core-Based Statistical Areas (CBSAs) and Combined Statistical Areas (CSAs) have M# with different numeric values for #
  • State Legislative Districts - Lower have L# with different numeric values for #
  • State Legislative Districts - Upper have U# with different numeric values for #

There may be more; these are the ones I've found.

Whether in the API or some other source, there ought to be a complete enumeration of possible geographic variants with reference information about their sources. At a minimum, a reference from a given variant to a given TIGER vintage should be documented.

For CBSA/CSAs, which are formally delineated as lists of one or more counties, it would be valuable to be able to link the variant to a specific delineation. Experimentally, I've worked out that M5 is the Sep. 2018 delineation, and M4 is the Apr. 2018 delineation, but it ought to be systematically published.

For the State Legislative Districts, the API returns names which include a year in parentheses, and one can infer that L6 and U6 are the 2018 legislative year, and L5 and U5 are the 2016 legislative year, but, again, this ought to be made explicit.

FR: API to return tree structure for ACS table

Here's an example.

Table: B11002 has a structure like the following and you have to see it via the Census' interface

image

It is really hard to see this structure and level if you only download data via API. You have to go over a long table and each level is separated with !!
https://api.census.gov/data/2018/acs/acs5/groups/B11002.html

image

It is also really error-prone and I can't recall how many times I've seen people get white population when attempting to get the non-Hispanic white.

It would be great if there is an API to return a tree structure that resembles what users see via the interface. For example:

Estimate
├── Total -- B11002_001E
└── Total
    ├── In family households -- B11002_002E
    ├── In family households
    │   ├── In married-couple family -- B11002_003E
    │   ├── In married-couple family
    │   │   ├── Relatives -- B11002_004E
    │   │   └── Nonrelatives -- B11002_005E
    │   ├── In male householder, no wife present, family -- B11002_006E
    │   ├── In male householder, no wife present, family
    │   │   ├── Relatives -- B11002_007E
    │   │   └── Nonrelatives -- B11002_008E
    │   ├── In female householder, no husband present, family -- B11002_009E
    │   └── In female householder, no husband present, family
    │       ├── Relatives -- B11002_010E
    │       └── Nonrelatives -- B11002_011E
    └── In nonfamily households -- B11002_012E

Some 2005-2009 ACS variables not migrated to new endpoint

Referencing walkerke/tidycensus#424, we see that the old API endpoint for the 2005-2009 ACS still exists at https://api.census.gov/data/2009/acs5, and contains some variables that are not available at the newer endpoint https://api.census.gov/data/2009/acs/acs5. For example:

https://api.census.gov/data/2009/acs5?get=B08101_001E,NAME&for=tract&in=state:01

works,

but https://api.census.gov/data/2009/acs/acs5?get=NAME,B08101_001E&for=state does not.

Posting here to put on your radar for after the holiday. Thanks!

FR: returns all available geo-levels for each variable in ACS Tables

Some variables in some Tables in ACS only available for some geo-level. It would be nice if you guys could add a new field to the following json to indicate all the available geo-levels.

https://api.census.gov/data/2018/acs/acs5/variables/B01001A_001E.json

{
  "name": "B01001A_001E",
  "label": "Estimate!!Total",
  "concept": "SEX BY AGE (WHITE ALONE)",
  "predicateType": "int",
  "group": "B01001A",
  "limit": 0,
  "attributes": "B01001A_001EA,B01001A_001M,B01001A_001MA"
}

Include geographic limitations in ACS group metadata

While the ACS 5-year is often described as having "data for all geographies down to block group," there are a number of tables for which data are not available for all geographic levels.

This is laid out in detail in a spreadsheet. It would be great to be able to get these restrictions as part of the group metadata.

A minimum would be to simply add the text from column C as a field in the per-group metadata; my preference would be an explicit list of summary levels (codes and/or text identifiers as used in the API) that are either available or not available.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.