Giter Site home page Giter Site logo

tidyqwi's Introduction

CRAN status R build status

tidyqwi

The goal of tidyqwi is to make accessing the US Census Bureau’s Quarterly Workforce Indicators easier in a tidy format. This package allows a user to specify the years and states of interest as well as some of the additional parameters (desired cross tabs, MSA vs county level data, firm size, etc) and submit them to the US Census API. This package then stays within the US Census guidelines for API submission for this data and returns a combined tidy dataframe for future analysis.

Installation

Install via CRAN with:

install.packages("tidyqwi")

Or the development version at:

remotes::install_github("medewitt/tidyqwi")

Use

After installation you can load and retrieve the desired data!

library(tidyqwi)

nc_qwi <- get_qwi(years = "2010", 
                  states = "11", 
                  geography = "county", 
                  apikey =  census_key, 
                  endpoint = "rh",
                  variables = c("sEmp", "Emp"), all_groups = FALSE,
                  industry_level = "2", processing = "multiprocess")

And look at your data:

head(nc_qwi)
#> # A tibble: 6 × 12
#>   year  quarter agegrp sex   ownercode seasonadj industry state county Emp   sEmp  year_time 
#>   <chr> <chr>   <chr>  <chr> <chr>     <chr>     <chr>    <chr> <chr>  <chr> <chr> <date>    
#> 1 2010  1       A00    0     A00       U         11       37    001    45    1     2010-01-01
#> 2 2010  1       A00    0     A00       U         11       37    003    101   1     2010-01-01
#> 3 2010  1       A00    0     A00       U         11       37    005    82    1     2010-01-01
#> 4 2010  1       A00    0     A00       U         11       37    007    207   1     2010-01-01
#> 5 2010  1       A00    0     A00       U         11       37    009    104   1     2010-01-01
#> 6 2010  1       A00    0     A00       U         11       37    011    77    1     2010-01-01

And there are labels added if desired

labelled_nc <- add_qwi_labels(nc_qwi)
Hmisc::describe(labelled_nc$Emp)
#> labelled_nc$Emp : Beginning-of-Quarter Employment: Counts 
#>        n  missing distinct     Info     Mean      Gmd      .05      .10      .25      .50      .75      .90      .95 
#>     7345      411     2851        1     2018     3129       24       40      132      448     1550     4355     8099 
#> 
#> lowest :     0     1     3     4     5, highest: 65243 81884 82723 84038 84674
library(ggplot2)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

labelled_nc %>%
  as_tibble() %>% 
  dplyr::filter(county == "067") %>% 
  ggplot(aes(year_time, Emp, color = county))+
  geom_line()+
  scale_y_log10()+
  facet_wrap(~industry)+
  labs(
    title = "Quarterly Workforce Indicators for Forsyth County",
    subtitle = attributes(labelled_nc$Emp)$label,
    caption = "Data: US Census Bureau QWI",
    x = "Month"
  )+
  theme_minimal()
#> Warning in vp$just: partial match of 'just' to 'justification'

Please note that the ‘tidyqwi’ project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

tidyqwi's People

Contributors

medewitt avatar monaahmadiani avatar mrembert avatar twedl avatar adamhyde avatar

Stargazers

Pukar Bhandari avatar EconMaett avatar  avatar Jonathan Knopf avatar Kyle Walker avatar

Watchers

James Cloos avatar  avatar

tidyqwi's Issues

State sequence loop

The function does not loop over states. Just look for the state in the sequence that the function call.
fix to have this:
#'@param states state fips code to fetch (e.g. 37, or c(37, 13))

add helper function to convert to proper tsibble

The tsibble package is immensely powerful which allows dataframes to be converted to time series objects. This will be helpful for working with forecast and all of the other times series packages.

I'm thinking two function or more functions with one that converts to a tsibble and another that converts to standard ts object type.

more informative error message when census api fails

Need to update the check_census_api_call to something like the following to make it clear when the census fails and not the package:

check_census_api_call <- function(call){
  if(class(call) != "response"){
    stop("A valid response was not returned")
  }
returned_call <- httr::content(call, as = "text", encoding = "UTF-8")

if( show_condition(xml2::as_xml_document(returned_call)) !="error"){
  returned_call %>%
    xml2::as_xml_document()%>%
    xml2::xml_find_all("body") %>%
    xml2::xml_text() %>%
    gsub(pattern = "\\s{2}", replacement = "") %>%
    gsub(pattern = "^\\s", replacement = "")
} else{
  stop(paste0("The following is a message from the US Census API \n",returned_call ))
}
}

Fix for dplyr version 1.0.0

[ ] Trace the error (maybe something with bind_rows
[ ] fix it

"The error message

== CHECK RESULTS ========================================

  • checking tests ...
     ERROR
    Running the tests in ‘tests/testthat.R’ failed.
    Last 13 lines of output:
      >
      > test_check("tidyqwi")
      ── 1. Error: Try that labels are added (@test.R#97)
    

───────────────────────────
x must be a vector, not a tbl_df/tbl/data.frame/qwi object.
Backtrace:
1. tidyqwi::add_qwi_labels(nc_qwi)
15. vctrs:::stop_scalar_type(...)
16. vctrs:::stop_vctrs(msg, "vctrs_error_scalar_type", actual = x)

  ══ testthat results

═══════════════════════════════════════════════════════════
[ OK: 19 | SKIPPED: 1 | WARNINGS: 0 | FAILED: 1 ]
1. Error: Try that labels are added (@test.R#97)

  Error: testthat unit tests failed
  Execution halted

quarters

always request 4 quarters this reduces the number of requests tremendously

CRAN Notice

Dear maintainer,

Please see the problems shown on
https://cran.r-project.org/web/checks/check_results_tidyqwi.html.

Please correct before 2024-05-12 to safely retain your package on CRAN.

It seems we need to remind you of the CRAN policy:

'Packages which use Internet resources should fail gracefully with an informative message
if the resource is not available or has changed (and not give a check warning nor error).'

This needs correction whether or not the resource recovers.

The CRAN Team

Recommendation for modifying get_qwi for collecting variables based on different values of firmsize and firmage

For firmage:

collector <- list()
collect_industry <- dplyr::data_frame()

for (j in seq_along(state)) {
    st <- state[[j]]
    
for (k in seq_along(year)) {
      yr <- year[[k]]
    
for (i in seq_along(industry)) {
  ind <- industry[[i]]
    
    url <-paste("https://api.census.gov/data/timeseries/qwi/sa?get=",
                "Emp,EmpEnd,HirA,Sep,FrmJbC,FrmJbCS,FrmJbGn,FrmJbLs,sEmp,sEmpEnd,sHirA,sSep,sFrmJbC,sFrmJbCS,sFrmJbGn,sFrmJbLs,EarnS,EarnBeg,EarnHirAS,EarnHirNS,EarnSepS,sEarnS,sEarnBeg,sEarnHirAS,sEarnHirNS,sEarnSepS",
                "&for=metropolitan+statistical+area/micropolitan+statistical+area",
                "&in=state:",st,
                "&year=",yr,
                "&quarter=1&quarter=2&quarter=3&quarter=4",
                "&sex=0",
                "&agegrp=A00",
                "&ownercode=A05",
                "&firmage=1&firmage=2&firmage=3&firmage=4&firmage=5",
                "&seasonadj=U&",
                "industry=",ind,
                "&key=",census_key,
                sep="")

    
    call <- httr::GET(url)
    #print(call$status_code)
    if(!call$status_code %in% c(200, 202)){
      # 500 means that message failed If not 500 then there was an OK
      next(i)
      print(call$status_code)
      print(url)
      
    } else{
      # Keep going if there isn't an error
      dat <- dplyr::as_data_frame(
        jsonlite::fromJSON(
          httr::content(call, as = "text")))
      
      colnames(dat) <- dat[1, ]
      dat <- dat[-1, ]
      
      # Keep adding to the data frame
      collect_industry <- dplyr::bind_rows(collect_industry, dat)
    }
    
  }

  # Store for each state into a list
  collector[[i]] <- collect_industry
}
}

And for firmsize:

collector <- list()
collect_industry <- dplyr::data_frame()

for (j in seq_along(state)) {
    st <- state[[j]]
    

for (k in seq_along(year)) {
      yr <- year[[k]]
    
for (i in seq_along(industry)) {
  ind <- industry[[i]]
    
    url <-paste("https://api.census.gov/data/timeseries/qwi/sa?get=",
                "Emp,EmpEnd,HirA,Sep,FrmJbC,FrmJbCS,FrmJbGn,FrmJbLs,sEmp,sEmpEnd,sHirA,sSep,sFrmJbC,sFrmJbCS,sFrmJbGn,sFrmJbLs,EarnS,EarnBeg,EarnHirAS,EarnHirNS,EarnSepS,sEarnS,sEarnBeg,sEarnHirAS,sEarnHirNS,sEarnSepS",
                "&for=metropolitan+statistical+area/micropolitan+statistical+area",
                "&in=state:",st,
                "&year=",yr,
                "&quarter=1&quarter=2&quarter=3&quarter=4",
                "&sex=0",
                "&agegrp=A00",
                "&ownercode=A05",
                "&firmsize=1&firmsize=2&firmsize=3&firmsize=4&firmsize=5",
                "&seasonadj=U&",
                "industry=",ind,
               "&key=",census_key,
                sep="")

    
    call <- httr::GET(url)
    #print(call$status_code)
    if(!call$status_code %in% c(200, 202)){
      # 500 means that message failed If not 500 then there was an OK
      next(i)
      print(call$status_code)
      print(url)
      
    } else{
      # Keep going if there isn't an error
      dat <- dplyr::as_data_frame(
        jsonlite::fromJSON(
          httr::content(call, as = "text")))
      
      colnames(dat) <- dat[1, ]
      dat <- dat[-1, ]
      
      # Keep adding to the data frame
      collect_industry <- dplyr::bind_rows(collect_industry, dat)
    }
    
  }

  # Store for each state into a list
  collector[[i]] <- collect_industry
}
}

tibble instead of data_frame

We should modify the code to use dplyr::tibble() instead of dplyr::data_frame(). Please see the below code and error:

state <- c("01","02","04","05","06","08","09","10","11","12",
           "13","15","16","17","18","19","20","21","22","23",
           "24","25","26","27","28","29","30","31","32","33",
           "34","35","36","37","38","39","40","41","42","44",
           "45","46","47","48","49","50","51","53","54","55",
           "56")

tidyqwi::get_qwi(years = year,
        states = state,
        geography = "county",
        apikey = census_key,
        endpoint = "se",
        variables = c("Emp","EmpEnd","HirA","Sep", "sEmp","sEmpEnd","sHirA","sSep"), all_groups = FALSE,
        industry_level = "2", processing = "sequential")

Error in eval_tidy(enquo(var), var_env) : object 'parameter' not found
In addition: Warning message:
as_data_frame() is deprecated as of tibble 2.0.0.
Please use as_tibble() instead.
The signature and semantics have changed, see ?as_tibble.
This warning is displayed once every 8 hours.
Call lifecycle::last_warnings() to see where this warning was generated.

Possible to consolidate license files?

RE: JOSS review openjournals/joss-reviews#1462, is there a way to combine the LICENSE and LICENSE.md files into one plain-text (.md or other) license file?

I'm not sure what the CRAN license file requirements are, but LICENSE doesn't say what type of license it is, and LICENSE.md says MIT, so one consistent file would avoid the confusion 🙏.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.