Giter Site home page Giter Site logo

idr-scraper's Introduction

This tool consists of two data collection processes.

The first scrapes data from the Illinois Department of Revenue's (IDOR) Standard Industrial Classification (SIC) Reporting page and calculates the total taxable retail sales revenue for municipalities and counties. This process is found within the ret_sales folder of this repository.

The second pulls table 28 for a given year from IDOR's tax statistics database, extracts EAV values for the northeast region of Illinois by municipality, and converts it into a machine readable format. This process is found within the eav folder of this repository.

This process follows the task-based workflow approach used by the Human Rights Data Analysis Group (HRDAG), albeit in a less mature form. It does not utilize makefiles to run the process. Instead the user must run the tasks in the following order:

  1. /ret_sales/import
  2. /ret_sales/transform
  3. /eav
  4. /export

The config.yaml file describes which year (or years) the process will pull for. Note that there are limitations on pulling historic data due to data availability. EAV data has only been tested from 2019 to present, though table 28 is available in prior years. For retail sales revenue this process will only currently work for years after 1999.

idr-scraper's People

Contributors

ethanjantz avatar

Watchers

 avatar

idr-scraper's Issues

Fix file paths

I forgot to update the file paths when I copied this from its original project. This needs to be fixed.

Improve workflow

I would like to eventually implement this process in line with HRDAG's task-based workflow approach. It currently follows that approach very loosely.

  • Implement makefiles
  • Move config.yaml to a hand folder or otherwise change the way selecting years works

Update README

The README currently provides little detail on this tool or what it does. This should be addressed.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.