Giter Site home page Giter Site logo

chekos / pypums Goto Github PK

View Code? Open in Web Editor NEW
2.0 2.0 1.0 1.68 MB

Download Public Use Micro Sample (PUMS) data files from the US Census Bureau's FTP server.

Home Page: https://pypums.readthedocs.io

License: Apache License 2.0

Python 100.00%
census census-api python

pypums's Introduction

Binder

pypums

Build status Changelog License

Download Public Use Micro Sample (PUMS) data files from the US Census Bureau's FTP server.

Usage

To use PyPUMS in a project:

on a jupyter notebook

or as a CLI

as a CLI

Installation

Install this library using pip:

pip install pypums

Usage

Usage instructions go here.

Development

To contribute to this library, first checkout the code. Then create a new virtual environment:

cd pypums
python -m venv venv
source venv/bin/activate

Or if you are using pipenv:

pipenv shell

Now install the dependencies and test dependencies:

pip install -e '.[test]'

To run the tests:

pytest

๐Ÿ“ƒ Citation

@misc{pypums,
  author = {chekos},
  title = {Download Public Use Micro Sample (PUMS) data files from the US Census Bureau's FTP server.},
  year = {2020},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/chekos/pypums}}
}

pypums's People

Contributors

chekos avatar dependabot[bot] avatar pyup-bot avatar sourcery-ai-bot avatar yonran avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Forkers

yonran

pypums's Issues

Add 2020 data

๐Ÿš€ Feature Request

The census did not release 2020 1-year ACS data because of COVID but they have it under the experimental section (with experimental weights). There is a 5-year PUMS dataset though.

๐Ÿ“Ž Additional context

Add the experimental flag.

Screen Shot 2022-04-05 at 6 41 18 PM

`--sample-unit` causes an error

๐Ÿ› Bug Report

๐Ÿ”ฌ How To Reproduce

Steps to reproduce the behavior:

  1. run pypums download-acs --year 2018 --state alaska --sample-unit household on mybinder link

Code sample

pypums download-acs --year 2018 --state alaska --sample-unit household

Environment

  • OS: [e.g. Linux / Windows / macOS]
  • Python version, get it with:
python --version

Linux. python 3.7

Screenshots

Screen Shot 2022-02-22 at 17 24 04

๐Ÿ“ˆ Expected behavior

Download a household-level dataset

๐Ÿ“Ž Additional context

Error:

jovyan@jupyter-chekos-2dpypums-2dgdr3re8j:~$ pypums download-acs --year 2018 --state alaska --sample-unit household
Traceback (most recent call last):
  File "/srv/conda/envs/notebook/bin/pypums", line 8, in <module>
    sys.exit(cli())
  File "/srv/conda/envs/notebook/lib/python3.7/site-packages/typer/main.py", line 214, in __call__
    return get_command(self)(*args, **kwargs)
  File "/srv/conda/envs/notebook/lib/python3.7/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/srv/conda/envs/notebook/lib/python3.7/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/srv/conda/envs/notebook/lib/python3.7/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/srv/conda/envs/notebook/lib/python3.7/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/srv/conda/envs/notebook/lib/python3.7/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/srv/conda/envs/notebook/lib/python3.7/site-packages/typer/main.py", line 500, in wrapper
    return callback(**use_params)  # type: ignore
  File "/srv/conda/envs/notebook/lib/python3.7/site-packages/pypums/cli.py", line 94, in download_acs
    _download_data(url, "acs", data_directory, extract)
  File "/srv/conda/envs/notebook/lib/python3.7/site-packages/pypums/utils.py", line 125, in _download_data
    total = int(response.headers["Content-Length"])
  File "/srv/conda/envs/notebook/lib/python3.7/site-packages/httpx/_models.py", line 993, in __getitem__
    raise KeyError(key)
KeyError: 'Content-Length'

change CLI assumed data directory

Description

CLI assumed data directory to be ../data/ this assumption holds if we're using pypums in a Jupyter notebook, for example, in the notebooks folder but not if we're using it as a CLI.

Change

Should be changed to ./data/ for the CLI

Move package to hyper-modern-python-package template

Description

The template: https://github.com/TezRomacH/python-package-template would be a big improvement.

From their readme:

For your development we've prepared:

For building and deployment:

For creating your open source community:

Moving to simonw/python-lib

Reference: https://github.com/simonw/python-lib/

๐Ÿ”ˆ Motivation

Moving to the hyper-modern-python-package template was a mistake. I actually do not use poetry at all and it comes with so much extra automation that I did not set up that I find it hard to develop. This package is small enough that I can rewrite it from scratch and set it up in a way that's actually easier to develop.

fix __repr__ for invalid surveys

  • pypums version: 0.0.4
  • Python version: 3.7
  • Operating System: MacOS

Description

When creating an instance of ACS with an invalid survey type (3-year in 2016, for example).
There is a message displayed that lets the user know that there is no 3-year ACS for 2016 and that it'll default to 5 year. However, the repr still shows 3-year. It should change to 5-year if that's the defaulted value to avoid confusion.

year < 2006 still uses "1-Year" in URL

  • pypums version: 0.0.4
  • Python version: 3.6+
  • Operating System: MacOS

Description

When constructing the ACS()._SURVEY_URL - ACSs from before 2006 should not have "1-Year" in their url. The if statement in pypums.surveys skips if (by default) the current value is 1-year.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.