Giter Site home page Giter Site logo

ropensci / usaboundaries Goto Github PK

View Code? Open in Web Editor NEW
56.0 12.0 9.0 16.47 MB

Historical and Contemporary Boundaries of the United States of America

Home Page: https://docs.ropensci.org/USAboundaries

License: Other

R 92.76% TeX 6.33% Makefile 0.90%
spatial-data history digital-history r rstats r-package

usaboundaries's Introduction

USAboundaries

CRAN_Status_Badge JOSS Status R-CMD-check Coverage Status

Overview

This R package includes contemporary state, county, and Congressional district boundaries, as well as zip code tabulation area centroids. It also includes historical boundaries from 1629 to 2000 for states and counties from the Newberry Library’s Atlas of Historical County Boundaries, as well as historical city population data from Erik Steiner’s “United States Historical City Populations, 1790-2010.” The package has some helper data, including a table of state names, abbreviations, and FIPS codes, and functions and data to get State Plane Coordinate System projections as EPSG codes or PROJ.4 strings.

This package can serve a number of purposes. The spatial data can be joined to any other kind of data in order to make thematic maps. Unlike other R packages, this package also contains historical data for use in analyses of the recent or more distant past. See the “A sample analysis using USAboundaries” vignette for an example of how the package can be used for both historical and contemporary maps.

Citation

If you use this package in your research, we would appreciate a citation.

citation("USAboundaries")
#> 
#> To cite the USAboundaries package in publications, please cite the
#> paper in the Journal of Open Source Software:
#> 
#>   Lincoln A. Mullen and Jordan Bratt, "USAboundaries: Historical and
#>   Contemporary Boundaries of the United States of America," Journal of
#>   Open Source Software 3, no. 23 (2018): 314,
#>   https://doi.org/10.21105/joss.00314.
#> 
#> A BibTeX entry for LaTeX users is
#> 
#>   @Article{,
#>     title = {{USAboundaries}: Historical and Contemporary Boundaries
#> of the United States of America},
#>     author = {Lincoln A. Mullen and Jordan Bratt},
#>     journal = {Journal of Open Source Software},
#>     year = {2018},
#>     volume = {3},
#>     issue = {23},
#>     pages = {314},
#>     url = {https://doi.org/10.21105/joss.00314},
#>     doi = {10.21105/joss.00314},
#>   }

Installation

You can install this package from CRAN.

install.packages("USAboundaries")

Almost all of the data for this package is provided by the USAboundariesData package. That package will be automatically installed (with your permission) from the rOpenSci package repository the first time that you need it.

Or you can install the development versions from GitHub using remotes.

# install.packages("remotes")
remotes::install_github("ropensci/USAboundaries")
remotes::install_github("ropensci/USAboundariesData")

Use

This package provides a set of functions, one for each of the types of boundaries that are available. These functions have a consistent interface.

Passing a date to us_states(), us_counties(), and us_cities() returns the historical boundaries for that date. If no date argument is passed, then contemporary boundaries are returned. The functions us_congressional() and us_zipcodes() only offer contemporary boundaries.

For almost all functions, pass a character vector of state names or abbreviations to the states = argument to return only those states or territories.

For certain functions, more or less detailed boundary information is available by passing an argument to the resolution = argument.

See the examples below to see how the interface works, and see the documentation for each function for more details.

library(USAboundaries) 
library(sf) # for plotting and projection methods
#> Linking to GEOS 3.9.1, GDAL 3.3.2, PROJ 8.1.1

states_1840 <- us_states("1840-03-12")
plot(st_geometry(states_1840))
title("U.S. state boundaries on March 3, 1840")

states_contemporary <- us_states()
plot(st_geometry(states_contemporary))
title("Contemporary U.S. state boundaries")

counties_va_1787 <- us_counties("1787-09-17", states = "Virginia")
plot(st_geometry(counties_va_1787))
title("County boundaries in Virginia in 1787")

counties_va <- us_counties(states = "Virginia")
plot(st_geometry(counties_va))
title("Contemporary county boundaries in Virginia")

counties_va_highres <- us_counties(states = "Virginia", resolution = "high")
plot(st_geometry(counties_va_highres))
title("Higher resolution contemporary county boundaries in Virginia")

congress <- us_congressional(states = "California")
plot(st_geometry(congress))
title("Congressional district boundaries in California")

State plane projections

The state_plane() function returns EPSG codes and PROJ.4 strings for the State Plane Coordinate System. You can use these to use suitable projections for specific states.

va <- us_states(states = "VA", resolution = "high")
plot(st_geometry(va), graticule = TRUE)

va_projection <- state_plane("VA")
va <- st_transform(va, va_projection)
plot(st_geometry(va), graticule = TRUE)

Related packages

Each function returns an sf object from the sf package, which can be mapped using the leaflet or ggplot2 packages.

If you need U.S. Census Bureau boundary files which are not provided by this package, consider using the tigris package, which downloads those shapefiles.

License

The historical boundary data provided in this package is available under the CC BY-NC-SA 2.5 license from John H. Long, et al., Atlas of Historical County Boundaries, Dr. William M. Scholl Center for American History and Culture, The Newberry Library, Chicago (2010). Please cite that project if you use this package in your research and abide by the terms of their license if you use the historical information.

The historical population data for cities is provided by U.S. Census Bureau and Erik Steiner, Spatial History Project, Center for Spatial and Textual Analysis, Stanford University. See the data in this repository.

The contemporary data is provided by the U.S. Census Bureau and is in the public domain.

All code in this package is copyright Lincoln Mullen and is released under the MIT license.


rOpenSci footer

usaboundaries's People

Contributors

cboettig avatar hadley avatar jeroen avatar jfbratt avatar karthik avatar lmullen avatar maelle avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

usaboundaries's Issues

Albers USA projection

Get an Albers USA projection the same way that you can get state plane projections.

Duplicate column names in us_counties()

The object returned by us_counties() has two columns named state_name, which causes problems for functions that require unique names. make.names(unique = TRUE) makes all of the names unique.

Reprex:

usCounties <- USAboundaries::us_counties()
dplyr::left_join(usCounties, usCounties)
#> Error in `dplyr::left_join()`:
#> ! Input columns in `x` must be unique.
#> ✖ Problem with `state_name`.
usCountiesNames <- names(usCounties) |> make.names(unique = TRUE)
usCountiesUniqueNames <- purrr::set_names(usCounties, usCountiesNames)
dplyr::left_join(usCountiesUniqueNames, usCountiesUniqueNames)
#> Joining, by = c("statefp", "countyfp", "countyns", "affgeoid", "geoid", "name",
#> "namelsad", "stusps", "state_name", "lsad", "aland", "awater", "state_name.1",
#> "state_abbr", "jurisdiction_type", "geometry")
#> Error:
#> ! All columns in a tibble must be vectors.
#> ✖ Column `geometry` is a `sfc_MULTIPOLYGON/sfc` object.
#> 
#> Note that this second error is different and can only occur because the error with non-unique column names has been resolved.

Created on 2022-07-28 by the reprex package (v2.0.1)

County ID

Hello,

I made a spatialpolygondatafram at the congressional level and was hoping to match these counties to FIPS county codes.

I saw that the spatialpolygondataframe had an ID slot. I looked at the ID slot for the counties in Ohio. I could not tell what they were or which entry in the object corresponded to which county.

Can you tell me what these slot IDs mean and how to match them with the correct FIPS code for each county?

Here is part of my code:

ohiocount<-us_boundaries(type="county",resolution="high",state="Ohio")
x<-c();for(i in 1:88){x[i]<-ohiocount@polygons[[i]]@id}
x

Ohio 9th congressional district

Hello,

First let me thank you for developing such a wonderful data set. I have found it to be most useful. I did notice a mistake in the congressional boundaries for Ohio's 9th district in both the high resolution data set and the low data resolution set. The boundaries in each case only covered a small subset of the district. It is a strange district on the edge of the lake which makes it hard to draw I'm sure.

Best wishes

Can us_cities() return an sf object ?

Is there a reason that us_cities returns a dataframe and not an sf points object ? Returning as an sf object would help us_cities() being used in combination with us_states(). I tried sf::plot(st_geometry(us_cities())) expecting it to work the same as us_states().

Also can use of us_cities() be added to the readme ?

Use broom for fortifying

We currently use ggplot's fortify function, which I think is based on plyr, and so is slow. Broom's tidy function seems to use dplyr and so is fast.

Add historical Congressional districts

The historic congressional districts for the first 113 Congresses are available as shapefiles. I'd like to include these.

Questions to decide:

  • Should users retrieve the boundaries by date, as in the current implementation of us_boundaries(), by the number of Congress, or both?
  • Should I included each shapefile individually, or should I merge them and then filter them?

Issues downloading from github

Hi;
Is there an alternate site where I can download USAboundaries or USAboundariesData? I tried both options and got the following error messages:

remotes::install_github("ropensci/USAboundaries")
Downloading GitHub repo ropensci/USAboundaries@HEAD
Running R CMD build...

STDOUT:

STDERR:
Error in library(USAboundaries) :
there is no package called 'USAboundaries'
Calls: suppressPackageStartupMessages -> withCallingHandlers -> library
In addition: There were 13 warnings (use warnings() to see them)
Execution halted
Error: Failed to install 'USAboundaries' from GitHub:
Failed to R CMD build package, try build = FALSE.

I am using R 4.1

Add an example of a real world analysis.

Add to the readme an example of how the package data can be joined to some other data and mapped. This can help new users see how they can use with their own data.
From JOSS guideline to reviewers : Example usage : The authors should include examples of how to use the software (ideally to solve real-world analysis problems).

statement of need can be strengthened in the readme

I am sure this is a hugely useful package and that it could help users with lots of spatial analyses. Could a brief discussion of potential uses be added to the readme ? Currently the readme might suggest to a naive reader that it can only be used for plotting.

Add current boundaries from Census

This will probably take the form of a new function us_current_boundaries() that provides state, county, ZIP code, and other boundaries from the census. The state and country boundary data will live directly in this package. The ZIP code and metropolitan area data will be in a separate data only package, since they are much larger. But they will be accessed through the same function.

Can potential for joining county fields from us_cities() and us_counties() be improved ?

us_cities() returns a dataframe with a fields called 'county' and 'county_name'. These don't seem quite to link to the 'name' field from us_counties(). It would be good to have an indication of how best these datasets could be used together, or to have an issue for how this can be improved in future. e.g. for Alaska, in the cities dataset some county names have Borough or census area appended, and some e.g. Anchorage have a very different name 'THIRD JUDICIAL DIVISION'.

counties_ak <- us_counties(states = "Alaska")
unique(counties_ak$name)

[1] "Bristol Bay" "Skagway" "Ketchikan Gateway" "Nome"
[5] "Yakutat" "Kenai Peninsula" "Lake and Peninsula" "Juneau"
[9] "Bethel" "Kodiak Island" "Aleutians East" "Anchorage"
[13] "North Slope" "Aleutians West" "Southeast Fairbanks" "Fairbanks North Star"
[17] "Sitka" "Denali" "Prince of Wales-Hyder" "Yukon-Koyukuk"
[21] "Haines" "Petersburg" "Valdez-Cordova" "Matanuska-Susitna"
[25] "Dillingham" "Northwest Arctic" "Hoonah-Angoon" "Kusilvak"
[29] "Wrangell"

dplyr::filter(uscities,state=='AK') %>% dplyr::distinct(county)

1 THIRD JUDICIAL DIVISION
2 North Slope Borough
3 Bethel Census Area
4 FOURTH JUDICIAL DIVISION
5 Kenai Peninsula Borough
6 FIRST JUDICIAL DIVISION
7 Kodiak Island Borough
8 Northwest Arctic Borough
9 NORTHERN DISTRICT
10 Matanuska-Susitna Borough
11 Wrangell-Petersburg Census Area
12 Sitka Borough
13 Aleutians West Census Area
14 Valdez-Cordova Census Area

Reduce imports

Currently you have:

Imports:
    assertthat (>= 0.1),
    dplyr (>= 0.3.0.1),
    ggplot2 (>= 1.0.0),
    lubridate (>= 1.3.3),
    maptools (>= 0.8-30),
    rgeos (>= 0.3-8),
    sp (>= 1.0.15)

I think you could eliminate quite a few of these:

  • ggplot2 and dplyr (I think) are only used in the vignette, so could be move to the suggests
  • I don't see where rgeos is used
  • I think you could replace lubridate::ymd with as.Date()

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.