Giter Site home page Giter Site logo

svraka / teroszt Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 28.93 MB

🗺🇭🇺 R package with data for Hungarian administrative and statistical divisions

License: Creative Commons Zero v1.0 Universal

Makefile 3.29% R 96.71%
r geocoding administrative-divisions geographic-visualizations

teroszt's People

Contributors

svraka avatar

Watchers

 avatar  avatar

teroszt's Issues

Replace OSM maps

Data downloaded from OpenStreetMap is outdated. It looks like they are still using the original 2013 version and district boundaries were redrawn several times. The whole point of including OSM data was to have district level maps, so this needs to be fixed.

A better approach would be to use Eurostat's regularly updated GISCO data. Download settlement maps and and (using giscoR) and aggragate it to district level using sf. As a proof of concept, something like this could be bundled into a function:

lau_hu <- gisco_get_lau(country = "hu")

sf_use_s2(TRUE)
jaras_sf <- lau_hu %>%
  left_join(select(tsz_2018, torzsszam, jaras, jaras_nev),
            by = c("LAU_CODE" = "torzsszam")) %>%
  group_by(jaras, jaras_nev) %>%
  summarise()

Although this needs some further cleaning, as there are some strange districts, e.g. Kaszaper (in the Mezőkovácsháza District) has a borough called Pusztaszőlős, which is an exclave in the Orosháza District.

This would also fix R CMD check notes about package size.

Improve documentation

The datasets are well documented but at least the README should have some examples and plots, especially maps. A vignette would be too much at this point.

Add territorial IDs to map data

kozighatarok_2018 can only be joined to other data using names (NAME column). IDs are probably safer, and are available for ADMIN_LEVE values, except for Budapest neighbourhoods.

Should this packages have helper functions?

There some recurring tasks, which could be automated in functions, e.g. get a data frame of county names and codes:

tsz_2018 %>% 
  distinct(megye_nev, megye)

This would require moving dplyr to Imports.

Add postal codes for post box addresses

Post box addresses and similar technical postal codes can be found in many databases but unfortunately Magyar Posta's database does list them and the list of post offices (xlsx) is incomplete. Try to ask Posta for them, or manually compile a list from postal codes found in real datasets.

Better support for working with postal codes

As postal codes can cross county and district boundaries, postal code based classifications require significant data cleaning. Ideally there should be crosswalk tables where overlapping postal codes are manually cleaned and validated. The following crosswalks are relevant:

  • Regions: very few overlapping postal codes
  • Counties: more but easily manageable
  • Districts: even more, needs heavy manual cleaning.
  • Metropolitan areas: also a lot of manual cleaning

Use factors in label names

Label type columns should be factors to keep their commonly used ordering based on their coding system. E.g. regions are usually sorted based on NUTS codes, not alphabetically. This is relevant for tsz_2018 and nav_igazgatosagi_kodok.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.