svraka / teroszt Goto Github PK
View Code? Open in Web Editor NEW🗺🇭🇺 R package with data for Hungarian administrative and statistical divisions
License: Creative Commons Zero v1.0 Universal
🗺🇭🇺 R package with data for Hungarian administrative and statistical divisions
License: Creative Commons Zero v1.0 Universal
This introduction by Ryan Hafen offers some possibilities, including the geofacet package.
Data downloaded from OpenStreetMap is outdated. It looks like they are still using the original 2013 version and district boundaries were redrawn several times. The whole point of including OSM data was to have district level maps, so this needs to be fixed.
A better approach would be to use Eurostat's regularly updated GISCO data. Download settlement maps and and (using giscoR) and aggragate it to district level using sf. As a proof of concept, something like this could be bundled into a function:
lau_hu <- gisco_get_lau(country = "hu")
sf_use_s2(TRUE)
jaras_sf <- lau_hu %>%
left_join(select(tsz_2018, torzsszam, jaras, jaras_nev),
by = c("LAU_CODE" = "torzsszam")) %>%
group_by(jaras, jaras_nev) %>%
summarise()
Although this needs some further cleaning, as there are some strange districts, e.g. Kaszaper (in the Mezőkovácsháza District) has a borough called Pusztaszőlős, which is an exclave in the Orosháza District.
This would also fix R CMD check notes about package size.
The datasets are well documented but at least the README should have some examples and plots, especially maps. A vignette would be too much at this point.
kozighatarok_2018
can only be joined to other data using names (NAME
column). IDs are probably safer, and are available for ADMIN_LEVE
values, except for Budapest neighbourhoods.
Currently the latest data is for 2018. Check if there were any changes.
The R community moved away from Travis.
There some recurring tasks, which could be automated in functions, e.g. get a data frame of county names and codes:
tsz_2018 %>%
distinct(megye_nev, megye)
This would require moving dplyr to Imports.
Post box addresses and similar technical postal codes can be found in many databases but unfortunately Magyar Posta's database does list them and the list of post offices (xlsx) is incomplete. Try to ask Posta for them, or manually compile a list from postal codes found in real datasets.
As postal codes can cross county and district boundaries, postal code based classifications require significant data cleaning. Ideally there should be crosswalk tables where overlapping postal codes are manually cleaned and validated. The following crosswalks are relevant:
Label type columns should be factors to keep their commonly used ordering based on their coding system. E.g. regions are usually sorted based on NUTS codes, not alphabetically. This is relevant for tsz_2018
and nav_igazgatosagi_kodok
.
jogallas_2005_nev
in tsz_2018
.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.