This repository has been archived. The former README is now in README-NOT.md.
ropensci-archive / datagovau Goto Github PK
View Code? Open in Web Editor NEW:no_entry: ARCHIVED :no_entry:
:no_entry: ARCHIVED :no_entry:
This repository has been archived. The former README is now in README-NOT.md.
example kml file:
# search for datasets with "trees" in their name:
trees_md <- search_data("name:trees", limit = 1000)
# fails because it wants a kml:
brimbank <- trees_md %>%
filter(name == "Brimbank Street Trees - Google kml") %>%
get_data()
... and draws on global variables and other things. In combination with multi.sapply
it just looks a bit complicated and maybe could be refactored to be easier to understand for future maintainers.
Currently this takes place in lines 23 to 51 of ./pkg/R/understanding_metadata.R
search_data
is too slow for interactive use:
> system.time(search_data("name:fire"))
user system elapsed
0.06 0.00 12.43
The performance could be improved by caching the metadata:
The pros and cons of the first are obvious: it would be more reliable and faster, but won't return metadata that forms post-release. Frequent releases would limit the downside, and I think it's my preferred method.
Some of the data on data.gov.au is available by API. At the moment we don't support this, which seems unfair!
eg in show_data()
for example:
trees_md <- search_data("name:trees", limit = 1000)
geelong <- trees_md %>%
filter(name == "Geelong Trees GeoJSON") %>%
get_data()
``
for example - mixes in id and date; cause and location - doesn't understand separate columns:
res <- search_data("name:fire", limit = 20)
res %>% filter(can_use == "yes") %>% slice(3) %>% get_data %>% View
library(datagovau)
library(dplyr)
res <- search_data("name:water", limit = 20)
res %>% filter(can_use == "yes") %>% slice(2) %>% show_data
gives:
[1] "https://datagovau.s3.amazonaws.com/bioregionalassessments/NIC/MBC/DATA/RiskAndUncertainty/FiguresMBC_drawdown_time_series_figure/352a2f65-ddbf-4251-a401-c7070d2c9208.zip"
Working with .zip (shp) file... Evaluate returned object.
trying URL 'https://datagovau.s3.amazonaws.com/bioregionalassessments/NIC/MBC/DATA/RiskAndUncertainty/FiguresMBC_drawdown_time_series_figure/352a2f65-ddbf-4251-a401-c7070d2c9208.zip'
Content type 'binary/octet-stream' length 766946 bytes (748 KB)
downloaded 748 KB
Show Traceback
Rerun with Debug
Error in show_data(.) : Unexpected unzipping of files.
That is, what do each of these 30 columns mean:
> res <- search_data("name:water", limit = 20)
> names(res)
[1] "cache_last_updated" "package_id" "webstore_last_updated"
[4] "id" "size" "state"
[7] "last_modified" "hash" "description"
[10] "format" "mimetype_inner" "url_type"
[13] "mimetype" "cache_url" "name"
[16] "created" "url" "webstore_url"
[19] "position" "revision_id" "resource_type"
[22] "verified_date" "verified" "resource_locator_protocol"
[25] "resource_locator_function" "Description" "autoupdate"
[28] "datastore_active" "wms_layer" "can_use"
Only can_use
is made in R, the others all come from data.gov.au. We should document them (certainly the most important) as a definition list in the helpfile for search-data
.
example attempt to get a json, returns "sorry, can't work with this file yet"
library(datagovau)
library(dplyr)
# search for datasets with "trees" in their name:
trees_md <- search_data("name:trees")
# what datasets do we have?:
trees_md[ , "name"]
# get one:
wyndham <- trees_md %>%
filter(name == 'Wyndham Trees and latest inspection') %>%
slice(1) %>%
get_data()
eg one function to download the data, one to turn it into a map
or at least this should be an option.
Some of the data is available in WMS format, usually to give people a preview in the browser. Do we want to support this in R - or is it sort of missing the point?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.