Giter Site home page Giter Site logo

datagovau's Introduction

datagovau

Project Status: Abandoned

This repository has been archived. The former README is now in README-NOT.md.

datagovau's People

Contributors

ellisp avatar hughparsonage avatar maelle avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

datagovau's Issues

Add kml support

example kml file:

# search for datasets with "trees" in their name:
trees_md <- search_data("name:trees", limit = 1000)

# fails because it wants a kml:
brimbank <- trees_md %>%
  filter(name == "Brimbank Street Trees - Google kml") %>%
  get_data()

characterise_data seems complicated...

... and draws on global variables and other things. In combination with multi.sapply it just looks a bit complicated and maybe could be refactored to be easier to understand for future maintainers.

Currently this takes place in lines 23 to 51 of ./pkg/R/understanding_metadata.R

Cache metadata?

search_data is too slow for interactive use:

> system.time(search_data("name:fire"))
   user  system elapsed 
   0.06    0.00   12.43 

The performance could be improved by caching the metadata:

  • in each package release
  • after the first invocation in a session
  • if the user requests it

The pros and cons of the first are obvious: it would be more reliable and faster, but won't return metadata that forms post-release. Frequent releases would limit the downside, and I think it's my preferred method.

Dealing with APIs

Some of the data on data.gov.au is available by API. At the moment we don't support this, which seems unfair!

Add GeoJSON support

for example:

trees_md <- search_data("name:trees", limit = 1000)

geelong <- trees_md %>%
  filter(name == "Geelong Trees GeoJSON") %>%
  get_data()
``

downloads data but gets columns wrong

for example - mixes in id and date; cause and location - doesn't understand separate columns:

res <- search_data("name:fire", limit = 20)
res %>% filter(can_use == "yes") %>% slice(3) %>% get_data %>% View

'unexpected unzipping of files'

library(datagovau)
library(dplyr)
res <- search_data("name:water", limit = 20)
res %>% filter(can_use == "yes") %>% slice(2) %>% show_data

gives:

[1] "https://datagovau.s3.amazonaws.com/bioregionalassessments/NIC/MBC/DATA/RiskAndUncertainty/FiguresMBC_drawdown_time_series_figure/352a2f65-ddbf-4251-a401-c7070d2c9208.zip"
Working with .zip (shp) file... Evaluate returned object.
trying URL 'https://datagovau.s3.amazonaws.com/bioregionalassessments/NIC/MBC/DATA/RiskAndUncertainty/FiguresMBC_drawdown_time_series_figure/352a2f65-ddbf-4251-a401-c7070d2c9208.zip'
Content type 'binary/octet-stream' length 766946 bytes (748 KB)
downloaded 748 KB

 Show Traceback
 
 Rerun with Debug
 Error in show_data(.) : Unexpected unzipping of files. 

document the columns returned by search_data

That is, what do each of these 30 columns mean:

> res <- search_data("name:water", limit = 20)
> names(res)
 [1] "cache_last_updated"        "package_id"                "webstore_last_updated"    
 [4] "id"                        "size"                      "state"                    
 [7] "last_modified"             "hash"                      "description"              
[10] "format"                    "mimetype_inner"            "url_type"                 
[13] "mimetype"                  "cache_url"                 "name"                     
[16] "created"                   "url"                       "webstore_url"             
[19] "position"                  "revision_id"               "resource_type"            
[22] "verified_date"             "verified"                  "resource_locator_protocol"
[25] "resource_locator_function" "Description"               "autoupdate"               
[28] "datastore_active"          "wms_layer"                 "can_use"     

Only can_use is made in R, the others all come from data.gov.au. We should document them (certainly the most important) as a definition list in the helpfile for search-data.

Add json support

example attempt to get a json, returns "sorry, can't work with this file yet"

library(datagovau)
library(dplyr)

# search for datasets with "trees" in their name:
trees_md <- search_data("name:trees")

# what datasets do we have?:
trees_md[ , "name"]

# get one:
wyndham <- trees_md %>% 
  filter(name == 'Wyndham Trees and latest inspection') %>%
  slice(1) %>%
  get_data()

WMS (web map service) support

Some of the data is available in WMS format, usually to give people a preview in the browser. Do we want to support this in R - or is it sort of missing the point?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.