Giter Site home page Giter Site logo

ddhconnect's People

Contributors

alpaziz avatar drkarthi avatar seladore avatar tonyfujs avatar yukun1218 avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

ddhconnect's Issues

download_resource() does not use the root_url parameter

download_resource uses the URL from dkanr_setup() and not the URL passed as the parameter. This is because it makes a call to get_resource_url() and fix_download_url() which do not take the URL as a parameter in the CRAN version.

get_resource_nid() not working

get_resource_nids() exists in dkanr. Delete existing get_resource_nid() from ddhconnect, and export the dkanr function to ddhconnect

Improve create_json_body

  1. Create helper function that maps metadata from / to machine_names space to pretty_names space
  2. Create a helper function that takes 2 columns form an Excel sheet, and creates the list programmatically (Clara)

Parse error from get_fields()

from Alp

id = 118383
updated_data = c(
"title" = "Togo - Firm Surveys for Comparing Personal Initiative Training to Traditional Business Training 2013-2016 TEST",
"workflow_status" = "published"
)
updated_data_json = create_json_body(values = updated_data, node_type = "dataset")
update_dataset(nid = id, body = updated_data_json)

error

Error: parse error: trailing garbage
                                      3{     "microdata": {         "d
                     (right here) ------^

happens in:

out <- httr::content(out)

Add function to download resources

New function for downloadable resources which will save files. Otherwise, users will have to located the
url file string field_link_api$und$url and declare their own file name. Also, might be good to print the citation (since we want to encourage use).

example
current

library(Rcurl)
resource_metadata = get_metadata(nid = 94974)
download.file(url = resource_metadata$field_link_api$und[[1]]$url, destfile = "WDI.zip")

suggested

# extract the existing file name? 
resource_download(resource_nid)

Add function to filter resource search results

Is there a machine name to indicate whether the resource is actually the data. Might help to have a function that filters the resource results by download/query tool to make the actual data more obvious. Identifying the data file/query tool/link is not clear when there are several resources (ex: WDI is ~70).

example
current

indicators_resources = get_resource_nid(nid = wdi_nid)
# need to locate the tid for dataset resource_type
tid_download = 986

for (resource_id in indicators_resources){
  resource_metadata = get_metadata(nid = resource_id)
  if (resource_metadata$field_wbddh_resource_type$und$tid == tid_download){
    print(resource_metadata$title)
  }
}

library(rcurl)
resource_metadata = get_metadata(nid = 94974)
download.file(url = resource_metadata$field_link_api$und$url, destfile = "WDI")

suggested

indicators_resources = get_resource_nid(nid = wdi_nid)
get_data(resource_ids, access_type = c("download, "query tool"))

Add roxygen comments for get_datasets_count

Does this function need to stand independently? If so, it needs clarification for users about what can pass in datatype parameter (i.e. single strings or combination of filters, etc.).

Update resource format lookup table

The resource_json_format_lookup is missing some machine_names and the corresponding json_template values. Need to update the lookup to match the current form fields and values.

missing

  • format
  • geospatial api formats
  • harvest source
  • harvest source id
  • workflow status?
  • remove_file?

Add a build_json() function

Function that takes a list of named arguments, check whether the values are valid, and build a valid JSON string.

Remove default node type in create_json_body()

The default value for the node_type parameter is 'dataset'. If the node type is 'resource' and if you updating the node without passing a parameter for node_type, then the node type is changed to dataset.

Update test json

current test files are not updated with latest json format
might need to consider adding a check?

Update UI Names look ups

Hey @alpaziz and @seladore , I remember on team meeting you guys mentioned about this UI look up table. I found some fields out of date. I'm not sure whether you get a chance to look at it. I'm adding some this week.

map_tids returns wrong tids

The map_tids() function breaks for some values. For instance:

metadata = c("field_topic" = "Poverty", "field_wbddh_data_class" = "Public" ) map_tids(metadata) field_topic field_wbddh_data_class "376" "378"

field_wbddh_data_class should have value "358"

It may be safer to build this function using inner_join()

  1. Create a dataframe from the input vector
  2. Do an inner_join with the output of get_lovs()

Improve output format for `search_catalog()`

Currently, the output is formatted as list of lists, including the "und" formatting. We might want to consider the output being a cleaner, more standardized format especially since multiple results can be returned at once.

example
current

[[1]]$field_contact_email
[[1]]$field_contact_email$und
[[1]]$field_contact_email$und[[1]]
[[1]]$field_contact_email$und[[1]]$value
[1] "[email protected]"

[[1]]$field_contact_email$und[[1]]$format
NULL

recommended
named list format or dataframe?

$star
[1] "wars"

Display more informative errors/warnings

create_dataset() and create_resource() both do not display informative error/warnings when a field is populated incorrectly.

example
current

Error: Client error: (406) Not Acceptable - error: An illegal choice has been detected. Please contact the site administrator.

suggested

{"form_errors":{"field_topic][und":"An illegal choice has been detected. Please contact the site administrator.","field_wbddh_data_type][und":"An illegal choice has been detected. Please contact the site administrator."]}

Add caching for get_lovs()

Cache it as a data file after the first call, with an option to update. Can also do this for get_fields(), get_required_fields() and get_lov_fields()?

New function mapping strings to tids

Using search_catalog() with lov field names has a confusing work flow. Currently, the user must map their search string to the corresponding tid with get_lovs(). Might be simpler for user if we automated looking up the string and returning the tid.

example:
current input
filters = c("field_wbddh_data_type" = "295")
suggested input
filters = c("field_wbddh_data_type" = "geospatial")

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.