Giter Site home page Giter Site logo

wis2-metadata-search's Introduction

WMO WIS 2.0 Discovery Metadata exchange, harvesting and search pilot project

Date: 2020-05-03

Focal point: Tom Kralidis

Introduction

WIS 1.0 discovery is primarily comprised of WMO Core Metadata Profile, OAI-PMH for harvesting and SRU for search.

Current realities of the interfaces and encodings include:

  • use of XML for metadata description and utilization in web applications
  • based on an era of service-oriented architecture
  • overloading of web architecture principles
    • using HTTP as a tunnel
    • little to no use of HTTP status codes
    • large, monolithic standards and systems
    • not "of the web" or "webby"
    • challenging for web developers to implement
    • challenging for mass market integration (search engine optimization)

As a result, WIS and weather/climate/water data services related to discovery and search should be improved to take advantage of current approaches and opportunities.

Weather/climate/water data is by nature geospatial, and temporal. The W3C Spatial Data on the Web Best Practices provides guidelines on how to best enable spatiotemporal data to lower the barrier for users, search engine optimization and linked data.

The current evolution in data exchange standards, systems and architecture are grounded in the following:

  • Resource-oriented architecture (ROA)
  • Representational State Transfer (REST)
  • JSON and HTML as core web formats

Following this trend is the current evolution of OGC interface standards via OGC API, which are a clean break against legacy standards, and implement APIs using core, broad industry approaches (W3C, OpenAPI, JSON, etc.).

OGC APIs are designed to be web developer friendly and are being developed with a minimal core and extension mechanism. Example:

  • Service-oriented: /api?request=GetFeature&typename=roads&featureid=5
  • Resource-oriented: /api/collections/roads/items/5

Project description

This project aims to experiment implementing WMO discovery metadata as DCAT using the OGC API - Records draft standard. This project will also experiment actionable linkages with demonstration project 1 (AMQP/MQTT), search/access of collections of variables of NWP data, as well as enabling search capability against WIS 2.0 topics.

WIS 2.0 Targeted Principles: 1, 2, 3, 4, 6, 10 and 11.

wis2-metadata-search's People

Contributors

tomkralidis avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

nourou6

wis2-metadata-search's Issues

initial WIS 2.0 metadata/search brainstorming/ideas

@wmo-im/tt-wismd / @wmo-im/tt-wigosmd in relation to WIS 2.0 and the metadata search demonstration project, notes from initial discussion with discussion with @6a6d74 (2020-12-15).

Note that these are initial ideas only for discussion with ET-Metadata. Please review and provide your thoughts and perspectives here, thanks.

Drivers

  • lower the barrier to entry
  • FAIR data principles
  • Web architecture/hypermedia
  • webby/of the web
  • search engine friendly

Metadata Standards

  • WIS and WIGOS metadata
    • linkage between dataset and the platform the generated/collected the data
    • a discovery metadata record should be able to reference a WIGOS metadata record (in OSCAR)
  • DCAT2: dataset+multiple realizations
    • unique identifiers are first class
    • consider community standards

Harvesting

  • suppliers provide URLs to metadata
  • harvest a set of metadata terms out of that that record, from a set of known formats (adapter pattern)
  • core tooling for data providers for converting their bespoke metadata into recognized formats if needed
    • data providers can contribute their converter to tooling (core+extension/plugin)
  • what is the machinery to harvest/push/pull records to a GISC destination

Catalogue options

The browser as the catalogue

  • is the browser search engine
  • WIS catalogue is NOT a primary search endpoint
  • probably doesn't need duplicated in each GISC
  • harvest from closest point to authoritative source
  • Structured data
    • e.g. Google Dataset search
  • schema.org annotations

Definitive WIS catalogue

  • People don't trust search engines
  • provide a vanilla search experience without "value add" from search engines to prioritize or promote various things
  • to assert the definitive list [authoritative data] as recognized by WMO
    • approved by PRs
    • quality statement
    • use this 'quality statement' identify quality / authority of datasets; enable search engines to see what is official
  • Searching from applications (e.g. GIS Desktop, QGIS, ArcGIS)
    • sensible for WIS Catalogue to provide an API
    • need to consider performance/availability
  • Metadata in the WIS Catalogue
    • WIS Catalogue only holds the smallest amount of metadata needed
    • refer back to the original metadata for the full description
    • meta-metadata, with link back to full metadata record
    • example:
      • identifier
      • type
      • title
      • abstract
      • keywords
      • extents
      • links
      • license
      • provenance
      • schema.org annotations
  • availability/uptime considerations
    • operational? 24x7?
    • number of instances? Synchronization? Or harvest metadata direct from source

Guidance and support to members

  • needed for NCs and DCPCs to do this to make their data searchable on Google, for example
    • e.g. publishing a schema.org record
    • tools for transformation/migration from WCMP

WIS 2.0 / Ocean InfoHub brainstorming/ideas

Discussion with @fils from Ocean InfoHub (2021-02-05):

Areas of discussion:

  • lowering the barrier to discovery of WMO and Ocean community resources
    using structured data and schema.org annotations
  • mass market, being "webby" / "of the web": extending the reach of our
    data holdings via SEO, accessible via web browsers/platforms
  • applying W3C Spatial Data on the Web principles [1]
  • controlled vocabularies (concepts, themes, etc.)

Synergies:

  • schema.org implementation
  • as well as WMO standards
  • OGC spatiotemporal concepts
  • OGC API - Records

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.