Giter Site home page Giter Site logo

yaml-ld's Introduction

W3C Logo

YAML-LD

This repository describes the YAML serialization of JSON-LD 1.1 as developed by the JSON for Linking Data Community Group. The editors’ draft of the Note can also be read directly.

The Use Cases and Requirements document can also be read read directly.

Disclaimer

UNDER THE EXCLUSIVE LICENSE, THIS DOCUMENT AND ALL DOCUMENTS, TESTS AND SOFTWARE THAT LINK THIS STATEMENT ARE PROVIDED "AS IS," AND COPYRIGHT HOLDERS MAKE NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, NON-INFRINGEMENT, OR TITLE; THAT THE CONTENTS OF THE DOCUMENT ARE SUITABLE FOR ANY PURPOSE; NOR THAT THE IMPLEMENTATION OF SUCH CONTENTS WILL NOT INFRINGE ANY THIRD PARTY PATENTS, COPYRIGHTS, TRADEMARKS OR OTHER RIGHTS. COPYRIGHT HOLDERS WILL NOT BE LIABLE FOR ANY DIRECT, INDIRECT, SPECIAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF ANY USE OF THE DOCUMENT OR THE PERFORMANCE OR IMPLEMENTATION OF THE CONTENTS THEREOF.

Code of Conduct

W3C functions under a code of conduct.

yaml-ld's People

Contributors

anatoly-scherbakov avatar bigbluehat avatar gkellogg avatar ioggstream avatar pchampin avatar rob-metalinkage avatar tallted avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

yaml-ld's Issues

Choose prefix character to replace @

As an architect of a YAML-LD aware system and a developer, I am a bit uncomfortable about having my users write JSON-LD @-keywords and have to quote them because that would mean additional effort. I would rather prefer some other prefix character from ASCII character set.

Choice of prefix characters

@id Notes Resolution
@id Is reserved. x
`id Is reserved. x
~id Works
!id Used for tags x
#id Used for comments x
$id Works x
%id Used for directives x
^id Works
&id Used for anchors x
*id Used for aliases x
-id Works, but is confusable with lists: - id x
+id Works
_id Works but is confusable with blank nodes (#57) x

Also, _ in Python means private or internal things so I am subconsciously against that :)

I suggest using $ for the Convenience Context. This character also is good for JS: something.$type is workable in that language.

The users however may do whatever they want. For instance, users might want something like this:

=: 🏠:cat
∊: 🐱
🏷️:
  πŸ‡ΊπŸ‡Έ: Ray

I won't stop them. Our $ is really just a way to trigger everyone's imagination.

get rid of "@" in a yaml spec

I was pleased to learn about this effort to facilitate encoding of LD with YAML... i enjoy writing yaml in my editor. what i don't enjoy is making special syntactic adjustments because of special symbols introduced that require quotes around keys like '@context' or '@graph'. why not just use context: or graph: ? make these reserved symbols, and keep the writing simple.

YARRRML Lessons learned

Hi all, this is just a way to present some lessons learned we had when developing YARRRML:
a DSL that translates to (R2)RML, a mapping langauge that converts heterogeneous data sources to RDF.
Although the goal of YARRRML (a developer-friendly DSL to write RML documents that happen to be in Turtle) is different from YAML-LD,
it is at least a bit related :). Some things we encountered:

  • Using ~ as a convencience to 'tag' specific values, e.g., to specify, when mapping a value to a node, to map it to a literal (http://example.com/homepage~literal => "http://example.com/homepage") or an iri (http://example.com/homepage~iri => <http://example.com/homepage>). (we did not use _ since this might confuse users with blank nodes somehow), see eg https://rml.io/yarrrml/spec/#tabs-19. Also related to #55
  • A limitation of YAML we encountered was not being able to refer to another YAML-document. We had to include a mechanism to 'combine' multiple YAML files and handle external values (see https://github.com/RMLio/yarrrml-parser#yarrrml-parser-1)

If it makes sense to take this into account for YAML-LD, happy to continue the discussion!
If irrelevant, feel free to close this without further ado ;)

Work on YAML-LD

On the w3c/json-ld-syntax w3c/json-ld-syntax#389 proposes advancing work on YAML-LD. I was not able to transfer the issue to this repository, but further discussion and votes in support of starting the initiative in the JSON-LD CG should be voiced here. Given sufficient support, we'll create a yaml-ld repository for work to procede.

Links to sections of YAML spec do not work

At PR #59, @gkellogg had introduced links to sections of the YAML specification which define particular terms of the spec.

Why?

[[YAML]] is dereferenced by ReSpec as per SpecRef into http://yaml.org/spec/1.2/spec.html. Thus, YAML#92-streams is converted into http://yaml.org/spec/1.2/spec.html#92-streams.

Which does not work as expected, instead it redirects to https://yaml.org/spec/1.2.2/ and the fragment identifier is lost.

What?

I would like to propose to change the YAML spec link to version 1.2.2 instead of 1.2. That should alleviate the redirect.

meld JSON Schema and JSON-LD context/frame

As an information architect.
I want a harmonized way of specifying validation (JSON Schema) and semantic binding (JSON-LD context & frames).
So that I can reap both benefits for my JSON and YAML data.

These are very complementary:

  • JSON Schema specifies the shape of JSON data for validation
  • JSON-LD specifies the binding of JSON data to semantics, and how to convert RDF<->JSON

What's the relation to YAML:

This is a sub-UCR of #19, which itself:

  • considers a wider context
  • doesn't have a specific goal yet, i.e. is just informational
  • considers simple data modeling languages based on YAML, wherein JSON Schema is derived but is not the source

"JSON Schema plus JSON-LD" is an especially relevant case for our community, thus this UCR

  • @ioggstream "JSON-LD and JSON Schema... I travel these boundaries quite often":
  • Wouldn't it be nice to "construct a smooth path" so you don't need to cross any boundaries, and can think more about your data model rather than the various modeling mechanisms?

Prior art

(from #2):

1: @OR13 often use OAS (Open API Specification) / YAML with JSON-LD and JSON Schema. I like the idea of controlling both semantics and data shape at the same time, using only 1 file. OAS supports JSON Schema represented in YAML.
We tweaked the JSON Schema to support JSON-LD terms ($linkedData), so now we can present RDF types and JSON Schema types in a single YAML file. This helps us keep semantics and security in sync (more discussion in #2 (comment)).
For example:

$linkedData:
  term: AgActivity
  '@id': https://w3id.org/traceability#AgActivity
title: Agricultural Activity

2: @ioggstream added new keywords (x-jsonld-context, x-jsonld-type) to be compatible with OAS 3.0.

Considerations

Modularization

TODO

Potential Conflicts

(from #51):

JSON Schema includes the following $ keywords:
$schema, $vocabulary, $defs, $ref, $id, $anchor, $comment, $dynamicRef, $dynamicAnchor

If we decide to use the same sigil for both kind of keywords, we should look out for conflicts

  • @id is a conflict with $id
  • @vocab is a near-conflict with $vocabulary (i.e. could be confusing)

But maybe there is no problem if these keywords are localized to the Context vs Schema parts?

  • After all, @id is already "overloaded" in JSON-LD:
"@container": "@id" # Node Identifier Indexing
"@id": "bart"             # Node identifier
"@id": {"@id": "bart", "age": 42}  # triple, for which RDF-star annotations will follow

Defining various interoperability profiles

As an Editor
I want a set of different interoperability profiles
So that I can select the YAML features that are supported by a YAML-LD implementation

Note

For example I could define

  • a BASE profile that is just JSON with YAML syntax: no anchors / alias nodes, only string keywords, only JSON datatype values, comments
  • an EXTENDED profile: with anchors / alias nodes but only with a rooted directed acyclic graph

Elements to be discussed in profiling

File signature

As a data consumer
I want an indicator to tell me that a file is probably YAML-LD
So that I know when to expect YAML-LD

Strict checking whether a YAML document is valid YAML-LD requires to follow the full specification. Nevertheless
some kind of magic file number would be useful. As suggested here a YAML global tag could be used for this purpose (see RFC 4151):

!<tag:json-ld.org,2022>
$context: http://schema.org/
$type: Person
name: Pierre-Antoine Champin

YAML processors will raise a "unknown tag" error when trying to process the document without knowledge of YAML-LD. It can still be parsed as valid YAML but there is no default mapping to JSON. This is not a bug, but a feature.

YAML-LD datatypes (and tags for datatypes)

  • RDF uses explicitly tagged literals, in particular lang strings and XSD datatypes, including infinite precision integers and decimals.
  • JSON carries faithfully strings and small numbers, everything else must be represented as a string with a separate field to indicate the datatype (@type in JSON-LD). Eg see w3c/json-ld-syntax#387 for the pitfalls of using large integers or decimals
  • YAML can use tags to carry literals faithfully (including infinite precision, "markers" like -.inf and .nan, datetimes), and even more complex structures. One could declare "YAML schemas" with additional tags, eg to represent all XSD datatypes

Why might we want more than "string plus @type"?

  • convenience (eg see dc:date below and many other examples)
  • normalization (reduce/eliminate lexical vs value space differences): it seems to me easier for a processor to normalize 02022-05-18 to 2022-05-18 if tagged as !xsd!date rather than looking at a parallel @type field.

Let's collect below examples of what we could want.


@gkellogg in ietf-wg-httpapi/mediatypes#8 (comment)

If I were to revisit anything in the JSON-LD data model, it would be the interpretation of JSON numbers to allow for decimal values. As it is now, JSON numbers are either interpreted as integers (long) or doubles based on the range of the number. But, in JSON-LD 1.1, we use The JSON Canonicalization Scheme (RFC8785) as a way to represent numbers in the rdf:JSON datatype serialization, which allows for a serialization form of either integer, decimal, or double. This really only comes into play in JSON-LD when creating RDF literals from native JSON numbers (something which is generally a bad design point, but is there to allow a reasonable interpretation of native JSON forms), but could also come into play when representing those numbers in the data model, and thus in serializations to forms such as YAML.


@VladimirAlexiev from #2:

  • Tags are comparable to datatypes.
  • the YAML json schema and core schema handle string, boolean, integer, float (the latter allows things like -.inf and .nan).
  • https://yaml.org/type/ handles a wider set, in particular dates and datetimes. But please note these are considered deprecated in 1.2 and are being removed in 1.3 yaml/yaml-spec#268 (comment)
  • Maybe we should define a YAML schema to handle more xsd datatypes?
    • It should aim to eliminate problems related to the limited and non-standardized set of JSON literals. Eg the JSON number 12345678901234567890.12345 is converted to RDF literal "12345678901234567168"^^xsd:integer (see jsonld playground)
    • And could even work as a replacement of @type, eg
# short form using tags
dc:date: !xsd!date 2022-05-18

# instead of long form
dc:date: {"@type": xsd:date, "@value": 2022-05-18}

New ones:

  • is it at all feasible to write "foo"@en in YAML rather than a separate @language field?
  • JSON-LD cannot capture GeoJSON because that uses nested arrays. Can this be worked around somehow with a YAML tag for "2D array"?

Conformance Tests

  • YAML-LD must include comprehensive conformance tests for all its features.
  • In fact the features should be designed after agreed archetypical examples, so it's feasible for the development process to be "test-driven"
  • It should replicate/re-render all JSON-LD tests. Despite no official support in the spec, the gazillion JSON-LD conformance tests are written in YAML: https://github.com/w3c/json-ld-syntax/tree/main/yaml
  • It should add extra tests for any YAML feature that has no JSON analog

Pospone discussion on "$" and "@"

Proposal

JSON-LD is complex. YAML is very complex and - as the YAML community says - it's a live specification that is undergoing
continuous revisions. For example:

  • YAML 1.2 deprecated many 1.1 functionalities but some of them are still present in the current spec examples;
  • Some deprecated but "killer" features like merge keys <<: have not been replaced by new ones, so implementers still support them;
  • While YAML serialized using the flow collection styles can appear like JSON, YAML MUST not be considered a superset of JSON;
  • "@" in YAML is reserved for future use since forever, and this might change.

In this context, focusing on switching "@" with "$" to avoid quoting could lead to a dead end e.g. this is both valid in YAML and JSON, and it's just the first example that come to my mind

{
  "@context": {
    "@vocab": "https://www.w3.org/2019/wot/json-schema#",
    "title": { "@id": "dct:title", "@type": "xsd:string" },
    "description": { "@id": "dct:description", "@type": "xsd:string" },
    "$id": { "@id": "@id" },
    "type": { "@id": "@type" },
    "object": "ObjectSchema"
  },
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "$id": "https://example.com/product.schema.json",
  "title": "Product",
  "description": "A product in the catalog",
  "type": "object"
}

I'd focus on YAML structure

imho I would focus:

Once identified the relations with these specific YAML features, I think we will be ready tame things like tags (which are not very interoperable in the wild).

The ongoing work on YAML fragment identifiers could even provide further benefits since it can reference both JSON Paths and alias nodes.

My2Β’,
R.

YAML-LD IRI tags

The so-called "extended" profile of YAML-LD makes use of YAML's tag mechanism to cause values to be interpreted as RDF literals with a datatype coming from the expanded tag (or language, through restricted use of the i18n namespace). This is particularly useful for environments where fetching a remote context remains an issue. What this doesn't provide for is a way to natively represent IRIs and Blank Nodes. Currently, this requires using a new node object:

"@id": https://json-ld.github.io/yaml-ld/spec # This is interpreted as an IRI by JSON-LD algs
homepage:
  "@id": https://example.org/foo

If "homepage" were defined with "@type": "@id" in a term definition, the indirection through a node object isn't required.

As IRIs don't have their own datatypes, the use of the YAML node tag mechanism, where the tag is a URI becomes semantically problematic, but YAML allow the use of local tags, and we could consider defining such a tag to be interpreted by YAML-LD processors (operating in extended mode). For example:

"@id": https://json-ld.github.io/yaml-ld/spec # This is interpreted as an IRI by JSON-LD algs
homepage: !id  https://example.org/foo # Can be treated as an IRI (or blank node) by a YAML-LD processor.

This would cause a YAML-LD processor to interpret values tagged with the !id local tag to be interpreted as either an IRI or blank node, similar to how values of the @id key in JSON-LD are interpreted, but avoids the indirection through a node object by extending the internal representation to be able to represent IRIs and Blank Nodes as scalar values, in addition to RDF literals, strings, numbers, boolean and null.

Convert JSON-LD to YAML-LD using standard YAML libraries

As a developer,
I want to be able to convert JSON-LD documents to YAML-LD by simply serializing the document using any standard YAML library,
So that the resulting YAML is valid YAML-LD, resolving to the same graph as the original JSON-LD.

Conversely, I would be very surprised (and annoyed) if such a simple conversion did not work.

This is why, although I do not oppose $-keywords (#11) for authoring YAML-LD from scratch, I want @-keywords to be also supported.

YAML-LD UCRs

Related to w3c/json-ld-syntax#389, #3.
TODO: scan these issues to extract more requirements.

Like any great W3C work, we should start with a Use Case Requirements specification / collection.

Until we have a repo (@gkellogg) where we can collect UCR per issue, let's use the following format. It could help UCR editors collect the issues:

  • PRIO: Title (@ contacts)
    • description
    • considerations
    • considerations

Notes:

  • The checkbox shows which UCRs have been posted as distinct issues in this repo
  • PRIO is MUST/COULD/SHOULD and is approximate and subjective, so let's not spend too much effort discussing them. This is specifically not a "requirement compliance" rating
  • Can include contacts of people who suggested the idea, or could be interested to develop it further. Not intended to track authorship precisely
  • Give a detailed description
  • Give as many relevant considerations, examples and links as you can

Here we go


  • MUST: Least Surprise (@pchampin)

    • YAML-LD should work the same as JSON-LD unless specific processor options are selected
    • Since YAML is an extension of JSON, YAML-LD->RDF should produce the same as YAML->JSON-LD -> RDF,
    • YAML-LD should cover all features of JSON-LD 1.1
  • MUST: Human Readability (@anatoly-scherbakov)

  • MUST: Compatibility (@gkellogg)

    • YAML-LD must be compatible with other related syntaxes (see "Polyglot Modeling")
    • Eg using the $ sigil ("namespace") will overlap with other existing uses. For example, JSON Schema has $schema, $vocabulary and other other keywords that would not overlap with the JSON-LD keyword namespace (that uses the @ sigil)
  • SHOULD: YAML Intro (@VladimirAlexiev)

    • The YAML-LD spec or UCR spec should provide an introduction to YAML, since the YAML spec is rather technical
    • In particular, it should cover YAML extensions compared to JSON
  • SHOULD: Archetypical Examples (@VladimirAlexiev)

    • The UCR spec should include archetypical YAML examples from various domains, eg software projects, modules, issues, etc
  • SHOULD: Shortcuts (@VladimirAlexiev)

  • COULD: Versions (@VladimirAlexiev)

  • SHOULD: Extensions (@VladimirAlexiev)

    • Leverage YAML features over JSON. Quoting from the YAML spec "The YAML 1.0 specification was published in early 2004
      The YAML 1.1 specification was published in 2005. Around this time, the developers became aware of JSON. By sheer coincidence, JSON was almost a complete subset of YAML (both syntactically and semantically). The YAML 1.2 specification was published in 2009. Its primary focus was making YAML a strict superset of JSON".
    • Some of them can be quite useful in an RDF context:
    • Anchors and Aliases can represent non-tree graph structures and should mesh with JSON-LD Frames
  • Tags are comparable to datatypes #17.

    • the YAML json schema and core schema handle string, boolean, integer, float (the latter allows things like -.inf and .nan).
    • https://yaml.org/type/ handles a wider set, in particular dates and datetimes. But please note these are considered deprecated in 1.2 and are being removed in 1.3 yaml/yaml-spec#268 (comment)
    • Maybe we should define a YAML schema to handle more xsd datatypes?
    • It should aim to eliminate problems related to the limited and non-standardized set of JSON literals. Eg the JSON number 12345678901234567890.12345 is converted to RDF literal "12345678901234567168"^^xsd:integer (see jsonld playground)
    • And could even work as a replacement of @type, eg
# short form using tags
dc:date: !xsd!date 2022-05-18

# instead of long form
dc:date:
  @type: xsd:date
  @value: 2022-05-18
  • COULD: Polyglot Modeling #19

    • For efficient RDF modeling, you need to define multiple related artefacts: ontology, shapes (SHACL (@HolgerKnublauch) or SHEX (@ericprud)), JSON-LD context, maybe JSON-LD frames, JSON schema or Avro schema
    • Many communities like LD expression of their data, but mostly care about defining it with JSON schema (eg w3c-ccg @OR13 @nissimsan @msporny)
    • Many people have expressed the desire to define a unified or "technology independent"
    • Most polyglot frameworks are YAML-based. Examples include:
    • See eg w3c-ccg/traceability-vocab#296 for a brief list of modeling framework requirements
    • YAML-LD should not take over these modeling-framework efforts, but should show how they can be used together, show archetypical examples, and maybe make a comparison
  • #20

    • YAML-LD must include comprehensive conformance tests for all its features.
    • In fact the features should be designed after agreed archetypical examples, so it's feasible for the development process to be "test-driven"
    • It should replicate/re-render all JSON-LD tests. Despite no official support in the spec, the gazillion JSON-LD conformance tests are written in YAML: https://github.com/w3c/json-ld-syntax/tree/main/yaml
  • SHOULD #42

  • MUST #43

Polyglot Modeling

WHO: As an information architect
WHAT: I want data modeling language(s) independent of technical artefacts
WHY: So that:

  • the language is understandable to domain experts
  • it can generate a variety of required technical artefacts
  • all these artefacts are kept in sync, thus lowering maintenance effort

For efficient RDF modeling, you need to define multiple related artefacts:

  • ontology
  • shapes (SHACL (@HolgerKnublauch) or SHEX (@ericprud))
  • diagrams and other documentation
  • JSON-LD context,
  • maybe JSON-LD frames,
  • JSON schema or Avro schema
  • API bindings and hypertext controls (HATEOAS)
  • etc

Thus, many people have expressed the desire to define a unified or "technology independent" modeling language.

Many communities like to have a LD expression of their data, but mostly care about defining the data with JSON schema. Efforts to marry JSON Schema with JSON-LD contexts have been undertaken in:

Examples of polyglot frameworks include the following (many but not all are YAML-based):

YAML-LD should not take over these modeling-framework efforts, but should show how they can be used together, show archetypical examples, and maybe make a comparison.

Even if no Modeling requirements make it into any YAML-LD spec, this git repo could serve as a "meeting place" for people working on these topics, potentially resulting in some unification.

Work Plan

The general work plan adopted on the August 17th meeting:

  • Define limitations on use of YAML Alias Nodes.
    • Use of aliases limited to the Extended profile.
  • Define more specifically the YAML-LD-JSON Profile. (PR yaml-ld#70)
  • Define YAML-LD-Extended Profile.
    • Transparent use of %TAG for XSD and I18N namespaces (pending library support) (issue yaml-ld#17)
    • Full use of YAML Alias nodes. Not round-tripped.
  • Define extension points to JSON-LD API.
  • Test Suite (see #20)

This issue serves as a meta-issue to manage completion of different milestones.

Define anchor usage in yaml-ld

As an json-ld editor … WHO
I want to use yaml anchors … WHAT
So that I can easily reuse content … WHY

Note

The specification should define:

  • when it is legitimate to use anchors
  • which are the expectation on anchor usage (e.g. do they represent a specific JSON-LD node or they can just be used to represent content?)
  • are there any constraint on anchor usage? (e.g. the representation graph MAY / MUST NOT be a cyclic graph...)

example 1

---
- "@id": &homer http://example.org/#homer  # Anchor the homer url
  http://example.com/vocab#name:
  - "@value": Homer
- "@id": http://example.org/#bart
  http://example.com/vocab#name:
  - "@value": Bart
  http://example.com/vocab#parent:
  - "@id": *homer                               # reuse the anchor instead of re-typing the homer url
- "@id": http://example.org/#lisa
  http://example.com/vocab#name:
  - "@value": Lisa
  http://example.com/vocab#parent:
  - "@id": *homer

example 2

Using anchor and alias nodes https://gist.github.com/ioggstream/31f3226fa9976b3baf0800f44bc19c98

Compatibility with existing libraries

My experience with ShExJ as YAML has indicated no difference in the way some libraries, e.g. js-yaml, and tools, e.g. json2yaml.com, map JSON to YAML, at least as far as semantics go (perhaps not as far as which things get quoted and line wraps vs. '\n's). A survey of corner cases would help reveal if there's already a tacit "standard".

Round-trip safe json-ld -> yaml-ld -> json-ld

As an <user with json-ld files> … WHO
I want to <convert them to yaml-ld> … WHAT
So that <they are round-trip safe> … WHY

Note

imho any other behavior hinders interoperability

Conformance section

I expect

in conformance section

  • include yaml
  • include HTTP
  • include linked data

YAML-LD canonicalization (c14n)

As an information architect.
I want no variation in YAML format for the same semantic content.
So that I can easily compare or sign YAML.

Canonicalization (also called c14n or normalization) is quite useful to enable the following use cases :

  • meaningful diff
  • signed content. As Manu puts it "to ensure that different expressions result in the same hash"
  • TODO: what more?

Prior art:

NOTE THAT this UCR is quite the opposite of #42. So if we cater to both:

  • we should define a YAML c14n style (cosmetic controls) to produce canonic YAML
  • we should describe that using different YAML styles is not recommended for the above "canonicalized" use cases

sigil: change prefix char in context

As an information architect and developer.
I want to write YAML-LD keywords using identifier chars accepted in YAML and in my programming language.
So that I can access them using dot notation, rather than using "string index" bracketed notation.

Example: if the prefix char is _ :

  • I can have more readable YAML without quoting keys (see Example with context below)
  • I can write this in JavaScript:
label._none // with Language Indexing, if the label has no lang tag
label._value // with langString

instead of

label["@none"]
label["@value"]

(see JSON-LD Language Indexing)


JSON-LD takes a per-keyword approach, i.e. you can define keyword aliases, eg

"@context":{
  "type":"@type",
  "id":"@id",
  "lang":"@language",
  "none":"@none"
}

A more uniform way in YAML-LD could be to specify the prefix char ($ or _ or even empty) with an option.

Example with context (TODO make more)

  "@context":
    "@sigil": $
    $base: http://example.org/resource/
    $vocab: http://example.org/ontology/
  $graph:
    $id: bart
    spouse: marge

TODO: is there any way to avoid the use of @ in the first 2 lines?

@ioggstream and @anatoly-scherbakov I tried \ escaping but it doesn't work at https://onlineyamltools.com/convert-yaml-to-json, maybe that convertor is non-conforming?

\@context:
  \@sigil: $
  $base: http://example.org/resource/
  $vocab: http://example.org/ontology/
$graph:
  $id: bart
  spouse: marge

JSON Schema includes some $ keywords:
"$schema, $vocabulary, $defs, $ref, $id, $anchor, $comment, $dynamicRef, $dynamicAnchor`

  • on one hand, if we come up with some YAML beast to combine YAML-LD Context and YAML Schema (see #19, I think @OR13 uses such things) it would be nice to use the same sigil for keywords
  • on the other hand, we should look out for conflicts
    • @id is a conflict with $id
    • @vocab is a near-conflict with $vocabulary (i.e. could be confusing)
  • But maybe there is no problem if these keywords are localized to the Context vs Schema parts?
    • After all, @id is already "overloaded" in JSON-LD:
"@container": "@id" # Node Identifier Indexing
"@id": "bart"             # Node identifier
"@id": {"@id": "bart", "age": 42}  # triple, for which RDF-star annotations will follow

Related issues:

  • Even if #42 is rejected, this still applies since it affects YAML use in programs
  • I think we can close #9 because it has a lot of great discussion but it's unfocused: @ioggstream do you agree?

Serializing JSON or YAML literal in YAML-LD

The YAML examples in the JSON-LD 1.1 spec (e.g., https://github.com/w3c/json-ld-syntax/blob/main/yaml/JSON-Literal-compacted.yaml), do not preserve the JSON serialization of a JSON literal.

Example 062: JSON Literal-compacted
---
"@context":
  "@version": 1.1
  e:
    "@id": http://example.com/vocab/json
    "@type": "@json"
e:
- 56.0
- d: true
  '10': 
  '1': []

It should, instead be the following:

Example 062: JSON Literal-compacted
---
"@context":
  "@version": 1.1
  e:
    "@id": http://example.com/vocab/json
    "@type": "@json"
e: [56.0,{"d":true,"10":null,"1":[]}]

But a simple YAML.dump of the parsed JSON does not take this into consideration. The spec should describe the requirements for serializing JSON literals in YAML-LD.

Leave YAML-LD Extended profile out of bounds of the Community Report

The Extended YAML-LD profile #35 was conceived as a way to improve the user experience by utilizing expressive means that YAML 1.2 enjoys and which JSON does not possess, in particular:

  • tags #79,
  • anchors and references #13 ,
  • datatypes #17 .

Two alternative methods of tackling this have been proposed:

  • #6 Extend the JSON-LD Internal Representation and agument the inner workings of JSON-LD processor software,
    • which means implementation of language and library specific specialized processors;
  • #84 or downgrade an Extended YAML-LD document to more standardized, JSON compatible form using JSON-LD native features like value objects and language maps,
    • which might be accomplished by a converter tool accepting an Extended Profile YAML-LD document and returning a JSON Profile YAML-LD document,
    • if implemented as a part of a library this tool would also be language specific.

Each of these approaches requires efforts, and it is unclear whether the participants of this community have sufficient time to put into the project. That said, we still want to drive the Community Group Draft document to a state where it can be accepted by the Working Group and thereafter be published as a Recommendation.

On the Feb 15, 2023 Community Group meeting, @gkellogg had proposed that we postpone the implementation of the Extended Profile.

  • the spec should restrict YAML-LD to features that JSON natively supports;
  • Extended Profile and related YAML features, as well as possible methods for using them, might be described in the non-normative part of the spec document.

Thus,

  • we shall not lose the work already put into the Extended Profile discussions,
  • and we will have much higher chances of getting the spec to a recommendation status.

This issue is to present this proposal for public discussion as a request for comments. I suggest we use reactions on this issue to vote: πŸ‘ to support and πŸ‘Ž to disapprove.

Thoughts?

YAML-LD-star

As an information architect.
I want to be able to represent embedded triples and annotations.
So that I can reap the benefits of RDF-star in YAML (YAML-LD-star).

Given the JSON-LD-star effort https://json-ld.github.io/json-ld-star/, I think YAML-LD should support the same.

Examples from that spec can be translated to YAML in a straight-forward way.
Notes:

Turtle-star example 3

<< :bob :age 42 >> :certainty 0.8 .

YAML-LD-star example 3:

$id:
  $id: bob
  age: 42
certainty: 0.8

YAML-LD-star example 4:

$id:
  $id: bob
  age: 
    $value: 42
    $annotation:
      certainty: 0.8

Turtle-star example 5

:alice :claims << :bob :knows :alice >> .

YAML-LD-star example 5:

  $context:
    $base: http://example.org/
    $vocab: http://example.org/
    knows: {$type: $id}
    claimedBy: {$reverse: http://example.org/claims, $type: $id}
  $id:
    $id: bob
    knows: alice
  claimedBy: alice

Turtle-star example 6

:bob :knows :alice .
<< :bob :knows :alice >> :accordingTo :alice .
:bob :claims << :bob :knows :alice >> .

YAML-LD-star example 6

  $id: bob
  knows:
    $id: alice
    $annotation:
      accordingTo: alice
      claimedBy: bob

Turtle-star example 7

<< :bob :knows :alice >> :certainty 0.8 .
<< << :bob :knows :alice >> :certainty 0.8 >> :claims :ted .

YAML-LD-star example 7

  $id:
    $id: bob
    knows: {$id: alice}
  certainty: {
    $value: 0.8,
    $annotation: {claims: ted}}

But there is a more YAMLish way to represent these examples:
In YAML, keys may be arbitrary nodes, so example 3 can be represented like this in flow style:
YAML-LD example 3F:

{$id: bob, age: 42}: {certainty: 0.8}

And like this in block style:
YAML-LD example 3B:

? $id: bob
  age: 42
: certainty: 0.8

YAML-LD example 7F:

  {{$id: bob, knows: {$id: alice}}: {certainty: 0.8}}: {claims: ted}

Best Practice 2 and 3 not specific to YAML-LD

Question

BP2
While this is reasonable, I think it's not specific to YAML-LD and not related to YAML, and is thus off-topic.

Do not force users to author contexts
Instead, provide pre-built contexts that the user can reference by URL for a majority of common use cases

Best Practice 3: Use a default context

This is not specific to YAML-LD, and is thus off-topic.

Replace $-keywords with @-keywords

As an author of YAML-LD files … WHO
I want an ability to type keywords without quotes … WHAT
So that my authoring experience is better … WHY

Motivation

I believe the primary purpose of having a Linked Data format based on YAML is to simplify manual authoring of the linked data documents. This means that, in an information system, we could ask domain experts to write YAML documents to describe their knowledge.

YAML is much easier to write manually than JSON because it does not require as much syntactic noise. Normally, keys can be written without quoting at all:

date: 2022-05-29
title: This is my latest blog article

However, sooner or later the document author will have to define @type, @context, @language, or any other JSON-LD keyword; and then they have to remember that @ is a reserved character and that in such cases quoting is mandatory.

  • This is an edge case which requires cognitive effort,
  • And it requires extra time to type.

The potential author of YAML-LD documents is not necessarily a programmer; they might be a history student, an anthropologist, a biologist, a physicist.

Let's not make their life harder than it has to be.

Potential risks

  • The @ character might suddenly acquire a new meaning under YAML specification. β€” This is not a risk because, under my proposal, we'll not have @ in our documents anyway, we'll have $.
  • The $ character might suddenly acquire a new meaning under YAML specification. β€” I consider that unlikely: it is now not, and such a change would bear great risks for broken backwards compatibility, especially for those who combine YAML with JSON Schema.
  • YAML-LD keywords will intersect with JSON Schema keywords. β€” I believe the sets of keywords between JSON-LD and JSON Schema do not intersect. For instance, JSON-LD does not have @schema or @ref, and JSON Schema does not have $context or $type. I am saying this based on a brief web research, please do correct me if I am wrong.

Possible implementations

  • Not replace anything, as per #9: possible, but I have been writing YAML-LD documents for some time now and my subjective feeling is that the replacement really had improved by writing experience. Not a good argument though as this is very much IMHO.
  • !yaml-ld tag proposed at #6: interesting, but for a non-programmer it will be just a nuisance, a piece of syntactic noise. I would propose to prioritize it to make YAML-LD as concise as possible without losing its writeability and readability.

Proposal

Let us replace $ β†’ @ and vice versa only for the particular keywords. For instance,

$schema: boo
$context: foo

will be converted into

{
  "$schema": "boo",
  "@context": "foo"
}

because @context is a JSON-LD keyword and @schema is not.

Thus, we will minimize the possibilities for conflict while still getting rid of the nasty quotes.

Upcoming teleconference

I sent out a Doodle to find a time for CG teleconference. Since the main activity of the group right now is YAML-LD, it would be good for people working on this to be involved. Historically, teleconferences are an important way to drive consensus and prioritize work, so even if you haven't been involved in the past, please consider attending at least the first such meeting. Calls are typically on Zoom, with meeting minutes recorded on the #json-ld IRC channel on irc.w3.org. See https://www.w3.org/2018/json-ld-wg/Meetings/#teleconferences.

Also, if you want to help in actively creating the UCR and Specifications, please request access, as noted in #1.

cc/ @VladimirAlexiev, @anatoly-scherbakov, @tetron, @nichtich, @ioggstream, @cmungall,

Fragment identifier

As a user … WHO
I want a fragment identifier specification … WHAT
So that I can reference specific elements of a yaml-ld … WHY

Notes

  • yaml fragid processing is described here

Q0. Do you think that yaml processing is enough?
Q1. Do .jsonld support JSON Pointer or other referencing method?
Q2. Do we need a way to reference a specific resource via id, eg. #id=<URI> ? In case, this will be specific yaml-ld extension that generic yaml parsers might not support.

YAML presentation ("cosmetic") controls

As an information architect.
When serializing YAML.
I want control over all YAML presentation ("cosmetic") features.
So that I can obtain a YAML representation that is most readable and usable for my case.

What "cosmetic" features do I mean:

  • optional header --- and footer ...
  • Number of spaces used to indent
  • Use of flow-style vs block-style for particular pieces of YAML
  • Ordering of keys
  • Alias names
  • Formatting of text blocks
  • String quoting
  • Use of escapes and code points in strings
  • Serialization of booleans
  • etc etc

How to list all controllable features systematically?

Here are the options of some serializers:

Most of these are for Perl, could you please add links to serializers in other languages?

Maybe we should also turn to linters. https://megalinter.github.io/latest/descriptors/yaml/ can use 3 YAML linters:

Finally, this specifically aims to fix presentation, but currently has a somewhat limited set of options

use regular expressions to resolve literals; URIs, CURIEs

As a data architect
I want YAML plain values to be recognized by regular expression
So that I don't have to explicitly tag them

Examples (Tagging @OR13 and @mgh128 who work with EPCIS data):

"@context": 
  epcis: https://ns.gs1.org/epcis/
issued: 2022-09-01
stringThatOnlyLooksLikeADate: !string 2022-09-01
homepage: https://example.org/foo
stringThatOnlyLooksLikeAUrl: !string https://example.org/foo
urlThatMayBeMisspelled: !anyURI hxxp:\\i-cannot spell,con/my home page
epcis:readPoint: urn:epc:id:sgln:952005385.011.ts4711
epcis:epcList: https://id.gs1.org/01/70614141123451/21/2018

Note: the benefit of datatype xsd:anyURI (tag !anyURI) is that:

  • the URL may be misspelled, because it is stored as a literal and semantic repos (at least rdf4j) don't check its syntax. How's that a benefit? Try to import a million CrunchBase "homepage" props and you'll see
  • OWL ontologies frown upon ObjectProperties that don't lead to any triples (eg rdf:type owl:NamedIndividual and some others)

We could also use explicit delimiters eg <...> around URNs (URIs, IRIs), which will also enable the use of CURIEs.
Eg below each of the props epcis:readPoint, epcis:epcList has 2 identical values (first a full URN, then a CURIE),
without having to declare that these are @id properties:

"@context": 
  epcis: https://ns.gs1.org/epcis/
  gtin: https://id.gs1.org/01/
  sgln: "urn:epc:id:sgln:"
epcis:readPoint: 
  - <urn:epc:id:sgln:952005385.011.ts4711>
  - <sgln:952005385.011.ts4711>
epcis:epcList:
  - <https://id.gs1.org/01/70614141123451/21/2018>
  - <gtin:70614141123451/21/2018>

The CURIE spec says

rules for disambiguation in situations where the same string could be interpreted as either a CURIE or an IRI.

I've seen this once in practice:

  • geo:lat (prefix for eg the WGS ontology), vs
  • <geo:1.23,4.56> (point using the geo: URI scheme, so the above prefix if used in a context precludes you from using this scheme)

One way to do this is to require that all CURIEs be expressed as Safe_CURIEs, implying that all unbracketed strings are to be interpreted directly as IRIs.

Safe_CURIEs use [...] delimiters, so we could rewrite the above example as follows:

"@context": 
  epcis: https://ns.gs1.org/epcis/
  gtin: https://id.gs1.org/01/
  sgln: "urn:epc:id:sgln:"
epcis:readPoint: 
  - urn:epc:id:sgln:952005385.011.ts4711
  - [sgln:952005385.011.ts4711]
epcis:epcList:
  - https://id.gs1.org/01/70614141123451/21/2018
  - [gtin:70614141123451/21/2018]

Here we avoid the need for any delimiters in URNs, but the brackets can be confused for "array in flow style":

epcis:epcList: [ https://id.gs1.org/01/70614141123451/21/2018, [gtin:70614141123451/21/2018] ]

We'd need extra spaces around the array brackets, and some damn specialized YAML parsers to grok this.

@ioggstream @gkellogg @anatoly-scherbakov

  • Do you think this is useful, or on the contrary, it is dangerous?
  • Should we define an "extended RDF schema" with such regexes, or stay clear of it?
  • Which variants for expressing URI/CURIE do you like?

File extension

JSON-LD uses .jsonld as the (recommended) file extension as can be seen from the IANA considerations and elsewhere in the specification. So I would like to bring up an obvious question (because I am selecting the "main" file extension for a project):

What will be the (recommended) file extension for YAML-LD?

I suggest .yamlld for consistency and the "no surprises" principle.

Previous materials on the topic:

  • @gkellogg used .ymld in an example in this comment: #8 (comment)
  • #14 has used just plain .yaml.
    • (The upcoming IANA media type application also apparently has to define a file extension?)

Add section: How to read this document

The JSON-LD specification contains a section "How to Read this Document".

That section defines two matters that I think are relevant in this stage of specification development:

  • The audience of the document.
  • Prerequisites to understand the specification.

I think especially the prerequisites are important. JSON-LD requires familiarity with JSON.

Do we require familiarity e.g. with

  1. YAML
  2. JSON-LD

At least the current draft of the YAML-LD specification seems to require familiarity with JSON-LD. I'm not sure if this has been discussed and if it is wise to require familiarity with JSON-LD to read the YAML-LD spec.

As reference, the JSON-LD does not require knowledge of RDF as explained in the introduction: "JSON-LD is designed to be usable directly as JSON, with no knowledge of RDF [RDF11-CONCEPTS]."

Any opinions? Perhaps discuss this in the call?

add section outlining conversion to Internal Representation

The primary issue is that there seem to be no normative specification for how to turn YAML into JSON, or more importantly, the internal representation shared between YAML, JSON, and other systems. If one can be found, it can be referenced, otherwise, this section would describe this transformation.

due 27 Jul 2022 (@gkellogg)

YAML-LD context and frame

A YAML-LD Context defines the conversion of a YAML-LD document to RDF.
It should include:

  • same as JSON-LD context, but represented in YAML, maybe using a more convenient char than @ (#9)
  • definitions of datatype tags (#17)
  • TODO anything else?

A YAML-LD Frame, together with the context, defines the serialization of some RDF data to YAML-LD.
It should include:

  • same as JSON-LD frame
  • maybe some anchor-related options, to handle shared and cyclic structures (#13)
  • presentation options (#42)

Use tags to distinguish "plain" YAML-LD from "idiomatic" YAML-LD

By "plain" YAML-LD, I mean "YAML-LD documents that can be interpreted as Linked Data by simply converting them to JSON, then processing them with a standard JSON-LD processor.

By "idiomatic" YAML-LD, I mean "YAML-LD documents that are easier to author, but require some specific processing steps to be interpreted as Linked Data". An example of such additional processing step would be the conversion of $-keywords into standard @-keywords (as discuseed in #3).

Tags are a feature of YAML that has no correspondance in JSON, so their presence mechanically requires additional processing. I suggest that

  • YAML-LD documents are considered "plain" if they don't use any local tag (beyond the standard tags used to specify JSON types, of course);
  • we define some tag(s) to signal "idiomatic" YAML-LD.

The first proposal I made in that direction is too much error prone. However, I think the general design principle deserves to be explored further.

Multiple documents in YAML

Should YAML-LD allow or prohibit multiple documents in YAML?

  • Which YAML parsers support multiple documents?
  • What are useful examples of using multiple documents?
  • If we decide to use them in YAML-LD, how should they be represented? As RDF graphs?
  • Below I formulate a positive use case, but I'm not quite certain we want this because of its complexity

PLEASE VOTE with πŸ‘ or πŸ‘Ž , thanks!


Eg1: multiple identical keys are forbidden by YAML linters.
But they are ok if they are in different documents.
Example by @ioggstream from #42 (comment):

---
a: 1
...
---
a: 2
...

Eg2: YAML metadata followed by a markdown textual body is widely used in some blog/content management systems:

---
created: 2022-07-03
published: 2022-07-04
title: Frobnification
author: A. U. Thor
...
Frobnification was invented in prehistoric times.
It's a useful meta-process wherein...

As an information architect.
I want to be able to use multiple documents in YAML-LD.
So that I can transmit several closely related documents (graphs) together.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.