w3c / rdf-star Goto Github PK

View Code? Open in Web Editor NEW

115.0 115.0 23.0 5.54 MB

RDF-star specification

Home Page: https://w3c.github.io/rdf-star/

License: Other

HTML 98.95% Shell 0.01% CSS 0.03% Makefile 0.01% Python 0.27% Ruby 0.14% Haml 0.60%

rdf-star's Introduction

RDF-star

RDF-star has now been taken over by the RDF-star working group.

The homepage of the RDF-star community group effort is https://w3c.github.io/rdf-star/

rdf-star's People

Contributors

Stargazers

Watchers

rdf-star's Issues

Interplay with named graphs?

IMHO the spec should clarify the Interplay of rdf-star triples with named graphs.

I always thought that rdf-star triples are very similar to singleton graphs. Eg aren't these similar

:Alice :believes _:x.
_:x {:Bob :loves :Mary}

and

:Alice :believes << :Bob :loves :Mary>>

@gkellogg wrote in #16 "specify if a quoted triple used with a named graph is asserted to be in the same named graph, or in the default graph"

CSV and TSV results

As an user,
I want a to be able to retrieve SPARQL results as CSV or TSV
So that I can use different toolchains to analyze the results.

There are proposals (#43) for extending SPARQL results in JSON and XML for RDF*. However, SPARQL also defines results in CSV and TSV formats.

As both formats have the ability to contain quoted values that include delimiters when the field is quoted (including quotes), the format can be used to express embedded triple results as well.

Consider the "bob-bind" query:

PREFIX : <http://bigdata.com>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT ?a ?b ?c WHERE {
   ?bob foaf:name "Bob" .
   BIND( <<?bob foaf:age ?age>> AS ?a ) .
   ?a ?b ?c .
}

when run against:

@prefix : <http://bigdata.com/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix ex:  <http://example.org/> .

:bob foaf:name "Bob" .
<<:bob foaf:age 23>> <http://example.org/certainty> 0.9 .

In JSON results, this would produce the following:

{
  "head": {"vars": ["a","b","c"]},
  "results": {
    "bindings": [
      {
        "a": {
          "type": "triple",
          "value": {
            "subject": {
              "type": "uri",
              "value": "http://bigdata.com/bob"
            },
            "predicate": {
              "type": "uri",
              "value": "http://xmlns.com/foaf/0.1/age"
            },
            "object": {
              "type": "typed-literal",
              "datatype": "http://www.w3.org/2001/XMLSchema#integer",
              "value": "23"
            }
          }
        },
        "b": {
          "type": "uri",
          "value": "http://example.org/certainty"
        },
        "c": {
          "type": "typed-literal",
          "datatype": "http://www.w3.org/2001/XMLSchema#decimal",
          "value": "0.9"
        }
      }
    ]
  }
}

In CSV, this might produce the following:

a,b,c
"http://bigdata.com/bob,http://xmlns.com/foaf/0.1/age,23",http://example.org/certainty,0.9

This requires that a client detect that the cell content is, itself, in CSV form, and interpret it as subject,predicate,object.

The TSV form could provide a datatype to the string-encoded embedded TSV for spo, to make it more explicit.

?a\t?b\t?c
"<http://bigdata.com/bob>\t<http://xmlns.com/foaf/0.1/age>\t23"	<http://example.org/certainty>	0.9

Do you need referential opacity?

This issue is intended as a strawpoll to determine if referential opacity is a required feature of RDF*, because it has several implications, especially on the ability or not to "encode" RDF* in standard RDF.

The problem

The question boils down to this: does the RDF* triple :alice :says << :Paris :population 2229621>> mean

(1) Alice says that Paris has a population of 2229621.

(referential transparency) or

(2) Alice says “Paris has a population of 2229621”.

(referential opacity)?

From (1), it would be acceptable to infer

(3) Alice says that the capital of France has a population of 2229621.

if we know that Paris = capital of France.
However, from (2), if would not be acceptable to infer

(4) Alice says “The capital of France has a population of 2229621”

because

those are not the terms used by Alice, and
we don't even know if Alice knows that Paris is the capital of France, so she could endorse the quoted sentence in (2) but not the one in (4).

Rationale of the current draft

The semantics in the current draft supports referential opacity. This choice was made because referential opacity is required by some use-cases such as

representing knowledge and beliefs (:alice :knows << ... >>);
representing provenance (:alice :said << ... >>)

Furthermore, from sentence (2) above, sentence (1) can be reconstructed if needed. The opposite is not true, as sentence (1) does not convey which precise terms were used by Alice.

Strawpoll

Please vote with emojis on this issue:

👍 yes, we need referential opacity
👎 no, we don't need it
👀 no strong opinion either way

Section 1.2 links to 1999 RDF spec to explain reification

Is there a specific reason that section 1.2 links the term "reification" to the "Resource Description Framework (RDF) Model and Syntax Specification" W3C Recommendation from 22 February 1999? The RDF 1.1 Semantics from 2014 is the currently authoritative spec, the RDF 1.0 Primer from 2004 features an IMO very accessible introduction.
IIUC the 1999 spec doesn't differentiate between triple type and occurrence whereas the later specs do explain the difference in detail. To properly describe the differences of reification in RDF* and RDF the current spec and its direct predecessor from 2004 seem more appropriate.

Embedded Quads: Should RDF* allow terms to be included in the graph position?

The purpose of this issue is to consider what are the merits and what are the drawbacks of allowing a term to appear in the graph position of an embedded triple. In other words, should RDF* deal explicitly in triples, or should embedded triples be generalized to embedded quads?

For example:

<< :a :b :c :graph >> :p1 :o1 .

Now I can certainly imagine legitimate use cases for embedded quads, but have not dedicated the time to really think this over. So perhaps instead, I will just open the discussion with some basic implications and others can chime in whether they see utility or hinderance:

if embedded quads generalizes embedded triples, then the semantics of an embedded triple where the graph component is missing needs to be defined. Presumably this would inherit the graph of the surrounding context and fallback to the "DefaultGraph" when the context is not explicit.
think about SPARQL* results formats, "type": "quad", ... and <quad>...</quad> rather than "type": "triple", ... and <triple>...</triple> for all embedded statements?
the equivalent RDF reification does not have a notion of graphs, i.e., an rdf:Statement has no such rdf:graph relation. This seems to set precedent against allowing a graph component.

UniProt: attributed/evidenced triples

As an UniProt developer, I want an easier way to talk about triples we have asserted. So that querying and parsing data-models from evidenced triples becomes simpler.

<P26948> up:annotation <P26948#SIPDB6A831D8E2E2D2A> .
<#_kb.P26948_up.annotation_A144DC8D56EA0928> rdf:type rdf:Statement ;
  rdf:subject <P26948> ;
  rdf:predicate up:annotation ;
  rdf:object <P26948#SIPDB6A831D8E2E2D2A> ;
  up:attribution <P26948#attribution-89AC1B682EEB440D50C4AEBB24FCA860> .

Is a lot of bytes to type., and even worse the five joins are very expensive to perform.

Our use-case for RDFstar etc. is to allow us to talk about triples as we do now, but with a higher performance and lower barrier to entry.

We currently depend on the RDF/XML rdf:ID to easily parse this in our inhouse custom RDF parsers, and would like to keep this option open.

At the same time we deal with a lot of renaming (IRIs) for the same thing. e.g. a related database might use http://identifiers.org/uniprot/P05067 instead of http://purl.uniprot.org/uniprot/P05067. And a owl:sameAs is used to merge these datasets. Our attributions/evidences should be found no matter which IRI is used.

distinguish interpretation from representation

As a data service, from an architectural perspective, interpretation should be distinct from and cleanly layered over representation - both abstract and concrete, in order to facilitate implementation and assure the ability to support future use cases as knowledge evolves over time.

From this perspective, the standard treatment of blank node labels is an anti-requirement, the lessons of which should dissuade from unnecessarily restrictive relations between interpretation and representation.

The situation with blank node labels is a case where the interpretation was fixed, to treat them as referentially opaque. that was a mistake. it would have been better to separate the interpretation from the representation and to allow alternatives.

https://lists.w3.org/Archives/Public/semantic-web/2009Nov/0040.html
https://www.slideshare.net/PatHayes/blogic-iswc-2009-invited-talk
http://videolectures.net/iswc09_hayes_blogic/ (which, just now, did not work)

SPARQL* pattern with only a reified statement

SPARQL* query question: Suppose I have the reified triples

:r1 rdf:subject :S1; rdf:predicate :P1; rdf:object :O1 .
:r2 rdf:subject :S2; rdf:predicate :P2; rdf:object :O2; a rdf:Statement .

and I want to discover the reified triples, I believe the following SPARQL* graph pattern would be syntactically incorrect:

select * { << :S1 :P1 ?O >> }

However this would be syntactically correct:

select * { << :S2 :P2 ?O >> a rdf:Statement }

So this means that :O1 cannot be matched using SPARQL*, but it can using SPARQL:

select * {:S1 ^rdf:subject [ rdf:predicate:P1; rdf:object ?o ] }

FIND instead of BIND

Instead of reusing the keyword BIND for SPARQL* (as in my original proposal), we may want to consider using a different keyword for this functionality because the behavior is a bit different. For instance, @klinovp has mentioned this issue in an email on the mailing list. In another email, @afs has proposed to use the keyword FIND instead.

Create a test suite

for the semantics (what RDF* graph should be entailed, under which entailment regime)
include the README of the test suite in the manifest.html
for Turtle*
for SPARQL* syntax (query, update)
for SPARQL* evaluation (query, update)

https://w3c.github.io/rdf-star/Minutes/2020-11-20.html#ActionSummary

New mime types for RDF-star serializations (inc. SPARQL results)

In addition to defining the extended formats for serializing the result of a SPARQL* SELECT query (#12 and #13), we have to decide whether we need/want new mime types for these extended formats? Similarly, do we need/want to introduce another namespace for the extended XML result format?

Add a standardized extension of SPARQL Query Results XML format

The result of a SPARQL SELECT query is serialized in XML using the SPARQL Query Results XML format. This format will need to be extended to deal with the RDF* triple being a new possible value type for a binding. For example, the result of a query where variable ?a is bound to an RDF* triple:

?a	?b	?c
`<<<http://example.org/bob> <http://xmlns.com/foaf/0.1/age> 23>>`	`<http://example.org/certainty>`	`0.9`

Currently, different implementations all have their own, slightly diverging, extensions. For example, in Eclipse RDF4J, the extension looks as follows:

<?xml version='1.0' encoding='UTF-8'?>
<sparql xmlns='http://www.w3.org/2005/sparql-results#'>
	<head>
		<variable name='a'/>
		<variable name='b'/>
		<variable name='c'/>
	</head>
	<results>
		<result>
			<binding name='a'>
				<triple>
					<subject>
						<uri>http://example.org/bob</uri>
					</subject>
					<predicate>
						<uri>http://xmlns.com/foaf/0.1/age</uri>
					</predicate>
					<object>
						<literal datatype='http://www.w3.org/2001/XMLSchema#integer'>23</literal>
					</object>
				</triple>
			</binding>
			<binding name='b'>
				<uri>http://example.org/certainty</uri>
			</binding>
			<binding name='c'>
				<literal datatype='http://www.w3.org/2001/XMLSchema#decimal'>0.9</literal>
			</binding>
		</result>
	</results>
</sparql>

In Apache Jena, the extension is almost identical, except for the choice to name the middle element property (where RDF4J uses predicate):

<?xml version='1.0' encoding='UTF-8'?>
<sparql xmlns='http://www.w3.org/2005/sparql-results#'>
	<head>
		<variable name='a'/>
		<variable name='b'/>
		<variable name='c'/>
	</head>
	<results>
		<result>
			<binding name='a'>
				<triple>
					<subject>
						<uri>http://example.org/bob</uri>
					</subject>
					<property>
						<uri>http://xmlns.com/foaf/0.1/age</uri>
					</property>
					<object>
						<literal datatype='http://www.w3.org/2001/XMLSchema#integer'>23</literal>
					</object>
				</triple>
			</binding>
			<binding name='b'>
				<uri>http://example.org/certainty</uri>
			</binding>
			<binding name='c'>
				<literal datatype='http://www.w3.org/2001/XMLSchema#decimal'>0.9</literal>
			</binding>
		</result>
	</results>
</sparql>

In Stardog, the implemented extension currently looks as follows:

<?xml version='1.0' encoding='UTF-8'?>
<sparql xmlns='http://www.w3.org/2005/sparql-results#'>
	<head>
		<variable name='a'/>
		<variable name='b'/>
		<variable name='c'/>
	</head>
	<results>
		<result>
			<binding name='a'>
				<statement>
					<s>
						<uri>http://example.org/bob</uri>
					</s>
					<p>
						<uri>http://xmlns.com/foaf/0.1/age</uri>
					</p>
					<o>
						<literal datatype='http://www.w3.org/2001/XMLSchema#integer'>23</literal>
					</o>
				</statement>
			</binding>
			<binding name='b'>
				<uri>http://example.org/certainty</uri>
			</binding>
			<binding name='c'>
				<literal datatype='http://www.w3.org/2001/XMLSchema#decimal'>0.9</literal>
			</binding>
		</result>
	</results>
</sparql>

In summary, RDF4J, Jena and Stardog all differ in the element names used (triple vs statement, property vs predicate, subject vs s, etc).

Other implementations may have yet other, slightly deviant variants. This makes it difficult to process query results from different endpoint implementations. A single recommended extension would be a benefit for parser implementors and users alike.

Section that defines SPARQL* Update

We need a section that defines SPARQL* Update. The text for this section can be taken from the following document: https://blog.liu.se/olafhartig/documents/sparql-update/

submit an issue on embedded quads

@blake-regalia
https://w3c.github.io/rdf-star/Minutes/2020-11-27.html#ActionSummary

Filename extension for Turtle* files and for SPARQL* files

I propose to use the file name extension ttls for Turtle* files and, for SPARQL* files, either rqs or rsq.

TODOs in this context:

agree on which file extensions to use
adapt the test suites to use these extensions
extend the spec draft to mention this suggestion

May an embedded triple be a graph name in SPARQLstar ?

Would the following be allowed?

SELECT *
WHERE {
  GRAPH << ex:s ex:o ex:p >> {
   ?s ?p ?o .
 }
}

SELECT *
FROM << ex:s ex:o ex:p >> 
WHERE {
   ?s ?p ?o .
}

Include Annotation syntax in Turtle* and SPARQL*

This has already been discussed on the mailing list.

The idea would be to have a notation like

:bob :age 42 {| :source <http://example.org/~bob/> |}.

as shortcut for

:bob :age 42.
<< :bob :age 42 >> :source <http://example.org/~bob/>.

Natural representation of PG data in RDF

As a KG vendor, we want Stardog customers to have easy to use means to attach properties to edges in their RDF graph or load property graph data with edge properties. Here "easy" specifically means that neither the customer nor the database should have to wreck the data model (and queries) to use any of the workarounds available in plain RDF for that purpose (like the RDF reification). For example, if the customer has :pavel :worksAt :Stardog edge in the data and wants to add ... :since 2011 to it, neither they nor the database should have to transform it into a bunch of different triples like [] rdf:subject :pavel ; rdfs:predicate :worksAt ... (and then also rewrite queries so that ?s :worksAt :Stardog still returns :pavel).

Also we want to enable customers to store that annotated statement in any named graph they want so we don't want to use named graphs for representing statement-level metadata.

Mapping between Triples and IRIs

There has been some discussion about "long URIs" to represent embedded triples in a backwards-compatible way. If we go down this road, we need to decide on a syntax for this mapping. The mapping should be bi-directional so that systems can parse URIs back to triples if needed. Ideally, the URIs should be as short as possible and be reasonably human-readable in case someone encounters them through a "leak".

PROPOSAL:
Given a triple S, P, O produce a IRI using the template urn:triple:${encode(S)}:${encode(P)}:${encode(O)} where the encode(N) function is (JavaScript) encodeURIComponent(ttl(N)) and ttl(N) is the Turtle serialization of N, without using prefixes but using absolute IRIs only. Blank nodes would become _:ID where ID is some internal ID that the current system uses (e.g. the Jena blank node label). See the sections including https://www.w3.org/TR/turtle/#sec-iri. For literals, the available short forms need to be used, e.g. "1"^^xsd:integer becomes 1, see https://www.w3.org/TR/turtle/#literals

We might want to use 'a' for rdf:type as there is a large number of triples of this form, but I have no strong opinion on that. Potentially the system could also rely on a number of hard-coded "well-known" prefixes such as rdf, owl, sh, skos. This would further shorten the URIs in case the implementation has them occupy memory.

See http://datashapes.org/reification.html#uriReification for an earlier version that is currently implemented in TopBraid. I have since convinced myself that relying on locally defined prefixes (per file) is not desirable, as prefixes may change and then these identifiers break.

Is it possible for an RDF* triple to be its own subject or object?

In other word, can we represent in the RDF* abstract syntax (I know we can't in Turtle*), something like

" This sentence was stated by Pierre-Antoine."

More formally, the definition http://ceur-ws.org/Vol-1912/paper12.pdf states that if t is an RDF* triple, p is an IRI and o a literal, then (t p o) is an RDF* triple. But then, t could be (t p o) itself...

Of course, the question extends to "indirect containment", like t = ((t p o) p' o')...

@hartig I have a strong feeling that this was not your intention, but I'd rather check with you, before I make an explicit note in the document that this is not allowed...

Activate GitHub Wiki

That would help to compile links and other material such as pro and contra from the mailing list.

SPARQL/RDF and RDFS support

I see from other posts discussion of 'standard' reification, which I assume to be that defined in rdf, viz
rdf:Statement rdf:subject, rdf:predicate, rdf:object
vs what I guess is non-standard reification using
rdf:Triple rdfx:subjectTerm, rdf:predicateTerm, rdf:objectTerm
and other variants.
My question is related to 'standard' reification and to what extent RDFS will be supported.
I use the equivalent of reification extensively in models. However I do create rdfs:subClassOf rdf:Statement and the corresponding rdfs:subPropertyOf rdf:subject, rdf:predicate, and rdf:object.

For example, when modeling a resource's attributes (aka observations, measurements), instead of the simple
:aResource :hasHeight "25"^^xsd^^double
we need to reify this to capture further details such as units-of-measure, time-of-measurement, accuracy etc (yes Reification-101):

def:Attribute_1
  rdf:type def:Attribute ;
  def:attribute.of.Item id:Peter ;
  def:attribute.Property def:hasHeight ;
  def:attribute.Value "25"^^xsd^^double ;
  
  def:attribute.UOM def:CM ;
  def:attribute.accuracy ".32"^^xsd^^double ;
.

Where the following have been declared:

def:Attribute
  rdfs:subClassOf rdf:Statement 
.

def:attribute.of.Item
  rdfs:domain def:Attribute ;
  rdfs:subPropertyOf rdf:subject ;
.

def:attribute.Property
  rdfs:domain def:Attribute ;
  rdfs:subPropertyOf rdf:predicate ;
.

def:attribute.Value
  rdfs:domain def:Attribute ;
  rdfs:subPropertyOf rdf:object ;
.

I guess one could use only 'standard' reification, but this loses the disambiguation that RDFS offers:

def:Attribute_1
  rdf:type rdf:Statement ;
  rdf:predicate def:hasHeight ;
  rdf:subject id:Peter ;
  rdf:object "25"^^xsd^^double ;
  
  def:attribute.UOM def:CM ;
  def:attribute.accuracy ".32"^^xsd^^double ;
.

I realise that SPARQL alone does not recognize RDFS, but the implications (I think) of <<...>> and standard reification is that rdf:subject, rdf:predicate, and rdf:object have rdfs:domain rdf:Statement. Following this same slippery slope, have rdfs:subClassOf and rdfs:subPropertyOf also been considered?

To illustrate this pattern's use, the following is the equivalent of select * {?s ?p ?o}

select  ?relationshipType ?s ?p ?o
{
VALUES(?reificationType ?s ?p  ){ (UNDEF UNDEF UNDEF  )}
OPTIONAL{ ?subject rdfs:domain ?reificationType; rdfs:subPropertyOf rdf:subject  . ?reifiedRelationship  ?subject ?s .}
OPTIONAL{ ?predicate rdfs:domain ?reificationType  ; rdfs:subPropertyOf rdf:predicate  . ?reifiedRelationship  ?predicate ?p . }
OPTIONAL{ ?object  rdfs:domain ?reificationType  ; 	rdfs:subPropertyOf rdf:object  . ?reifiedRelationship  ?object ?o .}
}

For example

select  ?relationshipType ?s ?p ?o
{
VALUES(?reificationType  ?s ?p  ){ (def:Attribute   id:Peter def:hasHeight  )}
OPTIONAL{ ?subject rdfs:domain ?reificationType; rdfs:subPropertyOf rdf:subject  . ?reifiedRelationship  ?subject ?s .}
OPTIONAL{ ?predicate rdfs:domain ?reificationType  ; rdfs:subPropertyOf rdf:predicate  . ?reifiedRelationship  ?predicate ?p . }
OPTIONAL{ ?object  rdfs:domain ?reificationType  ; 	rdfs:subPropertyOf rdf:object  . ?reifiedRelationship  ?object ?o .}
}

Why? In my experience this pattern of 'extended reification' appears over and over again in the models I have encountered, so it is useful for me to standardize on the pattern and reuse it whenever possible. Sometimes the pattern is not obvious. For example a purchase order's line item follows this pattern, as do many of the '3D/4D' modeling needs [http://hdl.handle.net/1854/LU-5721901]

RDF* Dataset

Should this CG extend the work to define an "RDF* dataset"?

Or is that only a collection of RDF* graphs?

Example:

For systems where the default graph is the union of named graphs, the named graphs form a means to manage data which is accessed as a union of all the data in the store.

But a subgraph of an RDF* graph may not be a legal RDF* graph (PG mode) if the annotation is separate from the target triple.

This may be too ambitious at this time because it gives rise to multiple independent annotations of the same triple.

Support Wikidata/Wikibase data model

As a member of the Wikidata community I would like to see triple stores supporting the Wikidata/Wikibase data model as much as possible, so that provenance information etc. can be represented in a way which is pleasing to the mind and software-systems such as Wikibase.

See:

One issue per Use Case

Please let us have one issue per Use Case instead of one issue for all Use Cases.

The comments in #29 are already becoming a mess.

MIME types and file extensions

Standardize MIME types and file extensions of rdf-star and sparql-star formats.

GraphDB and rdf4j do this: https://graphdb.ontotext.com/documentation/free/devhub/rdf-sparql-star.html#mime-types-and-file-extensions-for-rdf-in-rdf4j

@klinovp what does Stardog do?

@hartig How about Blazegraph?

Meta-properties over properties

As a data engineer,
I want to add meta-properties over properties, from an existing SQL-DB to a Graph-DB; that is for example
for a node of type PERSON, with a property :birthDate='1985-01-20', I want to be able to say
the property was created on '2020-11-10', by a user 'editor1' with the role 'writer'.

So that I do not need to transform the property into a relation plus yet another node.

Undefined notation in SPARQL* "Evaluation semantics"

@hartig wrote:

⟦B⟧_G is not defined anywhere here, and neither is ⟦(tp AS ?v)⟧_G

Actually, the first time this notation is used, it says

⟦B⟧_G (where ⟦B⟧_G is the evaluation of B over RDF* graph G)

where "evaluation" links to its definition, yet to be written in the ~~Algebra~~ Evaluation Semantics section.
So I consider this issue to be a duplicate of #4.

Decide on a policy for editorship/authorship

Although the document has currently two editors and two authors, nothing is fixed in that respect.
We have to collectively decide what criteria we chose to decide who's an editor, who's an author, and even if we want to make a distinction between the two.

Should RDF* be just syntactic sugar on top of RDF?

In other words, does RDF* need its own abstract syntax and semantics, or can it be "encoded" in standard RDF?

It largely depends on the answer to issue #22.

If we want embedded triples to be referentially transparent, then they can be internally represented using, e.g., standard reification or singleton properties.
If we want them to be referentially opaque, and forbid them to contain blank nodes, then they can be internally represented using specific IRIs (see #23).
Otherwise, IMO, we need to somehow extend RDF semantics. However, I see still see two paths here:
a) either we promote RDF* triples as a new kind of terms, as done in the original papers and the current version of the report, or
b) we extend RDF's semantics with a built-in datatype for representing IRIs and literals, and we represent RDF* triples using an adapted form of reification.

To illustrate the last bullet:

<< :alice :age 26 >> :accordingTo :bob.

could be seen as syntactic sugar for

_:stmt :accordingTo :bob.
_:stmt
    rdfx:subjectTerm "<http://example.org/alice>"^^rdfx:term;
    rdfx:predicateTerm "<http://example.org/age>"^^rdfx:term;
    rdfx:subjectTerm "\"26\"^^<www.w3.org/2001/XMLSchema#integer>"^^rdfx:term.

There are several reasons why I believe this modelling needs a small extension to RDF semantics, but I'll develop them if we come to a point where we consider this option seriously...

CG Report scope : document alternatives or only make one proposal?

Will the CG report include alternatives (e.g. SA vs PG and variations) or be a draft of a spec?

are embedded triples asserted?

It is still unclear to me whether embedded triples are also supposed
to be asserted - the initial RDF* said yes, the current document says no, some
people in the group seem to be arguing for yes.

These sorts of foundational issues need to be cleared up early.

Turtle* syntax tests

This issue is for Turtle* syntax tests.

A first set is provided PR #52.

summarizing syntax issues

What's in a name? RDF*, RDF+, RDF#, etc.

RDF* is an unsearchable string, but searches for RDF do bring it to light occasionally.

This evening, while searching for something else, I randomly discovered the existence of the RDF# proposal, which led me to the RDF+ proposal (which is no longer at the URL the RDF# paper linked to, this link takes you into the Internet Archive for it).

(I am not prepared to do a full analysis of either of these. I do wonder whether anyone else has done some comparison, and could add such to this repo...)

Unsurprisingly, neither of the others included a searchable version of their name (e.g., rdf-hash, rdf-plus). Being unsearchable might help explain how they remain apparently unpopular. It doesn't do anything to explain the popularity of RDF*, however, which I continue to believe is in desperate need of a renaming.

(What's the next one to be, RDF**? RDF*+#?)

N-Triples*

N-Triples fulfills the role of database dump format. As such it might be useful to define N-Triples* as exactly what is in the database, with only the << >> syntax (no annotation support as in issue #9) and without the automatic generation of the implied triple for << >>. This preserves the one line - one triple feature of N-Triples found in the wild.

This would also make it a format for writing tests for Turtle*.

Property path patterns in SPARQL*

I have added the definition of a SPARQL* property path pattern into the draft just for the sake of having such a definition. We need to think about whether it is useful to add this to SPARQL*, in which case we need to define the semantics of such SPARQL* property path patterns.

In fact, no matter what we decide, even for standard property path patterns, the semantics may have to be extended to use them over RDF* graphs.

are embedded triples unique?

The uniqueness of
embedded triples does not appear to me to be determined - the initial RDF* and
the current documents say yes in their formal sections but central examples
have fatal flaws if embedded triples are unique.

These foundational issues should be cleared up early.

Do we need more things in the 'conformance' section?

For the moment, we only have the boilerplate text generated by respec.

Similarity use case

As a materials scientist working with ontologies.
I want to enrich relationships between objects. For example A --similar--> B with a similarity measure of 0.5,
So that I can identify the pairs of objects with relationships of certain "strength". Visualization of the edges with accordingly weighted thickness would be a huge plus.

SPARQL-star constructor and accessors

https://graphdb.ontotext.com/documentation/free/devhub/rdf-sparql-star.html

GraphDB introduces several new SPARQL functions for manipulating embedded triples

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

SELECT * WHERE {
    VALUES ?triple { <<:man :hasSpouse :woman>> }

    # Checks if the variable is of type embedded triple
    BIND (rdf:isTriple(?triple) as ?isTriple)

    # Extract the subject, predicate or object from an embedded triple
    BIND (rdf:subject(?triple) as ?subject)
    BIND (rdf:predicate(?triple) as ?predicate)
    BIND (rdf:object(?triple) as ?object)

    # Create a new embedded statement
    BIND (rdf:Statement(?subject, ?predicate, ?object) as ?newTriple)
}

I think it would be beneficial to standardize similar functions

split and rename test embeded-triple-everywhere

https://w3c.github.io/rdf-star/Minutes/2020-11-27.html#ActionSummary

Add a standardized extension of SPARQL 1.1 Query Results JSON format

(related issue: #12)

The result of a SPARQL SELECT query is serialized in JSON using the SPARQL 1.1 Query Results JSON format. This format will need to be extended to deal with the RDF* triple being a new possible value type for a binding. For example, the result of a query where variable ?a is bound to an RDF* triple:

?a	?b	?c
`<<<http://example.org/bob> <http://xmlns.com/foaf/0.1/age> 23>>`	`<http://example.org/certainty>`	`0.9`

Currently, different implementations all have their own, slightly diverging, extensions. For example, in Eclipse RDF4J, the extension looks as follows:

{
  "head" : {
    "vars" : [
      "a",
      "b",
      "c"
    ]
  },
  "results" : {
    "bindings": [
      { "a" : {
          "type" : "triple",
          "value" : {
            "s" : {
              "type" : "uri",
              "value" : "http://example.org/bob"
            },
            "p" : {
              "type" : "uri",
              "value" : "http://xmlns.com/foaf/0.1/age"
            },
            "o" : {
              "datatype" : "http://www.w3.org/2001/XMLSchema#integer",
              "type" : "literal",
              "value" : "23"
            }
          }
        },
        "b": {
          "type": "uri",
          "value": "http://example.org/certainty"
        },
        "c" : {
          "datatype" : "http://www.w3.org/2001/XMLSchema#decimal",
          "type" : "literal",
          "value" : "0.9"
        }
      }
    ]
  }
}

In Apache Jena, the extension looks as follows:

{
  "head" : {
    "vars" : [
      "a",
      "b",
      "c"
    ]
  },
  "results" : {
    "bindings": [
      { "a" : {
          "type" : "triple",
          "value" : {
            "subject" : {
              "type" : "uri",
              "value" : "http://example.org/bob"
            },
            "property" : {
              "type" : "uri",
              "value" : "http://xmlns.com/foaf/0.1/age"
            },
            "object" : {
              "datatype" : "http://www.w3.org/2001/XMLSchema#integer",
              "type" : "literal",
              "value" : "23"
            }
          }
        },
        "b": { 
          "type": "uri",
          "value": "http://example.org/certainty"
        },
        "c" : {
          "datatype" : "http://www.w3.org/2001/XMLSchema#decimal",
          "type" : "literal",
          "value" : "0.9"
        }
      }
    ]
  }
}

In Stardog, the format extension as currently implemented is as follows:

{
  "head" : {
    "vars" : [
      "a",
      "b",
      "c"
    ]
  },
  "results" : {
    "bindings": [
      { "a" : {
          "type" : "statement",
          "s" : {
            "type" : "uri",
            "value" : "http://example.org/bob"
          },
          "p" : {
            "type" : "uri",
            "value" : "http://xmlns.com/foaf/0.1/age"
          },
          "o" : {
            "datatype" : "http://www.w3.org/2001/XMLSchema#integer",
            "type" : "literal",
            "value" : "23"
          }
        },
        "b": { 
          "type": "uri",
          "value": "http://example.org/certainty"
        },
        "c" : {
          "datatype" : "http://www.w3.org/2001/XMLSchema#decimal",
          "type" : "literal",
          "value" : "0.9"
        }
      }
    ]
  }
}

In summary, Jena and RDF4J differ only by the names of the keys inside the new RDF* triple type (s vs subject, p vs property, etc). Stardog deviates slightly more in that it does not wrap the individual components of the RDF* triple into a value.

Consider defining an IRI to represent the type of an embedded triple or triple pattern

This would be useful for representing queries and metadata in a structured way. For example, representing a SPARQL* query in a format such as SPIN or indicating the type of a database column using R2RML's termType. This would work alongside other IRIs such as rdfs:Literal and rr:BlankNode.

get back admin priviledge on BBB or find a new telco platform

Rename "RDF*" - avoiding regular expression wildcard in name

Copied from w3c/EasierRDF#76

The name "RDF*" has the negative property that it is not just a string, but also a regular expression. Some search engines make it difficult or impossible to search for "RDF*" without interpreting the name as a regular expression.

Some suggestions:

RDF star

RDFx

@hartig commented in w3c/EasierRDF#76 (comment)

I don't think it is a good idea to rename RDF* at this point. However, I understand the issue related to search engines. A possible way to address this issue is to include keywords such as "RDF star", "RDFstar", "SPARQL star", etc, in the metadata sections of the documents about the approach; this way, search engine can pick up these keywords and, then, searches for these keywords will end up showing the right hits.

I think it is not to late to rename the specification by replacing "RDF*" with "RDF star". Also:

Quite a few articles about RDF* already write RDF star instead
This repository is named rdf-star
html metadata keywords are almost never provided on the web today and it is questionable that search engines use them
html metadata keywords can not be provided for PDF files etc.

SPARQL* semantics needs to be completed

The RFD* paper was not relying directly on the SPARQL11-QUERY spec, but on a paper by Pérez et al., which can not(?) be used as a normative reference. So we need to adapt the definitions to those in the spec.

@hartig wrote:

I have added the definition above for the moment. However, there is additional work to be done to define the semantics of such SPARQL* property path patterns. In fact, even for standard property path patterns, the semantics has to be extended to use them over RDF* graphs.

No entailment when anonymizing IRIs

The current semantics was designed to ensure that renaming blank nodes in an RDF* did not change the semantics of the graph. However, we overlooked the situation where an IRI is replaced with a blank node. E.g.

   << :s :p :o >> :p2 :o2: .

currently does not entail

  << _:x :p :o >> :p2 :o2.

which is, IMO, a problem, because in standard RDF, this transformation produces a graph that is entailed by the original one.

Gather the Use Case and Requirements for RDF*

This issue is to collect together the use cases for RDF*.

Please add use cases you are aware of as comments below.

The document started in EasierRDF has been copied into this repo and is visible as HTML at:

https://w3c.github.io/rdf-star/UCR/

Recording a UC&R is not a promise by the CG to address that use case.

A useful format is:

As an <actor>       … WHO
I want a <feature>  … WHAT
So that <benefit>   … WHY

Add evaluation semantics for BGP* and BIND*/FIND into the draft

Currently, the Evaluation Semantics section of the draft is just a copy from the corresponding subsection of the original tech report. The actual definition of the formal semantics of a BGP* and of the SPARQL* version of BIND is in a different part of that tech report (where it defines ⟦B⟧_G and ⟦(tp AS ?v)⟧_G). These definitions still need to be adapted and moved into the Evaluation Semantics section of the draft.