Giter Site home page Giter Site logo

Comments (7)

chaals avatar chaals commented on June 21, 2024

I think the answer is that the spec should say a typed item whose value is not an absolute URL causes property names to be parsed relative to the URL of the itemtype

This seems like an important sub-issue of #34 - or did I miss something (again)?

from microdata.

gkellogg avatar gkellogg commented on June 21, 2024

The mechanism used in Microdata to RDF is described in Generate Predicate URI.

The current vocabulary is taken from the closest itemtype. Looking at it afresh, the language is a bit inconsistent. But, in Generate the Triples step 7 shows how to construct vocab if there is no registry entry:

Otherwise, if type is not empty, construct vocab by removing everything following the last SOLIDUS U+002F ("/") or NUMBER SIGN U+0023 ("#") from the path component of type.

Essentially, for schema.org, if item type is http://schema.org/Thing, vocab becomes http://schema.org/. This is then used as the current vocabulary when creating a property from name.

A lot of the wording from the algorithm could be improved with a rewrite that did not try to hold as close to the original algorithm removed from the first version of Microdata.

The registry was a concession to other vocabulary schemes, which aren't particularly important for schema.org, and the registry has gone mostly modified since the beginning:

{
  "http://schema.org/": {
    "properties": {
      "additionalType": {"subPropertyOf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#type"}
    }
  },
  "http://microformats.org/profile/hcard": {}
}

There's a concession for hcard (of questionable value IMHO), and an inference rule for additionalType.

from microdata.

chaals avatar chaals commented on June 21, 2024

@gkellogg wrote:

Otherwise, if type is not empty, construct vocab by removing everything following the last SOLIDUS U+002F ("/") or NUMBER SIGN U+0023 ("#") from the path component of type.

Essentially, for schema.org, if item type is http://schema.org/Thing, vocab becomes http://schema.org/. This is then used as the current vocabulary when creating a property from name.

This matches what I think people do, and what I think people should do, but as far as I can tell that isn't documented in the microdata spec itself, and effectively contradicts the microdata spec, unless it said something meaningful about vocabularies - which as per #34 I don't think it does…

Which seems like an issue that should be fixed. And that means looking at what else people do with microdata, in case there is a real-world interoperability problem. If "everyone" uses it that way (for a sufficiently inclusive value of "everyone") then I think it is easiest to lift that as a general approach in the core algorithm. If people are actually relying on the current wording in the microdata spec, we'll need to do something like state that relevant types may be defined by vocabularies to take that approach.

Personally, I can't see the use case for typed items not to use the itemtype to act like RDFa's vocab. And it seems to me that the intention of itemtype taking URLs is to do this. But I am making some big assumptions there.

Hence wanting more people to weigh in.

/@Hixie ?

from microdata.

chaals avatar chaals commented on June 21, 2024

As I quickly read (past tense) Microdata to RDF, itemtype="https://schema.org/Thing" means that the root vocabulary is "https://schema.org/" but itemtype="https://schema.org/Person/Teacher" actually results in a root vocabulary of "https://schema.org/Person/".

Reading more carefully, it appears that the intent is that the registry is a list of known vocabulary prefixes, optionally associating them with properties for subtypes and equivalent types.

Given the ambiguous language in the spec, and my hunch that processors don't actually fetch the registry in practice, I'm leaning toward the following, which essentially adopts the relevant bits from the Note:

  • Request that W3C establish the registry, and a mechanism for updating it, unless that was already done in working on the µdata-RDF note.
  • Use the raw prefixes in the registry as a list, and where there is a match, specify the prefix as the vocabulary prefix.
  • If there is no prefix in the registry, a processor-specific extension MAY be used to associate a vocabulary. Note that I don't like this much, but it's reality. Such extensions should be submitted to the W3C registry
  • Otherwise, use the "itemtype up to the last NUMBER SIGN or if there is none last SOLIDUS or if there is none NUMBER SIGN appended to the value of the itemtype" as the vocabulary prefix.

@danbri?

from microdata.

chaals avatar chaals commented on June 21, 2024

@danbri wondered offline if there is an obvious default we can use to avoid needing a registry, but I don't think so.

from microdata.

gkellogg avatar gkellogg commented on June 21, 2024

Reading more carefully, it appears that the intent is that the registry is a list of known vocabulary prefixes, optionally associating them with properties for subtypes and equivalent types.

Indeed, but really, the only one that mattered was schema.org. At the time, there was thought of using URL hierarchies for extensions, but I believe this became disfavored, so it may be more theoretical.

Indeed, I don't think that processors dynamically fetch the registry, but built it in. There is a mechanism for updating it, and it has been done through the efforts of @iherman; typically, when the Note is revised, but it can be at anytime. Given that it is a Note, there is no real formal mechanism for doing this. Note that the registry also include vocabulary expansion, which allows us to go from http://schema.org/additionalType to rdf:type.

As you say, in the absence of an entry in the registry, the @itemtype value is taken up to the last NUMBER SIGN or SOLIDUS. Certainly, a processor may use it's own heuristics, at the potential for loosing interoperability.

from microdata.

chaals avatar chaals commented on June 21, 2024

The alternative would be to specify some particular rules in the spec.

Does anyone know of a use of microdata other than schema.org where you might have itemtype="https://some.host/compound/path" but want the prefix associated to be just "https://some.host/" ? I can imagine this being the case for W3C, if they publish a couple of substantial vocabularies... but that's not a current concrete use case.

from microdata.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.