Comments (7)
I think the answer is that the spec should say a typed item whose value is not an absolute URL causes property names to be parsed relative to the URL of the itemtype
…
This seems like an important sub-issue of #34 - or did I miss something (again)?
from microdata.
The mechanism used in Microdata to RDF is described in Generate Predicate URI.
The current vocabulary is taken from the closest itemtype. Looking at it afresh, the language is a bit inconsistent. But, in Generate the Triples step 7 shows how to construct vocab if there is no registry entry:
Otherwise, if type is not empty, construct vocab by removing everything following the last SOLIDUS U+002F ("/") or NUMBER SIGN U+0023 ("#") from the path component of type.
Essentially, for schema.org, if item type is http://schema.org/Thing
, vocab becomes http://schema.org/
. This is then used as the current vocabulary when creating a property from name.
A lot of the wording from the algorithm could be improved with a rewrite that did not try to hold as close to the original algorithm removed from the first version of Microdata.
The registry was a concession to other vocabulary schemes, which aren't particularly important for schema.org, and the registry has gone mostly modified since the beginning:
{
"http://schema.org/": {
"properties": {
"additionalType": {"subPropertyOf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#type"}
}
},
"http://microformats.org/profile/hcard": {}
}
There's a concession for hcard (of questionable value IMHO), and an inference rule for additionalType
.
from microdata.
@gkellogg wrote:
…
Otherwise, if type is not empty, construct vocab by removing everything following the last SOLIDUS U+002F ("/") or NUMBER SIGN U+0023 ("#") from the path component of type.
Essentially, for schema.org, if item type is http://schema.org/Thing, vocab becomes http://schema.org/. This is then used as the current vocabulary when creating a property from name.
This matches what I think people do, and what I think people should do, but as far as I can tell that isn't documented in the microdata spec itself, and effectively contradicts the microdata spec, unless it said something meaningful about vocabularies - which as per #34 I don't think it does…
Which seems like an issue that should be fixed. And that means looking at what else people do with microdata, in case there is a real-world interoperability problem. If "everyone" uses it that way (for a sufficiently inclusive value of "everyone") then I think it is easiest to lift that as a general approach in the core algorithm. If people are actually relying on the current wording in the microdata spec, we'll need to do something like state that relevant types may be defined by vocabularies to take that approach.
Personally, I can't see the use case for typed items not to use the itemtype
to act like RDFa's vocab
. And it seems to me that the intention of itemtype
taking URLs is to do this. But I am making some big assumptions there.
Hence wanting more people to weigh in.
/@Hixie ?
from microdata.
As I quickly read (past tense) Microdata to RDF, itemtype="https://schema.org/Thing"
means that the root vocabulary is "https://schema.org/" but itemtype="https://schema.org/Person/Teacher"
actually results in a root vocabulary of "https://schema.org/Person/".
Reading more carefully, it appears that the intent is that the registry is a list of known vocabulary prefixes, optionally associating them with properties for subtypes and equivalent types.
Given the ambiguous language in the spec, and my hunch that processors don't actually fetch the registry in practice, I'm leaning toward the following, which essentially adopts the relevant bits from the Note:
- Request that W3C establish the registry, and a mechanism for updating it, unless that was already done in working on the µdata-RDF note.
- Use the raw prefixes in the registry as a list, and where there is a match, specify the prefix as the vocabulary prefix.
- If there is no prefix in the registry, a processor-specific extension MAY be used to associate a vocabulary. Note that I don't like this much, but it's reality. Such extensions should be submitted to the W3C registry
- Otherwise, use the "
itemtype
up to the last NUMBER SIGN or if there is none last SOLIDUS or if there is none NUMBER SIGN appended to the value of theitemtype
" as the vocabulary prefix.
from microdata.
@danbri wondered offline if there is an obvious default we can use to avoid needing a registry, but I don't think so.
from microdata.
Reading more carefully, it appears that the intent is that the registry is a list of known vocabulary prefixes, optionally associating them with properties for subtypes and equivalent types.
Indeed, but really, the only one that mattered was schema.org. At the time, there was thought of using URL hierarchies for extensions, but I believe this became disfavored, so it may be more theoretical.
Indeed, I don't think that processors dynamically fetch the registry, but built it in. There is a mechanism for updating it, and it has been done through the efforts of @iherman; typically, when the Note is revised, but it can be at anytime. Given that it is a Note, there is no real formal mechanism for doing this. Note that the registry also include vocabulary expansion, which allows us to go from http://schema.org/additionalType
to rdf:type
.
As you say, in the absence of an entry in the registry, the @itemtype
value is taken up to the last NUMBER SIGN or SOLIDUS. Certainly, a processor may use it's own heuristics, at the potential for loosing interoperability.
from microdata.
The alternative would be to specify some particular rules in the spec.
Does anyone know of a use of microdata other than schema.org where you might have itemtype="https://some.host/compound/path"
but want the prefix associated to be just "https://some.host/" ? I can imagine this being the case for W3C, if they publish a couple of substantial vocabularies... but that's not a current concrete use case.
from microdata.
Related Issues (20)
- Global Identifier
- Values section title odd
- Textual property value does not use language of the element HOT 4
- No description of how numeric property values are obtained. HOT 12
- Incomplete sentence: "User agents are"
- Syntax highlighting not working correctly HOT 12
- Capitalization of "microdata"
- give examples and algorithms a URL HOT 1
- Provide an example of itemid
- incomplete sentence "User agents are" HOT 2
- RDFa and JSON-LD are not equivalent HOT 12
- RDFa should generate to RDFa Lite HOT 4
- Use the same example for JSON-LD and RDFa HOT 6
- Reference to [microdata-rdf] should be changed HOT 13
- Reusing components in different contexts HOT 4
- itemref as a url, not just an ID within the same document
- "Valid" definition doesn't resolve
- "Our company" example is confusing
- Hedral the Cat HOT 1
- Hedral Issue 2 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from microdata.