Dear all,
In addition to the comments by Makx Dekkers (see my other issue), I had a call today with the entire SEMIC team.
They identified some points in our latest draft of mobilityDCAT-AP v1.0, which might create interoperability issues in the DCAT-AP ecosystem. For instance, when datasets are both in "our" scope (ITS Directive) and other frameworks (e.g. the High Value Dataset directive).
I noted the following points and proposed solutions.
As this is a last-minute action for our v1.0 revision, please comment/reply by Sept. 20th at the latest.
Property "contact point"
We removed the "contact point" property...
https://mobilitydcat-ap.github.io/mobilityDCAT-AP/drafts/latest/#recommended-properties-for-dataset
... and concentrate any contact details under the roles associated with the class "Agent" , e.g. :
https://mobilitydcat-ap.github.io/mobilityDCAT-AP/drafts/latest/#dataset-publisher
SEMIC emphasizes the "contact point" as the main property to establish a direct interaction to the data provider, whereas "Agent" should be the official entity behind the data provider. The "contact point" might be more specific, sometimes volatile or even external of the official entity.
-> We will re-introduce "contact point" under class "Dataset", and revise chapter 7.
Class "Data Service"
We recommend not using the class "Data Service"...
https://mobilitydcat-ap.github.io/mobilityDCAT-AP/drafts/latest/#properties-for-data-service
...but recommend declaring endpoint-APIs and similar under class "Distribution".
We optionally allow having class "Data Service" as a link to a "Distribution", creating a 4-layer-hierarchy "Catalogue Record"->"Dataset"->"Distribution"->"Data Service" (whereas the latter one is optional).
DCAT-AP introduced class "Data Service" by purpose, as an equivalent to a "Dataset" being described in a catalogue, in case data is provided via APIs etc. A sole usage of the Dataset->Distribution concept for such provision forms is seen outdated.
-> We will refine the usage note under class "Data Service" and state that any constellations using this class (even behind the 4-layer-hierarchy) are allowed, however noting that the 4-layer-hierarchy represent the status-quo of NAPs so far.
Properties "rights" vs. "licence"
We allow both, whereas the first one is obligatory, the second one recommended.
SEMIC argues that "licence" is the most-granular information possible, denoting concrete licences in machine- and human-readable formats. Thus, this is preferred. In contrast, "rights" is used when no specific information on licensing is available.
->This does not contradict to our understanding. However, I will make this more clear in the usage notes.
Property "theme"
We allow an proprietary (granular) Controlled Vocab, as an addition to the (rough) vocab of the EU Authority List. Both are linked via the same property "dcat:theme".
SEMIC recommends to align any theme definitions, when datasets address more than one domain (e.g. geo AND mobility). In general, a super-vocab should be valid for each domain, and sub-domains might introduce own vocabs, However, different properties should be used to better distinguish different layers of vocabs,
-> I stated this is exactly what we are proposing: two vocabs are noted for "dcat:theme" under chapter 5.2. I will introduce a new property "mobilitydcatap:mobilityTheme", as s sub-property to "dcat:theme", which will link to our proprietary vocab.
Property "frequency"
We link this to a Controlled Vocab from the EU Authority List. Because this vocab is missing some values that we need, we added another proprietary vocab. Both vocabs together provide the entire list of possible values.
SEMIC states that this is possible, but it is better to contact the team of the EU Authority List, and ask them to upgrade their vocab, so we don't need a proprietary vocab.
-> I will do contact the EU Authority List, but dont expect a reaction until our v1.0 publication.
Property "identifier"
We recommend using the property "dct:identifier" as an additional, strong identifier. The RDF URI is the technical identifier, but might not be the best way to clearly identify a dataset.
SEMIC likes our approach of strong identification. However, the alternative property "adms:identifier" should be used. The "dct:identifier" should be identical to the RDF URI, for technical reasons.
-> I will change the property to "adms:identifier" , and refine the usage notes.
- Property "service category"
We introduced this to link a dataset to a MMTIS service
SEMIC thinks these might cause confusion with "Data Service" (see above), recommends renaming.
-> I will rename to "mobilitydcatap:intentedInformationService".
Updates in recent DCAT-AP 3
https://semiceu.github.io/DCAT-AP/releases/3.0.0/
SEMIC explains among others:
- The ranges of temporal information (for our properties "end date" and "start date") is more generic, using a data type "Termporal Literal", instead of formerly "rdfs:Literal typed as xsd:date or xsd:dateTime"
- The versioning (in context of Dataset Series) is using different namespaces: "dcat:version" instead of "owl:versionInfo"
- Classes "Category" and "Category Scheme" are replaced with more generic Classes "Concept" and "Concept Scheme"
-> I will do the replacements, if I don't see any problems created herewith.