Giter Site home page Giter Site logo

rdf's People

Contributors

aamedina avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

rdf's Issues

Investigate Asami for MOP Bootstrapping environment

https://github.com/quoll/asami

I used XTDB when bootstrapping because I needed a datalog database without a schema so I could prepare one for Datomic based on the RDF models required by the user. XTDB also supports full-text search with Lucene which has proven useful for exploring datasets. However, Asami supports the open world assumption by default and supports transitive queries out of the box. Furthermore it supports CLJS. I think there is significant potential for being a superior bootstrapping environment than XTDB for RDF data in Clojure.

I think having the ability to choose a bootstrapping database for users would also be a good enhancement in general.

Use RDFa and JSON-LD vocabulary to describe all Clojure metadata

Problem:
Right now I use a made up attribute :rdf/ns-prefix-map to annotate a namespace with its prefixes. This is non-standard and can't be represented in RDF automatically.

Solution:
Instead, I think it should be something that I can embed in the :owl/Ontology instance directly with RDFa microdata combined with JSON-LD vocabulary.

{:rdf/type :owl/Ontology
 "@context" {"ex" "http://example.com/"}}

The "@context" key can be processed by JSON-LD processors to expand into a JSON-LD context.

Alternative:
I was also thinking of using a reader macro like #jsonld/Context {...} but to include that in the ontology is too verbose I think.

{:rdf/type :owl/Ontology
 :jsonld/context #jsonld/Context {"ex" "http://example.com/"}}

the reader macro would expand into something like:

{:rdf/type          :jsonld/Context
 :jsonld/definition [{:rdf/type    :jsonld/TermDefinition
                           :jsonld/term "ex"
                           :jsonld/id   {:rdfa/uri "http://example.com/"}}]}

Questions:
Does sh:declare fit into this?

Separate class finalization from Universal Translator

Problem:
The Universal Translator component does too much right now. If people want to use a metaobject protocol it isn't clear how to effectively configure it. I think it was useful to develop the MOP but now it presents a challenge introducing the technology to other people. I want to perform the minimum amount of metaobject inferencing when bootstrapping instead and move the full MOP (which does class finalization on every loaded OWL class) into its own component. I originally did write it like this but the cyclic dependencies of the RDF module and MOP module made it convenient to move everything into one library.

Solution:
The essential elements of the MOP I have found are:

  • compute-class-precedence-list / class-precedence-list
  • compute-slots / class-slots / class-direct-slots
  • class-direct-subclasses / class-direct-superclasses

In the MOP RDF vocabulary these are represented by the following RDF properties:

  • :mop/classPrecedenceList
  • :mop/classSlots / :mop/classDirectSlots
  • :mop/classDirectSubclasses

class-direct-superclasses just looks at :rdfs/subClassOf so it doesn't need to be materialized beyond RDFS.

The approach I am considering taking during bootstrap is as follows:

  1. Gather your vocabulary by requiring Clojure namespaces that represent RDF models
  2. Transact all of the vocabulary into Asami local storage "asami:local://.vocab"
  3. Query Asami for all classes (not individuals)
  4. Run the "poor mans" finalize in parallel materializing those essential properties per class.
  5. Transact the updated metaobjects into Asami which becomes your RDF graph

Document RDF/EDN

Problem:
I need a way to serialize arbitrary RDF models as EDN.

Solution:
I use Apache Jena and Aristotle to read from some RDF source (could be JSON-LD, Turtle, RDF/XML etc.) and convert the triples into maps of subjects. The blank nodes are represented by subjects with a :db/id and named resources are given a :db/ident iff a prefix mapping exists at the time the RDF model is parsed by Jena. If no prefix mapping is found resources are identified by :rdfa/uri.

Typed literals are read as EDN tagged literals using the IRI of the datatype as the tag.

Language tagged literals are read as #rdf/langString "hello@en".

#rdf/List [:owl/Class :rdfs/Class]  
=> 
{:rdf/type :rdf/List,
 :rdf/first :owl/Class,
 :rdf/rest
 {:rdf/type :rdf/List, :rdf/first :rdfs/Class, :rdf/rest :rdf/nil}}

Subsumes #6

Document exporting RDF graph from Datomic

Problem:
It isn't obvious how to create an RDF graph out of a Datomic database.

Solution:
Document the process and offer a Clojure API that makes it possible to do this with one function given a database and arguments (time/transaction based filters?).

Questions:
Do I want to make this a conforming RDFa processor that operates on a "document" which is a datomic database?

Materialize "The Semantics of Schema Vocabulary" OWL RL subset for Datomic

Problem:
In order to get to OWL RL reasoning I need to bootstrap a datomic schema that will support the necessary datalog queries.

Proposed solution:
Implement the following inferences over the bootstrapping environment:

Semantics If Then
scm-cls T(?c, rdf:type, owl:Class) T(?c, rdfs:subClassOf, ?c)
T(?c, owl:equivalentClass, ?c)
T(?c, rdfs:subClassOf, owl:Thing)
T(owl:Nothing, rdfs:subClassOf, ?c)
scm-sco T(?c1, rdfs:subClassOf, ?c2)
T(?c2, rdfs:subClassOf, ?c3)
T(?c1, rdfs:subClassOf, ?c3)
scm-eqc1 T(?c1, owl:equivalentClass, ?c2) T(?c1, rdfs:subClassOf, ?c2)
T(?c2, rdfs:subClassOf, ?c1)
scm-eqc2 T(?c1, rdfs:subClassOf, ?c2)
T(?c2, rdfs:subClassOf, ?c1)
T(?c1, owl:equivalentClass, ?c2)
scm-op T(?p, rdf:type, owl:ObjectProperty) T(?p, rdfs:subPropertyOf, ?p)
T(?p, owl:equivalentProperty, ?p)
scm-dp T(?p, rdf:type, owl:DatatypeProperty) T(?p, rdfs:subPropertyOf, ?p)
T(?p, owl:equivalentProperty, ?p)
scm-spo T(?p1, rdfs:subPropertyOf, ?p2)
T(?p2, rdfs:subPropertyOf, ?p3)
T(?p1, rdfs:subPropertyOf, ?p3)
scm-eqp1 T(?p1, owl:equivalentProperty, ?p2) T(?p1, rdfs:subPropertyOf, ?p2)
T(?p2, rdfs:subPropertyOf, ?p1)
scm-eqp2 T(?p1, rdfs:subPropertyOf, ?p2)
T(?p2, rdfs:subPropertyOf, ?p1)
T(?p1, owl:equivalentProperty, ?p2)
scm-dom1 T(?p, rdfs:domain, ?c1)
T(?c1, rdfs:subClassOf, ?c2)
T(?p, rdfs:domain, ?c2)
scm-dom2 T(?p2, rdfs:domain, ?c)
T(?p1, rdfs:subPropertyOf, ?p2)
T(?p1, rdfs:domain, ?c)
scm-rng1 T(?p, rdfs:range, ?c1)
T(?c1, rdfs:subClassOf, ?c2)
T(?p, rdfs:range, ?c2)
scm-rng2 T(?p2, rdfs:range, ?c)
T(?p1, rdfs:subPropertyOf, ?p2)
T(?p1, rdfs:range, ?c)
scm-hv T(?c1, owl:hasValue, ?i)
T(?c1, owl:onProperty, ?p1)
T(?c2, owl:hasValue, ?i)
T(?c2, owl:onProperty, ?p2)
T(?p1, rdfs:subPropertyOf, ?p2)
T(?c1, rdfs:subClassOf, ?c2)
scm-svf1 T(?c1, owl:someValuesFrom, ?y1)
T(?c1, owl:onProperty, ?p)
T(?c2, owl:someValuesFrom, ?y2)
T(?c2, owl:onProperty, ?p)
T(?y1, rdfs:subClassOf, ?y2)
T(?c1, rdfs:subClassOf, ?c2)
scm-svf2 T(?c1, owl:someValuesFrom, ?y)
T(?c1, owl:onProperty, ?p1)
T(?c2, owl:someValuesFrom, ?y)
T(?c2, owl:onProperty, ?p2)
T(?p1, rdfs:subPropertyOf, ?p2)
T(?c1, rdfs:subClassOf, ?c2)
scm-avf1 T(?c1, owl:allValuesFrom, ?y1)
T(?c1, owl:onProperty, ?p)
T(?c2, owl:allValuesFrom, ?y2)
T(?c2, owl:onProperty, ?p)
T(?y1, rdfs:subClassOf, ?y2)
T(?c1, rdfs:subClassOf, ?c2)
scm-avf2 T(?c1, owl:allValuesFrom, ?y)
T(?c1, owl:onProperty, ?p1)
T(?c2, owl:allValuesFrom, ?y)
T(?c2, owl:onProperty, ?p2)
T(?p1, rdfs:subPropertyOf, ?p2)
T(?c2, rdfs:subClassOf, ?c1)
scm-int T(?c, owl:intersectionOf, ?x)
LIST[?x, ?c1, ..., ?cn]
T(?c, rdfs:subClassOf, ?c1)
T(?c, rdfs:subClassOf, ?c2)
...
T(?c, rdfs:subClassOf, ?cn)
scm-uni T(?c, owl:unionOf, ?x)
LIST[?x, ?c1, ..., ?cn]
T(?c1, rdfs:subClassOf, ?c)
T(?c2, rdfs:subClassOf, ?c)
...
T(?cn, rdfs:subClassOf, ?c)

Rethink and rewrite datomic schema inferencing

Problem:

Right now there are multimethods used to navigate the metaobject type hierarchy and while this works I find it cumbersome to write and reason about.

Solution:
Instead I think the user interface needs to be EDN that expands into multimethod implementations and "hide" the multimethods from the user. The hierarchical dispatch can still be useful. Another possibility is adding an interactive mode where if a schema attribute is ambiguous the user will be prompted to disambiguate.

Document approach to RDF datatype boxing

Problem:
Boxing RDF literals can be complex and confusing. Most of the time RDF literals are used with vocabularies declaring properties that are well-typed but occasionally you need to refer to a data value with an IRI which presents complications for Datomic attributes.

Solution:
Document the current approach which is based on creating a blank node with a single property: the IRI of the XSD datatype and the corresponding value.

:xsd/min(max)Inclusive(Exclusive) datomic schema should be a bigdec

Problem:
When I defined the XSD datatype schema for Datomic I used double's which forced me to special case use of these datatypes as properties for boxing RDF literal numeric values for database serialization. However, Datomic supports bigdec and I would like to uniformly represent :xsd/decimal as a bigdec when both reading and writing RDF.

Solution:
Update the schema definitions in net.wikipunk.rdf.xsd and the special case boxing logic in rdf.clj, and test bootstrapping the boot + ext JSON-LD context to verify behavior and catch edge cases.

Automate MOP bootstrap environment with embedded Blazegraph + OWL Axioms

  • dev system should use a vocabulary written to a Blazegraph journal
  • The environment should support full-text search over metaobjects in Blazegraph and not require re-initialization unless new ontologies are loaded by the user
  • Datafy w/materialization using embedded blazegraph via DESCRIBE and consider linked data use cases over HTTP
  • Need to figure out Blazegraph namespace management
  • Is it possible to SPARQL federate in-process over the namespaces?
  • Redesign the universal translator component around Blazegraph embedded capabilities, what can it enable when Blazegraph is available in process to a Clojure program with reasoning?
  • Datomic schema generation should be redesigned so that inference is done on a materialized ontology + RDF in process
  • Is it possible to remove all dependencies EXCEPT blazegraph and Clojure outside of the main library?

Fully specify what a bootstrapping environment is & MOP vocabulary

Problem:
The mop/*env* is a dynamic variable that specifies an environment where metaobjects are resolved during processing. If it is nil, the assumption is that you are searching required Clojure namespaces for the RDF terms that have names. Usually it is bound to either an XTDB node or Datomic database.

While convenient having this be a dynamic variable complicates many aspects of the protocol since the implementation of each method depends on the class of the environment.

Furthermore I think this is a good time to start thinking about what a fully specified MOP ontology in OWL would look like.

Consider :mop/descendants, :mop/ancestors, :mop/parents

I was thinking maybe I should persist the multimethod hierarchy by attaching them to each entity to enable restoration from a versioned database in the cloud. For example, :schema/Movie:

(parents :schema/Movie)
;;=>
#{:schema/CreativeWork :rdfs/Class}

Could be persisted by attaching :mop/parents to the :schema/Movie entity and transacting that into the db.

Then restoration would be: construct the multimethod hierarchy by querying the database for :mop/parents etc and reduce the results into {:parents {:schema/Movie ...}}.

Document RDFa initial context & JSON-LD context design

I need to write documentation explaining that the net.wikipunk.boot namespace contains the RDFa initial context for "wikipunk" RDF processors.

net.wikipunk.ext contains an extended context that may be included if desired.

The design concept is to treat Clojure data structures, which either describe data themselves or contain metadata in source, as potential sources of RDF triples. Using RDFa vocabulary I can annotate Clojure data and retrieve embedded triples. In order to do this I needed to settle on some vocabulary that I could rely on the processor being "aware" of so the RDFa context fits the bill.

Usually these prefix mappings exist as blank nodes in a graph, but how do you figure out what they are before you have a graph? In order to sort through the potential mappings I treat Clojure namespaces with an :rdf/type of :jsonld/Context as places where these prefix or term mappings reside. Any required namespaces with this type will be searched for mappings when the system is started.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.