Giter Site home page Giter Site logo

ShEx support about serd HOT 6 OPEN

drobilla avatar drobilla commented on August 28, 2024
ShEx support

from serd.

Comments (6)

drobilla avatar drobilla commented on August 28, 2024

Sure, though I'm not really familiar with ShEx at all. I will read up and get a better idea of how this might fit into serd. Were you just thinking of syntax support, or...?

from serd.

ericprud avatar ericprud commented on August 28, 2024

I was thinking of a full implementation. it's not a ton of code and I already have the yacc. Though I guess you don't already have yacc and bison in your build dependencies.

from serd.

drobilla avatar drobilla commented on August 28, 2024

Yeah, there are no dependencies at all. Some of the unique things about serd stem from it being a hand hacked parser, which is pretty tedious to write and maintain but lets me control everything.

Are you thinking ShexC? That would probably be quite some work, but ShexJ would be easier since I already have JSON reading code lying around (even though JSON-LD isn't in master yet... herculean effort, that one, and serd could only ever support a subset since the spec doesn't allow streaming. Oh well)

from serd.

drobilla avatar drobilla commented on August 28, 2024

Although that makes me realize an important question: can Shex be parsed as a stream (i.e. emitted as a sequence of statement(s, p, o) calls in the same order they are found in the document) without significant readahead? Serd is fundamentally based on this, things that can't stream don't really fit.

(Sorry if this is obvious, I haven't found the time to read the spec in detail yet)

from serd.

ericprud avatar ericprud commented on August 28, 2024

ShExJ makes a lot of sense. There are plenty of tools to convert between ShExC and ShExJ if folks want to work in ShExC.

Re streaming, I guess everything is stream-able if you are willing to buffer enough. I believe @iovka and Jérémie Dusart are working on something related to this. The challenge is that validation is typically top-down, e.g. you start by validating <Obs1> as <ObservationShape>. In the process of that, you must then validate <Patient2>@<PatientShape>. The big challenge is: at what point do you decide you've seen all of the triples related to <Obs1> or <Patient2>?

This is similar to the problem of serialization; at some point you decide that you're not waiting for more triples from some node and you go ahead and write a . or ]. (Making a bad call doesn't seem as dire in serialization because you can always write a node out again, but that's not true of an anonymous blank node.) While we can construct screw cases, I expect we can address a lot of bulk-validation use cases with some heuristics to say when we assume we have all arcs out of node. Particularly easy would be nested anonymous BNodes such as what you see in FHIR/RDF.

from serd.

drobilla avatar drobilla commented on August 28, 2024

I think needing a model for validation itself is fine, and assumed that'd be the case, though streaming validation would be awesome if possible.

To support reading Shex* and writing the corresponding Turtle (or building a model out of it), though, that would need to be streamable. Essentially in order to parse a file, serd needs to be able to spit it out as triples as it goes. Seems like this should be possible here (maybe with some restrictions on key order, as it goes with JSON-LD, but I'm not sure in this case).

I imagine it would look something like parsing the ShEx file into a model, then having a function that takes that, and a data model, and validates one against the other (or, alternatively, just mash them all in the same model if that makes sense for ShEx).

from serd.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.