Comments (6)
Sure, though I'm not really familiar with ShEx at all. I will read up and get a better idea of how this might fit into serd. Were you just thinking of syntax support, or...?
from serd.
I was thinking of a full implementation. it's not a ton of code and I already have the yacc. Though I guess you don't already have yacc and bison in your build dependencies.
from serd.
Yeah, there are no dependencies at all. Some of the unique things about serd stem from it being a hand hacked parser, which is pretty tedious to write and maintain but lets me control everything.
Are you thinking ShexC? That would probably be quite some work, but ShexJ would be easier since I already have JSON reading code lying around (even though JSON-LD isn't in master yet... herculean effort, that one, and serd could only ever support a subset since the spec doesn't allow streaming. Oh well)
from serd.
Although that makes me realize an important question: can Shex be parsed as a stream (i.e. emitted as a sequence of statement(s, p, o) calls in the same order they are found in the document) without significant readahead? Serd is fundamentally based on this, things that can't stream don't really fit.
(Sorry if this is obvious, I haven't found the time to read the spec in detail yet)
from serd.
ShExJ makes a lot of sense. There are plenty of tools to convert between ShExC and ShExJ if folks want to work in ShExC.
Re streaming, I guess everything is stream-able if you are willing to buffer enough. I believe @iovka and Jérémie Dusart are working on something related to this. The challenge is that validation is typically top-down, e.g. you start by validating <Obs1>
as <ObservationShape>
. In the process of that, you must then validate <Patient2>@<PatientShape>
. The big challenge is: at what point do you decide you've seen all of the triples related to <Obs1>
or <Patient2>
?
This is similar to the problem of serialization; at some point you decide that you're not waiting for more triples from some node and you go ahead and write a .
or ]
. (Making a bad call doesn't seem as dire in serialization because you can always write a node out again, but that's not true of an anonymous blank node.) While we can construct screw cases, I expect we can address a lot of bulk-validation use cases with some heuristics to say when we assume we have all arcs out of node. Particularly easy would be nested anonymous BNodes such as what you see in FHIR/RDF.
from serd.
I think needing a model for validation itself is fine, and assumed that'd be the case, though streaming validation would be awesome if possible.
To support reading Shex* and writing the corresponding Turtle (or building a model out of it), though, that would need to be streamable. Essentially in order to parse a file, serd needs to be able to spit it out as triples as it goes. Seems like this should be possible here (maybe with some restrictions on key order, as it goes with JSON-LD, but I'm not sure in this case).
I imagine it would look something like parsing the ShEx file into a model, then having a function that takes that, and a data model, and validates one against the other (or, alternatively, just mash them all in the same model if that makes sense for ShEx).
from serd.
Related Issues (20)
- Colliding generated blank nodes during TriG import HOT 6
- How to apply a base URI? HOT 4
- Resolution for base URIs with empty path HOT 2
- Cannot parse a valid TriG document HOT 1
- Error parsing 'a' without whitespace HOT 1
- Build error HOT 3
- Parsing from a string in python HOT 11
- Compile failure on OSX (gcc) due to deprecated attributes message HOT 1
- serd 0.30.8 build failure on mojave and catalina HOT 9
- Unable to parse triple-quoted literal HOT 7
- Add streaming support for .gz and .bz2 format input / output files HOT 7
- Write canonical NTriples 1.1 by default HOT 6
- pkg-config file should container -DSERD_STATIC on static build HOT 11
- Debian / Archlinux package: Available ? HOT 1
- Does serdi support named pipe input/output ? HOT 5
- Add support for reading RDF* HOT 2
- [master/0.30.16] Statc build (-Dstatic=true) fails with link error: attempted static link of dynamic object `libserd-0.so.0.31.0' HOT 9
- Bug: serd_reader_read_chunk does not support NQuads HOT 2
- Version >= 0.30.16 writes faulty syntax for tuples in TTL files HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from serd.