rdflib / pyshacl Goto Github PK
View Code? Open in Web Editor NEWA Python validator for SHACL
License: Apache License 2.0
A Python validator for SHACL
License: Apache License 2.0
According to the SHACL spec:
The validation with recursive shapes is not defined in SHACL and is left to SHACL processor implementations.
I was wondering if pySHACL has any plans to support recursive shapes or does it?
Related to #55
In building the report graph, when a valueNode and focusNode are both the same and are a Blank Node, the validation result's valueNode and focusNode will never have the same ID in the Report Graph. In the current implementation they are both copied over separately (thus they are two new blank nodes), but I could put a simple check if they are the same node, only copy it into the report graph once, then use that for both valueNode and focusNode, then they will have the same ID.
Hello,
is there any way to measure the prevalence of executed SHACL-tests, like getting the total number of instances of the sh:targetClass or a percentage like 0.95 of the instances of the given sh:targetClass fulfill the restrictions? If not I think it would be nice to have.
Best Regards
I tried using the SHACL found at http://datashapes.org/schema.ttl to validate some data and received the following error:
NotImplementedError: SHACL Advanced Feature SPARQLFunction is not yet supported.
I'll add my vote to getting this feature implemented.
In the report the resource found in a http://www.w3.org/ns/shacl#value is empty, see the json below.
[
{
...
"@type": [
"http://www.w3.org/ns/shacl#ValidationResult"
],
"http://www.w3.org/ns/shacl#focusNode": [
{
"@id": "http://vangoghmuseum.nl/data/artwork/d0005V1962"
}
],
...
"http://www.w3.org/ns/shacl#value": [
{
"@id": "_:N6087b61f1f1d44e08519420c185ba3f2"
}
]
},
{
"@id": "_:N6087b61f1f1d44e08519420c185ba3f2"
},
This report is the result of a propertyShape with a sh:node constraint. The first validation result in the example contains the information of the shape containing the sh:node. This fine. The value (N6087b61f1f1d44e08519420c185ba3f2) should contain the information of the result for the sh:node. I confirmed this in TopBraid.
The path is not getting interpreted correctly:
file://c:\my\full\path\test.ttl/ does not look like a valid URI, trying to serialize this will break.
I've tried with forward slashes, backslashes, no slashes (all files in current directory), full filespec (with and without c:), etc. Couldn't get any of them to work.
Windows 10.
When using pre-inferencing options, there is a bug in the current version of pyshacl (and some previous versions) which results in the target_graph
property of the Validator (after executing run()
) will contain the pre-inferenced data graph, not the expanded data graph that is expected.
The attached files illustrate the behavior I am seeing. The W3C validator (https://shacl.org/playground/) does flag this as non-conforming, so I am trusting this is not operator error on my part.
The shacl graph includes a property shape defined as follows:
ex:Func a owl:Class , sh:NodeShape ;
rdfs:label "Func" ;
rdfs:subClassOf ex:Function ;
sh:property [ a sh:PropertyShape ;
sh:class ex:FuncParam_Func_a ;
sh:path ex:hasParameter ;
sh:minCount 1;
sh:name "Func_a"
] .
and the graph being validated includes
test:FuncNode a ex:Func;
ex:hasParameter test:FuncParam_b .
test:FuncParam_a a ex:FuncParam_Func_a .
test:FuncParam_b a ex:FuncParam_Func_b .
I want to request a feature for validation reports of SPARQLConstraint(Component) as described in the SHACL Recommondations.
I'd really appreciate the possibility to use variables from SELECT queries in the sh:message, i.e.:
:VerifyPowerAdapterSupplyShape
a sh:NodeShape ;
sh:targetClass ex:Computer ;
sh:sparql [
a sh:SPARQLConstraint ;
sh:message "The power adapter ({?availablePower} W) must provide more power than the parts of the computer consume ({?requiredPower} W)." ;
sh:prefixes ex: ;
sh:select """
SELECT $this ?availablePower ?requiredPower
WHERE {
$this ex:hasPowerAdapter ?powerAdapter .
?powerAdapter ex:hasPowerSupply ?availablePower .
{
SELECT (SUM(?power) as ?requiredPower)
WHERE {
$this ex:hasPart ?device .
?device ex:hasRequiredPower ?power .
}
}
FILTER(?availablePower < ?requiredPower) .
}
""" ;
] .
That would help me alot to debug ontologies that rely on complex SHACL-SPARQL constraints :)
Please look at https://github.com/xyzisinus/shacl-example for the shape/data files that we think should generate a particular error but not by pyshacl. The readme explains why we think that way. It may well be our misunderstanding. Thanks.
PySHACL is maturing and becoming an increasingly powerful and relevant tool for validating SHACL. I believe it is the go-to tool for SHACL validation on the commandline, and should be easily accessible for as many users as possible.
I want to get pySHACL packaged as a debian package and available from the official debian repositories, and in turn into Ubuntu repositories.
PySHACL has two dependencies, RDFLib and OWL-RL. RDFLib is already packaged and available in the debian repositories, so I need to get owlrl in too before I can package and publish a pySHACL debian package.
I've already submitted an ITP (Intent to Package) for both owlrl and pySHACL, to the Debian WNPP list.
I've created an Uploader account on the Debain Mentors site, so that I can request a sponsor to sponsor the package (to authorize it on my behalf) once the package is uploaded to the Mentors site staging area.
Validator says there is a runtime error, no additional details turning on debug:
$ pyshacl -s 03-Network.ttl -e 03-Network.ttl sample-network.ttl
Validator encountered a Runtime Error.
Info:
$ python3.6
Python 3.6.9 (default, Nov 7 2019, 10:44:02)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyshacl
INFO:rdflib:RDFLib Version: 4.2.2
>>> print(pyshacl.__version__)
0.11.3.post1
Turtle files attached as text files.
03-Network.ttl.txt
sample-network.ttl.txt
[Python 3.7.0, rdflib 4.2.2, pyshacl 0.9.8.post1]
I am using a graph as shacl_graph shown below.
conforms, v_graph, v_text = validate(g, shacl_graph=g2,
data_graph_format='turtle',
shacl_graph_format='turtle',
inference='rdfs', debug=True,
serialize_report_graph=True)
Validation Report
Conforms: True
The g2 graph I use is the following:
@prefix hei: <http://hei.org/customer/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
hei:HeiAddressShape a sh:NodeShape ;
sh:property [ rdfs:comment "Street constraint" ;
sh:datatype xsd:string ;
sh:minlength 30 ;
sh:path hei:Ship_to_street ] ;
sh:targetClass hei:Hei_customer .
Data validated is:
hei:hei_cust_1281 a hei:Sfg_customer ;
rdfs:label "XYZHorecagroothandel" ;
hei:Klant_nummer 1281 ;
hei:Ship_to_City "Middenmeer" ;
hei:Ship_to_postcode "1799 AB" ;
hei:Ship_to_street "Industrieweg"
The issue is when I pass a graph object no validation is done; passing the g2 validation graph as a string works fine. I did expect both options to work fine.
I ma getting the following error when I try to install pyshacl
Could not find a version that satisfies the requirement RDFClosure (from pyshacl) (from versions: )
No matching distribution found for RDFClosure (from pyshacl)
Assume this requires the deployment of coverage.py or similar package? Assume that's not already implemented?
It would be good to wrap pySHACL as a Windows EXE so windows users can execute the CLI without necessarily having to install python
Instead of the inference arg being a bool, it should be a string for 'none', 'owl-rl', 'rdfs' or 'both' with default being 'both' - the various OWL-RL inference regimes or none.
So, very new to all this... but have a question.
Is it possible to define an owl:imports in a shape file to pull in previously defined shapes from another file? Ref https://github.com/ESIPFed/science-on-schema.org/blob/master/tools/sospy/shapegraphs/reqrec.ttl
I'm looking at https://book.validatingrdf.com/bookHtml011.html section 5.4 for inspiration.
Note: Maybe I'm being too cute trying to import from a github raw URL?
I get a proper violation from the recomended.ttl file, but I can not import it and use it. I don't know if this is not possible or (more likely) I'm doing it wrong.
Thanks
pyshacl -s ./shapegraphs/recomendShape.ttl -m -f human -df json-ld ./datagraphs/dataset-minimal.json-ld
Validation Report
Conforms: False
Results (1):
Constraint Violation in MinCountConstraintComponent (http://www.w3.org/ns/shacl#MinCountConstraintComponent):
Severity: sh:Violation
Source Shape: [ sh:maxCount Literal("1", datatype=xsd:integer) ; sh:minCount Literal("1", datatype=xsd:integer) ; sh:path <http://schema.org/citation> ]
Focus Node: [ ]
Result Path: <http://schema.org/citation>
ย
pyshacl -s ./shapegraphs/reqrec.ttl -m -f human -df json-ld ./datagraphs/dataset-minimal.json-ld
Validation Report
Conforms: True
I've been happily using pyshacl (installed via pip3) to work on shacl rules, and in the process have broken my rules file in a manner I can't seem to correct, so was hoping you may have some tips for a newcomer ...
The error I get is simply
Validator encountered a Runtime Error:
Shape pointed to by sh:property does not exist or is not a well-formed SHACL PropertyShape.If you believe this is a bug in pyshacl, open an Issue on the pyshacl github page.
I don't believe it is a bug in pyshacl; the same two files in the shacl playground produce only VALIDATION FAILURE: Missing subject
-- when I load the shacl into RDFlib and query, I can't find any sh:property
triple with an unbound subject, but that may be a very naive approach.
run through meta-shacl shows nothing terribly helpful; unrelated things I know work in some engines such as using rdf:list items instead of spelling out a first/rest list.
In the -d debug trace, do the last few Constraint Report/Violations clues to the bad rule?
How might I get more information about what I've messed up in the shacl?
I have a jsonld data file which looks something like this:
{ "@type": "ex:Activity",
"schema:description": "example schema",
"ui": { "order": [
"john",
"mark",
"lisy" ],
"shuffle": false }
How do I write a constraint for the property ui
since it is an object?
if there is @version: 1.1
mentioned in the context of a jsonld data graph, pySHACL throws error :
"TypeError: argument of type 'float' is not iterable."
How do I validate data graphs which are written in json-ld 1.1 version?
I also posted this here: https://webmasters.stackexchange.com/questions/127215/how-to-validate-data-graph-expressed-in-json-ld-version-1-1-with-shacl
Can somebody here help? @ashleysommer ?
According to 8.4 General Execution Instructions for SHACL Rules implementations modify the data graph if triples get inferred, and/or may "construct a logical data graph that has the original data as one subgraph and a dedicated inferences graph as another subgraph, and where the inferred triples get added to the inferences graph only."
I've been following the Classification With SHACL Rules article and I would like to extract the graph of inferred triples which would include <http://bakery.com/ns#AppleTartC> a <http://bakery.com/ns#NonGlutenFreeBakedGood>, <http://bakery.com/ns#VeganBakedGood> .
merged into the data graph or as a inference graph.
Is this feature available?
Following the trick mentioned in https://www.w3.org/wiki/SHACL/Examples i wanted to write a shape to validate the existence of a node.
The shape
{
"@context": {
"rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
"sh": "http://www.w3.org/ns/shacl#",
"schema": "http://schema.org/"
},
"@graph": [
{
"@id": "_:forceDatasetShape",
"@type": "sh:NodeShape",
"sh:targetNode": "schema:DigitalDocument",
"sh:property": [
{
"sh:path": [
{
"sh:inversePath": [{
"@id": "rdf:type",
"@type": "@id"
}]
}
],
"sh:minCount": 1
}
]
}
]
}
with the graph
{}
throws a validation error in the SHACL playground https://shacl.org/playground/
But pyshacl says that it's conforming. Does this inversePath
trick not work with pySHACL?
Command I'm using is: pyshacl -a -m -s shape.json graph.json -sf json-ld -df json-ld
With pyshacl version 0.11.3
On a side note, SHACL playground validates successfully with
{
"@context": { "schema": "http://schema.org/", "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#" },
"@id": "http://example.org/ns#Bob",
"rdf:type": "http://schema.org/DigitalDocument"
}
but not with
{
"@context": { "schema": "http://schema.org/", "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#" },
"@id": "http://example.org/ns#Bob",
"@type": "http://schema.org/DigitalDocument"
}
or
{
"@context": { "schema": "http://schema.org/", "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#" },
"@id": "http://example.org/ns#Bob",
"rdf:type": "schema:DigitalDocument"
}
which is weird. I was under the impression that @type
is an alias for rdf:type
.
Hi,
This validator arrived just at the right time to enable more adoption of SHACL. Thank you for this effort.
I'm using a jupyter notebook running on python 3.6 and I installed the pyshacl module with:
!pip install git+https://github.com/RDFLib/[email protected]#egg=pyshacl
As suggested in the 'Use' section of the README file, I tried a basic validation by running:
from pyldapi import validate
validate(target_graph, shacl_graph, inference='rdfs', abort_on_error=False)
But I got a ModuleNotFoundError: No module named 'pyldapi'
I guess it's because the validate function seems to be part of the pyshacl module. I got it right by running :
from pyshacl import validate
validate(target_graph, shacl_graph, inference='rdfs', abort_on_error=False)
Thanks.
PySHACL was originally built to be a basic (but fully standards compliant) SHACL validator. That is, it uses SHACL shapes to check conformance of a data graph, and gives you the result (True
/False
, plus a ValidationReport
).
PySHACL does that job quite well. It can be called from python or from the command line, and it delivers the results users expect.
Over the last 12 months, I've been slowly implementing more of the SHACL Advanced Features spec, and pySHACL is now almost AF-complete.
The Advanced features add capability to SHACL which extends beyond that of just validating. Eg, the SHACL Rules allow you to run SHACL-based entailment on your data graph. SHACL Functions allow you to execute parameterised custom SPARQL Functions over the data graph. Custom Targets allow you to bypass the standard SHACL node-targeting mechanism and use SPARQL to select targets.
These features can use useful to execute validation in a more customisable way, but their major benefit is in the general use outside of just validating a data graph against constraints.
With these new features I see the possibility of PySHACL operating in additional alternative modes, besides that of just validating. Eg, expansion mode could run SHACL-AF Functions and Rules on the data graph, then return the expanded data graph (without validating).
Related to #20
When using pyshacl with -m option, pyshacl reports a traceback about a ValueError: read of closed file.
### pyshacl -m -s shape.ttl data.ttl
Traceback (most recent call last):
File "/usr/local/bin/pyshacl", line 71, in <module>
is_conform, v_graph, v_text = validate(args.data, **validator_kwargs)
File "/usr/local/lib/python3.7/site-packages/pyshacl/validate.py", line 194, in validate
rdf_format=shacl_graph_format)
File "/usr/local/lib/python3.7/site-packages/pyshacl/util.py", line 176, in load_into_graph
data = target.read()
ValueError: read of closed file
shape.ttl and data.ttl are valid files with valid shapes and RDF data.
When using pyshacl -s shape.ttl data.ttl
- so without -m - pyshacl works as expected.
This may be related to the changes made for #59
Using the script below and the SHACL from http://datashapes.org/schema.ttl, I get the following error:
ConstraintLoadError: sh:namespace value must be an RDF Literal with type xsd:anyURI.
https://www.w3.org/TR/shacl/#sparql-prefixes
However, running pyshacl from the command line, appears to work correctly.
pyshacl -s ./schema_org_validation.ttl ./test_data.ttl
Validation Report
Conforms: False
Results (1):
Constraint Violation in ClassConstraintComponent (http://www.w3.org/ns/shacl#ClassConstraintComponent):
Severity: sh:Violation
Source Shape: schema:CommunicateAction-about
Focus Node: ex:asdgjkj
Value Node: [ rdf:type sch:GameServer ; sch:playersOnline Literal("42", datatype=xsd:integer) ]
Result Path: schema:about
Message: Value does not have class schema:Thing
(I am not include the schema.org schema, hence the validation error)
Python script:
import rdflib
from pyshacl import validate
data = """
@prefix ex: <http://example.org/> .
@prefix sch: <http://schema.org/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
ex:asdgjkj a sch:CommunicateAction ;
sch:about [ a sch:GameServer ;
sch:playersOnline "42"^^xsd:integer ] .
"""
dataGraph = rdflib.Graph().parse( data = data, format = 'ttl' )
print( dataGraph.serialize( format='ttl' ).decode( 'utf8' ) )
shaclData = open( "./schema_org_validation.ttl", "r" ).read()
shaclGraph = rdflib.Graph().parse( data = shaclData, format = 'ttl' )
report = validate( dataGraph, shacl_graph = shaclGraph, abort_on_error = False, meta_shacl = False, debug = False, advanced = True, do_owl_imports = True )
print( report[2] )
Hi,
we are trying to use SPARQL-based targets in our SHACL-Tests.
Our Test should use all non-anonymous instances of owl:Class as Focus Nodes, but it seems its not working:
The Test:
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
<#LODE-class-comment-violation>
a sh:Shape ;
sh:target [
a sh:SPARQLTarget ;
sh:select """
SELECT ?this WHERE {
?this <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .
FILTER ( !isBlank(?this) )
}
""";
];
sh:severity sh:Violation;
sh:path rdfs:comment;
sh:nodeKind sh:Literal;
sh:minCount 1;
sh:name "comment not correctly specified"@en;
sh:message "rdfs:comment is missing or is no Literal"@en .
The result conforms as true with this ontology as data graph in pyshacl (with advanced=true in the validate function), but does not conform if we try the same in the shacl play service.
Is this a bug or did we miss something?
Thanks in advance,
Denis
if Data graph is this:
`{
"@context": { "@vocab": "http://schema.org/" },
"@id": "http://example.org/ns#Bob",
"@type": "Person",
"givenName": "Robert",
"familyName": "Junior",
"birthDate": "1971-07-07",
"deathDate": "1968-09-10",
"address": {
"@id": "http://example.org/ns#BobsAddress",
"streetAddress": "1600 Amphitheatre Pkway",
"postalCode": 9404
}
}`
and Shapes Graph this :
@Prefix dash: http://datashapes.org/dash# .
@Prefix rdf: https://www.w3.org/1999/02/22-rdf-syntax-ns# .
@Prefix rdfs: https://www.w3.org/2000/01/rdf-schema# .
@Prefix schema: http://schema.org/ .
@Prefix sh: https://www.w3.org/ns/shacl# .
@Prefix xsd: https://www.w3.org/2001/XMLSchema# .
schema:PersonShape
a sh:NodeShape ;
sh:targetClass schema:Person ;
sh:property [
sh:path schema:givenName ;
sh:datatype xsd:string ;
sh:name "given name" ;
] ;
sh:property [
sh:path schema:birthDate ;
sh:lessThan schema:deathDate ;
sh:maxCount 1 ;
] ;
sh:property [
sh:path schema:gender ;
sh:in ( "female" "male" ) ;
] ;
sh:property [
sh:path schema:address ;
sh:node schema:AddressShape ;
] .
schema:AddressShape
a sh:NodeShape ;
sh:closed true ;
sh:property [
sh:path schema:streetAddress ;
sh:datatype xsd:string ;
] ;
sh:property [
sh:path schema:postalCode ;
sh:or ( [ sh:datatype xsd:string ] [ sh:datatype xsd:integer ] ) ;
sh:minInclusive 10000 ;
sh:maxInclusive 99999 ;
] .
when I do this:
pyshacl -s /path/to/shapesGraph.ttl -m -i rdfs -a -f human /path/to/dataGraph.json-ld -df json-ld
why doesn't it show validation errors? (as we can clearly see there is error in address and birthDate in the data graph)
The load.py
script currently loads all passed files into an rdflib Graph
object. For JSON-LD files that contain multiple named graphs, this means that the resulting graph object g
in the referenced line [1] will be empty, and the validation will succeed without warning.
I know that pySHACL currently does not support TriG or NQuads files, but if you allow for JSON-LD, you should allow for these as well as the big difference is the support for named graphs.
There are three ways around this:
ConjunctiveGraph
. This disregards all named graph information, but makes all triples available for validation. This is the behaviour of the SHACL playground implementation. This is not ideal either, as we would like to validate individual graphs (as per the SHACL spec).Dataset
and then iterate over each contained graph for validation purposes. This is a more involved fix that requires the loader to always load into a Dataset, and then the validator should iterate over each graph contained in the dataset.[1]
pySHACL/pyshacl/rdfutil/load.py
Line 134 in a42cb6b
I have a gist with all of the relevant files at: https://gist.github.com/James-Hudson3010/2588d9b17dd33e15922122b8b5cf1bd7
If I execute:
$ pyshacl -a -f human employees.ttl
I get the following, correct validation report...
Validation Report
Conforms: False
Results (3):
Constraint Violation in MaxInclusiveConstraintComponent (http://www.w3.org/ns/shacl#MaxInclusiveConstraintComponent):
Severity: sh:Violation
Source Shape: hr:jobGradeShape
Focus Node: d:e4
Value Node: Literal("8", datatype=xsd:integer)
Result Path: hr:jobGrade
Constraint Violation in DatatypeConstraintComponent (http://www.w3.org/ns/shacl#DatatypeConstraintComponent):
Severity: sh:Violation
Source Shape: hr:jobGradeShape
Focus Node: d:e3
Value Node: Literal("3.14", datatype=xsd:decimal)
Result Path: hr:jobGrade
Constraint Violation in MinCountConstraintComponent (http://www.w3.org/ns/shacl#MinCountConstraintComponent):
Severity: sh:Violation
Source Shape: hr:jobGradeShape
Focus Node: d:e2
Result Path: hr:jobGrade
However, if I split employees.ttl into three files containing the schema, shape, and instance data and run:
pyshacl -s shape.ttl -e schema.ttl -a -f human instance.ttl
the result is:
Validation Report
Conforms: True
I assume I am calling pyshacl correctly.
I'm doing some validation in data with anonymous nodes. The output of validate (e.g., results_text) shows:
Focus Node: [ ]
Value Node: [ ]
That makes it pretty hard to know which node has the issue. Any thoughts on how to identify them. I'm wondering if there could be an option to print either the datafile:line_number (hard, I know) or perhaps the whole anonymous node (ours are small, I know they could be very big, but even a few lines would probably help locate them).
hey RDFlib! I'm working on validation functions for schema.org content, and specifically we have yaml (and frontend matter of html) definitions of specifications (that load nicely into JSON). I'm wondering if there would be some logical way to use pySHACL to validate these inputs? See our discussion here --> schemaorg/schemaorg#2069 (comment) and here is an example input with yaml as frontend matter (that can be loaded as json of course). I'm also wondering if there is development space to be able to define tests / criteria in yaml, since this is the current language of many continuous integration services like Travis, Circle, etc. I started some thinking about this but before implementing something new, wanted to check with what standards are used in the community. Generally the criteria I am looking for are:
Thanks for your feedback! Please join in on the first issue listed above if you have thoughts! I'm very happy to contribute something here (with guidance) or to create a simplified version that goes from a yaml criteria to a validated specification.
I am trying to validate a data graph with it corresponding shapes graph. When I use the commandline method it works fine. However I get an erro on using the python module.
I am doing this:
r = validate(data_graph, shacl_graph='./validation/ActivityShape.ttl', ont_graph=None, advanced=True, inference='rdfs', abort_on_error=False)
conforms, results_graph, results_text = r
I get the following error:
Traceback (most recent call last):
File "validation/test1.py", line 64, in <module>
r = validate(data_graph=data_graph, shacl_graph='./validation/ActivityShape.ttl', ont_graph=None, advanced=True, inference='rdfs', abort_on_error=False)
File "/Users/sanuann/validation-env/lib/python3.6/site-packages/pyshacl/validate.py", line 253, in validate
do_owl_imports=False) # no imports on data_graph
File "/Users/sanuann/validation-env/lib/python3.6/site-packages/pyshacl/rdfutil/load.py", line 110, in load_from_source
first_char = source[0]
IndexError: string index out of range
What am I doing wrong?
Hey out there.
I get a really confusing error while trying to set up pySHACL.
I just try to import from pyshacl import validate
in a python script and get the following error.
Traceback (most recent call last):
File ".\shaclCheck.py", line 1, in <module>
from pyshacl import validate
File "C:\Python37x64\lib\site-packages\pyshacl\__init__.py", line 3, in <module>
from pyshacl.validate import validate, Validator
File "C:\Python37x64\lib\site-packages\pyshacl\validate.py", line 5, in <module>
import owlrl
File "C:\Python37x64\Scripts\owlrl.py", line 4, in <module>
from owlrl import convert_graph, RDFXML, TURTLE, JSON, AUTO, RDFA
ImportError: cannot import name 'convert_graph' from 'owlrl' (C:\Python37x64\Scripts\owlrl.py)
I think i did set up all Path variables related to the packages but i dont get this error fixed.
My System is a Win10 set. I dont know if this does cause the problem?
Can somebody help me here?
Best regards
Given the Shapes Graph:
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix ex: <http://example.com/ex#> .
ex:Parent a rdfs:Class ;
rdfs:isDefinedBy ex: ;
rdfs:comment "The parent class"@en ;
rdfs:subClassOf owl:Thing .
ex:ParentShape a sh:NodeShape ;
rdfs:isDefinedBy ex: ;
sh:property [
sh:datatype xsd:string ;
sh:path ex:name ;
sh:maxCount 1 ;
sh:minCount 1 ;
] ;
sh:closed true ;
sh:ignoredProperties ( rdf:type ) ;
sh:targetClass ex:Parent .
and the Data Graph:
{
"@context": {
"@vocab": "http://example.com/ex#"
},
"@type": "Parent",
"name": "Father",
"dummy": "Dummy value"
}
I expect to see a sh:ClosedConstraintComponent validation failure because of (ex:ParentShape sh:closed, true)
in the Shapes Graph and the presence of the property "dummy": "Dummy value"
in the Data Graph.
However, using the pyshacl (0.9.5) validate function no such validation failure is generated. Instead the text result is:
Validation Report
Conforms: True
In http://shacl.org/playground/ the expected validation failure is produced.
The SHACL recommendation states the following option to define a focus node:
specified as explicit input to the SHACL processor for validating a specific RDF term against a shape
https://www.w3.org/TR/shacl/#focusNodes
It would be great to have this feature.
BTW. Great project! It got me started with SHACL in minutes.
If I run the following code:
shapes = rdf.Graph()
shapes.parse(data="""
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix ex: <http://example.org/ns#> .
ex:Person
a owl:Class ;
a sh:NodeShape ;
sh:property ex:NameConstraint ;
.
ex:NameConstraint
a sh:PropertyShape ;
sh:path ex:name ;
sh:minCount 1 ;
.
""",format="ttl")
data = rdf.Graph()
data.parse(data="""
@prefix ex: <http://example.org/ns#> .
ex:Bob
a ex:Person ;
.
""",format="ttl")
r = sh.validate(data_graph=data,shacl_graph=shapes,inference='rdfs')
print(r[2])
no validation errors are reported. In order to force the error to be recognized, I have to explicitly declare ex:Person sh:targetClass ex:Person
in the shapes graph which shouldn't be necessary.
This is how TopQuadrant products represent classes and node shapes by default, so it would be great if pyshacl could support this.
pyshacl is giving an unexpected violation, one that I'm not seeing on the javascript https://shacl.org/playground/ (and pyshacl is also not showing the sh:message of the only property).
Data:
@prefix ex: <http://example.org/ns#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix schema: <http://schema.org/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
ex:Document
a schema:Document ;
schema:isTargetOf [ a schema:HasAuthor ;
schema:isPresent true ] ;
schema:isTargetOf [ a schema:otherClass ;
schema:isPresent true ] ;
.
shacl constraints
@prefix dash: <http://datashapes.org/dash#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix schema: <http://schema.org/> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
schema:DocumentShape
a sh:NodeShape ;
sh:targetClass schema:Document ;
sh:property [
sh:message "At least one Author" ;
sh:path schema:isTargetOf ;
sh:qualifiedMinCount 1 ;
sh:qualifiedValueShape [
sh:class schema:HasAuthor ;
]
] ;
.
Python code used:
import rdflib
from pyshacl import validate
data_filename = "data/shacl/example_data_value.ttl"
data_graph = rdflib.Graph()
data_graph.parse(data_filename, format='n3')
constraints_filename = "data/shacl/shacl_constraints_value.ttl"
constraints_graph = rdflib.Graph()
constraints_graph.parse(constraints_filename, format='n3')
r = validate(data_graph,
shacl_graph=constraints_graph,
# ont_graph=og,
inference='rdfs', abort_on_error=False,
meta_shacl=False, debug=True, advanced=True)
conforms, results_graph, results_text = r
conforms
What I'm seeing in the terminal (note the absence of the sh:message
)
$ python3 data/shacl/validate_transition.py
Constraint Violation in ClassConstraintComponent (http://www.w3.org/ns/shacl#ClassConstraintComponent):
Severity: sh:Violation
Source Shape: [ sh:class schema:HasAuthor ]
Focus Node: [ rdf:type rdfs:Resource, schema:otherClass ; schema:isPresent Literal("true" = True, datatype=xsd:boolean) ]
Value Node: [ rdf:type rdfs:Resource, schema:otherClass ; schema:isPresent Literal("true" = True, datatype=xsd:boolean) ]
pyshacl seems to ignore constraints defined on a superclass when validating an instance of a subclass. e.g given the SHACL
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix ex: <http://example.com/ex#> .
ex: a owl:Ontology ;
rdfs:label "Example"@en ;
rdfs:comment "Example"@en ;
owl:versionInfo "" ;
sh:declare [ sh:namespace "http://example.com/ex#" ;
sh:prefix "ex" ] .
ex:Parent a rdfs:Class ;
rdfs:isDefinedBy ex: ;
rdfs:comment "The parent class"@en ;
rdfs:subClassOf owl:Thing .
ex:ParentShape a sh:NodeShape ;
rdfs:isDefinedBy ex: ;
sh:property [
sh:datatype xsd:string ;
sh:path ex:name ;
sh:maxCount 1 ;
sh:minCount 1 ;
] ;
sh:targetClass ex:Parent .
ex:Child a rdfs:Class ;
rdfs:isDefinedBy ex: ;
rdfs:comment "The child class"@en ;
rdfs:subClassOf ex:Parent .
ex:ChildShape a sh:NodeShape ;
rdfs:isDefinedBy ex: ;
rdfs:subClassOf ex:ParentShape ;
sh:property [
sh:datatype xsd:integer ;
sh:path ex:age ;
sh:maxCount 1 ;
sh:minCount 1 ;
] ;
sh:targetClass ex:Child .
Validating a json-ld instance of Child
that is missing the name
property from Parent
against the above SHACL
{
"@context": {
"@vocab": "http://example.com/ex#"
},
"@type": "Child",
"age": 3
}
does not find a violation. I had expected that validating a subclass instance would also include constraints from the superclass.
Is this a misunderstanding of SHACL on my part or an issue with pyshacl?
It is quite a common requirement to test an IRI to check if it is in a specific namespace, or contains a path element which is a specific character string or pattern. While the SHACL spec appears to restrict application of sh:pattern
to string literals, it would be helpful to allow a 'relaxed' mode where it can also apply to IRIs (which are, after all, just a sequence of characters).
Note that the TopBraid SHACL engine (maintained by the SHACL editor @HolgerKnublauch ) does operate in this mode - see https://groups.google.com/forum/?utm_source=digest&utm_medium=email#!topic/topbraid-users/BUoROZt0BhM
It seems like RDFClosure has been renamed to OWL-RL and that RDFClosure has been removed from the online repositories. Thus it's not installed as a dependency for pyshacl and thus pyshacl throws a lot of ModuleNotFoundErrors. E.g.:
### pyshacl --help
Traceback (most recent call last):
File "/usr/local/bin/pyshacl", line 17, in <module>
from pyshacl import validate
File "/usr/local/lib/python3.7/site-packages/pyshacl/__init__.py", line 3, in <module>
from pyshacl.validate import validate
File "/usr/local/lib/python3.7/site-packages/pyshacl/validate.py", line 12, in <module>
from pyshacl.inference import CustomRDFSSemantics, CustomRDFSOWLRLSemantics
File "/usr/local/lib/python3.7/site-packages/pyshacl/inference/__init__.py", line 2, in <module>
from .custom_rdfs_closure import CustomRDFSSemantics, CustomRDFSOWLRLSemantics
File "/usr/local/lib/python3.7/site-packages/pyshacl/inference/custom_rdfs_closure.py", line 2, in <module>
from RDFClosure.RDFSClosure import RDFS_Semantics as OrigRDFSSemantics
ModuleNotFoundError: No module named 'RDFClosure'
I'm using version 0.9.7 of pyshacl from pypi and the issue happens for each command of pyshacl, I have tested so far.
Hello,
I noticed that pySHACL supports SHACL Advanced Features
(SPARQL).
I wonder if you also have to support the SHACL JavaScript Extensions (SHACL-JS)?
Best Regards,
Angelo
Updated the library to pull in some recent changes, and ran into this error: super(type, obj): obj must be an instance or subtype of type. The function below was running fine before the update. Any idea what could be causing this?
try:
conforms, v_graph, v_text = validate(places, shacl_graph=places_shape,
data_graph_format=data_file_format,
shacl_graph_format=shapes_file_format,
inference='rdfs', debug=True,
serialize_report_graph=True)
print(conforms)
except Exception as e:
print(e)
pass
When I try to do SPARQL based SHACL validation, I am getting the wrong results.I am trying to filter out processes Testsparql:Process
where Testsparql:Cranecapacity
is less than Testsparql:Moduleweight
. However I am getting the desired output when my datafile and shape file is in a single RDF. However when I split it into 2 RDF, I am not getting the correct inference.
2 file case:
from pyshacl import validate
shapes_file = '''
@prefix Testsparql: <http://semanticprocess.x10host.com/Ontology/Testsparql#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
Testsparql:PrefixDeclaration
rdf:type sh:PrefixDeclaration ;
sh:namespace "http://semanticprocess.x10host.com/Ontology/Testsparql#"^^xsd:anyURI ;
sh:prefix "Testsparql" ;
.
Testsparql:Processshape
rdf:type rdfs:Class ;
rdf:type sh:NodeShape ;
rdfs:subClassOf owl:Class ;
sh:sparql [
sh:message "Invalid process" ;
sh:prefixes <http://semanticprocess.x10host.com/Ontology/Testsparql> ;
sh:select """SELECT $this
WHERE {
$this rdf:type Testsparql:Process.
$this Testsparql:hasResource ?crane.
$this Testsparql:hasAssociation ?module.
?crane Testsparql:Cranecapacity ?cc.
?module Testsparql:Moduleweight ?mw.
FILTER (?cc <= ?mw).
}""" ;
] ;
.
'''
shapes_file_format = 'turtle'
data_file = '''
@prefix Testsparql: <http://semanticprocess.x10host.com/Ontology/Testsparql#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
<http://semanticprocess.x10host.com/Ontology/Testsparql>
rdf:type owl:Ontology ;
owl:imports <http://datashapes.org/dash> ;
owl:versionInfo "Created with TopBraid Composer" ;
sh:declare Testsparql:PrefixDeclaration ;
.
Testsparql:Crane
rdf:type rdfs:Class ;
rdfs:subClassOf owl:Class ;
.
Testsparql:Crane_1
rdf:type Testsparql:Crane ;
Testsparql:Cranecapacity "500"^^xsd:decimal ;
.
Testsparql:Crane_2
rdf:type Testsparql:Crane ;
Testsparql:Cranecapacity "5000"^^xsd:decimal ;
.
Testsparql:Cranecapacity
rdf:type owl:DatatypeProperty ;
rdfs:domain Testsparql:Crane ;
rdfs:range xsd:decimal ;
rdfs:subPropertyOf owl:topDataProperty ;
.
Testsparql:Module
rdf:type rdfs:Class ;
rdfs:subClassOf owl:Class ;
.
Testsparql:Module_1
rdf:type Testsparql:Module ;
Testsparql:Moduleweight "800"^^xsd:decimal ;
.
Testsparql:Moduleweight
rdf:type owl:DatatypeProperty ;
rdfs:domain Testsparql:Module ;
rdfs:range xsd:decimal ;
rdfs:subPropertyOf owl:topDataProperty ;
.
Testsparql:Process
rdf:type rdfs:Class ;
rdfs:subClassOf owl:Class ;
.
Testsparql:ProcessID
rdf:type owl:DatatypeProperty ;
rdfs:domain Testsparql:Process ;
rdfs:range xsd:string ;
rdfs:subPropertyOf owl:topDataProperty ;
.
Testsparql:Process_1
rdf:type Testsparql:Process ;
Testsparql:ProcessID "P1" ;
Testsparql:hasAssociation Testsparql:Module_1 ;
Testsparql:hasResource Testsparql:Crane_1 ;
.
Testsparql:Process_2
rdf:type Testsparql:Process ;
Testsparql:ProcessID "P2" ;
Testsparql:hasAssociation Testsparql:Module_1 ;
Testsparql:hasResource Testsparql:Crane_2 ;
.
Testsparql:hasAssociation
rdf:type owl:ObjectProperty ;
rdfs:domain Testsparql:Process ;
rdfs:range Testsparql:Module ;
rdfs:subPropertyOf owl:topObjectProperty ;
.
Testsparql:hasResource
rdf:type owl:ObjectProperty ;
rdfs:domain Testsparql:Process ;
rdfs:range Testsparql:Crane ;
rdfs:subPropertyOf owl:topObjectProperty ;
.
'''
data_file_format = 'turtle'
conforms, v_graph, v_text = validate(data_file, shacl_graph=shapes_file,
target_graph_format=data_file_format,
shacl_graph_format=shapes_file_format,
inference='rdfs', debug=True,
serialize_report_graph=True)
print(conforms)
print(v_graph)
print(v_text)
Result is :
True
b'@prefix Testsparql: <http://semanticprocess.x10host.com/Ontology/Testsparql#> .\n@prefix owl: <http://www.w3.org/2002/07/owl#> .\n@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .\n@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .\n@prefix sh: <http://www.w3.org/ns/shacl#> .\n@prefix xml: <http://www.w3.org/XML/1998/namespace> .\n@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .\n\n[] a sh:ValidationReport ;\n sh:conforms true .\n\n'
Validation Report
Conforms: True
However, if the same data is given in a single file
from pyshacl import validate
data_file = '''
# baseURI: http://semanticprocess.x10host.com/Ontology/Testsparql
# imports: http://datashapes.org/dash
# prefix: Testsparql
@prefix Testsparql: <http://semanticprocess.x10host.com/Ontology/Testsparql#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
<http://semanticprocess.x10host.com/Ontology/Testsparql>
rdf:type owl:Ontology ;
owl:imports <http://datashapes.org/dash> ;
owl:versionInfo "Created with TopBraid Composer" ;
sh:declare Testsparql:PrefixDeclaration ;
.
Testsparql:Crane
rdf:type rdfs:Class ;
rdfs:subClassOf owl:Class ;
.
Testsparql:Crane_1
rdf:type Testsparql:Crane ;
Testsparql:Cranecapacity "500"^^xsd:decimal ;
.
Testsparql:Crane_2
rdf:type Testsparql:Crane ;
Testsparql:Cranecapacity "5000"^^xsd:decimal ;
.
Testsparql:Cranecapacity
rdf:type owl:DatatypeProperty ;
rdfs:domain Testsparql:Crane ;
rdfs:range xsd:decimal ;
rdfs:subPropertyOf owl:topDataProperty ;
.
Testsparql:Module
rdf:type rdfs:Class ;
rdfs:subClassOf owl:Class ;
.
Testsparql:Module_1
rdf:type Testsparql:Module ;
Testsparql:Moduleweight "800"^^xsd:decimal ;
.
Testsparql:Moduleweight
rdf:type owl:DatatypeProperty ;
rdfs:domain Testsparql:Module ;
rdfs:range xsd:decimal ;
rdfs:subPropertyOf owl:topDataProperty ;
.
Testsparql:PrefixDeclaration
rdf:type sh:PrefixDeclaration ;
sh:namespace "http://semanticprocess.x10host.com/Ontology/Testsparql#"^^xsd:anyURI ;
sh:prefix "Testsparql" ;
.
Testsparql:Process
rdf:type rdfs:Class ;
rdf:type sh:NodeShape ;
rdfs:subClassOf owl:Class ;
sh:sparql [
sh:message "Invalid process" ;
sh:prefixes <http://semanticprocess.x10host.com/Ontology/Testsparql> ;
sh:select """SELECT $this
WHERE {
$this rdf:type Testsparql:Process.
$this Testsparql:hasResource ?crane.
$this Testsparql:hasAssociation ?module.
?crane Testsparql:Cranecapacity ?cc.
?module Testsparql:Moduleweight ?mw.
FILTER (?cc <= ?mw).
}""" ;
] ;
.
Testsparql:ProcessID
rdf:type owl:DatatypeProperty ;
rdfs:domain Testsparql:Process ;
rdfs:range xsd:string ;
rdfs:subPropertyOf owl:topDataProperty ;
.
Testsparql:Process_1
rdf:type Testsparql:Process ;
Testsparql:ProcessID "P1" ;
Testsparql:hasAssociation Testsparql:Module_1 ;
Testsparql:hasResource Testsparql:Crane_1 ;
.
Testsparql:Process_2
rdf:type Testsparql:Process ;
Testsparql:ProcessID "P2" ;
Testsparql:hasAssociation Testsparql:Module_1 ;
Testsparql:hasResource Testsparql:Crane_2 ;
.
Testsparql:hasAssociation
rdf:type owl:ObjectProperty ;
rdfs:domain Testsparql:Process ;
rdfs:range Testsparql:Module ;
rdfs:subPropertyOf owl:topObjectProperty ;
.
Testsparql:hasResource
rdf:type owl:ObjectProperty ;
rdfs:domain Testsparql:Process ;
rdfs:range Testsparql:Crane ;
rdfs:subPropertyOf owl:topObjectProperty ;
.
'''
data_file_format = 'turtle'
conforms, v_graph, v_text = validate(data_file, shacl_graph=None,
target_graph_format=data_file_format,
shacl_graph_format=shapes_file_format,
inference='rdfs', debug=True,
serialize_report_graph=True)
print(conforms)
print(v_graph)
print(v_text)
It gives the correct inference.
False
b'@prefix Testsparql: <http://semanticprocess.x10host.com/Ontology/Testsparql#> .\n@prefix owl: <http://www.w3.org/2002/07/owl#> .\n@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .\n@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .\n@prefix sh: <http://www.w3.org/ns/shacl#> .\n@prefix xml: <http://www.w3.org/XML/1998/namespace> .\n@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .\n\n[] a sh:ValidationReport ;\n sh:conforms false ;\n sh:result [ a sh:ValidationResult ;\n sh:focusNode Testsparql:Process_1 ;\n sh:resultMessage "Invalid process" ;\n sh:resultSeverity sh:Violation ;\n sh:sourceConstraint [ sh:message "Invalid process" ;\n sh:prefixes <http://semanticprocess.x10host.com/Ontology/Testsparql> ;\n sh:select """SELECT $this \n WHERE {\n\t\t\t $this rdf:type Testsparql:Process.\n\t\t\t$this Testsparql:hasResource ?crane.\n\t\t\t$this Testsparql:hasAssociation ?module.\n\t\t\t?crane Testsparql:Cranecapacity ?cc.\n\t\t\t?module Testsparql:Moduleweight ?mw.\n\t\t\t\t\tFILTER (?cc <= ?mw).\n\n }""" ] ;\n sh:sourceConstraintComponent sh:SPARQLConstraintComponent ;\n sh:sourceShape Testsparql:Process ;\n sh:value Testsparql:Process_1 ] .\n\n'
Validation Report
Conforms: False
Results (1):
Constraint Violation in SPARQLConstraintComponent (http://www.w3.org/ns/shacl#SPARQLConstraintComponent):
Severity: sh:Violation
Source Shape: Testsparql:Process
Focus Node: Testsparql:Process_1
Value Node: Testsparql:Process_1
Source Constraint: [ sh:message Literal("Invalid process") ; sh:prefixes <http://semanticprocess.x10host.com/Ontology/Testsparql> ; sh:select Literal("SELECT $this
WHERE {
$this rdf:type Testsparql:Process.
$this Testsparql:hasResource ?crane.
$this Testsparql:hasAssociation ?module.
?crane Testsparql:Cranecapacity ?cc.
?module Testsparql:Moduleweight ?mw.
FILTER (?cc <= ?mw).
}") ]
Message: Invalid process
Can you help me understand why this is giving the wrong inference?
This is clearly a low priority, minor issue, but considering the amount of time I spent trying to figure out what was wrong with my rather large SHACL data, I thought it was worth considering and suggesting a change.
A minimal example to illustrate the issue is:
import rdflib
from pyshacl import validate
data = """
@prefix asdf: <http://example.org/asdf/> .
@prefix ex: <http://example.org/> .
asdf:e2e a ex:termA ;
ex:child asdf:23e .
asdf:23e a ex:termB .
"""
shaclData = """
@prefix ex: <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
ex:termShape a sh:NodeShape ;
sh:ignoredProperties ( rdf:type ) ;
sh:targetClass ex:termB .
"""
dataGraph = rdflib.Graph().parse( data = data, format = 'ttl' )
shaclGraph = rdflib.Graph().parse( data = shaclData, format = 'ttl' )
report = validate( dataGraph, shacl_graph = shaclGraph, abort_on_error = False, meta_shacl = False, debug = False, advanced = True, do_owl_imports = True )
This generates what I found to be a confusing error message:
ConstraintLoadError: ClosedConstraintComponent must have at least one sh:closed predicate.
https://www.w3.org/TR/shacl/#ClosedConstraintComponent
The issue is that when using sh:ignoredProperties, sh:closed is expected.
pySHACL is reporting that one is using something related to closed shapes without having a closed shape.
If possible, I would love to see some improvement to the clarity of the error.
I have the following:
graph_data = """
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix sch: <http://schema.org/> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix ex: <http://example.org/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
ex:JohnDoe a ex:XXXX .
ex:JohnDoe ex:name "hello.txt" .
"""
shape_data = """
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix sch: <http://schema.org/> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix ex: <http://example.org/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
ex:PersonShape
a sh:NodeShape ;
sh:targetClass ex:XXXX ;
sh:property ex:PersonShape-name .
ex:PersonShape-name
a sh:PropertyShape ;
sh:path ex:name ;
sh:minCount 1 ;
sh:pattern ".*.txt" .
"""
data = rdflib.Graph().parse( data = graph_data, format = 'ttl' )
shape = rdflib.Graph().parse( data = shape_data, format = 'ttl' )
print( f"{data.serialize( format = 'ttl' ).decode( 'utf8' )}" )
report = validate( data, shacl_graph=shape, abort_on_error = False, meta_shacl = False, debug = True, advanced = True )
print( report[2] )
The sh:pattern
should be ".*\.txt"
, but when I do that, the following errors are generated:
... notation3.py", line 1591, in strconst "bad escape")
... notation3.py", line 1615, in BadSyntax raise BadSyntax(self._thisDoc, self.lines, argstr, i, msg)
File "<string>", line unknown
BadSyntax
At least according to http://www.datypic.com/books/xquery/chapter19.html, I am using the escape correctly.
It can sometimes (often) be the case that the combination of the SHACL Shape file and the Data File together do not give the pySHACL validation engine enough information to generate a correct validation result, even if inferencing is run across the input data file.
For example:
I have a shape file which asserts that for all instances of the class Human
, if they have a property called hasPet
, the target object of that property must be an instance of the class Animal
.
I have a data file containing statements:
Person1
Instance of Human
named "Amy"
, she has a property hasPet
with the target Pet1
.Pet1
Instance of Lizard
named "Sebastian"
If I run the validator across those inputs, it will return a validation result indicating failure because the pet is not of type animal. Even if inferencing is run on the data file, there is no way for the validator to know that Lizard
is a subclass of Animal
, so the validation still returns the result.
In order for this validation to work, there needs to be a statement of (Lizard, rdfs:subclassOf, Animal)
included in the data file before submitting it to the validator, and basic RDFS inferencing must be run on the data graph before validating, to ensure the (Pet1, rdf:type, Animal)
triple is created in the data graph.
This is a very simple example but hopefully highlights the problem faced, where any extra ontological information required for inferencing needs to be added into the data file before passing it to the data file. This is inconvenient because in most practical applications of pySHACL, the data file is an isolated data snippet, without any accompanying ontological information.
It is sometimes the case that extra ontologicial information is added into the SHACL Shape file, or indeed that the SHACL Shapes are included as part of an ontology document itself. This does not help in this situation, because the file passed into the validator and parsed into the SHACL Shapes graph does not get mixed into the data graph, so those extra ontological statements do not take effect in the inferencing step on the data graph (and inferencing is never applied to the SHACL graph).
I propose an extra feature for pySHACL where you can optionally specify the location to an extra static ontology document, which gets ingested and mixed into the data graph prior to the inferencing step.
This will be a new feature in the python module, and exposed as an option on the command line tool, and as an optional field on the web tool.
Hi,
I'm just trying to clarify if pySHACL support advanced features such as sh:target, sh:filterShapeNode etc? Looks like it doesn't support those properties currently..
Thanks,
Yi
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.