anno4j / anno4j Goto Github PK

Read and write API for W3C Web Annotation Model and W3C Open Annotation Data Model

License: Apache License 2.0

Java 99.94% Shell 0.06%

anno4j's Introduction

Anno4j

This library mainly provides programmatic access to the W3C Web Annotation Data Model (formerly known as the W3C Open Annotation Data Model) to allow Annotations to be written from and to local or remote SPARQL endpoints. An easy-to-use and extensible Java API allows creation and querying of Annotations even for non-experts. This API is augmented with various supporting functionalities to increase the usability of using the W3C Web Annotations.

With the last iteration, Anno4j has also been developed to be able to work with generic metadata models. It is now possible to parse a RDFS or OWL Lite schema and generate the respective Anno4j classes on the fly via code generation.

Build Status

master branch: develop branch:

Table of Content

The use of the Anno4j library and its features is documented in the respective GitHub Anno4j Wiki. Its features are the following:

Extensible creation of Web/Open Annotations based on Java Annotations syntax (see Getting Started)
Built-in and predefined implementations for nearly all RDF classes conform to the W3C Web Annotation Data Model
Created (and annotated) Java POJOs are transformed to RDF and automatically transmitted to local/remote SPARQL 1.1 endpoints using the SPARQL Update functionality
Querying of annotations with path-based criteria (see Querying)
- Basic comparisons like "equal", "greater", and "lower"
- String comparisons: "equal", "contains", "starts with", and "ends with"
- Union of different paths
- Type condition
- Custom filters
Addition of custom behaviours of otherwise simple Anno4j classes through partial/support classes (see Support Classes)
Input and Output to and from different standardised RDF serialisation standards (see RDF Input and Output)
Parsing of RDFS or OWL Lite schemata to automatically generate respective Anno4j classes (see Java File Generation)
Schema/Validation annotations that can be added to Anno4j classes to induce schema-correctness which is indicated at the point of creation (see Schema Validation and Schema Annotations)
A tool to support the generation of so-called proxy classes, that speed up the creation of instances of large and deep schemata

Status of Anno4j and the implemented WADM specification

The current version 2.4 of Anno4j supports the most current W3C recommendation of the Web Annotation Data Model.

Development Guidelines

Snapshot

Each push on the development branch triggers the build of a snapshot version. Snapshots are publicly available:

	<dependency> 
	<groupId>com.github.anno4j</groupId>
   	<artifactId>anno4j-core</artifactId>
   	<version>X.X.X-SNAPSHOT</version>
	</dependency>

Compile, Package and Install

Package with:

      mvn package

Install to your local repository

      mvn install

Participate

Create an issue
Fork Anno4j
Add features
Add JUnit Tests
Create pull request to anno4j/develop

3rd party integration of custom LDPath expressions

To contribute custom LDPath (test) functions and thereby custom LDPath syntax, the following two classes have to be provided:

Step:

Create a Java class that extends either the SelectorFunction class or the TestFunction class. This class defines the actual syntax that has to be injected into the Anno4j evaluation process.

    public class GetSelector extends SelectorFunction<Node> {
    
        @Override
        protected String getLocalName() {
            return "getSelector";
        }
    
        @Override
        public Collection<Node> apply(RDFBackend<Node> backend, Node context, Collection<Node>... args) throws IllegalArgumentException {
            return null;
        }
    
        @Override
        public String getSignature() {
            return "fn:getSelector(Annotation) : Selector";
        }
    
        @Override
        public String getDescription() {
            return "Selects the Selector of a given annotation object.";
        }
    }

Step:

Create a Java class that actually evaluates the newly provided LDPath expression. This class needs to be flagged with the @Evaluator Java annotation. The @Evaluator annotation requires the class of the description mentioned in the first step. Besides that, the evaluator has to implement either the QueryEvaluator or the TestEvaluator interface. Inside the prepared evaluate method, the actual SPARQL query has to be generated using the Apache Jena framework.

    @Evaluator(GetSelector.class)
    public class GetSelectorFunctionEvaluator implements QueryEvaluator {
        @Override
        public Var evaluate(NodeSelector nodeSelector, ElementGroup elementGroup, Var var, LDPathEvaluatorConfiguration evaluatorConfiguration) {
            Var evaluate = new SelfSelectionEvaluator().evaluate(nodeSelector, elementGroup, var, evaluatorConfiguration);
            Var target = Var.alloc("target");
            Var selector = Var.alloc("selector");
    
            elementGroup.addTriplePattern(new Triple(evaluate.asNode(), new ResourceImpl(OADM.HAS_TARGET).asNode(), target));
            elementGroup.addTriplePattern(new Triple(target.asNode(), new ResourceImpl(OADM.HAS_SELECTOR).asNode(), selector));
            return selector;
        }
    }

Contributors

Kai Schlegel (University of Passau)
Andreas Eisenkolb (University of Passau)
Emanuel Berndl (University of Passau)
Thomas Weißgerber (University of Passau)
Matthias Fisch (University of Passau)

This software was partially developed within the MICO project (Media in Context - European Commission 7th Framework Programme grant agreement no: 610480) and the ViSIT project (Virtuelle Verbund-Systeme und Informations-Technologien für die touristische Erschließung von kulturellem Erbe - Interreg Österreich-Bayern 2014-2020, project code: AB78).

License

Apache License Version 2.0 - http://www.apache.org/licenses/LICENSE-2.0

anno4j's People

Stargazers

Watchers

Forkers

ziodave thomas-idmt westei papadako weissger tkurz jarroda silviodc kinow jeekim the-alchemist atgabe brambg zxy829 angelodel80

anno4j's Issues

Create missing SvgSelector

Dereferencable vector at resource level of the selector.

[ResourceObject] getNTriples is rather slow

In my system it takes 6664 ms to return the results. Is this an expected behaviour?

resolvable objects

each object with an uri should be resolvable (e.g. svg selector). a resolve method should do a http get call to get the real content

Create a method to add criteria objects to the query service

In the current version it is only possibly to specify criteria using the shortcut methods, which require three parameters (ldpath, constraint and comparison method). It should be possible to add a criteria object, that already contains this information.

use SPARQL 1.1 path instead of custom SPARQL queries

http://www.w3.org/TR/sparql11-query/#propertypaths

implement more ldpath language elements

e.g. fieldnames, self selector, wildcards

Remove Singleton of Anno4j object

Removing the singleton pattern will allow the usage of multiple triple stores in the same project

Replace SPARQL String concatenation with corresponding jena beans

write restrictions section in readme

[QueryService] Search using a URI

It seems that searching using a URI does not work as expected.

Assume that I want to get all annotations that have as body a specific resource. I guess I would have to use something like

queryService.setAnnotationCriteria("oa:hasBody", uri).execute();

If the string of the uri is of the form "http://domain.com/entity/annotations/bodies/061f499e-c93d-4268-af91-55865e597f34" I get an exception. It seems that since this is a URI it accepts a URI in the form . The exception is given below.

Caused by: org.openrdf.rio.RDFParseException: Expected '<' or '_', found: @ [line 1]
    at org.openrdf.rio.helpers.RDFParserHelper.reportFatalError(RDFParserHelper.java:441)
    at org.openrdf.rio.helpers.RDFParserBase.reportFatalError(RDFParserBase.java:671)
    at org.openrdf.rio.ntriples.NTriplesParser.reportFatalError(NTriplesParser.java:613)
    at org.openrdf.rio.ntriples.NTriplesParser.parseSubject(NTriplesParser.java:350)
    at org.openrdf.rio.ntriples.NTriplesParser.parseTriple(NTriplesParser.java:282)
    at org.openrdf.rio.ntriples.NTriplesParser.parse(NTriplesParser.java:193)
    at org.openrdf.http.client.BackgroundGraphResult.run(BackgroundGraphResult.java:133)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

On the other hand if I use as the uri string the following "http://domain.com/entity/annotations/bodies/061f499e-c93d-4268-af91-55865e597f34" I get no results...

Checking the finally constructed SPARQL query in Virtuoso

SELECT  ?annotation
WHERE
  { ?annotation <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/oa#Annotation> .
    ?annotation <http://www.w3.org/ns/oa#hasBody> ?var1
    FILTER regex(str(?var1), "http://domain.com/entity/annotations/bodies/061f499e-c93d-4268-af91-55865e597f34", "")
  }

returns the correct results, while "<http://domain.com/entity/annotations/bodies/061f499e-c93d-4268-af91-55865e597f34 >"

SELECT  ?annotation
WHERE
  { ?annotation <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/oa#Annotation> .
    ?annotation <http://www.w3.org/ns/oa#hasBody> ?var1
    FILTER regex(str(?var1), "<http://domain.com/entity/annotations/bodies/061f499e-c93d-4268-af91-55865e597f34>", "")
  }

does not.

pretty print sparql query in logger

allow custom uris for anno4j objects

Clear "printed exception" in Sesame Rio use

The printed exception happens both in input and output cases.
-> "Minor/Major version problem 0.52"

optimize tests

test take long because before each test a reinit of the memory store takes place.

solution 1:
reset function on object repository

solution 2:
reinit the memory store instead of the whole object repository

fork alibaba

solve the issue of missing concept files

Reuse of blank nodes

For example, instead of creating a new blank node for provenance agents per each annotation, when all annotations are serialized by the same software agent/the same person, etc.

Alibaba Improvement

create interfaces instead of class to allow multiple types

readme information

add manual for read and write annotations

Limit and Offset ignored in QueryService

QueryService<Annotation> queryService = anno4j.createQueryService(Annotation.class);
queryService.limit(5);
queryService.execute();

Creates the following query:

SELECT  ?annotation
WHERE
  { ?annotation <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/oa#Annotation>}

(nb: IntelliJ marks private Integer limit and private Integer offset in QueryService.java as unused, so I think it is not implemented)

getNTriples does not return a valid NTriple serialization

urirefs should be enclosed in <>

delete unnecessary alibaba tests

testing duration is very long. i think we can remove many tests of the alibaba test suite because we only use a subpart of the hole project and we test it in our project anyways

Lazy loading for queried annotations

What happens, when you query a very big RDF graph? Annotations can refer to other annotations, resulting in a very big result.

Recommendation, open questions

Looked at Apache Mahout - Taste (https://mahout.apache.org/users/recommender/recommender-documentation.html):

Mainly focused on collaborative filtering, item-based recommendations are possible but only in a very restrictive way. They claim that items are rather static, so their similarity can be precomputed.
Their basic engine needs to be supported with a list of precomputed item-similarities.
Maybe workflow: Given an annotation, define requested similarities -> Server queries related annotations (referred from its type) and calculates similarities -> Feed to taste -> Generate Recommendations

-> Questions:

Should our annotation similarity be static, calculated from time to time?
Should they be dynamic, calculated at request-time?
Between what kind of annotations/annotation types does one want to have similarities? All of them?
Cross-annotation similarity: how to define those similarities?
Can the desired similarity be defined on recommendation-request time?
Can the algorithms from taste be used on lower level annotation features?
Do we need the support of a framework altogether? Recommendations or similarities could be implemented at annotation level..

-> Ideas:

Basic annotation similarity based on annotation level and target level (e.g. User, Agent, Target resource, same selector/fragment, etc.)
Specific annotation similarity: Defined by some rules?

Enable Subgraph persisting

see MICO SPARQL templates

upload to maven repository

Extend LD Path for missing SPARQL filter functions, e.g. isLiteral

[QueryService] Missing fields in Annotations returned by the QueryService

The annotation returned after executing a QueryService (built with

createQueryService(Annotation.class, graph)

and with an annotation criteria over oa:hasBody) misses fields like body, serializationAt, serializationBy, targets, etc. are missing as shown below (json-ld representation of the annotation).

{
  "@id" : "http://lifewatchgreece.eu/entity/annotations/235332c4-1543-439f-8433-bddb95f2295f",
  "@type" : "http://www.w3.org/ns/oa#Annotation",
  "http://www.w3.org/ns/oa#annotatedAt" : "2015-10-15T19:27:34"
}

Add Javadoc comments

Extend RDFObject to generate JSONLD and Turtle String

create anno4j module for alibaba

put alibaba directly into the anno4j project as a module

Extend RDFObject to get all relationships and neighbours information

e.g. which predicates are available or which different language literals are possible

Allow multiple targets on annotation

The data model allows multiple targets on a annotation. The fix will break the current api (e.g. getTarget -> getTargets)

Using Virtuoso as a SPARQL endpoint throws exception due no default graph

Virtuoso does not have an explicit unnamed default graph. As a result inserting an annotation throws the following exception:

Virtuoso 37000 Error SP031: SPARQL compiler: No default graph specified in the preamble, but it is needed for triple in INSERT DATA {...} without GRAPH {...}[\n]"

This is with virtuoso 7.1