Giter Site home page Giter Site logo

anno4j's Introduction

Anno4j

This library mainly provides programmatic access to the W3C Web Annotation Data Model (formerly known as the W3C Open Annotation Data Model) to allow Annotations to be written from and to local or remote SPARQL endpoints. An easy-to-use and extensible Java API allows creation and querying of Annotations even for non-experts. This API is augmented with various supporting functionalities to increase the usability of using the W3C Web Annotations.

With the last iteration, Anno4j has also been developed to be able to work with generic metadata models. It is now possible to parse a RDFS or OWL Lite schema and generate the respective Anno4j classes on the fly via code generation.

Build Status

master branch: Build Status develop branch: Build Status

Table of Content

The use of the Anno4j library and its features is documented in the respective GitHub Anno4j Wiki. Its features are the following:

  • Extensible creation of Web/Open Annotations based on Java Annotations syntax (see Getting Started)
  • Built-in and predefined implementations for nearly all RDF classes conform to the W3C Web Annotation Data Model
  • Created (and annotated) Java POJOs are transformed to RDF and automatically transmitted to local/remote SPARQL 1.1 endpoints using the SPARQL Update functionality
  • Querying of annotations with path-based criteria (see Querying)
    • Basic comparisons like "equal", "greater", and "lower"
    • String comparisons: "equal", "contains", "starts with", and "ends with"
    • Union of different paths
    • Type condition
    • Custom filters
  • Addition of custom behaviours of otherwise simple Anno4j classes through partial/support classes (see Support Classes)
  • Input and Output to and from different standardised RDF serialisation standards (see RDF Input and Output)
  • Parsing of RDFS or OWL Lite schemata to automatically generate respective Anno4j classes (see Java File Generation)
  • Schema/Validation annotations that can be added to Anno4j classes to induce schema-correctness which is indicated at the point of creation (see Schema Validation and Schema Annotations)
  • A tool to support the generation of so-called proxy classes, that speed up the creation of instances of large and deep schemata

Status of Anno4j and the implemented WADM specification

The current version 2.4 of Anno4j supports the most current W3C recommendation of the Web Annotation Data Model.

Development Guidelines

Snapshot

Each push on the development branch triggers the build of a snapshot version. Snapshots are publicly available:

	<dependency> 
	<groupId>com.github.anno4j</groupId>
   	<artifactId>anno4j-core</artifactId>
   	<version>X.X.X-SNAPSHOT</version>
	</dependency>

Compile, Package and Install

Package with:

      mvn package

Install to your local repository

      mvn install

Participate

  1. Create an issue
  2. Fork Anno4j
  3. Add features
  4. Add JUnit Tests
  5. Create pull request to anno4j/develop

3rd party integration of custom LDPath expressions

To contribute custom LDPath (test) functions and thereby custom LDPath syntax, the following two classes have to be provided:

  1. Step:

Create a Java class that extends either the SelectorFunction class or the TestFunction class. This class defines the actual syntax that has to be injected into the Anno4j evaluation process.

    public class GetSelector extends SelectorFunction<Node> {
    
        @Override
        protected String getLocalName() {
            return "getSelector";
        }
    
        @Override
        public Collection<Node> apply(RDFBackend<Node> backend, Node context, Collection<Node>... args) throws IllegalArgumentException {
            return null;
        }
    
        @Override
        public String getSignature() {
            return "fn:getSelector(Annotation) : Selector";
        }
    
        @Override
        public String getDescription() {
            return "Selects the Selector of a given annotation object.";
        }
    }
  1. Step:

Create a Java class that actually evaluates the newly provided LDPath expression. This class needs to be flagged with the @Evaluator Java annotation. The @Evaluator annotation requires the class of the description mentioned in the first step. Besides that, the evaluator has to implement either the QueryEvaluator or the TestEvaluator interface. Inside the prepared evaluate method, the actual SPARQL query has to be generated using the Apache Jena framework.

    @Evaluator(GetSelector.class)
    public class GetSelectorFunctionEvaluator implements QueryEvaluator {
        @Override
        public Var evaluate(NodeSelector nodeSelector, ElementGroup elementGroup, Var var, LDPathEvaluatorConfiguration evaluatorConfiguration) {
            Var evaluate = new SelfSelectionEvaluator().evaluate(nodeSelector, elementGroup, var, evaluatorConfiguration);
            Var target = Var.alloc("target");
            Var selector = Var.alloc("selector");
    
            elementGroup.addTriplePattern(new Triple(evaluate.asNode(), new ResourceImpl(OADM.HAS_TARGET).asNode(), target));
            elementGroup.addTriplePattern(new Triple(target.asNode(), new ResourceImpl(OADM.HAS_SELECTOR).asNode(), selector));
            return selector;
        }
    }

Contributors

  • Kai Schlegel (University of Passau)
  • Andreas Eisenkolb (University of Passau)
  • Emanuel Berndl (University of Passau)
  • Thomas Weißgerber (University of Passau)
  • Matthias Fisch (University of Passau)

This software was partially developed within the MICO project (Media in Context - European Commission 7th Framework Programme grant agreement no: 610480) and the ViSIT project (Virtuelle Verbund-Systeme und Informations-Technologien für die touristische Erschließung von kulturellem Erbe - Interreg Österreich-Bayern 2014-2020, project code: AB78).

License

Apache License Version 2.0 - http://www.apache.org/licenses/LICENSE-2.0

anno4j's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

anno4j's Issues

resolvable objects

each object with an uri should be resolvable (e.g. svg selector). a resolve method should do a http get call to get the real content

Create a method to add criteria objects to the query service

In the current version it is only possibly to specify criteria using the shortcut methods, which require three parameters (ldpath, constraint and comparison method). It should be possible to add a criteria object, that already contains this information.

[QueryService] Search using a URI

It seems that searching using a URI does not work as expected.

Assume that I want to get all annotations that have as body a specific resource. I guess I would have to use something like

queryService.setAnnotationCriteria("oa:hasBody", uri).execute();

If the string of the uri is of the form "http://domain.com/entity/annotations/bodies/061f499e-c93d-4268-af91-55865e597f34" I get an exception. It seems that since this is a URI it accepts a URI in the form . The exception is given below.

Caused by: org.openrdf.rio.RDFParseException: Expected '<' or '_', found: @ [line 1]
    at org.openrdf.rio.helpers.RDFParserHelper.reportFatalError(RDFParserHelper.java:441)
    at org.openrdf.rio.helpers.RDFParserBase.reportFatalError(RDFParserBase.java:671)
    at org.openrdf.rio.ntriples.NTriplesParser.reportFatalError(NTriplesParser.java:613)
    at org.openrdf.rio.ntriples.NTriplesParser.parseSubject(NTriplesParser.java:350)
    at org.openrdf.rio.ntriples.NTriplesParser.parseTriple(NTriplesParser.java:282)
    at org.openrdf.rio.ntriples.NTriplesParser.parse(NTriplesParser.java:193)
    at org.openrdf.http.client.BackgroundGraphResult.run(BackgroundGraphResult.java:133)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

On the other hand if I use as the uri string the following "http://domain.com/entity/annotations/bodies/061f499e-c93d-4268-af91-55865e597f34" I get no results...

Checking the finally constructed SPARQL query in Virtuoso

SELECT  ?annotation
WHERE
  { ?annotation <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/oa#Annotation> .
    ?annotation <http://www.w3.org/ns/oa#hasBody> ?var1
    FILTER regex(str(?var1), "http://domain.com/entity/annotations/bodies/061f499e-c93d-4268-af91-55865e597f34", "")
  }

returns the correct results, while "<http://domain.com/entity/annotations/bodies/061f499e-c93d-4268-af91-55865e597f34 >"

SELECT  ?annotation
WHERE
  { ?annotation <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/oa#Annotation> .
    ?annotation <http://www.w3.org/ns/oa#hasBody> ?var1
    FILTER regex(str(?var1), "<http://domain.com/entity/annotations/bodies/061f499e-c93d-4268-af91-55865e597f34>", "")
  }

does not.

optimize tests

test take long because before each test a reinit of the memory store takes place.

solution 1:
reset function on object repository

solution 2:
reinit the memory store instead of the whole object repository

Reuse of blank nodes

For example, instead of creating a new blank node for provenance agents per each annotation, when all annotations are serialized by the same software agent/the same person, etc.

Limit and Offset ignored in QueryService

QueryService<Annotation> queryService = anno4j.createQueryService(Annotation.class);
queryService.limit(5);
queryService.execute();

Creates the following query:

SELECT  ?annotation
WHERE
  { ?annotation <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/oa#Annotation>}

(nb: IntelliJ marks private Integer limit and private Integer offset in QueryService.java as unused, so I think it is not implemented)

delete unnecessary alibaba tests

testing duration is very long. i think we can remove many tests of the alibaba test suite because we only use a subpart of the hole project and we test it in our project anyways

Recommendation, open questions

Looked at Apache Mahout - Taste (https://mahout.apache.org/users/recommender/recommender-documentation.html):

  • Mainly focused on collaborative filtering, item-based recommendations are possible but only in a very restrictive way. They claim that items are rather static, so their similarity can be precomputed.
  • Their basic engine needs to be supported with a list of precomputed item-similarities.
  • Maybe workflow: Given an annotation, define requested similarities -> Server queries related annotations (referred from its type) and calculates similarities -> Feed to taste -> Generate Recommendations

-> Questions:

  • Should our annotation similarity be static, calculated from time to time?
  • Should they be dynamic, calculated at request-time?
  • Between what kind of annotations/annotation types does one want to have similarities? All of them?
  • Cross-annotation similarity: how to define those similarities?
  • Can the desired similarity be defined on recommendation-request time?
  • Can the algorithms from taste be used on lower level annotation features?
  • Do we need the support of a framework altogether? Recommendations or similarities could be implemented at annotation level..

-> Ideas:

  • Basic annotation similarity based on annotation level and target level (e.g. User, Agent, Target resource, same selector/fragment, etc.)
  • Specific annotation similarity: Defined by some rules?

[QueryService] Missing fields in Annotations returned by the QueryService

The annotation returned after executing a QueryService (built with

createQueryService(Annotation.class, graph) 

and with an annotation criteria over oa:hasBody) misses fields like body, serializationAt, serializationBy, targets, etc. are missing as shown below (json-ld representation of the annotation).

{
  "@id" : "http://lifewatchgreece.eu/entity/annotations/235332c4-1543-439f-8433-bddb95f2295f",
  "@type" : "http://www.w3.org/ns/oa#Annotation",
  "http://www.w3.org/ns/oa#annotatedAt" : "2015-10-15T19:27:34"
}

Using Virtuoso as a SPARQL endpoint throws exception due no default graph

Virtuoso does not have an explicit unnamed default graph. As a result inserting an annotation throws the following exception:

Virtuoso 37000 Error SP031: SPARQL compiler: No default graph specified in the preamble, but it is needed for triple in INSERT DATA {...} without GRAPH {...}[\n]"

This is with virtuoso 7.1

ldpath unit test

create unit tests for each ldpatch language element we support

SPARQLMM Integration

We should provide a generic addFilterCriteria to allow 3rd party filter (e.g. SPARQLMM)

[Development guidelines]

Is it possible to add some development guidelines (e.g. how to build, test, etc)?

I am new to the world of maven and up to now whenever I make a change to anno4j-core, I run "mvn package" at the anno4j-parent to build the corresponding jar I want to test. I guess this is not the correct process.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.