Giter Site home page Giter Site logo

pelagios / recogito2 Goto Github PK

View Code? Open in Web Editor NEW
146.0 23.0 30.0 69.1 MB

Semantic Annotation Without the Pointy Brackets

License: Apache License 2.0

Scala 51.63% HTML 14.20% CSS 0.03% JavaScript 26.05% Shell 0.22% Python 0.05% Emacs Lisp 0.61% NewLisp 0.38% Dockerfile 0.09% Less 6.74%
annotation linkeddata iiif elasticsearch

recogito2's Introduction

Recogito

Current version: v3.3

Home of Recogito - a Semantic Annotation tool for texts and images, developed by Pelagios Commons.

Prerequisites

  • Java 8 JDK
  • SBT (version 1.0.x)
  • node.js (version 10.4.1), along with npm (version 6.1.0) and the webpack and webpack-cli npm packages (install globally via npm install -g {package-name} )
  • PostgreSQL DB (tested with version 9.5)
  • ElasticSearch v5.6.5 (important: do not use ES v6.x, since this introduced breaking changes not compatible with the current version of Recogito)
  • To use image annotation, you need to have the vips image processing system installed. If vips is not available on the command line, Recogito is set to reject uploaded images as 'unsupported content'. (Note: on Ubuntu, 'libvips-tools' is the package you need.)

Installation

  • Clone this repository
  • Create a copy of the file conf/application.conf.template and name it conf/application.conf. Make any environment-specific changes there. (For the most part, the defaults should be fine.)
  • Create a database named 'recogito' on your Postgres DB server. (If you want a different name, adjust the settings in your conf/application.conf accordingly.)
  • Type npm install to download required JS dependencies
  • Type sbt run to start the application in development mode.
  • Point your browser to http://localhost:9000
  • Recogito automatically creates a single user with administrator privileges with username 'recogito' and password 'recogito'. Be sure to remove this user - or at least change the password - for production use!
  • To generate an Eclipse project, type sbt eclipse.

Importing gazetteers

You can import gazetteers through the administration dashboard.

  • Log in with a user that has admin privileges (such as the default 'recogito' user created automatically)
  • Point your browser to http://localhost:9000/admin/authorities
  • Click Add Authority File to upload a gazetteer (see our Wiki for information on supported data formats).

Running in production

  • When running in production you must define a location where Recogito can store user files, using an absolute path. The relevant property in the conf/application.conf file is recogito.upload.dir.
  • To test production mode before deploying, type sbt runProd
  • To change to a different port (than default 9000), type sbt "runProd -Dhttp.port=9876"
  • For full production deployment, refer to the current Play Framework docs
  • Be sure to set a random application secret in conf/application.conf. Play includes a utility to generate one for you - type sbt playGenerateSecret.
  • Last but not least: another reminder to remove the default 'recogito' admin user - or at least change its password!

Contributing to development

If you want to contribute to the development of Recogito, do get in touch with us with your ideas via [email protected]. Or want to contribute and don't know where to start? The easiest way to get started is by helping out with the translation of the user interface and help resources. Check the Wiki for more information.

Acknowledgements

Recogito was developed predominantely as part of the Pelagios 6 & 7 research projects. Investigative team:

  • Elton Barker
  • Leif Isaksen
  • Rebecca Kahn
  • Rainer Simon
  • Valeria Vitale

License

Recogito is licensed under the terms of the Apache 2.0 license.

recogito2's People

Contributors

andrew01ait avatar brishti55 avatar dependabot[bot] avatar dimidd avatar hcayless avatar kiolalis avatar masoumeh avatar milvanidou avatar nicolasfranck avatar rsimon avatar vrazanajao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

recogito2's Issues

Explicit support for gazetteer correspondence lists?

Should we add support for ingesting URI correspondence lists into the gazetteer index? Since such close-/exactMatch pairs would not be linked to a gazetteer record, this would impact the data model. I.e. we'd need to either move up matches to the level of the place itself (could make sense in terms of query performance anyway?); or support matches at the place level and at the gazetteer record level in parallel.

Support for User-Specific Gazetteers?

Should we enable people to upload their own gazetteers, for exclusive use within their documents? This could be an interesting feature, but has significant architectural repercussions. How would we model it in ElasticSearch? A separate index for user gazetteers (with and "owner" field on each place) and multi-index query (to cover "standard" and custom gazetteers)? Additionally: no conflation with the standard gazetteers?

Account settings page

  • Change password
  • Reset password (I'm not too happy that we're storing E-Mail addresses. But we probably need to for the single use case of password reset. Are there perhaps alternatives? Or can we at least encrypt the E-Mail address somehow?)
  • Delete account (What do we do with associated docs? Offer two options a la Wordpress: delete or transfer to other user?)
  • Extra user metadata properties? E.g. set a real/full name, personal homepage link? (Both optional.)

Comments/notes on annotation bodies

What about comments on a specific body of an annotation? This related directly to the use case by Damien Bove, who wants to add notes to specific transcription bodies ("Transcription according to...").

In terms of the data model, I added a "note" string field to the body section in the annotation schema mapping. Is this enough? How should we treat that in the UI?

Marking as 'Person' - implement placeholder functionality

Currently, the quick mode for tagging as Person is implemented - but the editor popup does not have a dedicated 'Person' section yet. Likewise, marking a selection as person in the editor popup does not have any functionality yet.

Implement at least placeholder functionality, so that it's at least consistently possible to mark a selection as Person, even if there's no resolution to authority URI yet.

Clean up JSON API methods

The Place- and Annotation API controller methods need to be cleaned up.

  • Some methods trigger an asyn action, and then return immediately rather than mapping the resul future to the response. This is not just unecessary, but also means the response doesn't have information about whether the actions was successful.
  • Write methods currently check whether the current user == the owner of the document. We should separate the permission check into the base class, so that things are DRY and we can later extend the check to user == owner OR user has write permissions.

'Terms and Conditions' page

On the splash page, we link to our 'Terms and Conditions' (and say you're agreeing to them if you sign up). So now all we need to do is draft them...

'Hide annotations' toggle

As noted by Pau and Günther: we need a 'Hide Annotations' quick toggle to remove all annotations from the view.

'All Changes Saved' indicator à la Google Drive

There should be an indicator in the header that confirms if changes are saved, or displays an error message otherwise, similar to how it's done in GoogleDocs.

Suitable location for this would be next to the main navigation icons. (If space permits on narrow screen devices/tablets?) Otherwise somewhere in the sidebar?

'About' page

Should we should have a more formal 'About' page for Recogito? Most information will be on the splash page anyway. But perhaps contact info, acknowledgements (such as Johan for the map & gazetteer, Tom for Pleiades) etc.?

Basic 'Document Settings' page

Populate the 'Document Settings' area with basic functionality:

  • Document metadata editing
  • Document-part metadata editing
  • Document delete

Editable shapes

Shapes/selection should be movable (resizable) while in edit mode, after creation.

'Change Geo-resolution' popup dialog

Design & implement the popup dialog for gazetteer search & changing geo-resolution. In general, there's a difference between:

  • changing between different gazetteer URIs representing the same place vs.
  • changing the mapping to a different place altogether.

Make editor popup sensitive to screen space

Currently, the editor popup always opens at bottom left of the selection. Make the popup sensitive to available screenspace, so that it doesn't open outside the visible viewport. In addition to the bottom-left configuration, we also need:

  • top-left (when selection at bottom of screen)
  • bottom-right (when selection on right side of screen)
  • bottom-left (when selection at bottom right of screen)

ElasticSearch sanity checks

General issue for all ElasticSearch requests: add sanity checks where necessary. E.g. if a boolean query is built from a list of search terms (or URIs), check if the size of the list does not exceed some sort maximum boundary allowed for ElasticSearch boolean clauses, in order to make sure that weird gazetteer records don't break the import.

Note: this is probably most relevant in the PlaceStore

Basic 'Map View' page

Implement basic map view page, just so we can see something. This task is a bigger chunk of work: to implement this in a scalable way, we need to have a way of retrieving all places for one document in one go. That means we first need to have the entire PlaceLink functionality implemented first.

Multi-type documents and parallel editing

What about documents consisting of multiple representations and/or a mix of text and images? E.g. an image with multiple layers (from multispectral imaging); or a manuscript of which there's a scan and a transcription from a Latin version, and a scan from a translation.

The assumption is that these could be either represented as one document with multiple parts (the multispectral image case), or as multiple documents (potentially with multiple parts - the manuscript case) for which annotation need to be linked (somehow).

Should this be modeled as a relation between annotations? Or as another type of annotation body? (Type HYPERLINK with a URI)? OA would probably model this as an additional annotation altogether, with two annotations as target. But that's probably more complicated as need be, and won't work well with ElasticSearch. Disadvantage (?) we won't know if its an internal link in the same doc or not, unless
we parse the URI.

Additional question: how do we realize that in the UI? (Multi-window environment? Parallel navigation? Would that work on tablets at all?)

Annotation Versioning

Implement annotation versioning. Version history probably just needs to keep a record of all previous versions of an annotation. The modified annotation is stored as a serialized (non-indexed?) JSON string.

For search/restore, all we need to know is who created each version, and when. Scenarios we aim to cover are:

  • Revert a specific annotation to the state at a specific point in time
  • Revert all annotations on a specific document (or filepart) to the state at a specific point in time
  • On a specific document (or filepart), roll back all edits made by a specific user
  • Any other scenarios?

'Currently in use' warning

Should we display a warning in text/image-annotation mode if a document has been under editing by a different user in the last X minute? We won’t be able to add Google-Drive-like live collaboration. This would at least a reasonable precaution to avoid interference between concurrent users.

'Message Board' section for each document

Add a comment/talk section for the entire document (cf. Communication between annotators forum topic).

In terms of UI, this can live at the same level as the other areas (annotation, map, stats, settings). In fact, it's icon can probably take the place of the current 'Sharing Options' icon; whereas sharing options can be rolled into general document settings.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.