Giter Site home page Giter Site logo

andoc's Introduction

andoc

A collaborative web tool to enrich content.

$ easy_install lxml simplejson cherrypy jinja2 redis
$ python andoc.py
$ open 'http://localhost:8080'

Please use a recent version of Chrome or Firefox.

Idea

The idea of andoc is the enrichment and analysis of a wide range of content.

Like wikipdia or etherpad/gobby, it is a collaborative tool where many users can work on the same content at the same time. However, andoc is not about creating content but aims to enrich existing data with a specific set of metadata.

In the second step, andoc is analyzing the collected metadata and provides the user with dynamic visualisations to access and navigate the content.

This is especially helpful with larger sets of data.

The Model

The main metadata in andoc is the concept of an "event". An event usually consists of a place and a time and agents (or persons) present at the event.

Therefore one aspect of andoc is to identify these elements in the existing data.

Example

Let's take a snippet from a mail conversation:

We left Warren at Dean Gate, in our way home last night, and he is
now on his road to town. He left his love, &c., to you, and I will
deliver it when we meet. Henry goes to Harden to-day in his way to
his Master's degree. We shall feel the loss of these two most
agreeable young men exceedingly, and shall have nothing to console
us till the arrival of the Coopers on Tuesday. As they will stay
here till the Monday following, perhaps Caroline will go to the
Ashe ball with me, though I dare say she will not.

and enrich the content:

(p) is a person, (d) a date, (l) a location and (e) an event.

We left <Warren[p]> at <Dean Gate[l]>, in our way home <last night[d]>, and he
is now on his road to <town[l]>. He left his love, &c., to you, and I will
deliver it when we meet. <Henry[p]> goes to <Harden[l]> <to-day[d]> in his way
to his Master's degree.  We shall feel the loss of these two most agreeable
young men exceedingly, and shall have nothing to console us till the arrival of
the <Coopers[p]> on <Tuesday[d]>. As they will stay here till the <Monday[d]>
following, perhaps <Caroline[p]> will go to the <Ashe ball[e]> with me, though
I dare say she will not.

You can try this with the current prototype as well, using a fancy web user interface. It will for example enable you to see the events with associated relations on a timeline: event timeline

Andoc would then know about the existence of:

Agents:

  • Warren
  • Henry
  • Coopers
  • Caroline

Places:

  • Dean Gate
  • Town
  • Harden

Date:

  • last night
  • to-day
  • Monday
  • Tuesday

Event:

  • Ashe ball

In the actual interface the user should be provided with additional tools, so that "Monday" or "to-day" in the context of this document would actually represent a real date.

Since some of the steps can be done with the help of natural language processing, andoc aims to provide automatic processing of the data as well.

Analysis

The afford to enrich the documents, should lead to a direct improvement for the users:

  • Provide additional information about person, places and events from sources like wikipedia along the data. Context matters.

  • Visualisation of semantic relations, social networks, related data and events.

  • Grouping of related data based on event, place or person.

  • Timeline of events and data.

  • Geographical presentation (map) of events and data.

All those presentations should be updated constantly as the enrichment process progresses.

Questions? Ideas?

Contact me on twitter @endpnt

Copyleft

GPLv3 see COPYING

andoc's People

Contributors

endpnt avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.