Giter Site home page Giter Site logo

geodata's Introduction

lobid-geo-enrichment

Overview

Microservice for geo information enrichment.

This microservice allows to obtain geo data from Nominatim and Wikidata. To prevent data from being reloaded repeatedly, once obtained geo data is stored in an Elasticsearch index that should be part of the local machine or the local area network.

Currently, the service provides information for

  • Latitude
  • Longitude
  • Post code

The application is built using the Playframework and Elasticsearch.

Setup

Create and change into a directory where you want to store the project, e.g.:

mkdir ~/git ; cd ~/git

Get the project from GitHub:

git clone https://github.com/hbz/geodata.git

Download activator into your home directory in order to launch the Play app:

cd ~ ; wget http://downloads.typesafe.com/typesafe-activator/1.3.10/typesafe-activator-1.3.10-minimal.zip

unzip typesafe-activator-1.3.10-minimal.zip

Start the app:

cd ~/git/geodata

~/activator-1.3.10-minimal/bin/activator "start 7401"

When startup is complete (Listening for HTTP on /0.0.0.0:7401), exit with Ctrl+D, output will be logged to target/universal/stage/logs/application.log.

Index

The service runs with an embedded Elasticsearch index. The index is created on startup of the application BUT ONLY if it does not yet exist. Note that the index is created incrementally based on the queries the application receives. Additionally, the Nominatim web service restricts the use of the API to one request per second (see http://wiki.openstreetmap.org/wiki/Nominatim_usage_policy). Thus, it takes a considerable amount of time to rebuild the index. If you wish to build the index again, delete the data folder in the project root directory and start the application.

Eclipse

If you'd like to import the project into eclipse, use the activator command eclipse to prepare the project:

  • Change into the project directory, e.g. cd ~/git
  • Run ~/activator-1.3.10-minimal/bin/activator eclipse
  • Import the project into your Eclipse, like this)

geodata's People

Contributors

dr0i avatar fsteeg avatar philboeselager avatar sbritter avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

vovoma

geodata's Issues

Use embedded Elasticsearch index

The current geodata index size is only 3.5 MB. To ease deployment and increase self-containedness, we should use an embedded Elasticsearch index as we do in lobid-organisations. Since filling up the geodata index takes time, the index should not be recreated on app start (unlike lobid-organisations).

Store empty Nominatim results

...to prevent Nominatim being requested subsequently for the same set of data. Currently, that behaviour occurs, because "lat", "long" and "postcode" are being requested in separate, subsequent commands from geodata.

Service throws error if postcode cannot be found

An error is thrown if the postcode cannot be found in JSON coming from Nominatim.

Sample error output:

[error] play - Cannot invoke the action, eventually got an error: org.json.JSONException: JSONObject["postcode"] not found.
[error] application - 

! @711781k9f - Internal server error, for (GET) [/geodata/long/J%205/Mannheim/DE] ->

play.api.Application$$anon$1: Execution exception[[JSONException: JSONObject["postcode"] not found.]]
    at play.api.Application$class.handleError(Application.scala:296) ~[play_2.11-2.3.4.jar:2.3.4]
    at play.api.DefaultApplication.handleError(Application.scala:402) [play_2.11-2.3.4.jar:2.3.4]
    at play.core.server.netty.PlayDefaultUpstreamHandler$$anonfun$3$$anonfun$applyOrElse$4.apply(PlayDefaultUpstreamHandler.scala:320) [play_2.11-2.3.4.jar:2.3.4]
    at play.core.server.netty.PlayDefaultUpstreamHandler$$anonfun$3$$anonfun$applyOrElse$4.apply(PlayDefaultUpstreamHandler.scala:320) [play_2.11-2.3.4.jar:2.3.4]
    at scala.Option.map(Option.scala:145) [scala-library-2.11.1.jar:na]
Caused by: org.json.JSONException: JSONObject["postcode"] not found.
    at org.json.JSONObject.get(JSONObject.java:476) ~[json-20141113.jar:na]
    at controllers.geo.NominatimQuery.getPostcode(NominatimQuery.java:47) ~[classes/:na]
    at controllers.geo.NominatimQuery.createGeoNode(NominatimQuery.java:59) ~[classes/:na]
    at controllers.geo.GeoInformator.getFirstGeoNode(GeoInformator.java:95) ~[classes/:na]
    at controllers.geo.GeoInformator.getLatLong(GeoInformator.java:82) ~[classes/:na]

Prevent automatic deletion of existing index on startup

Currently, a new empty index is created on every application's startup, possibly leading to unwanted loss of data.
Preferably, a new empty index only should be created if there is no such index existing, yet.

Also, there should be no deletion of productive data while running tests.

Report locations not listed

All requests to this microservice that no result could be delivered for should be listed. By doing so, notifications to the owners of the requested organisations could be generated.

Make geodata accessible from the web

In order to use geodata in integration tests (like in hbz/lobid-organisations), this microservice should be accessible from the web resp. under a constant URL. Therefore, find a suitable (sub-) domain and make geodata be accessible that way.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.