Giter Site home page Giter Site logo

woodpeck / planet-dump-ng Goto Github PK

View Code? Open in Web Editor NEW

This project forked from zerebubuth/planet-dump-ng

0.0 3.0 0.0 15.8 MB

Experimental next version of the planet dump tool for OpenStreetMap.

License: BSD 2-Clause "Simplified" License

Shell 1.83% C++ 75.23% Emacs Lisp 0.55% Makefile 0.65% M4 21.73%

planet-dump-ng's Introduction

Planet Dump (Next Generation)

Tool for converting an OpenStreetMap database dump into planet files.

By operating on the database dump rather than a running server, this means that running the extraction from PostgreSQL dump file to planetfile(s) is completely independent of the database server, and can be done on a disconnected machine without putting any load on any database.

The previous version of this tool required the database server to keep a consistent transaction context open for the duration of the dump, which would usually be several days. This created problems as the long-running transaction could get cancelled, meaning the planet dump would have to be started again from scratch.

Building

Before building the code, you will need:

  • A C++ build system (GCC 4.7 recommended),
  • libxml2 (version 2.6.31 recommended),
  • The Boost libraries (version 1.49 recommended),
  • libosmpbf (version 1.3.0 recommended),
  • libprotobuf and libprotobuf-lite (version 2.4.1 recommended)

To install these on Ubuntu, you can just type:

sudo apt-get install build-essential automake autoconf \
  libxml2-dev libboost-dev libboost-program-options-dev \
  libboost-date-time-dev libboost-filesystem-dev \
  libboost-thread-dev libboost-iostreams-dev \
  libosmpbf-dev osmpbf-bin libprotobuf-dev pkg-config

After that, it should just be a matter of running:

./autogen.sh
./configure
make

If you run into any issues with this, please file a bug on the github issues page for this project, giving as much detail as you can about the error and the environment it occurred in.

Running

The planet dump program has a decent built-in usage description, which you can read by running:

planet-dump-ng --help

One thing to note is that the program will create on-disk databases in the current working directory, so it is wise to run the program somewhere with plenty of fast disk space. Existing files may interfere with the operation of the program, so it's best to run it in its own, clean directory.

Architecture

This started out with the aim of being easy to change in response to schema changes in the API. However, somehow the templates escaped and began to multiply. Sadly, the code is now much less readable than I would like, but on the bright side is a contender for the Most Egregiously Templated Code award.

Simplifying, the code consists of two basic parts; the bit which reads the PostgreSQL dump, and the part which writes XML and/or PBF.

The part which reads the PostgreSQL dump operates by launching "pg_restore" as a sub-process and parsing its output (in quite a naive way) to get the row data. The part which writes the XML and/or PBF then does a join between the top level elements like nodes, ways and relations and their "inners" - things like tags, way nodes and relation members.

In order that the system can output a planet file or a history planet file in the same run, all of this is generated from the history tables. This means a minor adjustment to how the "current" planet is written, with a filter which drops any non-current version of an element and any current version which is deleted.

History

This evolved, by a somewhat roundabout route, from an attempt to create a new planet dump which read the absolute minimum from the database; that is changesets, changeset tags and just the IDs and versions of the current tables for nodes, ways and relations. The remaining information could be filled in at any time from the history tables because, with the minor exception of redactions, the nodes, ways and relations tables are append-only.

Dumping the IDs and versions would still take time, so it seemed worth looking at "pg_dump" to see how it would best be done efficiently. While looking at "pg_dump", it became clear that what was really needed was just the dump itself - a dump which is produced regularly for backup purposes anyway.

planet-dump-ng's People

Contributors

zerebubuth avatar pnorman avatar woodpeck avatar

Watchers

 avatar James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.