Giter Site home page Giter Site logo

etopian / zombodb Goto Github PK

View Code? Open in Web Editor NEW

This project forked from zombodb/zombodb

1.0 2.0 0.0 225 KB

Postgres Extension that allows creating an index backed by Elasticsearch

License: Other

Java 70.75% Makefile 0.29% C 24.10% C++ 0.71% Shell 0.07% PLpgSQL 4.08%

zombodb's Introduction

ZomboDB

ZomboDB is a Postgres extension that enables efficient full-text searching via the creation of indexes (CREATE INDEX) backed by Elasticsearch.

Build Status

Quick Links

Features

  • transaction-safe full-text queries
  • managed & used via standard Postgres SQL
  • works with tables of any structure
  • automatically creates Elasticsearch Mappings supporting most datatypes, including arrays
  • nested objects for flexible schemaless sub-documents
  • custom full-text query language supporting nearly all of Elasticsearch's search features, including
    • boolean operations
    • proximity (in and out of order)
    • phrases
    • wildcards
    • fuzzy terms/phrases
    • "more like this"
    • regular expressions
    • inline scripts
    • range queries
  • query results expansion and index linking
  • extremely fast indexing
  • record count estimation
  • high-performance hit highlighting
  • access to Elasticsearch's full set of aggregations
  • use whatever method you currently use for talking to Postgres (JDBC, DBI, libpq, etc)
  • fairly extensive test suite (NB: in progress of being converted from closed-source version)

Because ZomboDB is a Postgres index type, it "just works" with SELECT, COPY, INSERT, UPDATE, DELETE, and VACUUM statements.

Not to suggest that these things are impossible, but there's a small set of non-features too:

  • no scoring
  • indexes are not crash-safe/recoverable
  • interoperability with various Postgres replication schemes is unknown
  • pg_get_indexdef() doesn't correctly quote index options making backup restoration annoying (would require patch to Postgres)
  • Postgres HOT updates not supported
  • only supports Postgres query plans that choose IndexScans or BitmapIndexScans (the latter is also dependent on sufficient work_mem to avoid Recheck conditions)

History

The name is an homage to zombo.com and its long history of continuous self-affirmation.

Development began in 2013 by Technology Concepts & Design, Inc as a closed-source effort to provide transaction safe full-text searching on top of Postgres tables. While Postgres' "tsearch" features are useful, they're not necessarily adequate for 200 column-wide tables with 100M rows, each containing large text content.

Initially designed on-top of Postgres' Foreign Data Wrapper API, ZomboDB quickly evolved into an index type (Access Method) so that queries are MVCC-safe and standard SQL can be used to query and manage indexes.

Elasticsearch was chosen as the backing search index because of its horizontal scaling abilities, performance, and general ease of use.

Two years later, it's been used in production systems for quite some time and is now open-source.

How to Use It

Usage is really quite simple. Note that this is just a brief overview. See the various documentation files for more detailed information.

Install the extension:

CREATE EXTENSION zombodb;

Create a table:

CREATE TABLE books (
	book_id serial8 NOT NULL PRIMARY KEY,
	author varchar(128),
	publication_date date,
	title phrase,     -- 'phrase' is a DOMAIN provided by ZomboDB
	content fulltext  -- 'fulltext' is a DOMAIN provided by ZomboDB
);

-- insert some data

Index it:

CREATE INDEX idxbooks ON books 
                   USING zombodb (zdb(books)) 
                   WITH (url='http://localhost:9200', shards=5, replicas=1);

Query it:

SELECT * FROM books WHERE zdb(books) ==> 'title:(catcher w/3 rye) 
                                            and content:"Ossenburger Memorial Wing" 
                                             or author:Salinger*';

What you need

Product Version
Postgres 9.3
Elasticsearch 1.5.2+ (not 2.0)
Java JDK 1.7.0_51+
libCurl 7.37.1+
Apache Maven 3.0.5

You'll also need a Postgres compatible build environment to build ZomboDB's Postgres extension along with a Java 7 compatible build environment to build the Elasticsearch plugin

NOTE: ZomboDB has only been tested on Linux and OS X. Windows support is unknown (but likely easy to whip into shape if necessary).

Credit and Thanks

Credit goes to Technology Concepts & Design, Inc, its management, and its development and quality assurance teams not only for their work during the early development days but also for their on-going support now that ZomboDB is open-source.

Contact Information

  • Eric Ridge
  • Email: [email protected]
  • Twitter: @zombodb or @eeeebbbbrrrr
  • via github Issues and Pull Requests ;)

License

Portions Copyright 2013-2015 Technology Concepts & Design, Inc.
Portions Copyright 2015 ZomboDB, LLC

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

zombodb's People

Contributors

eeeebbbbrrrr avatar zombodb avatar

Stargazers

Angus H. avatar

Watchers

Etopian Inc. avatar James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.