cldb.asd

A fast, atomic, consistent, isolated memory based database in Common Lisp.

Requirements

CLDB currently only works in Clozure Common Lisp, as this is what we use internally. Supporting other Lisp implementations might be straightforward. It is mainly the mmap implementation which would need replacing. From some very quick performance testing, Clozure CL seemed to perform better than I could get SBCL to perform, though I don’t know why and haven’t looked into it.

This package also depends on:-

vip-clim-core
anaphors
and trivial-utf-8

Apart from trivial-utf-8 the above can all be found at https://github.com/Virtual-Insurance-Products and have various dependencies themselves - probably most of the packages there.

Postgres Mirroring

At VIP we use CLDB to mirror (cache in memory) some tables from our Postgres database. The code for doing that mirroring is not here yet. I’m hoping to extract something to at least illustrate this soon…

Caveats and Limitations

There is no garbage collection of any kind for the persistent heap. There is also no way to manually delete things yet. Also, columns must be allocated with a fixed size. I had intended to then allow an overflow array in case they fill up. This would have some minor performance implication when it happens. This is not implemented yet though.

As an alternative to filling up a column causing an overflow column to be created, it would be straightforward to just re-allocate the column and then clear the original column by zeroing it out. When the heap is snapshot this would leave holes in the file, and the memory used by the holes would be very small. In effect, because of the layer of abstraction above direct memory access, rather than compacting the heap to clear things out, one can just leave big holes in it with almost no overhead. This is not yet implemented though.

To deal with this, it’s fine to allocate columns which are far bigger than required. As the allocated array will initially be empty, it will not consume actual disk space or space in RAM. HFS+ doesn’t support files with holes though, so snapshots on HFS+ will end up very large this way.

I also have not yet implemented strings larger than ~64kb. It wouldn’t be difficult to do so, it just wasn’t a priority.

Motivation

In an OLTP type application which makes a large number of database queries during a single transaction, a database running in a separate process and accessed over a socket is a performance bottleneck.

Modern computer systems allow a great many databases (including ours) to fit in main memory. Loading the whole database into process address space can eliminate a huge amount of overhead.

For example, a common use case is simply looking up values of a column when we know the id of a database row. Some performance comparisons:-

(in-package :vip)
(with-database
    (time (dotimes (i 1000) (dquery "select creation_date from quote where id=12345"))))
;; ~250ms
;; prepared queries get this down to about 160ms from a quick test

;; with cldb...
(cldb:with-database (base changed)
  (time (dotimes (i 1000)
          (let ((column (cldb::find-column base changed "quote" "creation_date")))
            (cldb::column-value base changed column 12345)))))
;; 14ms if we do column lookup each time

(cldb:with-database (base changed)
  (let ((column (cldb::find-column base changed "quote" "creation_date")))
    (time (dotimes (i 1000)
            (cldb::column-value base changed column 12345)))
    (cldb::column-value base changed column 12345)))
;; typcially 0.6ms

Of course, by simply using CL’s provided data structures (arrays, CLOS instances, structs etc) we lose the ACID features provided by an RDBMS such as PostgresQL, including automatic persistence to durable storage. Another problem is that if we try to store millions of attributes in memory naively, the lisp garbage collector becomes a bottleneck.

One possible solution to some of these problems is to use persistent data structures (such as trees with relatively high branching factors) as is done in Clojure. These can give atomicity and isolation, but suffer from a lot of GC overhead, and although access times are better, they are still considerably slower than simple array access.

CLDB aims to provide a database of sorts which:-

Is much faster than Postgres and persistent data structures as in Clojure for simple queries (id -> value and value -> id lookup); with those queries being composable
Fully indexes almost all columns (only disabled for booleans)
Has atomic updates - grab a mutable reference to the database to change it, but no other process will see changes until the new copy is swapped in
Only supports writing by one thread at a time if you want a single, consistent database
Allows quickly saving the whole database to disk, and this will either completely succeed or completely fail (it writes to another file and atomically swaps it in)
Loads databases from disk very quickly via mmap

To achieve this it uses a block based copy-on-write approach to implement a persistent heap. This has a small amount of overhead compared to direct array access, and generates a small number of objects for the garbage collector to manage (changed blocks) even in a database of millions of values.

This persistent heap can then be used to store other data structures, such as arrays, strings etc, which are automatically atomic and isolated due to residing in a copy-on-write heap.

At VIP we use this is as a caching layer in front of a PostgresQL database. By cunning use of triggers on some tables (set up automatically when a table is registered) we can ‘mirror’ a postgres table into the CLDB in-memory database. When the postgres transaction completes successfully, a single CLDB update thread will transfer the changes made in Postgres into the CLDB copy and then atomically swap the ‘current’ version of the database in the lisp image.

In this way we keep the advantages of PostgresQL in terms of reliability and ACIDity, and gain orders of magnitude of performance for many queries.

This in memory CLDB copy of the database is periodically ‘snapshot’ to disk. An interesting consequence of the way this is implemented is that it makes no difference whether the snapshoting happens in the same Lisp image or a different one, so this is handled by a separate process.

When the Lisp system starts up, it loads the CLDB database using mmap and then looks to see what transactions in the Postgres database are missing from the CLDB database since it was saved. These transactions are loaded in before the system starts processing requests. This is quick.

Design

Persistent Heap

The lowest layer is a persistable heap of objects. Most simply this would be an array of 64 bit integers which are used to store various kinds of object. By storing them in an array like this, the lisp garbage collector only has to check one object (the array) to see if it is still referenced. This makes ‘objects’ in the persistent heap invisible to the normal Lisp GC, and means we have to implement our own garbage collector if we want to collect things. No garbage collector has been implemented at present.

Representing the heap in this way gives very fast access (read and write) and we can very quickly write the whole heap out to disk. Reading the heap in from disk is far quicker, as we can just use mmap.

In order to build an ACID (or at least ACI - we just use PostgresQL at VIP to get the ‘D’) database on top of this heap, the API for changing things in the heap implements a copy-on-write (COW) strategy. The heap is therefore represented as:-

A flat array of unsigned 64 bit integers, which can be loaded via mmap
An array of changed blocks - each of which is an array of 4096 unsigned 64 bit integers.

This representation still gives a fairly small number of objects for the Lisp GC to consider for any ‘reasonable’ size of database.

The functions for accessing data FROM the heap (defined in persistent-heap.lisp) take the base vector and changed blocks array as two separate parameters. All the objects created in the heap (eg cons cells and arrays) are returned as ‘heap pointers’ - fixnums which are tagged offsets into the heap. The tag is used to identify the object type.

Functions for writing objects INTO the heap all take a writable heap, which is a single object containing a bit more information.

Column Database

On top of the heap a column database is implemented. Columns are stored (mainly) as arrays mapping id (row index) to value for that row combined with indexes for all the values mapping value -> row. The ID for each row is implicit and is simply it’s index into the array. Each column’s values are dynamically typed, so columns can contain a mixture of different types of values. The hash table index of those works on the ‘pobject’ representation, which is a single fixnum in 64 bit lisps. As all strings are interned in the heap, they are also just a single machine word for index lookup operations.

In our Postgres database we have used numeric IDs as primary keys in many tables, so these are used as the row index in CLDB. The arrays will not be completely allocated in memory, so creating excessively large arrays is fine although when saving they will add to the file size. Provided the file system supports it, the file will be sparse (containing empty blocks) and so won’t take up too much disk space either.

Getting Started

(in-package :cldb)
    
;; 1. Create a new database and snapshot it
(snapshot-database "/tmp/snap.db" (make-database))

;; 2. Open database from a file as the top level database
(open-database "/tmp/snap.db")

;; 3. Create a new column in the database in a transaction
(with-database-transaction (w)
  ;; We have to either make columns with up to 65536 rows OR columns with 1+2**n rows up to some limit
  ;; this is just over 1M rows.
  ;; The resultant file size 
  (make-simple-column w #x100001 "table" "column"))

;; 4. Put some data into the column
;; The columns can store any value we can encode into the pheap
(with-database (base changed)
  ;; The find-column function takes base vector and changed block list as it is a 'read' function
  (let ((column (find-column base changed "table" "column")))
    ;; make a writable snapshot
    (with-database-transaction (writable)
      (loop for i from 0 to 10
         do (setf (column-value writable column i) (* i i)))
      (setf (column-value writable column 11)
            "This is a string")
      )
    ))

;; passing atomic creates a temporary file to save and then moves the temporary over the previous once done
(snapshot-database "/tmp/snap.db" *current-database* :atomic)

;; close the database
(close-database)

;; open it again
(open-database "/tmp/snap.db")

;; read the data
(with-database (base changed)
  (let ((column (find-column base changed "table" "column")))
    (loop for i from 0 to 11
         collect (column-value base changed column i))))

;; Change the data
(with-database (base changed)
  (let ((col (find-column base changed "table" "column")))
    (with-database-transaction (w)
      (setf (column-value w col 0) t
            (column-value w col 1) nil
            ;; general symbols are not presently supported...
            ;; (column-value w col 2) 'hello
            ;; though all strings are interned
            (column-value w col 2) "hello"
            ;; (and small strings are encoded in a single 64 bit word)
            ;; rationals are stored as rationals
            (column-value w col 3) 1/3
            ))))

;; column-index-lookup returns a function which will look up row IDs from a column value
(with-database (base changed)
  (let ((results nil))
    (funcall (funcall (column-index-lookup (find-column base changed "table" "column"))
                      base changed
                      ;; this is the value we are looking for:-
                      "hello")
             (lambda (row)
               (push row results)))
    (reverse results)))

;; ...which can also be written as:-
(query (index-lookup "table" "column")
       (collect "hello"))
;; (see below for the query interface)

As much as possible lookups are only performed once. column-index-lookup takes the table and column name and returns a function to lookup column rows from values. When looking up different values in the same table and column this avoids the need to find the column itself in the heap repeatedly.

Having found the column we can pass base, changed and a value to the resultant function and it will lookup interned strings in the heap, and otherwise convert the lisp object into a ‘pobject’ - which is represented as a fixnum. All lisp objects which can be dealt with here will be converted to fixnums. Note: columns can’t meaningfully store list values, cons cells or arrays, though the heap can. These are used in the heap to build columns, including their indexes (as hash tables).

Having resolved the value to a pobject we then get an iterator function which repeatedly calls a function passed to it with the row indexes which match the value.

Queries

To provide a convenient interface to access information from this database there is a library of functions which can be composed together and a macro called cldb:query.

The following will find all 5 legged mammals by doing an index lookup to get all mammals then checking for the leg count being 5 and will return t if any are found:-

(cldb:query (cldb:index-lookup "animal" "type") ; find all animal of some type
            (cldb:column-equal "animal" "legs" 5) ; with 5 legs
            (cldb:exists "mammal"))

The following does the same, but by first doing an index lookup for all 5 legged animals and then checking to see whether they are a mammal. It is likely to be quicker as there are probably more mammals than 5 legged animals.

(cldb:query (cldb:index-lookup "animal" "legs")
            (cldb:column-equal "animal" "type" "mammal")
            (cldb:exists 5))

The query macro threads the parameters for base vector and changed blocks through the nested calls and uses compose to combine the query functions. Note cldb:compose is not the standard compose function. It composes CQFs (composable query functions). The above macro expands to:-

(WITH-DATABASE
  (#:BASE-VECTOR91268932 #:CHANGED-BLOCKS91268933)
  (EXISTS #:BASE-VECTOR91268932
          #:CHANGED-BLOCKS91268933
          (COMPOSE (INDEX-LOOKUP #:BASE-VECTOR91268932
                                 #:CHANGED-BLOCKS91268933
                                 "animal"
                                 "legs")
                   (COLUMN-EQUAL #:BASE-VECTOR91268932
                                 #:CHANGED-BLOCKS91268933
                                 "animal"
                                 "type"
                                 "mammal"))
          5))

It is also possible to omit the terminal clause:-

(cldb:query (cldb:index-lookup "animal" "legs")
            (cldb:column-equal "animal" "type" "mammal"))

Which gives

(WITH-DATABASE
  (#:BASE-VECTOR91296885 #:CHANGED-BLOCKS91296886)
  (COMPOSE (INDEX-LOOKUP #:BASE-VECTOR91296885
                         #:CHANGED-BLOCKS91296886
                         "animal"
                         "legs")
           (COLUMN-EQUAL #:BASE-VECTOR91296885
                         #:CHANGED-BLOCKS91296886
                         "animal"
                         "type"
                         "mammal")))

The result of this could then be passed to exists, collect or collect-1. exists is defined as:-

(defun exists (b c cqf value)
  (funcall (funcall cqf b c value)
           (lambda (x)
             (declare (ignore x))
             (return-from exists t))))

Metaclass

We can also do the following:-

;; as the CLDB metaclass won't create columns in the databaes we must do that manually for now
;; (in our system the columns come from the postgres database)
(with-database-transaction (w)
  (loop for slot in '("name" "full_name" "password_hash" "salt")
     do (make-simple-column w #x100001 "user_account" slot)))


(defclass user-account ()
  ((name :reader name :type string)
   (full-name :reader full-name :type string)
   (password-hash :reader password-hash :type string)
   (salt :reader salt :type string))
  (:metaclass cldb-class))

;; (cldb:get-instance 'user-account 1)


(defclass post ()
  ((user-account :type user-account :initarg :user-account :reader user-account)
   (message :type string :initarg :message :reader message)
   (date :type integer :initarg :date :reader date))
  (:metaclass cldb-class))

The values for the slots are read directly from the persistent heap and not cached in the object in any way. At VIP this is used as a way to access objects mirrored from a postgres database, so we have not, as yet, made a way to create or mutate these objects. This would certainly be possible, but would (of course) require a reference to a writeable heap. There is currently no dynamically bound writeable heap - all the heap modification functions take it as an explicit parameter by design.

Queries can also be performed at the metaclass level and the query optimzier (invoked from the query macro) will attempt to simplify any lookups avoiding generating CLOS objects only to access a single slot value from them and will just generate appropriate index-lookup and column-value compositions where possible.

Future Work

In order to achieve durability in an in memory database with periodic snapshots the obvious solution is to simply log all updates. This is the key missing part needed to make CLDB a viable database on its own. Since, at VIP, we were already using PostgresQL as our main RDBMS and have no near term plans to change that, we have used PostgresQL to handle the durability. This works very well from a reliability standpoint, though the way it achieves this is far more complicated than a simple logging system and bottlenecks write performance significantly.

As noted in the Metaclass section above, no object creation or mutation for the metaclass has been implemented either. At VIP the mutation is all handled through Postgres (in various ways) and so the CLDB objects only serve for reading data. To extend this to be more of a full, independent, database would require implementing that, along with other write methods.

cl-kuthirgal / cldb Goto Github PK

cldb's Introduction

cldb.asd

Requirements

Postgres Mirroring

Caveats and Limitations

Motivation

Design

Persistent Heap

Column Database

Getting Started

Queries

Metaclass

Future Work

cldb's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent