Giter Site home page Giter Site logo

clj-3df's Introduction

clj-3df

Clojars Project

This is alpha-quality software, not feature-complete, and not yet ready for use in production services.

3DF is best thought of as a pub/sub system in which subscriptions can be arbitrary Datalog expressions. Subscribers register queries with the broker, and data sources (such as Kafka, Datomic, or any other source-of-truth) publish new data to it. All subscriber queries affected by incoming data will be notified with a diff, describing how their results have changed. The Datalog implementation is modeled after Datomic's query language and aims to support the same set of features.

3DF does this efficiently, thanks to being built on top of differential dataflows. In particular, Differential Dataflow will only compute changes, rather than execute a computation from scratch.

This repository contains the Clojure client for 3DF. The broker is written in Rust and can be found in the Declarative Differential Dataflows repository.

How it works

The 3DF client compiles Datalog expressions into an intermediate representation that can be synthesised into a differential dataflow on the server. This dataflow is then registered and executed across any number of workers. Whenever query results change due to new data entering the system, the server will push the neccessary changes via a WebSocket connection.

For example, consider a subscriber created the following subscription:

(exec! conn
  (query db "user inbox" 
    '[:find ?msg ?content
      :where 
      [?msg :msg/recipient "[email protected]"]
      [?msg :msg/content ?content]]))

and a new message arrives in the system.

[{:msg/receipient "[email protected]"
  :msg/content    "Hello!"}]

Then the server will push the following results to the subscriber:

[[[<msg-id> "Hello!"] +1]]

If at some later point in time, this message was retracted

[[:db/retractEntity <msg-id>]]

the server would again notify the subscriber, this time indicating the retraction:

[[[<msg-id> "Hello!"] -1]]

This guarantees, that subscribers maintaining any form of functionally derived information will always have a consistent view of the data.

Query Language Features

  • Implicit joins and unions, and / or operators
  • Stratified negation
  • Parameterized queries
  • Rules, self-referential / mutually recursive rules
  • Aggregation (min, max, count, etc...)
  • Grouping via :with
  • Basic predicates (<=, <, >, >=, =, not=)
  • As-of queries
  • More find specifications (e.g. collection, scalar)
  • Pull queries
  • Queries across many heterogeneous data sources

Please also have a look at the open issues to get a sense for what we're working on.

Non-Features

3DF is neither concerned with durability nor with consistency in the ACID sense. It is intended to be used in combination with a consistent, durable source-of-truth such as Datomic or Kafka.

Consequently, 3DF will accept whatever tuples it is supplied with. For example, whereas in Datomic two subsequent transactions on an empty database

(d/transact conn [[:db/add 123 :user/balance 1000]])
...
(d/transact conn [[:db/add 123 :user/balance 1500]])

would result in the following sets of datoms being added into the database:

[[123 :user/balance 1000 <tx1> true]]
...
[[123 :user/balance 1000 <tx2> false]
 [123 :user/balance 1500 <tx2> true]]

3DF will by itself not take any previous information into account on transactions. Again, 3DF is intended to be fed data from a system like Datomic, which would ensure that transactions produce consistent tuples.

License

Copyright © 2018 Nikolas Göbel

Licensed under Eclipse Public License (see LICENSE).

clj-3df's People

Contributors

comnik avatar li1 avatar spacegangster avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

clj-3df's Issues

retraction of an inexistent value of cardinality/one causes server panic

If a value isn't there, then retraction and its cardinality is one, then if I retract it – Declarative server panics. Not sure if it's a bug though.

reproduction below

; in core namespace
  (def schema
     :loan/epoch  (merge
                   (of-type :Number)
                   (input-semantics :db.semantics.cardinality/one)
                   (tx-time))})

; connect, submit schema then
  (exec! conn
         (transact db
                   [[:db/retract #uuid "1550c322-9111-44f8-a939-c53cee0d9774"
                     :loan/epoch 4]]))

Panic

 server git:(a1323c6) ✗ RUST_BACKTRACE=1 cargo run                                                                                                                   
    Finished dev [unoptimized + debuginfo] target(s) in 0.05s                      
     Running `target/debug/declarative-server`                                                                                                                         
                                         
                                                                                                                                                                       
2thread 'worker thread 0' panicked at 'Received a retraction of a new key on a CardinalityOne attribute', /Users/spacegangster/clj/declarative-dataflow/src/operators/m
od.rs:76:29  

And I think now I don't understand what does cardinality means in this context.

Multiple aggregations

Currently we nest multiple aggregations in the same :find-clause.
This leads to problems in the backend, as already aggregated results are input-collections to the higher aggregations operators.

ClojureScript support

ClojureScript compatibility was left behind a couple of iterations ago. The major obstacle is the WebSocket connection, which is currently implemented via the aleph library. On the web we can simply use WebSocket's directly, but it still has to be done...

Datahike support?

Hi folks!

I remember hearing a while back that there was going to be an effort to add Datahike support to clj-3df. It doesn't seem like anything has come of that, but please let me know if that's still on the radar.

Thanks!

Improve query verification during parsing

Some aspects of a query can be checked locally during parse-time, e.g. that constants are of the correct type, attributes and rules exist, rule and query names don't clash with existing ones, and so on...

FnExpr symbols and implicit clause order

Quoting:
At the very least, result-sym should be part of the bound symbols. But we also have to think about whether the ouput binding is supposed to replace the input binding, or whether they co-exist, etc...

Pagination

Is pagination of queries on your radar at all?

Add support for aggregations in clauses

It is currently not possible to use aggregations within rules, because they can only appear within the find specification. It should be possible to use aggregations like transforms:

[[(total ?user ?sum)
  [?user :user/purchase ?purchase]
  [?purchase :purchase/amount ?amount]
  [(sum ?amount) ?sum]]]

How to run clj-3df?

Hi guys! This looks promising, but I can't figure out how to use it. I cloned the repo, am running a differential dataflow server. Updated Leiningen to fetch some newer deps. When I try to run any of the clj-3df examples, I get an error like this:

~/Projects/clj-3df  master ?  clj -m lww                                                                                                            1 ↵  3240  19:47:21
[MIDDLEWARE] running
Exception in thread "main" java.lang.ClassCastException: clojure.core.async.impl.channels.ManyToManyChannel cannot be cast to manifold.bus.IEventBus
	at lww$_main.invokeStatic(lww.clj:52)
	at lww$_main.invoke(lww.clj:50)
	at clojure.lang.AFn.applyToHelper(AFn.java:152)
	at clojure.lang.AFn.applyTo(AFn.java:144)
	at clojure.lang.Var.applyTo(Var.java:702)
	at clojure.core$apply.invokeStatic(core.clj:657)
	at clojure.main$main_opt.invokeStatic(main.clj:317)
	at clojure.main$main_opt.invoke(main.clj:313)
	at clojure.main$main.invokeStatic(main.clj:424)
	at clojure.main$main.doInvoke(main.clj:387)
	at clojure.lang.RestFn.applyTo(RestFn.java:137)
	at clojure.lang.Var.applyTo(Var.java:702)
	at clojure.main.main(main.java:37)

Or like this:

 ~/Projects/clj-3df  master ?  clj -m runner lww                                                                                               INT(-2) ↵  3241  19:49:53
Exception in thread "main" java.lang.RuntimeException: No such var: df/debug-conn, compiling:(runner.clj:10:18)
	at clojure.lang.Compiler.analyze(Compiler.java:6792)
	at clojure.lang.Compiler.analyze(Compiler.java:6729)
	at clojure.lang.Compiler$InvokeExpr.parse(Compiler.java:3813)

Roadmap?

I know it's hard to predict but do you have even a rough estimate of when a semi-stable version will be out? I love it and want to use in a project. If no estimate, anything else I could use with datomic for similar functionality?

Constant bindings

We should add support for constant bindings in ...

  • predicates
  • function expressions
  • rule expressions

For predicates and function expressions support is rather straight-forward, but for rules we can spend some more time thinking about how to do it efficiently. A first idea was to turn constant bindings to rule inputs into predicates that are then pushed down as far as possible into the rule body itself, to avoid unnecessary materialziation of tuples.

Ordering of vars in pred- and fn-expr

The backend expects in the case of fn-expr: vars, plan, function

  • Needs front-end update
  • Predicates are currently wrong in the backend as they take: vars, predicate, plan

UUID serialization for entity id's

At this point it's not possible to use UUIDs as entity ID in transaction most likely because of wrong serialization.

E.g.

(df/transact db [{:db/id   (java.util.UUID/randomUUID)}])

will cause the following error in 3df:

ERROR 2019-05-28T15:18:01Z: declarative_server::networking: [IO] Error { category: "df.error.category/incorrect", message: "unknown variant `\"1d2c0533-6e84-4f2a-bedd-d1389d7ec47d\"`, expected one of `Aid`, `String`, `Bool`, `Number`, `Rational32`, `Eid`, `Instant`, `Uuid` at line 1 column 228" }

Support for pull query

I see that support for pull queries is on the README query feature map, but I didn't see an issue for it, so I thought I'd start one. Of all the pending query features, this one to me feels like the biggest hole in my day to day usage of Datomic & DataScript.

There's potentially a bit of a discussion here about interface and scoping. There seems to be a clear target for

  • individual pull queries
  • pull queries on a set collection of ids (pull-many basically)
  • pull queries within a datalog query like [:find (pull ?e [...]) :where ...]

In all of these cases, it would seam reasonable to send diffs corresponding to the relevant [e a v] triples. The thing this misses vs convention pull is obviously "where in the nested structure is this relevant". It would probably be fine to ignore this, but it is interesting to consider that with some kind of Reagent like api, you could return reactions which resolve to maps, which themselves might point to nested reactions.

The problem I see is what if you have a query like [:find (pull ?e [...]) (pull ?d [...]) :where ...], which is effectively a relationship between pull structures. This is legal in either Datomic or DataScript, but I'm not sure how you would interpret it here, because here you don't just have a collection of facts, you have a relation between collections of facts. So maybe this just isn't supported. Or maybe you can come up with some clever indexing scheme that pairs the pull diffs with a concept of where they are in the outer relation. What's interesting is that if we again consider the Reagent model, this again fits quite nicely into the idea of returning a reaction of nested reactions.

Again, thanks for the great work!

default clauses for datalog

To prevent a situation where queries fail simply due to missing data, it might make sense to introduce default values (similar to get-else in datomic).

Add CloseInput support

Declarative dataflow now provides a CloseInput request, which we should support here as well.

Multi-arity rules

Currently, only fixed-arity rules are supported. It would be nice to support rule definitions with different arities, to be fully compatible with Datomic in this regard.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.