apa512 / clj-rethinkdb Goto Github PK
View Code? Open in Web Editor NEWLicense: Eclipse Public License 1.0
License: Eclipse Public License 1.0
The "view source" links in the API reference docs don't point to the right line numbers.
For example, this is the link for query/replace:
https://github.com/apa512/clj-rethinkdb/blob/master/src/rethinkdb/query.clj#L110
Actual line number is 137, not 110.
Currently the testing code checks for existence of a table, then drops it if it exists, then creates a new one. It would be much faster if we generated tables with different UUID's, and each test was assigned one of them for use.
There's two ways this could work:
setup-each
function, create a new table with a UUID name, and bind it to a var to be used by the test. http://stackoverflow.com/questions/31735423/how-to-pass-a-value-from-a-fixture-to-a-test-with-clojure-test/31736107#31736107 looks helpful for this.This could be a good PR for someone who wants to get their toes wet in the codebase.
To avoid more bugs like #52 it could be good to turn the query vector into a query map. It could look something like:
{:token 5
:query ...
:json-query ... ; turned into raw JSON arrays
:query-type
:global-optargs}
This would make it easier to put assertions in various places to ensure that queries are being constructed correctly.
Now that Clojure 1.7 is officially released, what are people's thoughts about moving this library to Clojure 1.7 in the future? Is anyone using Clojure 1.6 that isn't planning/able to upgrade? I'm particularly interested in hearing from anyone who can't upgrade.
The main motivation for requiring 1.7 is so that it would be possible to move to cljc
files and generate queries on the client. Transducers may also be useful in the future, but not immediately.
How about creating a gitter.im room for this repository? For the moment the only method to talk between developers is to file an issue, which is rather heavyweight. Gitter rooms work well, they tend to have a limited audience and are easy to maintain.
I'm trying to implement code similar from the rethinkdb docs
r.table("employees").eq_join("company_id", r.table("companies"))
# Copy the field right.id into right.c_id
.map( r.row.merge({
"right": {
"c_id": r.row["right"]["id"]
}
}))
# Remove the field right.id
.without({"right": {"id": True}})
.zip()
.run()
And having trouble with r.row["right"]["id"]
.
In what way should I call r/get-field
?
(r/get-field row "right" "id")
and (r/get-field row "right.id")
didn't work.
I was inserting a lot of documents into rethinkdb and everything was working but suddenly I got following error:
Caused by: java.lang.AssertionError: Assert failed: (= token recvd-token)
at rethinkdb.net$read_response.invoke(net.clj:29)
at rethinkdb.net$send_query.invoke(net.clj:42)
at rethinkdb.net$send_start_query.invoke(net.clj:55)
at rethinkdb.query$run.invoke(query.clj:378)
I got this twice and don't have good reproduction steps. Just curious maybe you know how that may happen.
This should work already by requiring macros when requiring rethinkdb.query
, but I haven't tested it properly yet.
Relates to #59.
There is a difference between arrays and streams that are returned. order-by with a non-indexed field will return an array which has been eagerly loaded. Streams are lazily transformed. This can be a bit confusing for users, and should be documented somewhere.
Relates to:
rethinkdb/rethinkdb#4530
rethinkdb/rethinkdb#2303
rethinkdb/rethinkdb#4115
I see the following when running this in 1.7:
WARNING: update already refers to: #'clojure.core/update in namespace: rethinkdb.query, being replaced by: #'rethinkdb.query/update
This may be a feature, rather than a bug, but it tripped me up (I guess because I converted code written with Revise, which does convert them to vectors). It seems to me that vectors would be more useful as they map more closely to js (and therefore RethinkDB) vectors.
I'd also prefer to avoid having to convert them in client code especially if they're nested in documents, although I understand if you prefer keeping them as seqs (especially since not all users will need this and it would therefore incur an unneeded cost).
In developing #55 we need to make a decision whether to use Manifold or core.async for our asynchronous API. We probably want to be able to provide a core.async API anyway as I imagine this will be the most common case, however there are good reasons to use Manifold, namely:
[It] can be used as a translation layer between libraries which use similar but incompatible abstractions.
What are peoples thoughts on this? I'm not familiar with Manifold. One potential downside is we gain extra dependencies on Riddley. I'm not sure how safe this is, I've had issues with Potemkin in the past.
@apa512 what's the versioning scheme for this library? It could be good if we followed either the RethinkDB protocol versions, or RethinkDB database versions. Semver is also fine too. What are your thoughts?
http://rethinkdb.com/api/javascript/default/
I need this mostly for when I'm looking for object properties that don't exist
In V0_4 of the spec, the Protocol buffer names for any
and all
have changed to or
and and
respectively. It looks like clj-rethinkdb has started using the V0_4 version of the spec (which should be backwards compatible with previous versions), however because the names have changed we will now be looking up an enum which no longer exists.
It looks like rethinkdb.query/any
and rethinkdb.query/all
aren't part of the tests so this wasn't picked up when the protobuf dependency changed.
One way of solving this is to deprecate any
and all
, and create new or
and and
functions which use the new term name, and then get any
and all
to call or
and and
. I'm happy to put together a PR to help on this.
git diff v1.16.0 v2.0.0 -- src/rdb_protocol/ql2.proto
diff --git a/src/rdb_protocol/ql2.proto b/src/rdb_protocol/ql2.proto
index 113ea7d..3ee6831 100644
--- a/src/rdb_protocol/ql2.proto
+++ b/src/rdb_protocol/ql2.proto
<SNIP...>
@@ -520,11 +546,9 @@ message Term {
// statement).
BRANCH = 65; // BOOL, Top, Top -> Top
// Returns true if any of its arguments returns true (short-circuits).
- // (Like `or` in most languages.)
- ANY = 66; // BOOL... -> BOOL
+ OR = 66; // BOOL... -> BOOL
// Returns true if all of its arguments return true (short-circuits).
- // (Like `and` in most languages.)
- ALL = 67; // BOOL... -> BOOL
+ AND = 67; // BOOL... -> BOOL
// Calls its Function with each entry in the sequence
// and executes the array of terms that Function returns.
FOR_EACH = 68; // Sequence, Function(1) -> OBJECT
<SNIP...>
I'm interested in peoples thoughts about modifying the structure of the Connection record.
From #50:
There's two things we could do to make this kind of bug not happen in the future:
:db
probably shouldn't be mutable.I was looking at the code and saw that in rethinkdb.types a bare import is used rather than putting it in the ns form. I'm aware there's some funkiness going on with the Protocol Buffers so I wondered if this was deliberate or just an oversight?
The RethinkDB drivers allow the user to specify a default DB context to execute queries in. It would be good to allow this, and probably worth adding a timeout option as well for connections.
Create a new connection to the database server. Accepts the following options:
- host: the host to connect to (default localhost).
- port: the port to connect on (default 28015).
- db: the default database (default test).
- authKey: the authentication key (default none).
- timeout: timeout period in seconds for the connection to be opened (default 20).
If the connection cannot be established, a RqlDriverError will be passed to the callback instead of a connection.
When nesting fn's, the name bindings of the outer fn get overwritten by the name bindings of the inner fn's and cannot be accessed in the inner fn's.
Nesting works as expected in the official javascript driver.
The test
table looks like this:
{:id "40cd4261-aa72-4427-877b-da45abe4142b" :field [{:name "foo"}]}
The javascript query looks like this:
r.table('test').get('40cd4261-aa72-4427-877b-da45abe4142b')('field')
.map(function(x){
return {a: x('name'),
b: r.expr(["x", "y", "z"]).map(function(y){
return x
})
}
})
And its output:
[
{
"a": "foo" ,
"b": [
{
"id": "afc9cb64-af3d-4cae-8993-fc5d35ebd6ee" ,
"name": "foo" ,
},
{
"id": "afc9cb64-af3d-4cae-8993-fc5d35ebd6ee" ,
"name": "foo" ,
},
{
"id": "afc9cb64-af3d-4cae-8993-fc5d35ebd6ee" ,
"name": "foo" ,
}
]
}
]
However, when I try to run this in Revise, referencing x
inside the nested lambda does not work.
The Clojure query:
(-> (r/db :test)
(r/table :test)
(r/get "40cd4261-aa72-4427-877b-da45abe4142b")
(r/get-field :field)
(r/map
(r/fn [x]
{:a (r/get-field x :name)
:b (-> ["x" "y" "z"]
(r/map
(r/fn [y]
x))})))
And its output:
[{:a "foo"
:b ["x" "y" "z"]}]
That is, the inner fn clobbers the binding of the outer one.
I reported the exact same issue on the Revise driver some months ago (issue #14), see the discussion there for more details. Basically what is happening is that RethinkDB variables are unique integer IDs, but the fn code https://github.com/apa512/clj-rethinkdb/blob/master/src/rethinkdb/query.clj#L9-L16 assigns integers starting at 0 for each fn - therefore both x and y in my code above are assigned 0 and RethinkDB treats them as the same variable.
In my own Revise code, I wrote my own lambda function which keeps track of the latest ID in a global atom and therefore each variable binding is given a unique integer. The "real" solution would be to have something like rethinkdb.query/run walk the AST before serializing to JSON and assigning ID's then - ie unique to each query, not globally. This would also allow it to reuse IDs that go out of scope (I don't know if this will help RethinkDB memory usage or not, I'm not sure how it handles variables internally, but it seems like a good idea to me).
Hi!
Wondering why this seemingly arbitrary call to Thread/sleep
is here: https://github.com/apa512/clj-rethinkdb/blob/master/src/rethinkdb/net.clj#L20.
Could @apa512 clarify this?
Thanks
RethinkDB has awesome documentation with function level documentation, extensive code samples, and topic level docs. It could be good to use their examples and documentation, but port it to the Clojure driver.
It could be cool to use something like https://github.com/cljsinfo/cljs-api-docs#manual-docs to tangle the docstrings with manual docs.
Thoughts?
The code, vaguely:
;; Listen to stream of table changes
(defn get-table-changes
"Get changes on a table"
[db-conn name]
(-> (r/db "db")
(r/table name)
(r/changes)
(r/run db-conn)))
(defn push-table-changes-to-pub
"Push changes from a lazy seq to a channel"
[chan table-name]
(let [db-conn (common/generate-new-db-connection)
seq-atom (atom (get-table-changes db-conn table-name))]
(loop [[f & r] @seq-atom]
(when-not (nil? f) (println "SEND:" f) (async/>! chan f))
(when-not (nil? @seq-atom) (recur r)))))
(you can ignore the fact that it's wrapped in an atom)
After a while listening, I get a stack overflow error, which is seemingly coming from here:
https://github.com/apa512/clj-rethinkdb/blob/master/src/rethinkdb/net.clj#L49-L54
Seems that the send gets into an infinite loop redoing send_query and eventually blows the stack. While it does make sense that it's doing send_query over and over again, maybe using a loop recur would be better (I'm assuming this is becoming an issue at all because of lack of TCO)
I'm not sure if that is actually an issue, so I'm just posting it for consideration. I am trying to insert data that contains (pre-generated) UUIDs. This produces the error:
Unhandled java.lang.Exception
Don't know how to write JSON of class java.util.UUID
json.clj: 385 clojure.data.json/write-generic
[the rest of the backtrace omitted for clarity]
Now, on one hand perhaps it was the intention to only handle strings (and perhaps numeric types) โ on the other hand, UUIDs are pretty ubiquitous, RethinkDB uses them internally as well. Should this be handled, or should I just convert my UUIDs to strings before trying to insert data?
I am trying to do an eq-join:
(-> (r/db db)
(r/table "parts")
(r/eq-join :id (db-table "stock") {:index :part-id})
(r/run conn))
โฆbut this produces a #<Cursor rethinkdb.net.Cursor@2c9b666e>
, which is not what I'd expect after reading http://www.rethinkdb.com/docs/table-joins/
Adding an (r/zip) just before (r/run conn) produces the expected result.
I am not sure if this is intended or not. It certainly is unexpected.
More generally, I could not get a number of things to work:
I think clj-rethinkdb could really benefit from more examples. I could actually help with writing documentation, but for the moment I can't figure out how many things should work. Perhaps someone could extend the basic example so that it does more things from the RethinkDB tutorials?
#55 brought up the issue of handling errors asynchronously.
Ideally, I would like if all errors (we need to investigate what errors can occur) should be put on an error channel. The error data should include:
To be able to "recover" from the error, two things need to exist:
It should be clearly documented what happens to the connections state (token, result channel) if the error is silently ignored.
Hi @apa512
Would it be possible to get commit rights for this project so I'm able to help more and get pull requests merged faster?
Thanks, Daniel.
When you try to pprint a Connection, the following error occurs:
IllegalArgumentException Multiple methods in multimethod 'simple-dispatch' match dispatch value: class rethinkdb.core.Connection -> interface clojure.lang.IDeref and interface clojure.lang.IPersistentMap, and neither is preferred
There may be the potential for performance speedups by writing the network code with Netty. Using this approach, along with handling byte streams rather than doing lots of string manipulation should speed things up.
for 383967a
Rather than create a cursor protocol, what do you think of using java.io.Closeable
?
I tried to implement a wrapper for connections that would allow you to use with-open
on them, but unfortunately the IAtom
interface was only introduced in Clojure 1.7.0.
(defn conn-atom
"Returns an atom-like wrapper around a connection that's closeable."
[conn-map]
(let [a (atom conn-map)]
(reify
Closeable
(close [_] (close a))
IDeref (deref [_] @a)
IAtom
(swap [_ f] (.swap a f))
(swap [_ f x] (.swap a f x))
(swap [_ f x y] (.swap a f x y))
(swap [_ f x y more] (.swap a f x y more))
(reset [_ new] (.reset a new))
(compareAndSet [_ old new]
(.compareAndSet a old new)))))
We should speed up the time to detect connection failures using TCP keepalive.
I can't find without
command implementation (http://www.rethinkdb.com/api/javascript/without/). Is it missing for now?
I have a query:
r.db('that_app__development')
.table('cards')
.filter(function (row) {
return row('created_at').add(row('duration')).gt(r.now());
})
That works correctly in rethinks data explorer. However, when I convert it to Clojure syntax:
(-> (r/db database-name)
(r/table "cards")
(r/filter (r/fn [row]
(r/gt (r/add (r/get-field row "created_at") (r/get-field row "duration")) (r/now))))
(r/run db))
All results get filtered out. Is their anything outstanding in the library that would:
a. Prevent date times from being added together.
b. Prevent date times from being compared using logic operators (the r/gt).
Apologies if this is a dumb error on my part.
Currently if an exception is thrown in the send-loop
(for example if a query is badly formatted), the send-loop will crash. It should probably get similar error handling to the recv-loop
.
Edit: The following pertains to [rethinkdb "0.9.39"]
I'm thinking this might be user error, but when I call a query like this pseudo-code:
(let [n 20]
(-> (r/db db)
(r/table "table")
(r/eq-join "other-table" join-fn)
(r/map map-fn)
;; ...
;; ...
(r/limit n)
(r/run conn)))
where n
is sufficiently large (seems like ๐ โ n, n > 10; and usually ๐ โ n, n < 10), I end up with a rethinkdb.net.Cursor
on which I'm not able to call distinct-by
. Instead, I get an error like:
UnsupportedOperationException nth not supported on this type: Cursor clojure.lang.RT.nthFrom (RT.java:874)
Can someone help me understand how I can get an ISeqable collection out of a Cursor
?
Or, help me adjust my queries so I get Clojure collections instead of Cursor
s?
Thanks!
The order-by query term uses rethinkdb.query-builder/parse-term
which in turn requires that we handle parse-arg
which has platform specific code, and rethinkdb.types
which uses protocol buffers. None of this is insurmountable, but there are a few hairy things to deal with. RethinkDB are looking at switching away from Protocol Buffers, so I'm not too inclined to spend a lot of time going down this path unless someone legitimately needs the order-by
query term in ClojureScript.
I'm also not quite sure how to write the checks for time and UUID. It seems that whatever instance of UUID's and dates we were checking would need to be used by library consumers too.
I'm going to put this on hammock time for now, but happy to look at it if it's important to someone.
Relates to #59.
I actually doubt this is any fault of your library even thought it's reporting the error in one of your files.
I'm getting the following error:
Unsupported major.minor version 51.0, compiling:(query_builder.clj:1:1)
when deploying to a Dokku container.Are you aware of anything in your library that might cause a dependency on a newer / older version of Java?
What are the plans for supporting v04 of the API and RethinkDB 2.0?
It would be good to build performance benchmarks to catch any performance regressions.
It looks like when I call get-all
with an index specified and include a key in the list that doesn't have any results in the database, I sometimes get a Cursor back and sometimes get a vector.
If I only ask for keys that exist in the index, I always get a vector.
Here's some code to recreate the issue:
(require '[rethinkdb.core :as rc])
(require '[rethinkdb.query :as r])
(defn recreate-bug []
(let [db-name (-> (str (java.util.UUID/randomUUID))
(clojure.string/replace "-" ""))]
(with-open [c (rc/connect)]
(r/run (r/db-create db-name) c)
(r/run (-> (r/db db-name)
(r/table-create "example")) c)
(let [table (-> (r/db db-name)
(r/table "example"))]
(-> (r/index-create table "by-race-id" (r/fn [row] (r/get-field row :race-id)))
(r/run c))
(-> (r/insert table {:test "document" :race-id "id"})
(r/run c))
(doseq [_ (range 50)]
(println
(-> (r/get-all table ["id" "cake"] {:index "by-race-id"})
(r/run c)
type)))))))
This produces output like this:
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
clojure.lang.PersistentVector
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
clojure.lang.PersistentVector
rethinkdb.net.Cursor
clojure.lang.PersistentVector
rethinkdb.net.Cursor
clojure.lang.PersistentVector
rethinkdb.net.Cursor
clojure.lang.PersistentVector
clojure.lang.PersistentVector
clojure.lang.PersistentVector
rethinkdb.net.Cursor
clojure.lang.PersistentVector
rethinkdb.net.Cursor
clojure.lang.PersistentVector
clojure.lang.PersistentVector
clojure.lang.PersistentVector
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
clojure.lang.PersistentVector
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
Is it possible to run async queries with a callback as a second arg? (Of course it's not implemented yet, just trying to figure out how I might hook in to this.)
Happy to help implement. Thanks!
Is it safe to assume clj-rethinkdb doesn't support RethinkDB 2.x.x yet?
On RethinkDB 2.0.3 with [rethinkdb "0.9.40"]
I get the following:
user=> (require '[rethinkdb.query :as r])
user=> clojure.core/eval core.clj: 3081
...
user/eval7750 form-init8720683755420544646.clj: 1
...
clojure.core/require core.clj: 5832
clojure.core/apply core.clj: 632
...
clojure.core/load-libs core.clj: 5749
clojure.core/apply core.clj: 632
...
clojure.core/load-lib core.clj: 5710
clojure.core/load-lib/fn core.clj: 5711
clojure.core/load-one core.clj: 5671
...
clojure.core/load core.clj: 5865
clojure.core/load/fn core.clj: 5866
...
rethinkdb.query/eval7754 query.clj: 1
rethinkdb.query/eval7754/loading--auto-- query.clj: 1
...
clojure.core/require core.clj: 5832
clojure.core/apply core.clj: 632
...
clojure.core/load-libs core.clj: 5749
clojure.core/apply core.clj: 632
...
clojure.core/load-lib core.clj: 5710
clojure.core/load-lib/fn core.clj: 5711
clojure.core/load-one core.clj: 5671
...
clojure.core/load core.clj: 5865
clojure.core/load/fn core.clj: 5866
...
java.lang.Class.forName Class.java: 270
java.lang.Class.forName0 Class.java
java.lang.ClassLoader.loadClass ClassLoader.java: 357
...
java.lang.ClassLoader.loadClass ClassLoader.java: 424
...
java.net.URLClassLoader.findClass URLClassLoader.java: 354
java.security.AccessController.doPrivileged AccessController.java
java.net.URLClassLoader$1.run URLClassLoader.java: 355
java.net.URLClassLoader$1.run URLClassLoader.java: 366
java.lang.ClassNotFoundException: rethinkdb.net
clojure.lang.Compiler$CompilerException: java.lang.ClassNotFoundException: rethinkdb.net, compiling:(rethinkdb/net.clj:63:19)
If this is the case, I'm happy to help with a PR for an update. Just want to confirm that's why I'm seeing what I'm seeing.
Cheers,
Sean
Currently, the error messages when connections fail are not very descriptive of the problem, it would be good to improve this.
Cheshire is much faster and better supported than clojure.data.json, it would be good to move to it.
if the rethnkdb server socket is down or disconnected, the assert here: https://github.com/apa512/clj-rethinkdb/blob/master/src/rethinkdb/net.clj#L45 throws before a java.net.ConnectException.
I found this a bit confusing in production as I wasn't sure if the db had been disconnected or what.
I thought I'd just implement them and send you a pull request, but after a quick look I'm not so sure โ I think it would be better to have all constants as constants, not functions, so that we can write
(r/between [1 r/maxval] {:index "my-index"})
instead of:
(r/between [1 (r/maxval)] {:index "my-index"})
It would be very useful if a connection pool could be created which can be connected to multiple servers in a cluster. Additionally, if a connection is disconnected due to a socket error, there should be an option to have the pool automatically attempt to reconnect.
I'm trying to work out how to use the rethinkdb.query.do
function without match luck.
Consider the following query
r.branch(r.table('foo').getAll('bar, {index: 'name'}).isEmpty(),
r.table("foo").insert({name: "bar"}, {returnChanges: true}).do(function(x) {
return x;
}),
{"exists": true}
)
I've tried something equivalent with clj-rethinkdb but haven't had much success
(r/branch (-> db
(r/table "foo")
(r/get-all [(:name data)] {:index "name"})
r/is-empty)
(-> db
(r/table "foo")
(r/insert data {:return-changes true})
(r/do (fn [x] x)))
{:exists true})
Obviously I would live to do additional processing in the do block, but event getting this example to work has not been possible.
How can I use (r/do) properly?
The following code snippet works:
(with-open [c (connect)]
(r/run (r/get-all
(r/table (r/db "mydb") "mytable")
["id1"]) c)) ;;also works with "id2"
The following code snippet throws a NPE:
(with-open [c (connect)]
(r/run (r/get-all
(r/table (r/db "mydb") "mytable")
["id1" "id2"]) c))
In order to get it to work with multiple ids, I have to coerce it to an array:
(with-open [c (connect)]
(r/run (r/coerce-to
(r/get-all
(r/table (r/db "racehub") "test")
["id1" "id2"])
"ARRAY") c))
I suspect this is because I'm running RethinkDB 2.0.1.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.