Giter Site home page Giter Site logo

Support batch operations about neography HOT 20 CLOSED

maxdemarzi avatar maxdemarzi commented on September 2, 2024
Support batch operations

from neography.

Comments (20)

maxdemarzi avatar maxdemarzi commented on September 2, 2024

I agree it needs to be done, the question is how do we implement this cleanly. Everything needs to go back to the one http request and not break backward compatibility. Thyphoeus is an alternative way of batching (minus the transaction of course), but doesn't work on JRuby and I want to keep neography platform agnostic if possible. It might be easier to implement at the lower level (the straight REST wrapper) first...

from neography.

dnagir avatar dnagir commented on September 2, 2024

I don't think Parallel requests are "real" alternative. It's kind of a workaround.
I look at them as "fire and forget" in case you don't care about the result (That's probably whole new feature :) )

Anyway, my thought about the implementation was to:

  1. Always return proxy objects.
  2. Do not execute HTTP requests when we're in a batch mode.
  3. Do not block, unless absolutely necessary

Using the pseudo-code for implementing create_node this way

def create_node *args
  request_data = {:method => 'POST', :data => {...}}
  response = process_request(request_data) unless batching?
  response = batch_request(request_data) if batching?
  wrap_result_with_node response
end

class Node
  def member_missing(name, *whatever)
    return wait_until_ready!(name, *whatever)
  end
end

# Then:
n = db.create_node "name" => "me" # This posts but returns straight away without waiting
n.name # We still can use the object as we have the data locally

n.load 123 # Does the GET, but doesn't block
n.name # The data is not available locally, so we'll block here and wait until it comes in

# But:
n, n1 = nil, nil
db.batch do
  n = db.create_node "name" => "me" # This just adds the request to the current batch without posting
  n.name # We have this locally, so no problem

  n1.load 123
  n1.name # Here we probably need to throw as the value will never be available and would block forever

  # But we can easily reference the nodes/rels within the batch:
  db.create_relationship 'cant_think', n, n # This will use the batch number - {jobId} syntax of the API
  # And of course it won't send any requests, it'll just a add new data to the batch
end

# Here it will execute the batch, but won't block
n.1.name # And now, if the batch hasn't completed yet, it blocks and waits for the result

All this is somewhat idealistic and just came from top of my head.

But at the very minimum something like this would work:

db.batch [n1 = Node.new, n2 = Node.new, Relationship.new("kung-fu", n1, n2), existing_node_would_update, existing_node.batch_delete]

Sorry for the long comment. Just thought it may be useful if I'll put my thoughts down on "paper".

from neography.

dnagir avatar dnagir commented on September 2, 2024

Probably doing everything through the batch would be another option. Not sure about the drawbacks though.

from neography.

dnagir avatar dnagir commented on September 2, 2024

FYI,

It seems there's nothing wrong to always use batching http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-REST-API-always-batch-tp3550568p3550568.html

from neography.

maxdemarzi avatar maxdemarzi commented on September 2, 2024

Jake kind of put a downer on this idea... http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-REST-API-always-batch-tp3550568p3551368.html

  1. There are limits to how large the results can be from a batch API call.
    The batch API supports streaming deserialization of it's input, but output
    is still created as a giant string in memory. It is reasonably easy to hit
    the limits of how large the batch API results can be when you start
    batching requests.
  2. We don't want it to be the primary API. We have lots of ideas for how we
    can improve the current REST API (add full transactional support, massively
    improve throughput and latency), and so we would rather go down that road.
    The batch API is a pragmatic solution to a big problem, but I don't think
    it is a good long-term path to take.

Why don't we go simpler.

Figure out the most likely used batch operations, and just implement those.

  • CRUD +Index a node
  • CRUD +Index a relationship
  • ???

from neography.

dnagir avatar dnagir commented on September 2, 2024

Maybe at this moment just the basic way of batching will do the job. And later update as more use-cases will come in?

Maybe just something like:

db.batch [
  :relationship => {:id => 1, :body => {same params as for relationship} },
  :node => { :id => 1, :body => { same params as for node},
  :index => {:id => 2, :body => {same params as for index}
]

from neography.

marbemac avatar marbemac commented on September 2, 2024

Hi guys, I'm very interested in this. Any updates on batch operations with neography? Do either of you have it working (with any of the methods mentioned in this thread)? Thanks!

from neography.

maxdemarzi avatar maxdemarzi commented on September 2, 2024

I can whip something up this week. Do you just need inserts or what do you want it to do?

from neography.

marbemac avatar marbemac commented on September 2, 2024

That would be fantastic. The main thing I'm using it for is creating nodes and relationships. Here's how it works:

  • User submits a post that mentions a few topics (posts and topics are nodes).
  • Create the node for the post
  • Add the node to an index
  • Create 'created' relationship between user and the post
  • Add this relationship to an index
  • Create relationships between the post and every topic it mentions (each topic is already a node)
  • Add these new relationships to an index
  • Update (read and then update) a weight property on relationships between the user and the mentioned topics (indicates that the user is more 'interested' in the topics they are mentioning).

So basically creating and reading nodes, relationships, and node/relationship indexes (deleting would be nice also but is not necessary for this particular function).

Let me know if you have any questions or if I can help! Thanks again.

from neography.

maxdemarzi avatar maxdemarzi commented on September 2, 2024

Ok... I'm going to implement this very naively and then we can talk about it and refactor it.

from neography.

maxdemarzi avatar maxdemarzi commented on September 2, 2024

Ok it's in version 0.0.19 Take a look at the specs => https://github.com/maxdemarzi/neography/blob/master/spec/integration/rest_batch_spec.rb and the implementation starting at line 355 of https://github.com/maxdemarzi/neography/blob/master/lib/neography/rest.rb

from neography.

marbemac avatar marbemac commented on September 2, 2024

Great thanks a lot! I'll check it out over the next few days.

from neography.

marbemac avatar marbemac commented on September 2, 2024

@maxdemarzi ok it looks great. I am having an issue with create_relationship though, even though everything seems correct. I added get_node_by_index to my local neography branch and am trying to do the following batch operation:

self.neo.batch [:get_node_by_index, 'users', "uuid", node1_id], [:get_node_by_index, 'topics', "uuid", node2_id], [:create_relationship, "follow", "{0}", "{1}"], [:add_relationship_to_index, "users", "follow", "#{node1_id}-#{node2_id}", "{2}"]

get_node_by_index works fine, but the relationship is never created. Below is the body sent in the batch request. Everything looks fine to me, but I might be missing something. Maybe this is a neo4j issue??

[{"id":0,"method":"GET","to":"/index/node/users/uuid/4eb9cda1cddc7f4068000042"},{"id":1,"method":"GET","to":"/index/node/topics/uuid/4f0a2fc1cddc7f18f900031a"},{"id":2,"method":"POST","to":"{0}/relationships","body":{"to":"{1}","type":"follow","data":null}},{"id":3,"method":"POST","to":"/index/relationship/users","body":{"uri":"{2}","key":"follow","value":"4eb9cda1cddc7f4068000042-4f0a2fc1cddc7f18f900031a"}}]

from neography.

maxdemarzi avatar maxdemarzi commented on September 2, 2024

I wonder if it doesn't create the relationship because we could end up with multiple nodes with these index requests, so it wouldn't know to apply it to one or all, or all to all (SQL CROSS JOIN). Let me look up the docs.

from neography.

maxdemarzi avatar maxdemarzi commented on September 2, 2024

Hum... I think you can only reference things you Create, not Get.
http://docs.neo4j.org/chunked/snapshot/rest-api-batch-ops.html

"18.13.2. Refer to items created earlier in the same batch job
The batch operation API allows you to refer to the URI returned from a created resource in subsequent job descriptions, within the same batch call."

I think you'll have to do this in two calls. Get the 2 nodes in a batch, then create the relationship.

from neography.

marbemac avatar marbemac commented on September 2, 2024

Ah ok that's too bad. Ok I'll continue giving this a go and submit a pull request for a couple of additions shortly. Thanks.

from neography.

marbemac avatar marbemac commented on September 2, 2024

@maxdemarzi do you know how to get the node id of a node retrieved through an index? I can't seem to figure out how to do it from the docs, and it's not immediately apparent in the response data. I'm trying to get the two nodes via an index and then pass the node ids of those two nodes in to the create relationship call.

from neography.

maxdemarzi avatar maxdemarzi commented on September 2, 2024

You should get an array with a hash:

[{"extensions"=>{}, "paged_traverse"=>"http://localhost:7474/db/data/node/1153/paged/traverse/{returnType}{?pageSize,leaseTime}", "self"=>"http://localhost:7474/db/data/node/1153", "property"=>"http://localhost:7474/db/data/node/1153/properties/{key}", "data"=>{}, "incoming_typed_relationships"=>"http://localhost:7474/db/data/node/1153/relationships/in/{-list|&|types}", "indexed"=>"http://localhost:7474/db/data/index/node/test_node_index/jpapfx/yvmvnwpa/1153", "outgoing_typed_relationships"=>"http://localhost:7474/db/data/node/1153/relationships/out/{-list|&|types}", "incoming_relationships"=>"http://localhost:7474/db/data/node/1153/relationships/in", "all_relationships"=>"http://localhost:7474/db/data/node/1153/relationships/all", "create_relationship"=>"http://localhost:7474/db/data/node/1153/relationships", "traverse"=>"http://localhost:7474/db/data/node/1153/traverse/{returnType}", "properties"=>"http://localhost:7474/db/data/node/1153/properties", "all_typed_relationships"=>"http://localhost:7474/db/data/node/1153/relationships/all/{-list|&|types}", "outgoing_relationships"=>"http://localhost:7474/db/data/node/1153/relationships/out"}]

Then you can just grab the id form the self url.

result.first.["self"].split('/').last

from neography.

maxdemarzi avatar maxdemarzi commented on September 2, 2024

Marc,

Any update on those additions you wanted?

Thanks,
Max

from neography.

maxdemarzi avatar maxdemarzi commented on September 2, 2024

Closing.

from neography.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.