Giter Site home page Giter Site logo

neo4j-contrib / neo4j_doc_manager Goto Github PK

View Code? Open in Web Editor NEW

This project forked from hannelita/neo4j_doc_manager

99.0 22.0 26.0 8.34 MB

Doc manager for Neo4j

License: Apache License 2.0

Python 100.00%
mongodb mongodb-support neo4j importer oplog

neo4j_doc_manager's Introduction

Overview

The Neo4j Doc Manager takes MongoDB documents and makes it easy to query them for relationships by making them available in a Neo4j graph structure, following the format specified by Mongo Connector. It is intended for live one-way syncronization from MongoDB to Neo4j, where you have both databases running and take advantage of each databases' strength in your application (polyglot persistance).

Note

The software in this repository is provided AS IS, with no guarantees of any kind.

Installing

You must have Python installed in order to use this project. Python 3 is recommended.

First, install neo4j_doc_manager with pip:

pip install neo4j-doc-manager

(You might need sudo privileges).

Refer to this document for more information if you experience any difficulties installing with pip.

Using Neo4j Doc Manager

Ensure that you have a Neo4j instance up and running. If you have authentication enabled (version 2.2+) for Neo4j, be sure to set NEO4J_AUTH environment variable, containing your user and password.

export NEO4J_AUTH=user:password

Ensure that mongo is running a replica set. To initiate a replica set start mongo with:

mongod --replSet myDevReplSet

Then open mongo-shell and run:

rs.initiate()

Please refer to Mongo Connector FAQ for more information.

Start the mongo-connector service with the following command:

mongo-connector -m localhost:27017 -t http://localhost:7474/db/data -d neo4j_doc_manager

-m provides Mongo endpoint -t provides Neo4j endpoint. Be sure to specify the protocol (http). -d specifies Neo4j Doc Manager.

Data synchronization

With the neo4j_doc_manager service running, any documents inserted into mongo will be converted into a graph structure and immediately inserted into Neo4j as well. Neo4j Doc Manager will turn keys into graph nodes. Nested values on each key will become properties.

To see this in action, consider the following document:

{
  "session": {
    "title": "12 Years of Spring: An Open Source Journey",
    "abstract": "Spring emerged as a core open source project in early 2003 and evolved to a broad portfolio of open source projects up until 2015."
  },
  "topics":  ["keynote", "spring"],
  "room": "Auditorium",
  "timeslot": "Wed 29th, 09:30-10:30",
  "speaker": {
    "name": "Juergen Hoeller",
    "bio": "Juergen Hoeller is co-founder of the Spring Framework open source project.",
    "twitter": "https://twitter.com/springjuergen",
    "picture": "http://www.springio.net/wp-content/uploads/2014/11/juergen_hoeller-220x220.jpeg"
  }
}

Insert the following document into mongo using the mongo-shell:

db.talks.insert(  { "session": { "title": "12 Years of Spring: An Open Source Journey", "abstract": "Spring emerged as a core open source project in early 2003 and evolved to a broad portfolio of open source projects up until 2015." }, "topics":  ["keynote", "spring"], "room": "Auditorium", "timeslot": "Wed 29th, 09:30-10:30", "speaker": { "name": "Juergen Hoeller", "bio": "Juergen Hoeller is co-founder of the Spring Framework open source project.", "twitter": "https://twitter.com/springjuergen", "picture": "http://www.springio.net/wp-content/uploads/2014/11/juergen_hoeller-220x220.jpeg" } } );

This document will be converted to a graph structure and immediately inserted into Neo4j:

Refer to this document for more information and examples.

Resources

neo4j_doc_manager's People

Contributors

hannelita avatar jexp avatar johnymontana avatar neo4j-oss-build avatar ryguyrg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

neo4j_doc_manager's Issues

Specify mongo_connector as dependency in setup.py

Current installation process as described in README is to install mongo-connector with pip before installing neo4j_doc_manager. Is there a reason mongo-connector is not included as a dependency in setup.py install_requires? This would remove the requirement to first install mongo-connector with pip.

bulk upsert fails if oplog.timestamp is deleted

In the official documentation, it says that deleting the timestamp should be possible. But if I start neo4j doc manager once, stop it, delete the timestamp, and start it again, I get the following error:

mongo-connector -m $MONGODB -t $NEO4JDB -d $NEO4JDOCMANAGER


 2016-07-08 07:30:53,976 [CRITICAL] mongo_connector.oplog_manager:625 - Exception during collection dump
Traceback (most recent call last):
  File "/usr/local/lib/python3.4/site-packages/mongo_connector/util.py", line 32, in wrapped
    return f(*args, **kwargs)
  File "/usr/local/lib/python3.4/site-packages/mongo_connector/doc_managers/neo4j_doc_manager.py", line 89, in bulk_upsert
    tx.commit()
  File "/usr/local/lib/python3.4/site-packages/py2neo/cypher/core.py", line 333, in commit
    return self.post(self.__commit or self.__begin_commit)
  File "/usr/local/lib/python3.4/site-packages/py2neo/cypher/core.py", line 288, in post
    raise self.error_class.hydrate(error)
py2neo.cypher.error.schema.ConstraintViolation: Node 0 already exists with label Test and property "_id"=[577ccc5e39a414a3d7d17171]

transaction block size during bulk upsert

How bulk upsert in neo4j_doc_manager make sure that transaction block doesn't have more than 1000 transaction ? I'm using neo4j with mongodb, during dumping process bulk upsert of neo4j_doc_manager keeping pushing the transaction and try to commit far more than 1000 transaction which ultimately raise server time out or bad status exception.

Heroku deploy help

This is just a general help question. I'm trying to deploy this onto heroku but am having now success. I'm pretty sure the app is running fine because I tested the same settings locally and it worked. However once I deploy it to heroku I get this:

2017-09-13T18:48:01.306884+00:00 heroku[worker.1]: State changed from starting to up
2017-09-13T18:48:04.125965+00:00 app[worker.1]: Logging to /app/mongo-connector.log.
2017-09-13T18:48:07.626384+00:00 heroku[worker.1]: State changed from up to crashed
2017-09-13T18:48:07.610283+00:00 heroku[worker.1]: Process exited with status 0

I'm not sure if it's the app related or heroku. I'm thinking I'm just missing a small configuration. I'm hoping someone could help me with this.

cannot create a pipeline between MongoDB and NEO4j

after typing " mongo-connector -m localhost:27017 -t http://localhost:7474/db/data -d neo4j_doc_manager " I get following error,
Exception in thread Thread-2:
Traceback (most recent call last):
File "/Users/gauravvashisth/anaconda/lib/python3.5/threading.py", line 914, in _bootstrap_inner
self.run()
File "/Users/gauravvashisth/anaconda/lib/python3.5/site-packages/mongo_connector/util.py", line 104, in wrapped
func(*args, **kwargs)
File "/Users/gauravvashisth/anaconda/lib/python3.5/site-packages/mongo_connector/oplog_manager.py", line 267, in run
docman.upsert(doc, ns, timestamp)
File "/Users/gauravvashisth/anaconda/lib/python3.5/site-packages/mongo_connector/doc_managers/neo4j_doc_manager.py", line 66, in upsert
tx.commit()
File "/Users/gauravvashisth/anaconda/lib/python3.5/site-packages/py2neo/cypher/core.py", line 306, in commit
return self.post(self.__commit or self.__begin_commit)
File "/Users/gauravvashisth/anaconda/lib/python3.5/site-packages/py2neo/cypher/core.py", line 261, in post
raise self.error_class.hydrate(error)
File "/Users/gauravvashisth/anaconda/lib/python3.5/site-packages/py2neo/cypher/error/core.py", line 54, in hydrate
error_cls = getattr(error_module, title)
AttributeError: module 'py2neo.cypher.error.schema' has no attribute 'ConstraintValidationFailed'

I have installed py2neo 2.0.7

Error in sync with neo4j while updating mongodb

Hi,

I'm encountering this error while I'm updating fields in mongo.

Traceback (most recent call last): File "/Users/prakritidevvema/anaconda3/envs/rk/lib/python3.6/threading.py", line 916, in _bootstrap_inner self.run() File "/Users/prakritidevvema/anaconda3/envs/rk/lib/python3.6/site-packages/mongo_connector/util.py", line 107, in wrapped func(*args, **kwargs) File "/Users/prakritidevvema/anaconda3/envs/rk/lib/python3.6/site-packages/mongo_connector/oplog_manager.py", line 298, in run entry["o2"]["_id"], entry["o"], ns, timestamp File "/Users/prakritidevvema/anaconda3/envs/rk/lib/python3.6/site-packages/mongo_connector/doc_managers/neo4j_doc_manager.py", line 81, in update tx.commit() File "/Users/prakritidevvema/anaconda3/envs/rk/lib/python3.6/site-packages/py2neo/cypher/core.py", line 333, in commit return self.post(self.__commit or self.__begin_commit) File "/Users/prakritidevvema/anaconda3/envs/rk/lib/python3.6/site-packages/py2neo/cypher/core.py", line 288, in post raise self.error_class.hydrate(error) File "/Users/prakritidevvema/anaconda3/envs/rk/lib/python3.6/site-packages/py2neo/cypher/error/core.py", line 54, in hydrate error_cls = getattr(error_module, title) AttributeError: module 'py2neo.cypher.error.statement' has no attribute 'SyntaxError'

Logging to mongo-connector.log. waits infinitely....................................

Hi,
I have installed neo4j doc manager as per the document. When I try to sync my mongodb data using the below command it waits infinitely...

Python35-32>mongo-connector -m l
ocalhost:27017 -t http://localhost:7474/db/data -d neo4j_doc_manager
Logging to mongo-connector.log.

The content of mongo-connector.log is as follows:

2016-02-26 19:10:11,809 [ERROR] mongo_connector.doc_managers.neo4j_doc_manager:70 - Bulk

The content of oplog.timestamp is as follows:

["Collection(Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True, replicaset='myDevReplSet'), 'local'), 'oplog.rs')", 6255589333701492738]

Please help ASAP..

Create relationship when subdocument contains _id property

Currently a naming convention of xxx_id (where xxx is a collection name) is used to identify relationships across collections. For example, consider a document in a the customers collection:

{
   "_id": ObjectId(1234),
   "name": "Bob Loblaw",
   "purchases": [
      {
         "products_id": ObjectId(xxx),
         "quantity": 2,
         "date": "03/23/2015"
   }
}

This would create a relationship to a node representing a document from the products collection because we have a property key of products_id.

However, we should also support creating relationships from subdocuments where the key _id is used. For example, we should observe the same behavior with this document:

{
   "_id": ObjectId(1234),
   "name": "Bob Loblaw",
   "purchases": [
      {
         "_id": ObjectId(xxx),
         "quantity": 2,
         "date": "03/23/2015"
   }
}

Prevent automatic sync with Mongo upon startup

This is a really great tool, and it potentially has a tremendous future. I am having a problem though, and I am not sure what is going on. Whenever I startup the doc manager, it starts mass inserting data into neo4j automatically. I have run a series of insert commands in mongo in the past, but nothing is currently getting inserted into mongo. I even deleted everything from mongo, and it still inserts all of this data into neo4j upon startup. Is it intended that the doc manager should auto sync the neo4j database with the existing mongo database (even if there is nothing in the mongo database)? If so, can this functionality be disabled? It is really annoying and I will not be able to use this tool unless there is some way to turn off this feature, or if I can delete insertion history from the mongo database. Thanks for your help!

5.0b1 py2neo and 4.1.1 .. a quick fix ? do I back to a older py2neo version ? doc manager connect issues- Customer on my back

leveridge@leveridge-PowerEdge-R710:$ py2neo version
5.0b1
leveridge@leveridge-PowerEdge-R710:
$ mongo-connector -m localhost:27017 -t http://localhost:7474/db/data -d neo4j_doc_manager
Traceback (most recent call last):
File "/home/leveridge/.local/lib/python3.6/site-packages/mongo_connector/connector.py", line 1098, in import_dm_by_path
module = import(package, fromlist=(package,))
File "/home/leveridge/.local/lib/python3.6/site-packages/mongo_connector/doc_managers/neo4j_doc_manager.py", line 16, in
from py2neo import Graph, authenticate
File "/home/leveridge/.local/lib/python3.6/site-packages/py2neo/init.py", line 19, in
from py2neo.data import *
File "/home/leveridge/.local/lib/python3.6/site-packages/py2neo/data/init.py", line 25, in
from py2neo.collections import is_collection, SetView, PropertyDict
ImportError: cannot import name 'PropertyDict'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/leveridge/.local/bin/mongo-connector", line 11, in
sys.exit(main())
File "/home/leveridge/.local/lib/python3.6/site-packages/mongo_connector/util.py", line 107, in wrapped
func(*args, **kwargs)
File "/home/leveridge/.local/lib/python3.6/site-packages/mongo_connector/connector.py", line 1409, in main
conf.parse_args()
File "/home/leveridge/.local/lib/python3.6/site-packages/mongo_connector/config.py", line 125, in parse_args
option, dict((k, values.get(k)) for k in option.cli_names)
File "/home/leveridge/.local/lib/python3.6/site-packages/mongo_connector/connector.py", line 1122, in apply_doc_managers
DocManager = import_dm_by_name(dm["docManager"])
File "/home/leveridge/.local/lib/python3.6/site-packages/mongo_connector/connector.py", line 1089, in import_dm_by_name
return import_dm_by_path(full_name)
File "/home/leveridge/.local/lib/python3.6/site-packages/mongo_connector/connector.py", line 1109, in import_dm_by_path
"vailable doc managers. ImportError:\n%s" % (package, exc)
mongo_connector.errors.InvalidConfiguration: Could not import mongo_connector.doc_managers.neo4j_doc_manager. It could be that this doc manager has been moved out of this project and is maintained elsewhere. Make sure that you have the doc manager installed alongside mongo-connector. Check the README for a list of available doc managers. ImportError:
cannot import name 'PropertyDict'

Allow attributes to be `$unset`/assigned directly in updates

In a MongoDB update specification, there can be three different ways to update a field within a document:

  1. Using the $set operator, e.g. {"$set": {"field": "value"}} (sets "field" to "value")
  2. Using the $unset operator, e.g. {"$unset": "field"} (causes this field to be removed)
  3. Direct assignment, e.g. {"field": "value"} (basically the same as (1) above).

It looks like the update method might only handle the first case.

Honestly, I haven't worked with neo4j before, so I don't know what the syntax is for updating a document; maybe there isn't even a notion of un-setting attributes. If that's the case, then ignore this.

Otherwise, you might want to look into grabbing values from $unset and checking for direct assignments contained in update specs pulled from the MongoDB oplog. This can get a bit hairy to do correctly, so most DocManagers packaged with mongo-connector use the apply_update method for this purpose (this method assumes the document is retrieved from the remote end as a Python dict). Here's what Solr does (document is retrieved from the remote end as a flattened Python dict): https://github.com/mongodb-labs/mongo-connector/blob/master/mongo_connector/doc_managers/solr_doc_manager.py#L191-L222

Another strategy is to grab the latest version of the document from MongoDB, using its _id. This doesn't really have the same affect as applying the update, since the current state of the document in MongoDB may not be the expected state of the document given how much oplog has been applied. But perhaps this can be considered an approximation that's "good enough."

You can borrow some test cases from here to inspect what this DocManager does in various scenarios with updates: https://github.com/mongodb-labs/mongo-connector/blob/master/tests/test_elastic.py#L164-L205

Do I use pip or pip3 ? to install mongo-connect ? I have 2.7 and 3.6 on my ubuntu1804

but.. when i install with pip it say it successfully installed mongo conector 2.7 i am HOPING it is using python 3.6 and not 2.7

pls advise if any details missing from your instructions on the python version and env enviroment setup/ paths to ensure that only py3 libs are used.. but i think 3.6 uses alot of python2 libs anyway

thoughts?

AttributeError: 'Graph' object has no attribute 'cypher'

I running neo4j_doc_manager with
mongo-connector -m 192.168.1.188:27017 -t http://127.0.0.1:7474/db/uadb -d neo4j_doc_manager, and the error throw :

2016-03-23 17:50:35,161 [ERROR] mongo_connector.doc_managers.neo4j_doc_manager:70 - Bulk
2016-03-23 17:50:35,161 [ERROR] mongo_connector.doc_managers.neo4j_doc_manager:70 - Bulk
2016-03-23 17:50:35,161 [ERROR] mongo_connector.doc_managers.neo4j_doc_manager:70 - Bulk
2016-03-23 17:50:35,161 [ERROR] mongo_connector.doc_managers.neo4j_doc_manager:70 - Bulk
2016-03-23 18:00:58,046 [ERROR] mongo_connector.util:87 - Fatal Exception
Traceback (most recent call last):
  File "e:\pythonworkspace\venv\neo4j.venv\lib\site-packages\mongo_connector\util.py", line 85, in wrapped
    func(*args, **kwargs)
  File "e:\pythonworkspace\venv\neo4j.venv\lib\site-packages\mongo_connector\oplog_manager.py", line 268, in ru
    ns, timestamp)
  File "e:\pythonworkspace\venv\neo4j.venv\lib\site-packages\mongo_connector\doc_managers\neo4j_doc_manager.py", line 74, in update
    tx = self.graph.cypher.begin()
AttributeError: 'Graph' object has no attribute 'cypher'

any other lib I have to install?

Add license

What license should this be released under?

Expand on AS-IS / alpha stata language in README

Unicode support in python 2.7 - UnicodeEncodeError during insert

2015-10-08 15:10:24,110 [CRITICAL] mongo_connector.oplog_manager:543 - Exception during collection dump
Traceback (most recent call last):
  File "/Library/Python/2.7/site-packages/mongo_connector/oplog_manager.py", line 495, in do_dump
    upsert_all(dm)
  File "/Library/Python/2.7/site-packages/mongo_connector/oplog_manager.py", line 479, in upsert_all
    dm.bulk_upsert(docs_to_dump(namespace), mapped_ns, long_ts)
  File "/Users/lyonwj/neotechnology/devtmp/neo4j_doc_manager/mongo_connector/doc_managers/neo4j_doc_manager.py", line 78, in bulk_upsert
    builder = NodesAndRelationshipsBuilder(doc, doc_type, doc_id)
  File "/Users/lyonwj/neotechnology/devtmp/neo4j_doc_manager/mongo_connector/doc_managers/nodes_and_relationships_builder.py", line 17, in __init__
    self.build_nodes_query(doc_type, doc, doc_id)
  File "/Users/lyonwj/neotechnology/devtmp/neo4j_doc_manager/mongo_connector/doc_managers/nodes_and_relationships_builder.py", line 30, in build_nodes_query
    self.build_nodes_query(key, document[key], id)
  File "/Users/lyonwj/neotechnology/devtmp/neo4j_doc_manager/mongo_connector/doc_managers/nodes_and_relationships_builder.py", line 29, in build_nodes_query
    self.build_relationships_query(doc_type, key, id, id)
  File "/Users/lyonwj/neotechnology/devtmp/neo4j_doc_manager/mongo_connector/doc_managers/nodes_and_relationships_builder.py", line 82, in build_relationships_query
    statement = "MATCH (a:`{main_type}`), (b:`{node_type}`) WHERE a._id={{doc_id}} AND b._id ={{explicit_id}} CREATE (a)-[r:`{relationship_type}`]->(b)".format(main_type=main_type, node_type=node_type, relationship_type=relationship_type)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xeb' in position 16: ordinal not in range(128)

Unicode handling is built-in in python 3.4 so this error only occurs with Python 2.7.

To reproduce, using Python 2.7 with mongo connector running use mongoimport to import this collection dump

Indexing the nodes without me wanting it

The problem is that when i first push my document to mongo it contains an array.
E.g:
'Person' :[{name: 'A', age: 'B'},{name: 'C', age: 'D'}]

It looks fine in the mongodb but in my neo4j graph the nodes are not named 'Person',
instead they are index so their name is 'Person0', and 'Person1'.

I've try to see if there is someway to fix it but I can't find any. Is this a built-in feature och am I able to change it in anyway?

neo4j_doc_manager crashed while creating constraint

image
I checked the log when neo4j_doc_manager crashed and I found it is weird that when it creates a constraint, it will reconnect and then time out. I don't know whether is the cypher cause the problem or something else? Pls help me ,thx!

Could not import mongo_connector.doc_managers.neo4j_doc_manager.

Hi,
when I run the following command:
mongo-connector -m localhost:27017 -t http://localhost:7474/db/data -d neo4j_doc_manager

displays the following error:
No handlers could be found for logger "mongo_connector.util"
Traceback (most recent call last):
File "/usr/local/bin/mongo-connector", line 9, in
load_entry_point('mongo-connector==2.3', 'console_scripts', 'mongo-connector')()
File "/usr/local/lib/python2.7/dist-packages/mongo_connector-2.3-py2.7.egg/mongo_connector/util.py", line 85, in wrapped
func(_args, *_kwargs)
File "/usr/local/lib/python2.7/dist-packages/mongo_connector-2.3-py2.7.egg/mongo_connector/connector.py", line 1041, in main
conf.parse_args()
File "/usr/local/lib/python2.7/dist-packages/mongo_connector-2.3-py2.7.egg/mongo_connector/config.py", line 118, in parse_args
option, dict((k, values.get(k)) for k in option.cli_names))
File "/usr/local/lib/python2.7/dist-packages/mongo_connector-2.3-py2.7.egg/mongo_connector/connector.py", line 824, in apply_doc_managers
module = import_dm_by_name(dm['docManager'])
File "/usr/local/lib/python2.7/dist-packages/mongo_connector-2.3-py2.7.egg/mongo_connector/connector.py", line 814, in import_dm_by_name
"vailable doc managers." % full_name)
mongo_connector.errors.InvalidConfiguration: Could not import mongo_connector.doc_managers.neo4j_doc_manager. It could be that this doc manager has been moved out of this project and is maintained elsewhere. Make sure that you have the doc manager installed alongside mongo-connector. Check the README for a list of available doc managers.

software
error

InvalidSyntax: Invalid input '{': expected whitespace, comment or a label name

I am using mongo-connector to do the initial bulk_upsert operation between MongoDB and Neo4J. At some point while querying with py2neo, the InvalidSyntax exception is occurring due to which nothing is being inserted into graph database. I believe the issue lies somewhere in the DocManager during syntax translations (which is why I'm raising the issue here). I am running py2neo v2.0.8 and Neo4J v2.3.1.
Here is the detailed stack trace:

Exception in thread Thread-2:
Traceback (most recent call last):
  File "//anaconda/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "//anaconda/lib/python2.7/site-packages/mongo_connector/util.py", line 85, in wrapped
    func(*args, **kwargs)
  File "//anaconda/lib/python2.7/site-packages/mongo_connector/oplog_manager.py", line 256, in run
    docman.upsert(doc, ns, timestamp)
  File "//anaconda/lib/python2.7/site-packages/mongo_connector/doc_managers/neo4j_doc_manager.py", line 66, in upsert
    tx.commit()
  File "//anaconda/lib/python2.7/site-packages/py2neo/cypher/core.py", line 333, in commit
    return self.post(self.__commit or self.__begin_commit)
  File "//anaconda/lib/python2.7/site-packages/py2neo/cypher/core.py", line 288, in post
    raise self.error_class.hydrate(error)
InvalidSyntax: Invalid input '{': expected whitespace, comment or a label name (line 1, column 20 (offset: 19))
"MERGE (d:Document: { _id: {parameters}._id})"

This is the first time I'm raising an issue on Git so please go easy on me :)

Johnny.. cant put replSet as flag onto sudo systemctl start mongod,, for mongo4.4 fire up.. its invalid flag

I also tried editing the replication.replSetName in mongod.conf file to "rs0" for the rep set.. it fired up BUT I could not even list the dbs in the mongo shell .. but it seems rs0 was there

have all other bits installed and a neo desktop running.. just not sure if mongo is happy ?? as we cannot recreatre your commands as we HAVE to use sudo systemctl start mongod to fire up mongo

ubuntu 1804

any clues ? thanks a million mate.

Handle ObjectId property value (and arrays of ObjectIds) properly

Originally reported on StackOverflow here:

2016-08-02 18:43:07,881 [ERROR] mongo_connector.oplog_manager:282 - Unable to process oplog document {u'h': 82402292097390737L, u'ts': Timestamp(1470143587, 1), u'o': {u'deviceName': u'iphone', u'countryCode': u'+91', u'degreeIds': [ObjectId('56f22b6b3ec80d233fb0f45d')], u'userType': u'DOCTOR', u'practiceIds': [ObjectId('576566288006599059170496')], u'noShowCount': 0, u'timezone': u'asia/kolkata', u'categoryIds': [], u'originalProfilePicture': u'http://bucketname.s3.amazonaws.com/doctor/profilePicture/Profile_57a0986720820b161c932287.png?timestamp=1470142568822', u'patientsRated': 0, u'collegeIds': [ObjectId('56f22b573ec80d233fb0f45c')], u'loginCount': 1, u'clinics': [], u'blockedByDoctors': [], u'phoneNumber': u'1113040410', u'blockedDoctors': [], u'doctorRefNum': 1600001722L, u'deviceType': u'IOS', u'totalRating': 0, u'email': u'[email protected]', u'registrationDate': datetime.datetime(2016, 8, 2, 12, 56, 8, 765000), u'thumbProfilePicture': u'http://bucketname.s3.amazonaws.com/doctor/profilePicture/thumb/Thumb_Profile_57a0986720820b161c932287.png?timestamp=1470142568822', u'profileStatus': u'PROFILE_DETAILS', u'consultationCharges': 200, u'usersRated': 0, u'testIds': [], u'servingLocations': [ObjectId('56f22bcaf9a8f73a3f9e99aa'), ObjectId('576647ff7777044e7387d4b5')], u'fullName': u'ER. FULL NAME', u'specialityIds': [ObjectId('5765662880065990591703ef'), ObjectId('576566288006599059170416'), ObjectId('57656628800659905917041d')], u'_id': ObjectId('57a0986720820b161c932287')}, u't': 36L, u'v': 2, u'ns': u'clinic_world_local.doctors', u'o2': {u'_id': ObjectId('57a0986720820b161c932287')}, u'op': u'u'}
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/site-packages/mongo_connector/oplog_manager.py", line 268, in run
    ns, timestamp)
  File "/usr/local/lib/python2.7/site-packages/mongo_connector/util.py", line 38, in wrapped
    reraise(new_type, exc_value, exc_tb)
  File "/usr/local/lib/python2.7/site-packages/mongo_connector/util.py", line 32, in wrapped
    return f(*args, **kwargs)
  File "/usr/local/lib/python2.7/site-packages/mongo_connector/doc_managers/neo4j_doc_manager.py", line 101, in update
    tx.commit()
  File "/usr/local/lib/python2.7/site-packages/py2neo/cypher/core.py", line 306, in commit
    return self.post(self.__commit or self.__begin_commit)
  File "/usr/local/lib/python2.7/site-packages/py2neo/cypher/core.py", line 248, in post
    rs = resource.post({"statements": self.statements})
  File "/usr/local/lib/python2.7/site-packages/py2neo/core.py", line 307, in post
    response = self.__base.post(body, headers, **kwargs)
  File "/usr/local/lib/python2.7/site-packages/py2neo/packages/httpstream/http.py", line 983, in post
    rq = Request("POST", self.uri, body, headers)
  File "/usr/local/lib/python2.7/site-packages/py2neo/packages/httpstream/http.py", line 382, in __init__
    self.__body = json.dumps(body, cls=JSONEncoder, separators=",:")
  File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 251, in dumps
    sort_keys=sort_keys, **kw).encode(obj)
  File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/encoder.py", line 207, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/encoder.py", line 270, in iterencode
    return _iterencode(o, 0)
  File "/usr/local/lib/python2.7/site-packages/py2neo/packages/httpstream/jsonencoder.py", line 37, in default
    return json.JSONEncoder.default(self, obj)
  File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/encoder.py", line 184, in default
    raise TypeError(repr(o) + " is not JSON serializable")
Neo4jOperationFailed: ObjectId('5765662880065990591703ef') is not JSON serializable

ObjectId values stored as properties are not handled property by neo4j_doc_manager. When an ObjectId property value is encountered neo4j_doc_manager should:

  1. convert the ObjectId to its string representation and store as a property on the Node in Neo4j
  2. create a relationship to the node referenced by the ObjectId
  3. do the above for arrays of ObjectIds

Neo4jOperationFailed: Object of type 'ObjectId' is not JSON serializable

Hi guys,

I have this error while the mongo_connector is duplicating data. ( it is working for almost all my data but it's throwing an error for this one. Do you have anyidea or fixing it ?

2017-01-09 11:38:52,049 [ALWAYS] mongo_connector.connector:52 - Starting mongo-connector version: 2.5.0
2017-01-09 11:38:52,050 [ALWAYS] mongo_connector.connector:52 - Python version: 3.6.0 (default, Dec 28 2016, 23:09:0
[GCC 4.9.2]
2017-01-09 11:38:52,051 [ALWAYS] mongo_connector.connector:52 - Platform: Linux-4.4.0-57-generic-x86_64-with-debian-
2017-01-09 11:38:52,051 [ALWAYS] mongo_connector.connector:52 - pymongo version: 3.4.0
2017-01-09 11:38:52,643 [ALWAYS] mongo_connector.connector:52 - Source MongoDB version: 3.4.1
2017-01-09 11:38:52,643 [ALWAYS] mongo_connector.connector:52 - Target DocManager: mongo_connector.doc_managers.neo4
2017-01-09 11:45:13,278 [ERROR] mongo_connector.oplog_manager:288 - Unable to process oplog document {'ts': Timestam
c25876641d4c705a94')}, 'o': {'$set': {'friends.accepted': [{'dateAccepted': datetime.datetime(2017, 1, 9, 11, , 45, 13, 833000), '_id': ObjectId('587377c95876641d4c705a97'), 'users_id': ObjectId('587377c95876641d4c705a95')}]}}}
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/mongo_connector/util.py", line 33, in wrapped
    return f(*args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/mongo_connector/doc_managers/neo4j_doc_manager.py", line 101, in upda
    tx.commit()
  File "/usr/local/lib/python3.6/site-packages/py2neo/cypher/core.py", line 333, in commit
    return self.post(self.__commit or self.__begin_commit)
  File "/usr/local/lib/python3.6/site-packages/py2neo/cypher/core.py", line 275, in post
    rs = resource.post({"statements": self.statements})
  File "/usr/local/lib/python3.6/site-packages/py2neo/core.py", line 307, in post
    response = self.__base.post(body, headers, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/py2neo/packages/httpstream/http.py", line 983, in post
    rq = Request("POST", self.uri, body, headers)
  File "/usr/local/lib/python3.6/site-packages/py2neo/packages/httpstream/http.py", line 382, in __init__
    self.__body = json.dumps(body, cls=JSONEncoder, separators=",:")
  File "/usr/local/lib/python3.6/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/usr/local/lib/python3.6/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/local/lib/python3.6/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "/usr/local/lib/python3.6/site-packages/py2neo/packages/httpstream/jsonencoder.py", line 37, in default
    return json.JSONEncoder.default(self, obj)
  File "/usr/local/lib/python3.6/json/encoder.py", line 180, in default
    o.__class__.__name__)
TypeError: Object of type 'ObjectId' is not JSON serializable

neo_doc_manager fails with py2neo 1.6

I have python 2,6 and i can't upgrade it because of my work environment i must stay on 2,6
and i'm trying to work with neo-doc-manager and this requeres installing py2neo.
on pyhton 2,6 i can't install the latest version of py2neo so i installed the py2neo 1,6

and when i try to make a connection:
mongo-connector -m localhost:27017 -t http://localhost:7474/db/data -d neo4j_doc_manager

i get this error:
` Traceback (most recent call last):
File "", line 1, in
File "/tmp/pip-build-6AsNjN/py2neo/setup.py", line 29, in
from py2neo import author, email, license, package, version
File "py2neo/init.py", line 27, in
from py2neo.core import *
File "py2neo/core.py", line 1414
new_inst.__stale.update({"labels", "properties"})
^
SyntaxError: invalid syntax

----------------------------------------

Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-6AsNjN/py2ne`

so is there a solution for this

Implement bulk indexing or leave it out

The bulk_upsert method doesn't currently do anything. If the DocManager doesn't support bulk inserts, then this method should be left unimplemented, so that the default bulk_upsert method will be used (this just does upserts serially). This method is always called during a collection dump, so I suspect that currently this DocManager won't properly synchronize any documents already in MongoDB (before tailing the oplog) correctly, though I haven't actually run this code.

Maybe this method is still a WIP?

There are also a few other methods that should be implemented, but aren't yet (maybe they're all WIPs):

  • search
  • commit
  • get_last_doc

https://github.com/neo4j-contrib/neo4j_doc_manager/blob/master/mongo_connector/doc_managers/neo4j_doc_manager.py#L100-L108

Sorry to annoy you if you're already working on implementing these; just don't want this to slip through the cracks!

AttributeError on document insertion

When importing documents mongo connector crashed with this error:

mongo-connector -v -m localhost:27017 -t http://localhost:7474/db/data -n catalog.course,catalog.category,catalog.university,catalog.instructor,catalog.student,catalog.course_taken -d neo4j_doc_manager
Logging to mongo-connector.log.
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "/Library/Python/2.7/site-packages/mongo_connector/util.py", line 85, in wrapped
    func(*args, **kwargs)
  File "/Library/Python/2.7/site-packages/mongo_connector/oplog_manager.py", line 256, in run
    docman.upsert(doc, ns, timestamp)
  File "/Users/lyonwj/neotechnology/devtmp/forks/neo4j_doc_manager/mongo_connector/doc_managers/neo4j_doc_manager.py", line 60, in upsert
    builder = NodesAndRelationshipsBuilder(doc, doc_type, doc_id)
  File "/Users/lyonwj/neotechnology/devtmp/forks/neo4j_doc_manager/mongo_connector/doc_managers/nodes_and_relationships_builder.py", line 17, in __init__
    self.build_nodes_query(doc_type, doc, doc_id)
  File "/Users/lyonwj/neotechnology/devtmp/forks/neo4j_doc_manager/mongo_connector/doc_managers/nodes_and_relationships_builder.py", line 30, in build_nodes_query
    self.build_nodes_query(key, document[key], id)
  File "/Users/lyonwj/neotechnology/devtmp/forks/neo4j_doc_manager/mongo_connector/doc_managers/nodes_and_relationships_builder.py", line 35, in build_nodes_query
    self.build_nodes_query(json_key, json, id)
  File "/Users/lyonwj/neotechnology/devtmp/forks/neo4j_doc_manager/mongo_connector/doc_managers/nodes_and_relationships_builder.py", line 22, in build_nodes_query
    for key in document.keys():
AttributeError: 'NoneType' object has no attribute 'keys'

Steps to reproduce:

mongoimport -d catalog -c course --host=127.0.01 < course.json

Setting environment variable doesn't work to authenticate

I've run the command:
mongo-connector -m localhost:27017 -t http://localhost:7474/db/data -d neo4j_doc_manager

Here is the log:

No handlers could be found for logger "mongo_connector.util"
Traceback (most recent call last):
  File "/usr/local/bin/mongo-connector", line 9, in <module>
    load_entry_point('neo4j-doc-manager==1.0.0.dev11', 'console_scripts', 'mongo-connector')()
  File "/Library/Python/2.7/site-packages/mongo_connector/util.py", line 90, in wrapped
    func(*args, **kwargs)
  File "/Library/Python/2.7/site-packages/mongo_connector/connector.py", line 1059, in main
    conf.parse_args()
  File "/Library/Python/2.7/site-packages/mongo_connector/config.py", line 118, in parse_args
    option, dict((k, values.get(k)) for k in option.cli_names))
  File "/Library/Python/2.7/site-packages/mongo_connector/connector.py", line 854, in apply_doc_managers
    dm_instances.append(DocManager(target_url, **kwargs))
  File "/Library/Python/2.7/site-packages/mongo_connector/doc_managers/neo4j_doc_manager.py", line 38, in __init__
    self.graph = Graph(url)
  File "/Library/Python/2.7/site-packages/py2neo/database/__init__.py", line 327, in __new__
    use_bolt = version_tuple(inst.__remote__.get().content["neo4j_version"]) >= (3,)
  File "/Library/Python/2.7/site-packages/py2neo/database/http.py", line 157, in get
    raise Unauthorized(self.uri.string)
py2neo.database.status.Unauthorized: http://localhost:7474/db/data/

I've set the NEO4J_AUTH env variable. I can read it.

~> env | grep NEO4J
NEO4J_AUTH=neo4j:neo4j

I couldn't see py2neo's authenticate(url, usr, pass) method call in the source code. Or anything about NEO4J_AUTH variable. Is there something I've missed?

Data model for deep level nested documents

Using this document as an example (from this dataset in a collection called products):

{ "_id" : { "$oid" : "507d95d5719dbef170f15c00" }, 
    "name" : "Phone Service Family Plan", 
    "type" : "service", 
    "monthly_price" : 90, 
    "limits" : { 
        "voice" : { 
            "units" : "minutes", "n" : 1200, "over_rate" : 0.05 
        }, 
        "data" : { 
            "n" : "unlimited", 
            "over_rate" : 0 
        }, 
        "sms" : { 
            "n" : "unlimited", 
            "over_rate" : 0 
        } 
    }, 
    "sales_tax" : true, 
    "term_years" : 2 
}

I would intuitively expect that document to be converted to a property graph with a structure and content like this:

screenshot 2015-09-24 18 59 20

Top level properties (_id, name, type, monthly_price, sales_tax and term_years) are set on the root level document (:Document:Product). The limits subdocument is extracted into a node :limits. Note that this subdocument has no properties other than subdocuments, so these subdocuments become nodes (voice, data, and sms) each with a relationship to the :limits subdocument node, not to the root level :products node.

However, what I actually get using the neo4j-doc-manager is this:

screenshot 2015-09-25 09 23 05

Issues for discussion:

  • Are others able to duplicate this?
  • What is the data model that we expect from this document?
  • There appear to be missing properties on the root document node (name and type).

IndexError when inserting documents

When using mongoimport to insert this sample data, an IndexError occurs and only 8434 documents (out of 25359 documents that were successfully inserted into MongoDB) are imported to Neo4j.

Stacktrace:

bash-3.2$ mongo-connector -m localhost:27017 -t http://localhost:7474/db/data -d neo4j_doc_manager
Logging to mongo-connector.log.
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "/Library/Python/2.7/site-packages/mongo_connector/util.py", line 85, in wrapped
    func(*args, **kwargs)
  File "/Library/Python/2.7/site-packages/mongo_connector/oplog_manager.py", line 256, in run
    docman.upsert(doc, ns, timestamp)
  File "/Users/lyonwj/neotechnology/devtmp/neo4j_doc_manager/mongo_connector/doc_managers/neo4j_doc_manager.py", line 59, in upsert
    builder = NodesAndRelationshipsBuilder(doc, doc_type, doc_id)
  File "/Users/lyonwj/neotechnology/devtmp/neo4j_doc_manager/mongo_connector/doc_managers/nodes_and_relationships_builder.py", line 12, in __init__
    self.build_nodes_query(doc_type, doc, doc_id)
  File "/Users/lyonwj/neotechnology/devtmp/neo4j_doc_manager/mongo_connector/doc_managers/nodes_and_relationships_builder.py", line 22, in build_nodes_query
    self.build_nodes_query(key, document[key], id)
  File "/Users/lyonwj/neotechnology/devtmp/neo4j_doc_manager/mongo_connector/doc_managers/nodes_and_relationships_builder.py", line 23, in build_nodes_query
    elif self.is_json_array(document[key]):
  File "/Users/lyonwj/neotechnology/devtmp/neo4j_doc_manager/mongo_connector/doc_managers/nodes_and_relationships_builder.py", line 46, in is_json_array
    return ((type(doc_key) is list) and (type(doc_key[0]) is dict))
IndexError: list index out of range

mongo-connector.log:

2015-09-11 10:07:46,855 [ERROR] mongo_connector.util:87 - Fatal Exception

Cannot import name autheticate

I try to using mg connector with neo4j

File "/Users/signorini/Jobs/www/maestro/mongo-connector/lib/python3.6/site-packages/mongo_connector/doc_managers/neo4j_doc_manager.py", line 16, in
from py2neo import Graph, authenticate
ImportError: cannot import name 'authenticate'

py2neo == 4.0.0
mongo-connector==2.5.1
more-itertools==4.2.0
neo4j-doc-manager==1.0.0.dev11
neo4j-driver==1.6.0

Made some searchs, i find some anwsers, but this is new version, with same problems..

Way to prevent replication until specific condition is met

Hey guys,

This is not a technical issue but rather something I'm trying to figure out. Is there any way not to replicate data from mongo to neo4j until specific condition (logic) is satisfied. I have some data for instance that will be under "pending" status and I would want to insert it to neo4j only when the data is approved.

If this is not possible, anything you can suggest to make this happen ?

Thanks a bunch ๐Ÿ‘

Mongo connector with py2neo v5 is working well connecting.. from mongo 4.4.1 ! to doc manager BUT..

see attached.. success mongo conector work BUT when i goto mongo shell and key in ...

onfig 0.000GB
local 0.000GB
test 0.000GB
rs0:PRIMARY> db.data.insert( { "session": { "title": "12 Years of Spring: An Open Source Journey", "abstract": "Spring emerged as a core open source project in early 2003 and evolved to a broad portfolio of open source projects up until 2015." }, "topics": ["keynote", "spring"], "room": "Auditorium", "timeslot": "Wed 29th, 09:30-10:30", "speaker": { "name": "Juergen Hoeller", "bio": "Juergen Hoeller is co-founder of the Spring Framework open source project.", "twitter": "https://twitter.com/springjuergen", "picture": "http://www.springio.net/wp-content/uploads/2014/11/juergen_hoeller-220x220.jpeg" } } );
WriteResult({ "nInserted" : 1 })
rs0:PRIMARY> show dbs
admin 0.000GB
config 0.000GB

THEN I go back to my browser.. there is zippo.. no graph shows up ???? what do we need to do to see this graph ?
Screenshot from 2020-10-27 16-06-53
Screenshot from 2020-10-27 16-02-56

Inserting an array of document-ids does not get replicated to neo4j

I tried something like
db.items.insertOne({name:"item05",description:"desc05",items_id:["57719752029ffbfb3c4cd0af","577197a8029ffbfb3c4cd0b0"]})

which gets inserted to mongo, but didn't find its way to neo4j.

db.items.insertOne({name:"item05",description:"desc05",items_id:"57719752029ffbfb3c4cd0af"})
works fine. (mongo 3.2, neo4j 2.3.3, latest neo4j_doc_manager (without 3.0-updates)

Are lists/arrays of ids not supported or is there another way?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.