Giter Site home page Giter Site logo

ml-link-prediction-notebooks's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

ml-link-prediction-notebooks's Issues

IndexError in `apply_triangles_features()`

In (cell 32](https://github.com/neo4j-graph-analytics/ml-link-prediction-notebooks/blob/main/04_Predictions.ipynb), I get an IndexError:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_21220/3422983529.py in <module>
----> 1 training_df = apply_triangles_features(training_df, "trianglesTrain", "coefficientTrain")
      2 test_df = apply_triangles_features(test_df, "trianglesTest", "coefficientTest")

~\AppData\Local\Temp/ipykernel_21220/2050145394.py in apply_triangles_features(data, triangles_prop, coefficient_prop)
     17     "coefficientProp": coefficient_prop
     18     }
---> 19     features = graph.run(query, params).to_data_frame()
     20     return pd.merge(data, features, on = ["node1", "node2"])

D:\Anaconda3\envs\data_science\lib\site-packages\py2neo\database.py in run(self, cypher, parameters, **kwparameters)
    403         :return:
    404         """
--> 405         return self.auto().run(cypher, parameters, **kwparameters)
    406 
    407     def evaluate(self, cypher, parameters=None, **kwparameters):

D:\Anaconda3\envs\data_science\lib\site-packages\py2neo\database.py in run(self, cypher, parameters, **kwparameters)
    976                 result = self._connector.run(self.ref, cypher, parameters)
    977             else:
--> 978                 result = self._connector.auto_run(cypher, parameters,
    979                                                   graph_name=self.graph.name,
    980                                                   readonly=self.readonly)

D:\Anaconda3\envs\data_science\lib\site-packages\py2neo\client\__init__.py in auto_run(self, cypher, parameters, pull, graph_name, readonly)
   1341             if pull != 0:
   1342                 try:
-> 1343                     cx.pull(result, n=pull)
   1344                 except TypeError:
   1345                     # If the RUN fails, so will the PULL, due to

D:\Anaconda3\envs\data_science\lib\site-packages\py2neo\client\bolt.py in pull(self, result, n, capacity)
    941         result.append(response, final=(n == -1))
    942         try:
--> 943             self._sync(response)
    944         except BrokenWireError as error:
    945             result.transaction.mark_broken()

D:\Anaconda3\envs\data_science\lib\site-packages\py2neo\client\bolt.py in _sync(self, *responses)
    745         self.send()
    746         for response in responses:
--> 747             self._wait(response)
    748 
    749     def _audit(self, task):

D:\Anaconda3\envs\data_science\lib\site-packages\py2neo\client\bolt.py in _wait(self, response)
    740         """
    741         while not response.full() and not response.done():
--> 742             self._fetch()
    743 
    744     def _sync(self, *responses):

D:\Anaconda3\envs\data_science\lib\site-packages\py2neo\client\bolt.py in _fetch(self)
    715         failed state into an exception.
    716         """
--> 717         tag, fields = self.read_message()
    718         if tag == 0x70:
    719             self._responses.popleft().set_success(**fields[0])

D:\Anaconda3\envs\data_science\lib\site-packages\py2neo\client\bolt.py in read_message(self)
    642 
    643     def read_message(self):
--> 644         tag, fields = self._reader.read_message()
    645         if tag == 0x71:
    646             # If a RECORD is received, check for more records

D:\Anaconda3\envs\data_science\lib\site-packages\py2neo\client\bolt.py in read_message(self)
     94                 chunks.append(self.wire.read(size))
     95         message = b"".join(chunks)
---> 96         _, n = divmod(message[0], 0x10)
     97         try:
     98             unpacker = UnpackStream(message, offset=2)

IndexError: index out of range

Versions

Component Version
Neo4j 4.3.1
GDS 1.6.1

IllegalArgumentException in 03_Recommendations_Part2

The cell titled as

Next, use full text search and Personalized PageRank to find interesting articles for different authors:

results in the following error:

ClientError: [Procedure.ProcedureCallFailed] Failed to invoke procedure `gds.pageRank.stream`: Caused by: java.lang.IllegalArgumentException: Source nodes do not exist in the in-memory graph: ['105328', '118756', ... ]

I believe this is due to the fact that, in the proposed query reported below, the personalized pagerank algorithm uses source nodes that are not included in the set of nodes of the anonymous projection.

query = """
MATCH (a:Author {name: $author})<-[:AUTHOR]-(article)-[:CITED]->(other)
WITH a, collect(article) + collect(other) AS sourceNodes
CALL gds.pageRank.stream({
  nodeQuery: 'CALL db.index.fulltext.queryNodes("articles", $searchTerm)
   YIELD node, score
   RETURN id(node) as id',
  relationshipQuery: 'MATCH (a1:Article)-[:CITED]->(a2:Article) 
   RETURN id(a1) as source,id(a2) as target', 
  sourceNodes: sourceNodes,
  validateRelationships:false,
  parameters: {searchTerm: $searchTerm}})
YIELD nodeId, score
WITH gds.util.asNode(nodeId) AS n, score
WHERE not(exists((a)<-[:AUTHOR]-(n))) AND score > 0
RETURN n.title as article, score, [(n)-[:AUTHOR]->(author) | author.name][..5] AS authors
order by score desc limit 10
"""

I was able to obtain the same results as pictured in the cell's original output by slightly altering the query as follows:

query = """
MATCH (a:Author {name: $author})<-[:AUTHOR]-(article)-[:CITED]->(other)
WITH a, collect(article) + collect(other) AS sourceNodes
CALL db.index.fulltext.queryNodes("articles", $searchTerm)
   YIELD node, score
WITH a, sourceNodes, collect(id(node)) AS ids
CALL gds.pageRank.stream({
  nodeQuery: 'UNWIND $ids AS id 
  RETURN id',
  relationshipQuery: 'MATCH (a1:Article)-[:CITED]->(a2:Article) 
   RETURN id(a1) as source,id(a2) as target', 
  sourceNodes: [article IN sourceNodes WHERE id(article) IN ids | article],
  validateRelationships:false,
  parameters: {ids: ids, searchTerm: $searchTerm}
 })
YIELD nodeId, score
WITH gds.util.asNode(nodeId) AS n, score
WHERE not(exists((a)<-[:AUTHOR]-(n))) AND score > 0
RETURN n.title as article, score, [(n)-[:AUTHOR]->(author) | author.name][..5] AS authors
order by score desc limit 10
"""

The behaviour of the query is the same but only sourceNodes present in the anonymous projection are used as sources in the pagerank algorithm.

I'm using neo4j Desktop at the following versions:

Product Version
neo4j 4.3.1
APOC 4.3.0.4
GDS 1.6.1

Thanks for the great course and I hope you find this useful!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.