Giter Site home page Giter Site logo

opencypher / morpheus Goto Github PK

View Code? Open in Web Editor NEW
334.0 334.0 64.0 30.3 MB

Morpheus brings the leading graph query language, Cypher, onto the leading distributed processing platform, Spark.

License: Apache License 2.0

Scala 100.00%
apache-spark apache2 big-data cypher graph scala

morpheus's Introduction

The Cypher Property Graph Query Language

This repository holds the specification of Cypher, a declarative property graph query language. Its purpose is to be central to the process of evolving the specification and standardisation of Cypher as a graph query language.

Overview of the process

Changes to openCypher are made through consensus in the openCypher Implementers Group (oCIG). The process for proposing changes, voting on proposals and measuring consensus is described in this set of slides.

Refer to the Cypher Improvement Process document for more details on CIPs, CIRs, their structure and lifecycle.

The structure of this repository

  • Cypher Improvement Proposals (CIP), /cip

    • Contains a list of accepted CIP documents.

  • Cypher grammar, /grammar

    • Contains the Cypher grammar specification, in XML source format.

    • A more readily consumable form of the grammar is generated as output from the build and can be found here:

      • Railroad diagrams

      • EBNF

      • ANTLR4 Grammar

  • Cypher Technology Compatibility Kit (TCK), /tck

    • Contains a set of Cucumber features that define Cypher behaviour, and documentation on how to use it.

  • openCypher developer tools, /tools

    • Contains code that tests the integrity of the repository, generates release artifacts, and aids implementers of openCypher.

Building

This repository uses a Maven build and supports cross building for Scala 2.12 and Scala 2.13:

  • For Scala 2.12, use mvn -U clean install -P scala-212

  • For Scala 2.13 use mvn -U clean install -P scala-213

Contact us

There are several ways to get in touch with the openCypher project and its participants:

  • Are you interested in implementing openCypher for your platform, but you have general questions and want to reach out to other community members with similar interests? Post to our Google Groups mailing list: https://groups.google.com/forum/#!forum/opencypher

  • For specific feature requests or bug reports, please open an issue on this repository.

  • Do you have a particular contribution in mind, and concrete ideas on how to implement them? Open a pull request.

© Copyright 2015-2017 Neo Technology, Inc.

Feedback

Any feedback you provide to Neo Technology, Inc. through this repository shall be deemed to be non-confidential. You grant Neo Technology, Inc. a perpetual, irrevocable, worldwide, royalty-free license to use, reproduce, modify, publicly perform, publicly display and distribute such feedback on an unrestricted basis.

License

The openCypher project is licensed under the Apache license 2.0.

morpheus's People

Contributors

boggle avatar conker84 avatar darthmax avatar florentind avatar freeclimbing avatar grewalr avatar hannesmiller avatar jjaderberg avatar mats-sx avatar mengxr avatar moxious avatar pstutz avatar s1ck avatar soerenreichardt avatar tobias-johansson avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

morpheus's Issues

Extract node label info from predicates

The Neo4j frontend rewrites node label info to where predicates. We need CTNode to have the correct label information as this is used by SparkCypherGraph.nodes(var, nodeType) to load only the necessary subset of nodes (scans).

Query builder tests should also make assertions about graphs

Define graph schema with case classes that implement traits

Spark can extract Dataset schemas from case classes. If we additionally expect these classes to extend a Node/Relationship interface, then we might use this as a convenient way to define a graph schema. These case class definitions could also be used to define the graph schema for an existing Dataset/DataFrame (provided that it has the necessary ID columns).

This is what the readme example might look like:

case class Person(id: Long, name: String) extends Node
case class Friendship(id: Long, source: Long, target: Long) extends Relationship

val persons = Seq(Person(0, "Alice"), Person(1, "Bob")) 
val friendships = Seq(Friendship(0, 0, 1), Friendship(1, 1, 0))
val graph = CAPSGraph.create(NodeScan(persons), RelationshipScan(friendships))

This is what the conversion might look like for object instances. Turning an existing Dataset + case class combination into a node/relationship scan would probably not be much different.

NodeScan/RelationshipScan.apply[T <: PropertyGraph.Node : TypeTag](
    nodes: Seq[T]
)(implicit caps: CAPSSession): NodeScan = {
  /**
    * Turn the nodes into a Dataset (also offer an API for `DataSet` + case class type that extends node/rel).
    * The Node/Relationship traits identify and guarantee the existence of ID columns that encode the graph structure.
    * Get the relationship type and primary node label from the node/relationship class names
    * (e.g. relationshipTag.getSimpleName.toUpperCase).
    * Turn `extraLabels` (if they exist) into additional node labels. All other fields are turned into properties with the field name as the property key.
    */
}

If a case class implements Node, then a sequence of it or a matching Dataset can be turned into a NodeScan. Same for Relationship/RelationshipScan.

sealed trait Entity extends Product {
  def id: Long
}

trait Node extends Entity {
  def extraLabels: Seq[String] = Seq.empty
}

trait Relationship extends Entity {
  def source: Long

  def target: Long
}

What do you think?

Matching for unknown label does not work

Given the following test

  test("Match on unknown label") {
    // Given
    val given = TestGraph(
      """
        |(:Person {firstName: "Alice", lastName: "Foo"})
      """.stripMargin)

    // When
    val result = given.cypher(
      """
        |MATCH (a:Animal)
        |RETURN a.name
      """.stripMargin)

    // Then
    result.records.toMaps should equal(Bag())

    result.graph shouldMatch given.graph
  }

I get the following error:

Did not find slot for a:Animal :: BOOLEAN
org.opencypher.spark.api.exception.SparkCypherException: Did not find slot for a:Animal :: BOOLEAN

Id generation

Currently, we accept ids of external graphs and hope they never collide with one another. Additionally, the graph projection capabilities need to generate fresh ids for new entities, and they must also not collide with any other ids that may have been imported.

The only way we know of to get guarantees from Spark when it comes to id generation is the function monotonically_increasing_id(), which divides the id space per partition to generate unique numbers for each partition, given some assumption on the size of each partition and the number of partitions.

That function could only be reliably used if we also map all incoming graph entity ids to values generated from it -- which is costly. There seems to be no great way of solving this.

Support for Python API

Python API would open this library to a significant section of developers who are not in JAVA/SCALA camp. WIth many of us doing data science in PYTHON, having only scala support seems to be a limiting factor. Is there a chance that PYTHON will be supported as well?

Plan RelationshipScan in Expand operators

Currently, relationships are loaded from the input graph. This requires code duplication in the Expand operators. Instead, we should leverage the existing RelationshipScan instead.

CypherGraph should handle AllGiven and AnyGiven

Currently, CypherGraph.nodes and CypherGraph.relationships expect CTNode and CTRelationship arguments which are handles as AND (for ndoes) and OR (for rels) internally. To be more generic and also support the OR case for nodes (and the AND case for rels), the methods should expect Elements[Label] and Elements[RelType] and match internally.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.