ftsrg / ingraph Goto Github PK
View Code? Open in Web Editor NEWIncremental view maintenance for openCypher graph queries.
Home Page: http://docs.inf.mit.bme.hu/ingraph/
License: Eclipse Public License 1.0
Incremental view maintenance for openCypher graph queries.
Home Page: http://docs.inf.mit.bme.hu/ingraph/
License: Eclipse Public License 1.0
e.g. input -> output
For certain query parts, using the same name does not mean they are the same variables. So separate VariableFactory instances need to be used for:
singleQuery
(i.e. those that are unioned together) (see also: #9)WITH
or RETURN
)Note: for the subqueries, the variables present in the WITH
clause are chained forward and available in the subsequent subqueries
Once travis builds are corrected, switch .travis.yml/notifications/on_failure
to always
.
It has been silented for now in d5d8e4b.
Let's gather some use cases for multisets/bags.
SQL keeps duplicates, because:
- Duplicates are important for aggregate queries.
- In general, bags can be more “efficient” than sets.
(source)
Xtext 2.11 beta will be released on 18 October, with the final version planned in January. https://blogs.itemis.com/en/xtext-2.11-release-plan-changed
2.11 fixes some issues with the web editor, see https://bugs.eclipse.org/bugs/show_bug.cgi?id=495851
Xtext's web UI support the Ace editor, which looks quite nice.
Not all expression-pairs can be operand of comparison operators. Some type check needs to be done when compiling the cypher query to te relational algebra expressions.
These are legal:
nodeVar1 = nodeVar2
nodeVar1 != nodeVar2
nodeVar1.numericAttribute < 5
However, these are not, and a compile-time error should be omitted:
nodeVar1 < 5
nodeVar1 < nodeVar2
Add operator for unwinding collections.
We should support single queries that has an OPTIONAL MATCH
clause as their first MATCH
clause.
E.g. the following should return a singlle null
-row, provided, there is no vertex having the label NON_EXISTENT_LABEL
.
OPTIONAL MATCH (n:NON_EXISTENT_LABEL)
RETURN n
Note: this implies that all MATCH
clauses in this single query is OPTIONAL MATCH
.
add Union as BetaOperator to EMF model
handle Union in cyp AST -> relalg compilation
use different VariableFactories fo each singleQuery
(i.e. those that are unioned together)
make distinction between UNION ALL/UNION
match (n:A)
return n
union
match (n:D)
return n
add requirement: variable aliases need to be matched (see also: #16)
otherwise exception is thrown in Neo4j, see:
CREATE (n1:A{name:"World"}) create (n2:B{whatever:"Universe"});
MATCH (a:A) RETURN a AS a
UNION
MATCH (b:B) RETURN b AS a;
Neo4j: All sub queries in an UNION must have the same column names
Synopsis: AllDifferent( list_of_edge_variables )
, e.g. AllDifferent([_e1, _e2, _e3])
Query plans should implement tuple semantics, i.e. instead of using attribute names, they should use tuples and indices. This is required by ire.
See also: #10
ensureLabel
should indicate (possibly in EMF model) when a labelset is already set for a variable, and the new is incompatible.filter(false)
condition. query optimizer can replace these branches with an empty data source of specified schema (list of variables)@jmarton do we need cypxmi models in the ingraph-cypxmi/
dir anymore?
We should definitely support labels()
and type()
. I am not sure about properties()
and keys()
.
Currently, if ANTLR cannot parse a query, it prints warning messages to the standard error. It (or we) should throw an exception instead.
Check this query:
MATCH ()
RETURN *
In this case, our parser generated a variable and marks it as "don't care". I don't think this is an appropriate name - we should call these variables "anonymous" variables.
collect()
We should use Docker containers and Akka remoting to support distributed query processing.
Key challenges:
Think http://spray.io/
To facilitate range edge queries, AllDifferent operator should accept edge-variables and edge list variables. On the resultset the flattened union should be tested for pairwise inequality.
See also: #11
VertexVariable
EdgeVariable
Cypher2RelalgUtil.ensureLabel
)
VertexVariable
EdgeVariable
Effective semantics of the labelset constraints:
VertexVariable
are and
-ed together (in the relalg model, this is the semantics of VertexLabelSet
)VertexVariable
with possible different label sets enforce a contraint that a matching variable needs to match the union of the label sets.EdgeVariable
are or
-ed together (in the relalg model, this is the semantics of EdgeLabelSet
)EdgeVariable
with possible different label sets enforce a constraint that a matching variable needs to match the intersection of the label sets.EdgeVariable
s, a clear distinction is to be made between the cases when there are no labelset constraints given for an edge variable and when their intersection is the empty set (this is handled in LabelSetStatus
)We currently use Operator
multiple times in the relalg model:
Operator
, UnaryOperator
, etc. are relational algebraic operatorsArithmeticComparisonOperator
, UnaryArithmeticOperator
, etc. are enumsSometimes this makes "programming by Ctrl+Space" a bit more difficult :-)
Idea: add a Type
suffix to enums, e.g. ArithmeticComparisonOperatorType
, UnaryArithmeticOperatorType
, ...
I tried to parse integer and double numbers. Integers were straightforward to implement (tests).
However, I had issues with parsing "exponent decimal" real numbers. I documented some strange behaviour in a minimal working example.
In accordance with #56, we should throw an exception if a query uses SKIP/LIMIT without ORDER BY.
Before we fix the issues with sorting (e.g. parsing ASC/DESC orders), we should update to the latest version of Slizaa:
slizaa/slizaa-opencypher-xtext@f9e935d
Currently, we use the terminology "mutual attributes" in the code (LaTeX/Xcore) and "common attributes" in the submitted paper. We should unify this.
This a tough one, so let's break it down to simpler ones.
relalg.xcore
ExpressionList
grammar entityUNWIND
operator in ire: #54collect()
function in ire: ftsrg/ire#2First, we should aim at simple queries:
MATCH (n)
RETURN COLLECT(n) AS nodes
WITH [1,2,3] AS x
UNWIND x AS y
RETURN y
WITH [1,2,3] AS x
UNWIND x AS y
RETURN COLLECT(y) AS ys
We should consider how to handle minHops and maxHops in Cypher queries: https://neo4j.com/docs/developer-manual/current/cypher/#_variable_length_relationships
-[:TYPE*minHops..maxHops]->
The default values are:
1
Possible cases (projections [π operator] omitted):
maxHops
is limited: we should unfold the expression to a union of joins.
minHops
is 0 -> e:E*0..2
can be defined as r ∪ r ⨝ (E ∪ (E ⨝ E))
minHops
is larger than 0 -> e:E*1..3
can be defined as r ⨝ (E ∪ (E ⨝ E) ∪ (E ⨝ E ⨝ E))
maxHops
is infinity (default):
minHops
is 0 -> e:E*0..
can be defined as r ∪ r ⨝ E+
minHops
is 1 (default), we should simply use transitive closure -> e:E*1..
can be defined as r ⨝ E+
minHops
is larger than 1 and maxHops
is infinity, we should unfold exactly minHops
hops -> e:E*2..
can be defined as r ⨝ E ⨝ E+
ORDER BY
https://neo4j.com/docs/developer-manual/3.0/cypher/#query-order
A sub-clause following RETURN or WITH, specifying that the output should be sorted in particular way.
The user of the library should specify a list of variable-direction pairs (e.g. person-ASC, city-DESC) for the ordering.
(Related ticket: #51)
Use this issue to keep track of supported Cypher constructs.
CREATE
MERGE
SET
DELETE
DETACH DELETE
REMOVE
CALL ... YIELD
We can safely ignore these for now.
ON CREATE
ON MATCH
STARTS WITH
CONTAINS
ENDS WITH
MATCH
WHERE
MATCH
, without OPTIONAL
OPTIONAL MATCH
MATCH
clause of a single queryMATCH
clause of a single query, see #40RETURN
RETURN *
RETURN DISTINCT
ORDER BY
(handles variables and expressions as well as name resolution for aliased RETURN items)SKIP
(SKIP
and LIMIT
only implemented for constants)LIMIT
UNWIND
WITH
WHERE
#124UNION
=
<>
(also !=
, though non-standard)<
>
<=
>=
+
-
/
*
^
%
NOT
AND
OR
XOR
<variable> IS [NOT] NULL
<expression> IS [NOT] NULL
.
(property access)[]
(subscript) (Note: we don't support p['foo']
, which is a syntax sugar for p.foo
)$
)avg()
collect()
count()
max()
min()
sum()
stdDev()
stdDevP()
We can safely ignore these for now:
exists()
coalesce()
endNode()
head()
length()
last()
properties()
size()
startNode()
type()
toFloat()
relationships()
tail()
keys()
labels()
nodes()
range()
abs()
ceil()
floor()
rand()
round()
sign()
e()
exp()
log()
log10()
sqrt()
acos()
asin()
atan()
atan2()
cos()
cot()
degrees()
pi()
radians()
sin()
tan()
left()
lTrim()
trim()
replace()
reverse()
right()
rTrim()
split()
substring()
toLower()
toString()
toUpper()
percentileCont()
percentileDisc()
toBoolean()
toInteger()
(called toInt()
)We currently have two enums for IS_NULL
, IS_NOT_NULL
: https://github.com/FTSRG/ingraph/blob/44e456d7f772b4cac446e9df8fcb7bd43a596b77/ingraph-relalg-xcore/src/relalg.xcore#L356
Currently, this condition
WHERE n="emfsucks"
matches if n
is "\"emfsucks\""
There are a few hundred tests that we have imported from several sources. Our project, however, is a work-in-progress and succeeds only for a subset of the tests. We would like to mark those tests that are known to work at some time point (test set 1). We require these tests to pass for future commits on Travis in order to recognize if something breaks.
To achieve this, we need to
Note: test set 1 is intended to expand only, i.e. to form a set-series where each element is a superset of any previous elements.
(Related ticket: #35)
Use this issue to keep track of supported Cypher constructs.
MATCH
RETURN
(Production node)UNWIND
OPTIONAL MATCH
(left outer join)WITH
UNION
CREATE
MERGE
SET
DELETE
DETACH DELETE
REMOVE
CALL ... YIELD
WHERE
#43ORDER BY
SKIP
LIMIT
ON CREATE
ON MATCH
DISTINCT
STARTS WITH
CONTAINS
ENDS WITH
=
<>
<
>
<=
>=
+
-
/
*
^
%
NOT
AND
OR
XOR
IS NULL
IS NOT NULL
.
(property access)[]
(subscript)$
) (moved to #85)toFloat()
head()
length()
last()
size()
coalesce()
startNode()
(moved to #175)endNode()
(moved to #175)properties()
, (see #53)type()
(see #53)toBoolean()
toInteger()
(called toInt()
)Implement a SKIP
...LIMIT
operator, which works on ordered expressions: https://neo4j.com/docs/developer-manual/current/cypher/#_skip_and_limit
We only support SKIP
and LIMIT
if they are preceded by an ORDER BY
expression:
https://neo4j.com/docs/developer-manual/3.0/cypher/#query-order
A sub-clause following RETURN or WITH, specifying that the output should be sorted in particular way.
The user of the library should specify a list of variable-direction pairs (e.g. person-ASC, city-DESC) for the ordering.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.