cleishm / libcypher-parser Goto Github PK
View Code? Open in Web Editor NEWCypher Parser Library
License: Apache License 2.0
Cypher Parser Library
License: Apache License 2.0
I found libcypher-parser does not fully match openCypher9 syntax.
So where can I find the grammar file that libcypher-parser obeys.
Thanks!
Hi
Can you please explain the considerations for allowing a number (integer or float) representation to be only a string representing only non-negative values, while negative values are expressions with unary minus operation over a non-negative number?
float-string =
< [0-9]+ '.'? [0-9]* [eE] [-+]? [0-9] sym-part* >
{ strbuf_append_block(); }
| < [0-9]* '.' [0-9] sym-part* > { strbuf_append_block(); }
integer-string = < [0-9] sym-part* > { strbuf_append_block(); }
Your answer is appreciated.
Hi @cleishm,
It seems that some APOC procedures are not supported. The following example query throws an error due to the parenthese after apoc.convert.toBoolean
.
Query:
MATCH (:NodeType)-[rel:relationship]->(:NodeType2)
UNWIND rel.list_data as data
WITH data, split(data, " ")[-1] as flag
WHERE apoc.convert.toBoolean(flag)
RETURN data
Note here: rel.list_data
is of the format ["some_value flag", "some_value flag",...]
where the flag string is being converted into a boolean via the APOC procedure.
This gives the output:
@0 5..188 statement body=@1
@1 5..188 > query clauses=[@2, @12, @17, @35]
@2 5..60 > > MATCH pattern=@3
@3 11..55 > > > pattern paths=[@4]
@4 11..55 > > > > pattern path (@5)-[@7]-(@10)
@5 11..22 > > > > > node pattern (:@6)
@6 12..21 > > > > > > label :`NodeType`
@7 22..43 > > > > > rel pattern -[@8:@9]->
@8 24..27 > > > > > > identifier `rel`
@9 27..40 > > > > > > rel type :`relationship`
@10 43..55 > > > > > node pattern (:@11)
@11 44..54 > > > > > > label :`NodeType2`
@12 60..93 > > UNWIND expression=@13, alias=@16
@13 67..81 > > > property @14.@15
@14 67..70 > > > > identifier `rel`
@15 71..80 > > > > prop name `list_data`
@16 84..88 > > > identifier `data`
@17 93..165 > > WITH projections=[@18, @20], WHERE=@29
@18 98..102 > > > projection expression=@19
@19 98..102 > > > > identifier `data`
@20 104..137 > > > projection expression=@21, alias=@28
@21 104..125 > > > > subscript @22[@26]
@22 104..120 > > > > > apply @23(@24, @25)
@23 104..109 > > > > > > function name `split`
@24 110..114 > > > > > > identifier `data`
@25 116..119 > > > > > > string " "
@26 121..123 > > > > > unary operator - @27
@27 122..123 > > > > > > integer 1
@28 128..132 > > > > identifier `flag`
@29 143..165 > > > property @30.@33
@30 143..155 > > > > property @31.@32
@31 143..147 > > > > > identifier `apoc`
@32 148..155 > > > > > prop name `convert`
@33 156..165 > > > > prop name `toBoolean`
@34 165..175 > > error >>(flag)\n <<
@35 176..188 > > RETURN projections=[@36]
@36 183..188 > > > projection expression=@37
@37 183..187 > > > > identifier `data`
This is a valid implementation of the apoc procedure which performs as expected. Is there any current support for APOC procedures or are there plans to include this in a future release?
Thanks in advance!
This seems a similar issue to #19 which has been closed. What was the outcome of this?
In a similar fashion, putting backticks around apoc.convert.toBoolean
to escape the projection removes the error, but this then does not work in neo
On ArchLinux, the current system's default GCC won't build the code because of a warning that's treated as error. Personally, I'm against releasing code to end users with -Werror, it makes sense on author's testing grid, but not on myriad end user's configurations. Warnings can vary per compiler and per compiler version. Sorry, that's the reason I'm reporting the occurrence without disclosing the error. If it had been kept as a warning I'd probably report and disclose it instead.
Hi,
Is ASC/DESC not implemented yet? I use your parser in my project and have tested some queries.
Here one of the test queries: MATCH (n:N),(m:M) WHERE n.num = m.num RETURN n.num ASC
and here are the last three lines of the parser:
@23 47..50 > > > > > prop name `num`
@24 45..51 > > > > identifier `n.num`
@25 51..55 > > error >>ASC\n<<
Will this feature come in the next time?
-Daizy
Hi
There might be an issue with operators precedence, given an "IN" operator and subscript operator.
Given this open cypher TCK scenario, the query is
RETURN 3 IN [[1, 2, 3]][0] AS r
, which should return true.
The AST generated by libcypher-parser is as follows:
I believe that the "IN" operator should be the higher in the tree, while the subscript should be much lower
For your information, I made the linter available in ALE, so anyone that uses ALE with Vim or NeoVim and have cypher-lint
in path will benefit from instant lint feedback while editing Cypher source files.
The last tag was in April of 2019 and there are quite a few things added since then. Can we get a release published? I'm specifically looking to use the fix for #19
More specifically, I am referring to getting it published to add-apt-repository ppa:cleishm/neo4j
neo4j support CALL {} (subquery)
since v3.5. It is an amazing feature.
https://neo4j.com/docs/cypher-manual/4.1/clauses/call-subquery/
Procedure calls cannot currently be succeeded by WHERE conditions:
$ echo "CALL db.labels() YIELD label WHERE label = 'fruit' RETURN label" | ./src/cypher-lint -a
<stdin>:1:31: Invalid input 'H': expected WITH
CALL db.labels() YIELD label WHERE label = 'fruit' RETURN label
Interpolating a WITH projection allows us to make this combination:
$ echo "CALL db.labels() YIELD label WITH label WHERE label = 'fruit' RETURN label" | ./src/cypher-lint -a
@0 0..75 statement body=@1
@1 0..75 > query clauses=[@2, @6, @12]
@2 0..29 > > CALL name=@3, YIELD=[@4]
@3 5..14 > > > proc name `db.labels`
@4 23..29 > > > projection expression=@5
@5 23..28 > > > > identifier `label`
@6 29..62 > > WITH projections=[@7], WHERE=@9
@7 34..40 > > > projection expression=@8
@8 34..39 > > > > identifier `label`
@9 46..62 > > > binary operator @10 = @11
@10 46..51 > > > > identifier `label`
@11 54..61 > > > > string "fruit"
@12 62..75 > > RETURN projections=[@13]
@13 69..75 > > > projection expression=@14
@14 69..74 > > > > identifier `label`
The CALL...WHERE construction is supported in Neo4j.
$ cypher-lint -a < scripts/upload_t10_data.cypher
<stdin>:2:19: Invalid input 'u': expected '=' or CREATE CONSTRAINT ON
CREATE CONSTRAINT uniqueT10 IF NOT EXISTS ON (n:T10)
^
<stdin>:5:19: Invalid input 'u': expected '=' or CREATE CONSTRAINT ON
CREATE CONSTRAINT uniqueMonth IF NOT EXISTS ON (m:Month)
^
@0 2..33 line_comment // Make sure we have unique nodes
@1 34..145 error >>CREATE CONSTRAINT uniqueT10 IF NOT EXISTS ON (n:T10)\n ASSERT (n.date, n.name, n.result, n.m50) IS NODE KEY<<
@2 149..175 line_comment // Constraint for Month/Year
@3 176..275 error >>CREATE CONSTRAINT uniqueMonth IF NOT EXISTS ON (m:Month)\n ASSERT (m.month, m.year) IS NODE KEY<<
@4 280..327 line_comment // Needs apoc.import.file.enabled=true in config
According to the cypher refcard, the following constraints format is acceptable (and working as expected):
// Make sure we have unique nodes
CREATE CONSTRAINT uniqueT10 IF NOT EXISTS ON (n:T10)
ASSERT (n.date, n.name, n.result, n.m50) IS NODE KEY;
// Constraint for Month/Year
CREATE CONSTRAINT uniqueMonth IF NOT EXISTS ON (m:Month)
ASSERT (m.month, m.year) IS NODE KEY;
I'm using RedisGraph and implementing some custom graph algorithms, and was hoping I'd be able to pass a serialized binary blob into my functions, so that I receive my inputs pre-structured and avoid having to parse them.
Something like... CALL algo.abc(binaryBlob)
This of course isn't possible because cypher relies on string parsing. But maybe you have some other suggestion for me to accomplish this?
Creating a file point.cypher
with the contents:
create index on :Point(latitude, longitude);
and running
$ cypher-lint point.cypher
gets the following result:
point.cypher:1:32: Invalid input ',': expected ')'
create index on :Point(latitude, longitude);
^
I expected this statement to lint correctly, per the spec here: https://neo4j.com/docs/developer-manual/current/cypher/schema/index/#create-a-composite-index
Are composite index statements not supported by the linter?
Is it possible to take the parser output and reconstruct the query using that output?
In my use case I would like to allow my users to enter a query, but I would like to modify it on their behalf before finally querying the DB. Is this doable using this library, or am I barking up the wrong tree?
Thank you!
Can I assume all of the ast_ files are generated from a grammar syntax and if so, is it the grammar from http://www.opencypher.org/resources?
Functions like duration.between
are incorrectly rejected as they are parsed as property accesses instead of function calls.
root@36c23cede604:/tmp# cat t.c
MATCH (a) WHERE duration.between(a, b) < 4 RETURN a;
root@36c23cede604:/tmp# cypher-lint --ast t.c
t.c:1:33: Invalid input '(': expected '.', AND, OR, XOR, NOT, '=', '<>', '+', '-', '*', '/', '%', '^', IN, '=~', CONTAINS, STARTS WITH, ENDS WITH, '<=', '>=', '<', '>', IS NULL, IS NOT NULL, '[', '{', a label, ';' or a clause
MATCH (a) WHERE duration.between(a, b) < 4 RETURN a;
^
t.c:
@0 0..52 statement body=@1
@1 0..52 > query clauses=[@2, @11]
@2 0..32 > > MATCH pattern=@3, where=@7
@3 6..9 > > > pattern paths=[@4]
@4 6..9 > > > > pattern path (@5)
@5 6..9 > > > > > node pattern (@6)
@6 7..8 > > > > > > identifier `a`
@7 16..32 > > > property @8.@9
@8 16..24 > > > > identifier `duration`
@9 25..32 > > > > prop name `between`
@10 32..42 > > error >>(a, b) < 4<<
@11 43..51 > > RETURN projections=[@12]
@12 50..51 > > > projection expression=@13
@13 50..51 > > > > identifier `a`
The expected behavior would be to parse this as a function application with the function name as duration.between
.
libcypher-parser/lib/test/check_libcypher-parser_suite.c:1:10: fatal error: check.h: No such file or directory
1 | #include <check.h>
| ^~~~~~~~~
https://stackoverflow.com/questions/63697460/installed-check-for-c-but-check-h-not-found
I saw this post. Do I need to add any cmake arguments to get this working? I installed check
by sudo apt install check
.
I have this in the CMakeList.txt:
SET(CMAKE_C_FLAGS "-lpthread -lX11 -ldrm -lcheck -lm -lrt -lsubunit")
Thank you in advance!
for
MATCH p=((anna)-[:FriendOf*]->(bob))
RETURN p
cypher-lint
gives:
<stdin>:1:10: Invalid input '(': expected an identifier, a label, '{', a parameter or ')'
MATCH p=((anna)-[:FriendOf*]->(bob))
^
but Neo4j seems to accept it alright. I'm running the latest release, not master, so I'm sorry if this has been fixed already!
#26 addresses this, I think?
I've posted a question in Stackoverflow, but just cross-posting here since I could not use the libcypher-parser
tag.
https://stackoverflow.com/questions/54573401/does-libcypher-parser-has-support-for-regex-operator
An more succinct example here:
echo "RETURN word =~ '.*'" | cypher-lint -a
<stdin>:1:14: Invalid input '~': expected NOT, '+', '-', TRUE, FALSE, NULL, "...string...", a float, an integer, '[', a parameter, '{', CASE, FILTER, EXTRACT, REDUCE, ALL, ANY, NONE, SINGLE, shortestPath, allShortestPaths, '(', a function name or an identifier
RETURN word =~ '.*'
^
@0 0..20 statement body=@1
@1 0..20 > query clauses=[@2]
@2 0..12 > > RETURN projections=[@3]
@3 7..12 > > > projection expression=@4
@4 7..11 > > > > identifier `word`
@5 12..20 > > error >>=~ '.*'\n<<
I can see references to an CYPHER_OP_REGEX
in the code. But cannot find why it's not parsing as expected.
What is the recommended way to set parameters in libcypher-parser?
":param x => [1,2];"
results in:
@0 0..17 command name=@1, args=[@2, @3, @4]
@1 1..6 > string "param"
@2 7..8 > string "x"
@3 9..11 > string "=>"
@4 12..17 > string "[1,2]"
Looking at the client-arg-string
definition in the source, it seems that arguments are always interpreted as simple strings. Would it be difficult to parse parameter values as atoms instead, or does that pose issues I haven't considered?
Hi @cleishm,
Thanks very much for the annotation API! I've started migrating our anonymous identifiers to leverage it, and it's working very smoothly so far.
Now that RedisGraph is synced with master
here, we're running afoul of a safety check you introduced in 8c76cc5. Specifically, the cypher_ast_query
children check:
libcypher-parser/lib/src/ast_query.c
Lines 51 to 52 in 776d96e
To differentiate between scopes in multi-part queries, we invoke this constructor to make sub-ASTs that only represent a slice of clauses (punctuated by WITH and RETURN clauses). With appropriate shame, I admit that this approach involves a bit of dodgy casting:
cypher_astnode_t *clauses[n];
for(uint i = 0; i < n; i ++) {
clauses[i] = (cypher_astnode_t *)cypher_ast_query_get_clause(master_ast->root, i + start_offset);
}
struct cypher_input_range range = {};
ast->root = cypher_ast_query(NULL, 0, (cypher_astnode_t *const *)clauses, n, NULL, 0, range);
This now produces the error:
ast_query.c:52: cypher_ast_query: Assertion 'nchildren >= nclauses' failed.
We can avoid this by introducing the same variables as children
, but this causes a lot of unnecessary allocations.
Do you have any advice for how we could construct ephemeral sub-ASTs to represent a slice of clauses in a more orthodox way?
Thank you!
Neo4j 4 offers new syntax for database management that I suppose libcypher-parser doesn’t know about yet.
Running libcypher-parser 0.6.2 via @majensen’s fork majensen/libneo4j-client@7bd6dfd:
neo4j> :status
Connected to 'neo4j://neo4j@localhost:7687' (insecure) [Neo4j/4.2.1]
neo4j> :dbname system
db set
neo4j> SHOW DEFAULT DATABASE;
"name","address","role","requestedStatus","currentStatus","error"
"neo4j","localhost:7687","standalone","online","online",""
<interactive>:1:2: error: Invalid input 'H': expected 't/T' or 'e/E'
SHOW DEFAULT DATABASE
▲
Cypher has some restrictions in its grammar file regarding clause sequences. For example, in a single-part query, a reading clause cannot follow a writing clause, and no clause can follow RETURN
.
libcypher-parser allows these constructions, however, constructing ASTs for queries such as:
CREATE (a) MATCH (a)
RETURN 1 DELETE a
RETURN 1 RETURN 1
Shouldn't these constructions trigger parser errors?
this works:
MATCH (a:Node) RETURN abc("args1") AS a;
output:
@0 0..40 statement body=@1
@1 0..40 > query clauses=[@2, @8]
@2 0..15 > > MATCH pattern=@3
@3 6..14 > > > pattern paths=[@4]
@4 6..14 > > > > pattern path (@5)
@5 6..14 > > > > > node pattern (@6:@7)
@6 7..8 > > > > > > identifier `a`
@7 8..13 > > > > > > label :`Node`
@8 15..39 > > RETURN projections=[@9]
@9 22..39 > > > projection expression=@10, alias=@13
@10 22..34 > > > > apply @11(@12)
@11 22..25 > > > > > function name `abc`
@12 26..33 > > > > > string "args1"
@13 38..39 > > > > identifier `a`
but this doesn't
MATCH (a:Node) RETURN abc.fn("args1") AS a;
output:
<stdin>:1:29: Invalid input '(': expected '.', AND, OR, XOR, NOT, '=~', '=', '<>', '+', '-', '*', '/', '%', '^', IN, CONTAINS, STARTS WITH, ENDS WITH, '<=', '>=', '<', '>', IS NULL, IS NOT NULL, '[', '{', a label, AS, ',', ORDER BY, SKIP, LIMIT, ';' or a clause
MATCH (a:Node) RETURN abc.fn("args1") AS a;
^
@0 0..43 statement body=@1
@1 0..43 > query clauses=[@2, @8]
@2 0..15 > > MATCH pattern=@3
@3 6..14 > > > pattern paths=[@4]
@4 6..14 > > > > pattern path (@5)
@5 6..14 > > > > > node pattern (@6:@7)
@6 7..8 > > > > > > identifier `a`
@7 8..13 > > > > > > label :`Node`
@8 15..28 > > RETURN projections=[@9]
@9 22..28 > > > projection expression=@10, alias=@13
@10 22..28 > > > > property @11.@12
@11 22..25 > > > > > identifier `abc`
@12 26..28 > > > > > prop name `fn`
@13 22..28 > > > > identifier `abc.fn`
@14 28..42 > > error >>("args1") AS a<<
I think apply
should have higher priority than property
I have seen a lot of incorrect behaviors when parsing queries from LDBC and TCK .
Could you expose the version numbers as a #define macros? (example: https://github.com/python/cpython/blob/master/Include/patchlevel.h)
This way users would be able to write code that works with different versions of the library.
great job! thank you so much for sharing the cypher-parser. Is it possible to enable fulltext?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.