andremm / lua-parser Goto Github PK
View Code? Open in Web Editor NEWA Lua 5.3 parser written with LPegLabel
License: MIT License
A Lua 5.3 parser written with LPegLabel
License: MIT License
I think using table representation for booleans like this:
{
false,
tag = 'Boolean',
}
will be more conforming with lua type system and other tags, such as Number or String
I have questions.. and the ast is mind blowing to loop 😂
in string we have this
[09:01 ob ]$ ob
{ Set{ {
Id "main" }, { Function{ { }, {
Set{ { Id "select" }, {
Call{ Index{
Id "gg", String "choice" },
Table{ String "test1",
String "test2" } } } }, If{
Op{ "eq", Id "select",
Number "1" }, { Call{
Index{ Id "test",
String "show" } } } }, If{
Op{ "eq", Id "select",
Number "2" }, { Call{
Index{ Id "test",
String "show2" } } } } } } } }, Set{ {
Id "a" }, { Number "1" } },
Set{ { Id "b" }, {
String "3(£+£+£+£+" } }, Set{ {
Id "test" }, { Table{
Pair{ String "show",
Function{ { }, { Call{
Index{ Id "gg",
String "alert" }, String "find me" } } } },
Pair{ String "show2",
Function{ { }, { Call{
Index{ Id "gg",
String "alert" }, String "yay me" } } } } } } },
Call{ `Id "main" } }
and. i have this as table
``
{ { { { "main",
pos = 10,
tag = "Id"
} }, { { {}, { { { { "select",
pos = 21,
tag = "Id"
},
pos = 21,
tag = "VarList"
}, { { { { "gg",
pos = 30,
tag = "Id"
}, { "choice",
pos = 33,
tag = "String"
},
pos = 30,
tag = "Index"
}, { { "test1",
pos = 51,
tag = "String"
}, { "test2",
pos = 69,
tag = "String"
},
pos = 40,
tag = "Table"
},
pos = 30,
tag = "Call"
},
pos = 30,
tag = "ExpList"
},
pos = 21,
tag = "Set"
}, { { "eq", { "select",
pos = 91,
tag = "Id"
}, { 1,
pos = 101,
tag = "Number"
},
pos = 91,
tag = "Op"
}, { { { { "test",
pos = 108,
tag = "Id"
}, { "show",
pos = 113,
tag = "String"
},
pos = 108,
tag = "Index"
},
pos = 108,
tag = "Call"
},
pos = 108,
tag = "Block"
},
pos = 88,
tag = "If"
}, { { "eq", { "select",
pos = 131,
tag = "Id"
}, { 2,
pos = 141,
tag = "Number"
},
pos = 131,
tag = "Op"
}, { { { { "test",
pos = 148,
tag = "Id"
}, { "show2",
pos = 153,
tag = "String"
},
pos = 148,
tag = "Index"
},
pos = 148,
tag = "Call"
},
pos = 148,
tag = "Block"
},
pos = 128,
tag = "If"
},
pos = 21,
tag = "Block"
},
pos = 14,
tag = "Function"
} },
pos = 1,
tag = "Set"
}, { { { "a",
pos = 177,
tag = "Id"
},
pos = 177,
tag = "VarList"
}, { { 1,
pos = 181,
tag = "Number"
},
pos = 181,
tag = "ExpList"
},
pos = 177,
tag = "Set"
}, { { { "b",
pos = 187,
tag = "Id"
},
pos = 187,
tag = "VarList"
}, { { "3(£+£+£+£+",
pos = 191,
tag = "String"
},
pos = 191,
tag = "ExpList"
},
pos = 187,
tag = "Set"
}, { { { "test",
pos = 212,
tag = "Id"
},
pos = 212,
tag = "VarList"
}, { { { { "show",
pos = 229,
tag = "String"
}, { {}, { { { { "gg",
pos = 259,
tag = "Id"
}, { "alert",
pos = 262,
tag = "String"
},
pos = 259,
tag = "Index"
}, { "find me",
pos = 268,
tag = "String"
},
pos = 259,
tag = "Call"
},
pos = 259,
tag = "Block"
},
pos = 244,
tag = "Function"
},
pos = 229,
tag = "Pair"
}, { { "show2",
pos = 300,
tag = "String"
}, { {}, { { { { "gg",
pos = 331,
tag = "Id"
}, { "alert",
pos = 334,
tag = "String"
},
pos = 331,
tag = "Index"
}, { "yay me",
pos = 340,
tag = "String"
},
pos = 331,
tag = "Call"
},
pos = 331,
tag = "Block"
},
pos = 316,
tag = "Function"
},
pos = 300,
tag = "Pair"
},
pos = 219,
tag = "Table"
},
pos = 219,
tag = "ExpList"
},
pos = 212,
tag = "Set"
}, { { "main",
pos = 372,
tag = "Id"
},
pos = 372,
tag = "Call"
},
pos = 1,
tag = "Block"
}
``
how can i loop in all of this?
hmm I want to get all
functions name
variable names
the script looks like this
function main() select = gg.choice({ "test1", "test2" }) if select == 1 then test.show() end if select == 2 then test.show2() end end a = 1 b = "3(£+£+£+£+" test = { show = function() gg.alert("find me") end, show2 = function() gg.alert("yay me") end } main()
Your example produces the same syntax tree for two very different code snippets
a=1>=4
a=1<=4
Bot produce
{ `Set{ { `Id "a" }, { `Op{ "le", `Number "1", `Number "4" } } } }
Hello,
I enjoyed by your lua-parser !
I started to testt it and it fail, because I'm on a quiet old debian with LPeg 0.10.
I decided to check with LuLPeg (LPeg 0.12).
All tests passed with lua5.1, luajit(5.1) and lua5.2 !
I discovered all tests pass... but sometimes it fail !
Always at the assert of the line 598.
I add some debug print and got different error message, but the most of time it raise the funny message syntax error, unexpected 'do', expecting 'do', ...
.
I only got failure with lua5.1 (not lua5.2)
EDIT: I also got the same error with luajit(2.0.3 compat 5.1))
The full logs :
tst2005/lua-parser$ while lua5.1 test.lua; do sleep 5;done
> testing lexer...
> testing parser...
OK
> testing lexer...
> testing parser...
OK
> testing lexer...
> testing parser...
OK
> testing lexer...
> testing parser...
OK
> testing lexer...
> testing parser...
OK
> testing lexer...
> testing parser...
OK
> testing lexer...
> testing parser...
OK
> testing lexer...
> testing parser...
OK
> testing lexer...
> testing parser...
OK
> testing lexer...
> testing parser...
OK
> testing lexer...
> testing parser...
OK
> testing lexer...
> testing parser...
OK
> testing lexer...
> testing parser...
assert(r==e) r: string test.lua:1:1: syntax error, unexpected 'while', expecting 'return', '(', 'Name', 'goto', 'break', '::', 'local', 'function', 'repeat', 'for', 'do', '[', 'if', ';'
e: { `While{ `Number "1", { `Break } } }
string s:
lua5.1: ERROR: assert fail at line ~598
stack traceback:
[C]: in function 'error'
test.lua:645: in main chunk
[C]: ?
tst2005/lua-parser$ while lua5.1 test.lua; do sleep 5;done
> testing lexer...
> testing parser...
OK
> testing lexer...
> testing parser...
assert(r==e) r: string test.lua:1:9: syntax error, unexpected 'do', expecting 'do', 'or', 'and', '>', '<', '>=', '<=', '==', '~=', '|', '~', '&', '>>', '<<', '..', '-', '+', '%', '/', '//', '*', '^'
e: { `While{ `Number "1", { `Break } } }
string s:
lua5.1: ERROR: assert fail at line ~598
stack traceback:
[C]: in function 'error'
test.lua:645: in main chunk
[C]: ?
tst2005/lua-parser$ while lua5.1 test.lua; do sleep 5;done
> testing lexer...
> testing parser...
OK
> testing lexer...
> testing parser...
OK
> testing lexer...
> testing parser...
OK
> testing lexer...
> testing parser...
OK
> testing lexer...
> testing parser...
OK
> testing lexer...
> testing parser...
OK
> testing lexer...
> testing parser...
assert(r==e) r: string test.lua:1:9: syntax error, unexpected 'do', expecting 'do', 'or', 'and', '>', '<', '>=', '<=', '==', '~=', '|', '~', '&', '>>', '<<', '..', '-', '+', '%', '/', '//', '*', '^'
e: { `While{ `Number "1", { `Break } } }
string s:
lua5.1: ERROR: assert fail at line ~598
stack traceback:
[C]: in function 'error'
test.lua:645: in main chunk
[C]: ?
tst2005/lua-parser$
/tmp
and cd
into it.pip install hererocks
hererocks . -r^ --lua=5.3
./bin/luarocks install lua-parser
README
as parse.lua
:local parser = require "lua-parser.parser"
local pp = require "lua-parser.pp"
if #arg ~= 1 then
print("Usage: parse.lua <string>")
os.exit(1)
end
local ast, error_msg = parser.parse(arg[1], "example.lua")
if not ast then
print(error_msg)
os.exit(1)
end
pp.print(ast)
os.exit(0)
./bin/lua parse.lua "for i=1, 10 do print(i)"
example.lua:1:24: syntax error, expected 'end' to close the for loop
./bin/lua: ./share/lua/5.3/lua-parser/parser.lua:478: attempt to get length of a number value (local 'sfail')
stack traceback:
./share/lua/5.3/lua-parser/parser.lua:478: in function 'lua-parser.parser.parse'
parse.lua:9: in main chunk
[C]: in ?
When given the string
"\\a"
The fix_str() function will make this into
"\\\a"
Instead of the correct
"\a"
This is the same for all escaped characters. (\b,\n,\r ect.)
Hi Andre, there may be something not right in lua-parser/parser.lua:binaryOp:
local function binaryOp (e1, op, e2)
if not op then
return e1
end
local node = { tag = "Op", pos = e1.pos, [1] = op, [2] = e1, [3] = e2 }
if op == "ne" then
node[1] = "eq"
node = unaryOp("not", node)
elseif op == "gt" then
node[1], node[2], node[3] = "lt", e2, e1 --<-- problem
elseif op == "ge" then
node[1], node[2], node[3] = "le", e2, e1 --<-- problem
end
return node
end
$ cat t.lua
gl_f_ct = 0
function f()
if gl_f_ct <= 0 then
gl_f_ct=1
return 1000
end
return -1000
end
print( f("1st call") > f("2nd call") ) --> in lua-parser's ast: lt f("2nd call") f("1st call") | wrong
gl_f_ct = 0
print( f("1st call") < f("2nd call") ) --> in lua-parser's ast: lt f("1st call") f("2nd call") | right
$ luajit -bl t.lua
-- BYTECODE -- t.lua:3-9
0001 GGET 0 0 ; "gl_f_ct"
0002 KSHORT 1 0
0003 ISGT 0 1
0004 JMP 0 => 0009
0005 KSHORT 0 1
0006 GSET 0 0 ; "gl_f_ct"
0007 KSHORT 0 1000
0008 RET1 0 2
0009 => KSHORT 0 -1000
0010 RET1 0 2
-- BYTECODE -- t.lua:0-15
0001 KSHORT 0 0
0002 GSET 0 0 ; "gl_f_ct"
0003 FNEW 0 1 ; t.lua:3
0004 GSET 0 2 ; "f"
0005 GGET 0 3 ; "print"
0006 GGET 1 2 ; "f"
0007 KSTR 2 4 ; "1st call" <-- right
0008 CALL 1 2 2
0009 GGET 2 2 ; "f"
0010 KSTR 3 5 ; "2nd call" <-- right
0011 CALL 2 2 2
0012 ISLT 2 1
0013 JMP 1 => 0016
0014 KPRI 1 1
0015 JMP 2 => 0017
0016 => KPRI 1 2
0017 => CALL 0 1 2
0018 KSHORT 0 0
0019 GSET 0 0 ; "gl_f_ct"
0020 GGET 0 3 ; "print"
0021 GGET 1 2 ; "f"
0022 KSTR 2 4 ; "1st call" <-- right
0023 CALL 1 2 2
0024 GGET 2 2 ; "f"
0025 KSTR 3 5 ; "2nd call" <-- right
0026 CALL 2 2 2
0027 ISLT 1 2
0028 JMP 1 => 0031
0029 KPRI 1 1
0030 JMP 2 => 0032
0031 => KPRI 1 2
0032 => CALL 0 1 2
0033 RET0 0 1
$ lua parse_str.lua 'print( f("1st call") > f("2nd call") ) ; print( f("1st call") < f("2nd call") )'
{
[tag] = Block
[pos] = 1
[1] = {
[tag] = Call
[pos] = 1
[1] = {
[tag] = Id
[pos] = 1
[1] = print
}
[2] = {
[tag] = Op
[pos] = 8
[1] = lt
[2] = {
[tag] = Call
[pos] = 24
[1] = {
[tag] = Id
[pos] = 24
[1] = f
}
[2] = {
[tag] = String
[pos] = 26
[1] = 2nd call <-- wrong
}
}
[3] = {
[tag] = Call
[pos] = 8
[1] = {
[tag] = Id
[pos] = 8
[1] = f
}
[2] = {
[tag] = String
[pos] = 10
[1] = 1st calll <-- wrong
}
}
}
}
[2] = {
[tag] = Call
[pos] = 42
[1] = {
[tag] = Id
[pos] = 42
[1] = print
}
[2] = {
[tag] = Op
[pos] = 49
[1] = lt
[2] = {
[tag] = Call
[pos] = 49
[1] = {
[tag] = Id
[pos] = 49
[1] = f
}
[2] = {
[tag] = String
[pos] = 51
[1] = 1st calll <-- right
}
}
[3] = {
[tag] = Call
[pos] = 65
[1] = {
[tag] = Id
[pos] = 65
[1] = f
}
[2] = {
[tag] = String
[pos] = 67
[1] = 2nd call <-- right
}
}
}
}
}
I think the AST shouldn't consider too much about the constraints of latter IR phase. And there maybe two possible solutions:
Add 'gt' & 'ge' to the opid (which is used in PR11).
Use the mathematical transformation below to solve it lazily (but which would cause some more unnecessarily complexity):
( f1 > f2 ) == ( not ( f1 <= f2 ) )
( f1 >= f2 ) == ( not ( f1 < f2 ) )
Thanks a lot and all best :)
The command line below will simply throw an error if without this commit .
$cat parse.lua
local parser = require "lua-parser.parser"
local pp = require "lua-parser.pp"
if #arg ~= 1 then
print("Usage: parse.lua <string>")
os.exit(1)
end
local ast, error_msg = parser.parse(arg[1], "example.lua")
if not ast then
print(error_msg)
os.exit(1)
end
pp.dump(ast)
os.exit(0)
$ lua parse.lua "x=2;local function x() end"
lua: /usr/share/lua/5.1/lua-parser/pp.lua:314: bad argument #3 to 'format' (string expected, got nil)
stack traceback:
[C]: in function 'format'
/usr/share/lua/5.1/lua-parser/pp.lua:314: in function 'dump'
/usr/share/lua/5.1/lua-parser/pp.lua:319: in function 'dump'
/usr/share/lua/5.1/lua-parser/pp.lua:319: in function 'dump'
parse_str.lua:54: in main chunk
[C]: ?
Expected output after this commit:
{
[tag] = Block
[pos] = 1
[1] = {
[tag] = Set
[pos] = 1
[1] = {
[tag] = VarList
[pos] = 1
[1] = {
[tag] = Id
[pos] = 1
[1] = x
}
}
[2] = {
[tag] = ExpList
[pos] = 3
[1] = {
[tag] = Number
[pos] = 3
[1] = 2
}
}
}
[2] = {
[tag] = Localrec
[pos] = 11
[1] = {
[1] = {
[tag] = Id
[pos] = 20
[1] = x
}
}
[2] = {
[1] = {
[tag] = Function
[pos] = 21
[1] = {
}
[2] = {
[tag] = Block
[pos] = 24
}
}
}
}
}
I don't know whether this is just a little implementation problem within pp.dump or actually due to the inconsistence of the AST which generated by the parser.
Thanks a lot and all best to you. lua-parser is a pretty cool project 👍
lp = require 'lua-parser.parser'
lp.parse('::xx::; goto xxx')
parser.lua:75: bad argument #2 to 'format' (string expected, got nil)
lua5.3: /usr/local/share/lua/5.3/lua-parser/parser.lua:478: attempt to get length of a number value (local 'sfail')
stack traceback:
/usr/local/share/lua/5.3/lua-parser/parser.lua:478: in function 'lua-parser.parser.parse'
<my code>
[C]: in ?
require 'luarocks.loader'
local parser = require 'lua-parser.parser'
local source = "while true do"
local ast, err = parser.parse(source, "sourcefile.lua")
Seems like this happens with any invalid syntax.
luarocks 3.0.4
lua-parser 1.0.0-1
LPegLabel 1.5.0-1
Lua 5.3.3
This issue does not occur with lua-parser 0.1.1-1
. The parse
function returns normally and error message is sourcefile.lua:1:13: syntax error, unexpected 'EOF', expecting 'end', 'return', '(', 'Name', 'goto', 'break', '::', 'local', 'function', 'repeat', 'for', 'do', 'while', 'if', ';'
I've faced a problem in reproducing original lua source code from AST. There is a pos
field in AST node but is it possible to add optional code context? For example:
I have the following original code:
if a == 3 and b then end
and corresponding AST for it;
{ { { "and", { "eq", { "a",
pos = 4,
tag = "Id"
}, { 3,
pos = 9,
tag = "Number"
},
pos = 4,
tag = "Op"
}, { "b",
pos = 15,
tag = "Id"
},
pos = 4,
tag = "Op"
}, {
pos = 22,
tag = "Block"
},
pos = 1,
tag = "If"
},
pos = 1,
tag = "Block"
}
I want to restore orig code for if condition body. On one hand I believe that I can do it via Breadth-first search on the AST, but some info can be missing: some binary comparation operators transformed in non-revertable AST such as >= & <=, etc. Lua comments are dropped and so on.
So as the result there is a bunch of questions:
not a == b
, etc)A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.