Comments (16)
- def test ()
- print hello
- WebIDL Grammar #3
- def another ()
- print hello 2
What is the input? The above input text is illegal Python2 source code. (Try it in https://onecompiler.com/python2/42j94gnxp.)
def test()
does not end in a colon.print hello
is not indented within the definition oftest()
.
Further, we cannot tell if you are using \n
or \r\n
or \n\r
newline character sequences. It's only possible to know which if you attach a .txt file. In lieu of that, please edit the above comment with the input nested in a triple-backtick quoted block. See https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax#quoting-code.
from grammars-v4.
Thank you kaby76 for suggesting to input in a triple-backtip quoted block.
Adding on to Himanshu's issue, sharing the below code snippet where function test() starts at line number 1 and ends at line number 3 at the print statement but the ANTLR Python 2.7.18 grammar finds the end line of test() function as the start of the next function greet() which is at line number 5.
def test():
xxx=1
print xxx
def greet():
print 'Hello World'
greet();
from grammars-v4.
DEDENT token is placed in the 5th line because it is detected there.
Also try Python's tokenizer:
python -m tokenize test.py -e
It also places the DEDENT token in the 5th line.
from grammars-v4.
I agree, I'm not sure what the problem is here.
Input:
def test():
xxx=1
print xxx
def greet():
print 'Hello World'
greet();
Or in file: xxx.txt.
The parse tree is:
( file_input
( stmt
( compound_stmt
( funcdef
( DEF
( text:'def' tt:0 chnl:DEFAULT_TOKEN_CHANNEL
) )
( Attribute WS Value ' ' chnl:HIDDEN
)
( NAME
( text:'test' tt:0 chnl:DEFAULT_TOKEN_CHANNEL
) )
( parameters
( LPAR
( text:'(' tt:0 chnl:DEFAULT_TOKEN_CHANNEL
) )
( RPAR
( text:')' tt:0 chnl:DEFAULT_TOKEN_CHANNEL
) ) )
( COLON
( text:':' tt:0 chnl:DEFAULT_TOKEN_CHANNEL
) )
( suite
( NEWLINE
( text:'\r\n' tt:0 chnl:DEFAULT_TOKEN_CHANNEL
) )
( Attribute WS Value ' ' chnl:HIDDEN
)
( INDENT
( text:'<INDENT>' tt:0 chnl:DEFAULT_TOKEN_CHANNEL
) )
( stmt
( simple_stmt
( small_stmt
( expr_stmt
( testlist
( test
( or_test
( and_test
( not_test
( comparison
( expr
( xor_expr
( and_expr
( shift_expr
( arith_expr
( term
( factor
( power
( atom
( NAME
( text:'xxx' tt:0 chnl:DEFAULT_TOKEN_CHANNEL
) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) )
( EQUAL
( text:'=' tt:0 chnl:DEFAULT_TOKEN_CHANNEL
) )
( testlist
( test
( or_test
( and_test
( not_test
( comparison
( expr
( xor_expr
( and_expr
( shift_expr
( arith_expr
( term
( factor
( power
( atom
( NUMBER
( text:'1' tt:0 chnl:DEFAULT_TOKEN_CHANNEL
) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) )
( NEWLINE
( text:'\r\n' tt:0 chnl:DEFAULT_TOKEN_CHANNEL
) ) ) )
( stmt
( simple_stmt
( small_stmt
( print_stmt
( Attribute WS Value ' ' chnl:HIDDEN
)
( PRINT
( text:'print' tt:0 chnl:DEFAULT_TOKEN_CHANNEL
) )
( test
( or_test
( and_test
( not_test
( comparison
( expr
( xor_expr
( and_expr
( shift_expr
( arith_expr
( term
( factor
( power
( atom
( Attribute WS Value ' ' chnl:HIDDEN
)
( NAME
( text:'xxx' tt:0 chnl:DEFAULT_TOKEN_CHANNEL
) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) )
( Attribute NEWLINE Value '\r\n' chnl:HIDDEN
)
( NEWLINE
( text:'\r\n' tt:0 chnl:DEFAULT_TOKEN_CHANNEL
) ) ) )
( DEDENT
( text:'<DEDENT>' tt:0 chnl:DEFAULT_TOKEN_CHANNEL
) ) ) ) ) )
( stmt
( compound_stmt
( funcdef
( DEF
( text:'def' tt:0 chnl:DEFAULT_TOKEN_CHANNEL
) )
( Attribute WS Value ' ' chnl:HIDDEN
)
( NAME
( text:'greet' tt:0 chnl:DEFAULT_TOKEN_CHANNEL
) )
( parameters
( LPAR
( text:'(' tt:0 chnl:DEFAULT_TOKEN_CHANNEL
) )
( RPAR
( text:')' tt:0 chnl:DEFAULT_TOKEN_CHANNEL
) ) )
( COLON
( text:':' tt:0 chnl:DEFAULT_TOKEN_CHANNEL
) )
( suite
( NEWLINE
( text:'\r\n' tt:0 chnl:DEFAULT_TOKEN_CHANNEL
) )
( Attribute WS Value ' ' chnl:HIDDEN
)
( INDENT
( text:'<INDENT>' tt:0 chnl:DEFAULT_TOKEN_CHANNEL
) )
( stmt
( simple_stmt
( small_stmt
( print_stmt
( PRINT
( text:'print' tt:0 chnl:DEFAULT_TOKEN_CHANNEL
) )
( test
( or_test
( and_test
( not_test
( comparison
( expr
( xor_expr
( and_expr
( shift_expr
( arith_expr
( term
( factor
( power
( atom
( Attribute WS Value ' ' chnl:HIDDEN
)
( STRING
( text:''Hello World'' tt:0 chnl:DEFAULT_TOKEN_CHANNEL
) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) )
( Attribute NEWLINE Value '\r\n' chnl:HIDDEN
)
( Attribute WS Value ' ' chnl:HIDDEN
)
( NEWLINE
( text:'\r\n' tt:0 chnl:DEFAULT_TOKEN_CHANNEL
) ) ) )
( DEDENT
( text:'<DEDENT>' tt:0 chnl:DEFAULT_TOKEN_CHANNEL
) ) ) ) ) )
( stmt
( simple_stmt
( small_stmt
( expr_stmt
( testlist
( test
( or_test
( and_test
( not_test
( comparison
( expr
( xor_expr
( and_expr
( shift_expr
( arith_expr
( term
( factor
( power
( atom
( NAME
( text:'greet' tt:0 chnl:DEFAULT_TOKEN_CHANNEL
) ) )
( trailer
( LPAR
( text:'(' tt:0 chnl:DEFAULT_TOKEN_CHANNEL
) )
( RPAR
( text:')' tt:0 chnl:DEFAULT_TOKEN_CHANNEL
) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) )
( SEMI
( text:';' tt:0 chnl:DEFAULT_TOKEN_CHANNEL
) )
( NEWLINE
( text:'<NEWLINE>' tt:0 chnl:DEFAULT_TOKEN_CHANNEL
) ) ) )
( EOF
( text:'' tt:0 chnl:DEFAULT_TOKEN_CHANNEL
) ) )
The tokens are:
[@0,0:2='def',<9>,1:0]
[@1,3:3=' ',<84>,channel=1,1:3]
[@2,4:7='test',<79>,1:4]
[@3,8:8='(',<34>,1:8]
[@4,9:9=')',<37>,1:9]
[@5,10:10=':',<40>,1:10]
[@6,11:12='\r\n',<82>,1:11]
[@7,13:16=' ',<84>,channel=1,2:0]
[@8,17:16='<INDENT>',<1>,2:4]
[@9,17:19='xxx',<79>,2:4]
[@10,20:20='=',<51>,2:7]
[@11,21:21='1',<80>,2:8]
[@12,22:23='\r\n',<82>,2:9]
[@13,24:27=' ',<84>,channel=1,3:0]
[@14,28:32='print',<27>,3:4]
[@15,33:33=' ',<84>,channel=1,3:9]
[@16,34:36='xxx',<79>,3:10]
[@17,37:38='\r\n',<82>,channel=1,3:13]
[@18,39:40='\r\n',<82>,4:0]
[@19,41:40='<DEDENT>',<2>,5:0]
[@20,41:43='def',<9>,5:0]
[@21,44:44=' ',<84>,channel=1,5:3]
[@22,45:49='greet',<79>,5:4]
[@23,50:50='(',<34>,5:9]
[@24,51:51=')',<37>,5:10]
[@25,52:52=':',<40>,5:11]
[@26,53:54='\r\n',<82>,5:12]
[@27,55:56=' ',<84>,channel=1,6:0]
[@28,57:56='<INDENT>',<1>,6:2]
[@29,57:61='print',<27>,6:2]
[@30,62:62=' ',<84>,channel=1,6:7]
[@31,63:75=''Hello World'',<81>,6:8]
[@32,76:77='\r\n',<82>,channel=1,6:21]
[@33,78:79=' ',<84>,channel=1,7:0]
[@34,80:81='\r\n',<82>,7:2]
[@35,82:81='<DEDENT>',<2>,8:0]
[@36,82:86='greet',<79>,8:0]
[@37,87:87='(',<34>,8:5]
[@38,88:88=')',<37>,8:6]
[@39,89:89=';',<42>,8:7]
[@40,90:89='<NEWLINE>',<82>,8:8]
[@41,90:89='<EOF>',<-1>,8:8]
According to the Official Python2 grammar, https://docs.python.org/2.7/reference/grammar.html, a funcdef
is funcdef: 'def' NAME parameters ':' suite
. It extends from the first character 'd'
of def
, and goes all the way to the last character of DEDENT
, since suite
is defined as suite: simple_stmt | NEWLINE INDENT stmt+ DEDENT
.
If you want to get the interval for the statements within function "test()", then you have to get the last char of the 2nd stmt
. It says there are two statements for function "test()":
$ trparse xxx.txt | trquery grep ' //stmt/compound_stmt/funcdef[NAME/text() = "test"]/suite/stmt' | trtext -c
CSharp 0 xxx.txt success 0.0428541
2
07/05-12:35:54 ~/issues/g4-new-csharp/python/python2_7_18/Generated-CSharp-0
$ trparse -l xxx.txt | trquery grep ' //stmt/compound_stmt/funcdef[NAME/text() = "test"]/suite/stmt[1]' | trcaret
CSharp 0 xxx.txt success 0.0425021
L2: xxx=1
^
07/05-12:36:00 ~/issues/g4-new-csharp/python/python2_7_18/Generated-CSharp-0
$ trparse -l xxx.txt | trquery grep ' //stmt/compound_stmt/funcdef[NAME/text() = "test"]/suite/stmt[2]' | trcaret
CSharp 0 xxx.txt success 0.0426761
L3: print xxx
^
from grammars-v4.
The only thing that would be nice to change is the text for the INDENT and DEDENT tokens. They are <INDENT>
and <DEDENT>
respectively. But the text is inconsistent with the computed length of the token, which is end index - start index + 1 = 0
. So, for the first INDENT token, [@8,17:16='<INDENT>',<1>,2:4]
, the attributes of the token are:
The "problem" is on the trtext-side of things. trtext reconstructs the text of the input by concatenating the text of the leaves of the parse tree. So, I see <INDENT>
and <DEDENT>
sprinkled in the reconstructed text. I can easily remove these from the tree using trquery delete
.
from grammars-v4.
Thanks for bringing this to my attention.
I really forgot about that.
In other words, the token stream must ensure that the original source code can be restored.
And this is not possible with "<INDENT>
" and "<DEDENT>
" token text.
I will fix it in all PythonLexerBase ports:
- Java
- C#
- Python
- JavaScript
- TypeScript
- Go
- Dart
- CPP
from grammars-v4.
Thanks kaby76 and RobEin for checking on this issue. Waiting for your update if it is fixed in PythonLexerBase for Java.
from grammars-v4.
On second thought, no repair is needed after all.
The rule is very simple to restore the original source code by the token stream.
You just have to take out the INDENT and DEDENT tokens.
Python's tokenizer works differently.
The INDENT and DEDENT tokens must be inserted there to restore the original code.
I'm still wondering if there's any advantage to this, but probably not.
from grammars-v4.
The rule is very simple to restore the original source code by the token stream[:] You just have to take out the INDENT and DEDENT tokens.
...
The INDENT and DEDENT tokens must be inserted there to restore the original code.
I don't understand. These two statements are inconsistent. The first statement says that the INDENT and DEDENT tokens need to be deleted from the parse tree in order to reconstruct the source. The second statement says that they cannot be deleted because they are essential to reconstruct the source.
Currently, I have to delete the INDENT and DEDENT tokens to reconstruct the text because if I don't I get INDENT
and DEDENT
strings sprinkled in the reconstructed text, e.g., this:
07/07-08:05:20 ~/issues/g4-current/python/python2_7_18/Generated-CSharp-0
$ trparse ../examples/atexit.py | trtext
CSharp 0 ../examples/atexit.py success 0.0601923
"""
atexit.py - allow programmer to define multiple exit functions to be executed
upon normal program termination.
One public function, register, is defined.
"""
__all__ = ["register"]
import sys
_exithandlers = []
def _run_exitfuncs():
<INDENT>"""run any registered exit functions
_exithandlers is traversed in reverse order so functions are executed
last in, first out.
"""
exc_info = None
while _exithandlers:
<INDENT>func, targs, kargs = _exithandlers.pop()
try:
<INDENT>func(*targs, **kargs)
<DEDENT>except SystemExit:
<INDENT>exc_info = sys.exc_info()
<DEDENT>except:
<INDENT>import traceback
print >> sys.stderr, "Error in atexit._run_exitfuncs:"
traceback.print_exc()
exc_info = sys.exc_info()
<DEDENT><DEDENT>if exc_info is not None:
<INDENT>raise exc_info[0], exc_info[1], exc_info[2]
<DEDENT><DEDENT>def register(func, *targs, **kargs):
<INDENT>"""register a function to be executed upon normal program termination
func - function to be called at exit
targs - optional arguments to pass to func
kargs - optional keyword arguments to pass to func
func is returned to facilitate usage as a decorator.
"""
_exithandlers.append((func, targs, kargs))
return func
<DEDENT>if hasattr(sys, "exitfunc"):
# Assume it's another registered exit function - append it to our list
<INDENT>register(sys.exitfunc)
<DEDENT>sys.exitfunc = _run_exitfuncs
if __name__ == "__main__":
<INDENT>def x1():
<INDENT>print "running x1"
<DEDENT>def x2(n):
<INDENT>print "running x2(%r)" % (n,)
<DEDENT>def x3(n, kwd=None):
<INDENT>print "running x3(%r, kwd=%r)" % (n, kwd)
<DEDENT>register(x1)
register(x2, 12)
register(x3, 5, "bar")
register(x3, "no kwd args")
<DEDENT>
07/07-08:05:40 ~/issues/g4-current/python/python2_7_18/Generated-CSharp-0
$
Text reconstruction in Trash follows the basic concept that existed in CS since the 1960's: the input text is simply the concatenation of the text of the frontier of the parse tree. The text for INDENT and DEDENT tokens are <INDENT>
and <DEDENT>
. This is why I need to either erase the text (which I currently cannot do with Trash), or the tokens need to be deleted from the parse tree, e.g.,:
07/07-07:59:04 ~/issues/g4-current/python/python2_7_18/Generated-CSharp-0
$ trparse !$ | trquery 'delete //(DEDENT | INDENT)' | trtext
trparse ../examples/atexit.py | trquery 'delete //(DEDENT | INDENT)' | trtext
CSharp 0 ../examples/atexit.py success 0.0612294
"""
atexit.py - allow programmer to define multiple exit functions to be executed
upon normal program termination.
One public function, register, is defined.
"""
__all__ = ["register"]
import sys
_exithandlers = []
def _run_exitfuncs():
"""run any registered exit functions
_exithandlers is traversed in reverse order so functions are executed
last in, first out.
"""
exc_info = None
while _exithandlers:
func, targs, kargs = _exithandlers.pop()
try:
func(*targs, **kargs)
except SystemExit:
exc_info = sys.exc_info()
except:
import traceback
print >> sys.stderr, "Error in atexit._run_exitfuncs:"
traceback.print_exc()
exc_info = sys.exc_info()
if exc_info is not None:
raise exc_info[0], exc_info[1], exc_info[2]
def register(func, *targs, **kargs):
"""register a function to be executed upon normal program termination
func - function to be called at exit
targs - optional arguments to pass to func
kargs - optional keyword arguments to pass to func
func is returned to facilitate usage as a decorator.
"""
_exithandlers.append((func, targs, kargs))
return func
if hasattr(sys, "exitfunc"):
# Assume it's another registered exit function - append it to our list
register(sys.exitfunc)
sys.exitfunc = _run_exitfuncs
if __name__ == "__main__":
def x1():
print "running x1"
def x2(n):
print "running x2(%r)" % (n,)
def x3(n, kwd=None):
print "running x3(%r, kwd=%r)" % (n, kwd)
register(x1)
register(x2, 12)
register(x3, 5, "bar")
register(x3, "no kwd args")
07/07-07:59:38 ~/issues/g4-current/python/python2_7_18/Generated-CSharp-0
$ trparse ../examples/atexit.py | trquery 'delete //(DEDENT | INDENT)' | trtext > save
CSharp 0 ../examples/atexit.py success 0.0600218
07/07-07:59:48 ~/issues/g4-current/python/python2_7_18/Generated-CSharp-0
$ diff save ../examples/atexit.py
66d65
<
07/07-07:59:57 ~/issues/g4-current/python/python2_7_18/Generated-CSharp-0
NB: trtext outputs an extra newline character because it calls Console.WriteLine() instead of a Console.Write()
. It has to do this because dotnet programs don't work perfectly with a Cygwin/MSYS shell. Instead, one should use trsponge to perform the reconstruction and outputting.
from grammars-v4.
The second statement says that they cannot be deleted because they are essential to reconstruct the source.
The second statement was about the original Python tokenizer.
... Trash follows the basic concept that existed in CS since the 1960's: the input text is simply the concatenation of the text of the frontier of the parse tree ...
Now I understand what the problem is.
I didn't know this recommendation.
I can imagine two alternatives in this case:
-
Solution 1:
The text of the INDENT/DEDENT tokens would contain the indentation similar to Python's tokenizer.
Currently, the indentation text is stored in the WS tokens before the INDENT/DEDENT tokens.
This is problematic because it may cause compatibility problems with older applications that use the PythonLexerBase class. -
Solution 2:
This is simpler and less likely to cause compatibility issues.
That is, INDENT/DEDENT tokens would store an empty string.
Currently, the text property of INDENT tokens is consistently"<INDENT>"
and similarly that of DEDENT tokens is"<DEDENT>"
.
If these are now empty strings, then the text properties of the tokens should only be concatenated to restore the original source code.
This would be similar to deleting INDENT/DEDENT tokens.
I recommend the second solution.
from grammars-v4.
I didn't understand. Can you explain me what has to be changed. Do I need to change in any grammar files?
We are trying to parse python 2x file using Java. When i tried to print the FuncdefContext.suite.getText() of test() function for this example,
def test():
xxx=1
print xxx
def greet():
print 'Hello World'
greet();
Output:
<INDENT>xxx=1
printxxx
<DEDENT>
and endline for this test() function is 5.
Can you tell me what should be done here to get the the correct endline.
from grammars-v4.
tree.getText()
doesn't reconstruct the text of the input. It never does for virtually every Antlr grammar! This is because Antlr parse trees don't contain all the tokens of the input, like comments and white space, nor does it contain strings that are "skipped." Grammars that define lexer rules with -> skip
or -> channel(HIDDEN)
cause input strings to be not tokenized or tokenized with the channel property to be 1. The leaves in the parse tree don't contain these tokens. For python2_7_18, the DEDENT and INDENT tokens contain text as strings <DEDENT>
and <INDENT>
and these tokens are part of the Antlr parse tree. This is why you see tree.getText()
contain strings for the DEDENT and INDENT tokens. The "approved" way to get the text from an Antlr parse tree is to query the input char stream directly, using the parse tree to get the bounds of the indices of the text. See https://stackoverflow.com/a/55852474/4779853 or antlr/antlr4#1302
Trash doesn't represent the parse tree like Antlr. It incorporates the entire input, including white space and comments. It's done this way so that it's fully serializable, with no loss of text, and fully editable. The way Antlr splits the parse tree from the token stream, and char stream, is unnatural, difficult/slow to serialize and edit.
from grammars-v4.
Hi, Thanks for your response. I understand that you have suggested on how to get the text from ANTLR parse tree.
Our use case is to parse input python file and identify the startline and endline for each classes, functions, statements, comments, etc. in the file and while doing so we are facing an issue fetching endline from the function and statements context (for, while loop,...)
- Can you help me understand how this endline can be fetched correctly or is there any workaround you would like to suggest.
- Also, does ANTLR python 2.7.18 grammars support python 2.6 version too?
from grammars-v4.
The easiest solution would be to just delete the INDENT and DEDENT leaves, then just get the Interval for the sub-tree. But, the Antlr runtime doesn't have tree editing.
Instead, do this:
- Get the Interval of the node for the funcdef or stmt. The Interval is the start and end indices of the tokens for that sub-tree (i.e., not the start and end of the character buffer).
- Write a loop to start at the ending token index. Working backwards, skip all INDENT and DEDENT tokens until you find something else, something that is not an INDENT or DEDENT. Do not backup further than the starting token index. We now have the end token index of the
funcdef
orstmt
. - Get the end token from its end token index.
- Get the end character index from the end token.
- Write a loop that starts at end character index and looks at the character buffer. Stop looping when you find a character that is not a newline, character index of last non-newline for funcdef or stmt.
- You can now return the 1+character index of last non-newline for funcdef or stmt
In C#:
var funcdefs = new Antlr4.Runtime.Tree.Xpath.XPath(parser, "//funcdef").Evaluate(tree);
var funcdef = funcdefs.FirstOrDefault();
var token_interval = funcdef.SourceInterval;
int end_token_index = token_interval.b;
for (; end_token_index >= token_interval.a; --end_token_index)
{
if (tokens.Get(end_token_index).Type != PythonParser.INDENT
&& tokens.Get(end_token_index).Type != PythonParser.DEDENT
&& tokens.Get(end_token_index).Type != PythonParser.WS
&& tokens.Get(end_token_index).Type != PythonParser.NEWLINE
&& tokens.Get(end_token_index).Channel == 0)
{
break;
}
}
var start_token = tokens.Get(token_interval.a);
var end_token = tokens.Get(end_token_index);
var start_char_index = start_token.StartIndex;
var end_char_index = end_token.StopIndex;
System.Console.WriteLine("funcdef text:");
System.Console.WriteLine(str.GetText(new Interval(start_char_index, end_char_index)));
[D]oes ANTLR python 2.7.18 grammars support python 2.6 version too?
I would think so, but don't quote me.
from grammars-v4.
Hi, Thanks for you response. We will check on the suggestion you have provided as we have built in Java.
Also, in our case we are using custom listener class to identify the endlines for each class, function, statement, etc. by overriding the base listener enter and exit methods.
For Example:
@Override
public void enterFuncdef(FuncdefContext ctx) {
int start = ctx.getStart.getLine();
int stop = ctx.getStop.getLine();
}
Would you like to suggest if we can handle the endlines correctly here?
from grammars-v4.
[W]e are using custom listener class to identify the endlines for each class, function, statement, etc. by overriding the base listener enter and exit methods.
For Example:
@Override public void enterFuncdef(FuncdefContext ctx) { int start = ctx.getStart.getLine(); int stop = ctx.getStop.getLine(); }
Would you like to suggest if we can handle the endlines correctly here?
Not quite. Try this.
MyListener.java
import org.antlr.v4.runtime.*;
import org.antlr.v4.runtime.misc.*;
public class MyListener extends PythonParserBaseListener {
CommonTokenStream tokens_;
CharStream str_;
public MyListener(CommonTokenStream tokens, CharStream str)
{
tokens_ = tokens;
str_ = str;
}
@Override public void enterFuncdef(PythonParser.FuncdefContext ctx)
{
var start = ctx.getStart().getLine();
var token_interval = ctx.getSourceInterval();
var end_token_index = token_interval.b;
var tokens = this.tokens_;
var str = this.str_;
for (; end_token_index >= token_interval.a; --end_token_index)
{
if (tokens.get(end_token_index).getType() != PythonParser.INDENT
&& tokens.get(end_token_index).getType() != PythonParser.DEDENT
&& tokens.get(end_token_index).getType() != PythonParser.WS
&& tokens.get(end_token_index).getType() != PythonParser.NEWLINE
&& tokens.get(end_token_index).getChannel() == 0)
{
break;
}
}
var start_token = tokens.get(token_interval.a);
var end_token = tokens.get(end_token_index);
var start_char_index = start_token.getStartIndex();
var end_char_index = end_token.getStopIndex();
var stop_line_number = end_token.getLine();
System.out.println("stop = " + stop_line_number);
System.out.println("funcdef text:");
System.out.println(str.getText(new Interval(start_char_index, end_char_index)));
}
}
from grammars-v4.
Related Issues (20)
- mkindex.py/reindex.py produces invalid json HOT 1
- _scripts/mkindex.py doesn't work on Windows with Pwsh.exe
- [static-check]Deprecated module [email protected] / [email protected] HOT 4
- [build] "perf" workflow isn't working because Octave cannot install
- Quoted sequence in range bound parsed incorrectly (PCRE.g4) HOT 4
- TSQL dynamic sql not parsed on single quotes HOT 1
- [build] CSharp templates slightly out of date
- [plsql] rule range_partition_desc contains an optional block with at least one alternative that can match an empty string
- [build] Python3 driver program Test.py has wrong syntax for string comparison of command-line options
- [r] The R grammar contains an ambiguity in the desc.xml
- [DB2] - why not support DB2
- new Grammer HOT 2
- [javascript] Parsing fails if there are curly braces inside a template literal
- Keep a repository of ANTLR4 grammars converted to flex/bison HOT 1
- cpp grammar: Unable to parse pointer to array types as template argument HOT 9
- [cpp] "std::array<xxx[1], 1> test;" parses, "std::array<int[1], 1> test;" does not
- [cpp] rule noPointerAbstractPackDeclarator is never tested, not sure how to get it tested
- [C] Implement line marker rule HOT 1
- Easytrieve
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from grammars-v4.