oranoran / antlr4-autosuggest Goto Github PK
View Code? Open in Web Editor NEWJava auto-suggest engine for ANTLR4 grammars
License: Apache License 2.0
Java auto-suggest engine for ANTLR4 grammars
License: Apache License 2.0
Issue copied from identical issue opened on antl4-autosuggest-js by @debashish2014.
Hi,
I am using the following Test.G4, which is a very simple grammar for variable declaration.
grammar Test;
file: (varDecl)+ EOF;
varDecl
: type ID '=' NUMBER ';'
;
type: 'float' | 'int' | 'decimal' ; // user-defined types
ID : LETTER (LETTER | [0-9])* ;
NUMBER: DIGIT+;
fragment LETTER : [a-zA-Z] ;
fragment DIGIT : [0-9];
SPACES
: [ \u000B\t\r\n] -> channel(HIDDEN)
;
Ideally, if a user types "int a", then the expected suggestions will be '=', but it is not finding any suggestion. Anything wrong here?
The example in the readme uses new AutoComplete(
but in master I think we must use new AutoSuggester(
.
Hi,
I was testing the java suggester you uploaded and I found and important issue.
This is the grammar I am using:
grammar Grammar;
rul : 'If the user ' condition;
condition
: '(' simpleCondition bracket
| simpleCondition
;
simpleCondition
: 'is vegetarian'
;
bracket : ')';
When the input sentence is "If the user is vegetarian", the suggestions should be empty but the token ')' is suggested instead. It seems incorrect suggestions happen when the parser rule (in this case simpleCondition) is used in other rules and the element that follows it is another parser rule.
Do you know how to fix this?
Thank you.
Hi,
The repo seems to be inactive, which is a shame, because the implementation is interesting.
Do you plan to bring it back to life?
Is it possible to publish the package on maven?
Copied from JavaScript project issue reported by @WiseBird.
The root cause is that the parser ATN for this grammar contains a loop where all the transitions are epsilon transitions (states 12, 13 and 14). Therefore as the ATN is being traversed, this loop is iterated indefinitely without consuming any tokens. If tokens were being consumed, there would be no problems because the end of the input would be reached.
The fix here is to identify such wholly-epsilon loops, and not pursue their exploration in the ATN. This is done by remembering, for each parser state, at which token it was last visited. In case a parser state gets visited again with no tokens having been consumed since the last visit, this means it's an epsilon-transitions loop and the search must be backtracked.
clause
: clause AND clause
| action
;
action : 'action' ;
AND : 'AND' ;
with action AND
input.
Debug info:
TOKENS FOUND IN FIRST PASS:
[@-1,0:5='action',<1>,1:0]
[@-1,7:9='AND',<2>,1:7]
UNTOKENIZED:
Parser rule names: clause, action
State: 0 (type: RuleStartState)
State: 4 (type: BasicState)
State: 5 (type: BasicState)
State: 2 (type: RuleStartState)
State: 15 (type: BasicState)
State: 16 (type: BasicState)
State: 3 (type: RuleStopState)
State: 6 (type: BasicState)
State: 12 (type: StarLoopEntryState)
State: 10 (type: StarBlockStartState)
State: 7 (type: BasicState)
State: 8 (type: BasicState)
State: 9 (type: BasicState)
Suggesting tokens for rule numbers: 1
SUGGEST: tokenSoFar= remainingText= lexerState=1
SUGGEST: tokenSoFar= remainingText= lexerState=5
NONMATCHING LEXER TOKEN: a remaining=
State: 13 (type: LoopEndState)
State: 1 (type: RuleStopState)
State: 11 (type: BlockEndState)
State: 14 (type: StarLoopbackState)
State: 12 (type: StarLoopEntryState)
State: 10 (type: StarBlockStartState)
State: 7 (type: BasicState)
State: 8 (type: BasicState)
State: 9 (type: BasicState)
Suggesting tokens for rule numbers: 1
SUGGEST: tokenSoFar= remainingText= lexerState=1
SUGGEST: tokenSoFar= remainingText= lexerState=5
NONMATCHING LEXER TOKEN: a remaining=
State: 13 (type: LoopEndState)
State: 1 (type: RuleStopState)
State: 11 (type: BlockEndState)
State: 14 (type: StarLoopbackState)
State: 12 (type: StarLoopEntryState)
State: 10 (type: StarBlockStartState)
State: 7 (type: BasicState)
State: 8 (type: BasicState)
State: 9 (type: BasicState)
...
Copied from JavaScript project issue reported by @debashish2014.
The root cause appears to be the grammar line: expr: expr logical_exp expr
I am using the following grammar file for testing. If user types 'SHOW EMPLOYEE ' then the suggestion should come as 'FOR' and 'WHERE', however, the autosuggest function is going into an infinite loop and is eventually causing stack overflow.
grammar autocomplete;
query
: query_stmt EOF
;
query_stmt
: start_keyword entity_name ( filter_name expr )?
;
expr
: column_name
| expr operator_exp literal_value
| expr logical_exp expr
;
operator_exp
:OPERATOR
;
logical_exp
:K_AND
;
entity_name
: any_name
;
column_name
: any_name
;
any_name
: IDENTIFIER
| STRING_LITERAL
;
filter_name
: K_FOR
| K_WHERE
;
start_keyword
: K_SHOW
| K_SELECT
;
literal_value
: NUMERIC_LITERAL
| IDENTIFIER
| STRING_LITERAL
;
K_SHOW : S H O W;
K_SELECT : S E L E C T;
K_AND : A N D;
K_FOR : F O R;
K_WHERE : W H E R E;
OPERATOR
: ('=' | '!=' | '>=' | '<=' )
;
IDENTIFIER
: '"' (~'"' | '""')* '"'
| '' (~'' | '``')* '`'
| '[' ~']'* ']'
| [a-zA-Z_] [a-zA-Z_0-9]*
;
STRING_LITERAL
: ''' ( ~''' | '''' )* '''
;
NUMERIC_LITERAL
: DIGIT+ ( '.' DIGIT* )? ( E [-+]? DIGIT+ )?
| '.' DIGIT+ ( E [-+]? DIGIT+ )?
;
SPACES
: [ \u000B\t\r\n] -> channel(HIDDEN)
;
UNEXPECTED_CHAR
: .
;
fragment DIGIT : [0-9];
fragment A : [aA];
fragment B : [bB];
fragment C : [cC];
fragment D : [dD];
fragment E : [eE];
fragment F : [fF];
fragment G : [gG];
fragment H : [hH];
fragment I : [iI];
fragment J : [jJ];
fragment K : [kK];
fragment L : [lL];
fragment M : [mM];
fragment N : [nN];
fragment O : [oO];
fragment P : [pP];
fragment Q : [qQ];
fragment R : [rR];
fragment S : [sS];
fragment T : [tT];
fragment U : [uU];
fragment V : [vV];
fragment W : [wW];
fragment X : [xX];
fragment Y : [yY];
fragment Z : [zZ];
Considering the following input:
a = un
in a grammar where un
can be a constant, but where un
can also be the 2 first letters of the union
token.
At this point, autocomplete suggest nothing, maybe because this potential constant complete the script.
Would not be great, if the end of the script is not a whitespace, to ignore the last token in the list, to run the programme and then to filter suggestions with the getText() of the ignored token?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.