Comments (4)
Hi @Beliefuture !
Thanks for your interest in Grammarinator!
Empty output are generated by HTMLGenerator if all the quantified components of the starting htmlDocument rule decides to stop generation at the first iteration. Since there is 0.5 chance for stopping and continuing the loop at every iteration and since there is 6 quantified components in htmlDocument
, empty output happen with 0.5^6 chance.
As per the invalid output... these output might look as invalid HTMLs (and some of them are indeed), however they fulfill all the requirements defined by the grammar. The grammar doesn't have any information about tag or attribute names or attribute values. It doesn't know anything about spaces between the tokens. It doesn't know the semantics of style, script or xml tags. Etc. This is simply because these grammars are parser grammars. They are responsible to check only the syntax of an input and all the further checks are usually implemented manually. Similarly, if these grammars are used to generate output, then the additional information needed to be defined manually. Either by editing the grammar itself with rule rewrites, custom predicates or actions (probably with loosing the possibility of using the grammar for parsing) or by implementing custom generator subclasses and/or models/listeners/serializers etc. HTMLCustomGenerator is a basic example for such a custom generator.
Regarding the PostgreSQL issue, are you sure you commented out the superClass options both in the lexer and parser grammars and regenerated the generator? Another option to control the superclass of the produced generator is rewriting the superClass option from CLI like this:
grammarinator-process -DsuperClass=Generator ...
Getting only empty output from PostgreSQL is weird. Although stmtmulti is completely quantified, it should only result at most 50% empty result. Could you paste the command you used resulting in only empty output?
from grammarinator.
Hi @renatahodovan !
Thanks for your detailed explanation and sorry for the late response.
-
For the empty output by the
HTMLGenerator
, could I ask whether there exist some ways to customize the rules to enforce the quantified components not to decide to stop generation at the first or later iteration?In the meanwhile, how to control the complexity (i.e., the number of tokens) of the generated files (specify the value of the
-d
parameter?) since I have found the generated files are almost short with few tokens. -
Since there exist messy characters in the generated files (e.g.,
𧞢䯸
,왘𥍾𤏖
,𗉊𬱦
......) in the demonstrated cases above, I wonder how this tool populates these values and how to specify the set of the values to make the generated file more reasonable? -
For the class issue of PostgreSQL, I have checked the files, and I am sorry that I forget to comment the class in the
PostgreSQLLexer.g4
file.For the empty issue of PostgreSQL, I have tried to generate ten cases again for testing and only three of them are not empty (≈0.3). Based on your illustration that the default probability of the empty output is 0.5, I think it doesn't raise an issue. Does this tool adopt random strategy to generate testing cases now? Could I set my preference to make it generate specific clauses or expressions I want?
Again, I want to know how to specify the literal values of the generated SQL to make them more reasonable to be the testing cases (since they look strange?). I have listed a case below:
CREATE
OPERATOR CLASS RIGHT . U&"V" . OVERLAPS . HEADER . XMLFOREST DEFAULT FOR TYPE U&"X" % ROWTYPE USING ROWTYPE FAMILY OPEN AS STORAGE INTERVAL ( 9 ) [ 96 ] , STORAGE SETOF BIT VARYING ARRAY , OPERATOR 626 * ( COALESCE % TYPE , NONE ) FOR SEARCH , STORAGE :"" % ROWTYPE , STORAGE INTEGER ARRAY [ 7 ] , STORAGE SETOF TRANSLATE ARRAY [ 53 ]
- Another minor issue is that I encounter the following errors when I generated the
PostgreSQLGenerator
file but I think they might be attributed to theantlr
source grammar file. But I am not sure whether these errors will hinder this tool to function properly.
The farthest rule from 'root' is 'a_expr_typecast' (25 steps).
150 rule(s) unreachable from 'root': 'Dollar', 'DOT_DOT', 'OperatorEndingWithPlusMinus', 'OperatorCharacterNotAl lowPlusMinusAtEnd', 'OperatorCharacterAllowPlusMinusAtEnd', 'WHILE', 'FOREACH', 'LOOP', 'InvalidQuotedIdentifier', 'Inva lidUnterminatedQuotedIdentifier', 'UnterminatedUnicodeQuotedIdentifier', 'InvalidUnicodeQuotedIdentifier', 'InvalidUnter minatedUnicodeQuotedIdentifier', 'BeginEscapeStringConstant', 'InvalidBinaryStringConstant', 'InvalidUnterminatedBinaryS tringConstant', 'InvalidHexadecimalStringConstant', 'InvalidUnterminatedHexadecimalStringConstant', 'NumericFail', 'Whit espace', 'Newline', 'LineComment', 'BlockComment', 'UnterminatedBlockComment', 'ErrorCharacter', 'UnterminatedEscapeStri ngConstant', 'InvalidEscapeStringConstant', 'InvalidUnterminatedEscapeStringConstant', 'InvalidEscapeStringText', 'After EscapeStringConstantMode_Whitespace', 'AfterEscapeStringConstantMode_Newline', 'AfterEscapeStringConstantMode_NotContinu ed', 'AfterEscapeStringConstantWithNewlineMode_Whitespace', 'AfterEscapeStringConstantWithNewlineMode_Newline', 'AfterEs capeStringConstantWithNewlineMode_Continued', 'AfterEscapeStringConstantWithNewlineMode_NotContinued', 'plsqlroot', 'pl_ function', 'comp_options', 'comp_option', 'sharp', 'option_value', 'opt_semi', 'pl_block', 'decl_sect', 'decl_start', 'd ecl_stmts', 'label_decl', 'decl_stmt', 'decl_statement', 'opt_scrollable', 'decl_cursor_query', 'decl_cursor_args', 'dec l_cursor_arglist', 'decl_cursor_arg', 'decl_is_for', 'decl_aliasitem', 'decl_varname', 'decl_const', 'decl_datatype', 'd ecl_collate', 'decl_notnull', 'decl_defval', 'decl_defkey', 'assign_operator', 'proc_sect', 'proc_stmt', 'stmt_perform', 'stmt_call', 'opt_expr_list', 'stmt_assign', 'stmt_getdiag', 'getdiag_area_opt', 'getdiag_list', 'getdiag_list_item', ' getdiag_item', 'getdiag_target', 'assign_var', 'stmt_if', 'stmt_elsifs', 'stmt_else', 'stmt_case', 'opt_expr_until_when' , 'case_when_list', 'case_when', 'opt_case_else', 'stmt_loop', 'stmt_while', 'stmt_for', 'for_control', 'opt_for_using_e xpression', 'opt_cursor_parameters', 'opt_reverse', 'opt_by_expression', 'for_variable', 'stmt_foreach_a', 'foreach_slic e', 'stmt_exit', 'exit_type', 'stmt_return', 'opt_return_result', 'stmt_raise', 'opt_stmt_raise_level', 'opt_raise_list' , 'opt_raise_using', 'opt_raise_using_elem', 'opt_raise_using_elem_list', 'stmt_assert', 'opt_stmt_assert_message', 'loo p_body', 'stmt_execsql', 'stmt_dynexecute', 'opt_execute_using', 'opt_execute_using_list', 'opt_execute_into', 'stmt_ope n', 'opt_open_bound_list_item', 'opt_open_bound_list', 'opt_open_using', 'opt_scroll_option', 'opt_scroll_option_no', 's tmt_fetch', 'opt_cursor_from', 'opt_fetch_direction', 'stmt_move', 'stmt_close', 'stmt_null', 'stmt_commit', 'stmt_rollb ack', 'plsql_opt_transaction_chain', 'stmt_set', 'cursor_variable', 'exception_sect', 'proc_exceptions', 'proc_exception ', 'proc_conditions', 'proc_condition', 'opt_block_label', 'opt_loop_label', 'opt_label', 'opt_exitcond', 'any_identifie r', 'sql_expression', 'expr_until_then', 'expr_until_semi', 'expr_until_rightbracket', 'expr_until_loop', 'make_execsql_ stmt', 'opt_returning_clause_into', 'c_expr_c_expr_expr'
Please leave messages if you have any questions :)
from grammarinator.
Hi @renatahodovan !
Thanks for your detailed explanation and sorry for the late response.
- For the empty output by the
HTMLGenerator
, could I ask whether there exist some ways to customize the rules to enforce the quantified components not to decide to stop generation at the first or later iteration?
In the meanwhile, how to control the complexity (i.e., the number of tokens) of the generated files (specify the value of the-d
parameter?) since I have found the generated files are almost short with few tokens.- Since there exist messy characters in the generated files (e.g.,
𧞢䯸
,왘𥍾𤏖
,𗉊𬱦
......) in the demonstrated cases above, I wonder how this tool populates these values and how to specify the set of the values to make the generated file more reasonable?- For the class issue of PostgreSQL, I have checked the files, and I am sorry that I forget to comment the class in the
PostgreSQLLexer.g4
file.
For the empty issue of PostgreSQL, I have tried to generate ten cases again for testing and only three of them are not empty (≈0.3). Based on your illustration that the default probability of the empty output is 0.5, I think it doesn't raise an issue. Does this tool adopt random strategy to generate testing cases now? Could I set my preference to make it generate specific clauses or expressions I want?
Again, I want to know how to specify the literal values of the generated SQL to make them more reasonable to be the testing cases (since they look strange?). I have listed a case below:CREATE OPERATOR CLASS RIGHT . U&"V" . OVERLAPS . HEADER . XMLFOREST DEFAULT FOR TYPE U&"X" % ROWTYPE USING ROWTYPE FAMILY OPEN AS STORAGE INTERVAL ( 9 ) [ 96 ] , STORAGE SETOF BIT VARYING ARRAY , OPERATOR 626 * ( COALESCE % TYPE , NONE ) FOR SEARCH , STORAGE :"" % ROWTYPE , STORAGE INTEGER ARRAY [ 7 ] , STORAGE SETOF TRANSLATE ARRAY [ 53 ]
- Another minor issue is that I encounter the following errors when I generated the
PostgreSQLGenerator
file but I think they might be attributed to theantlr
source grammar file. But I am not sure whether these errors will hinder this tool to function properly.The farthest rule from 'root' is 'a_expr_typecast' (25 steps). 150 rule(s) unreachable from 'root': 'Dollar', 'DOT_DOT', 'OperatorEndingWithPlusMinus', 'OperatorCharacterNotAl lowPlusMinusAtEnd', 'OperatorCharacterAllowPlusMinusAtEnd', 'WHILE', 'FOREACH', 'LOOP', 'InvalidQuotedIdentifier', 'Inva lidUnterminatedQuotedIdentifier', 'UnterminatedUnicodeQuotedIdentifier', 'InvalidUnicodeQuotedIdentifier', 'InvalidUnter minatedUnicodeQuotedIdentifier', 'BeginEscapeStringConstant', 'InvalidBinaryStringConstant', 'InvalidUnterminatedBinaryS tringConstant', 'InvalidHexadecimalStringConstant', 'InvalidUnterminatedHexadecimalStringConstant', 'NumericFail', 'Whit espace', 'Newline', 'LineComment', 'BlockComment', 'UnterminatedBlockComment', 'ErrorCharacter', 'UnterminatedEscapeStri ngConstant', 'InvalidEscapeStringConstant', 'InvalidUnterminatedEscapeStringConstant', 'InvalidEscapeStringText', 'After EscapeStringConstantMode_Whitespace', 'AfterEscapeStringConstantMode_Newline', 'AfterEscapeStringConstantMode_NotContinu ed', 'AfterEscapeStringConstantWithNewlineMode_Whitespace', 'AfterEscapeStringConstantWithNewlineMode_Newline', 'AfterEs capeStringConstantWithNewlineMode_Continued', 'AfterEscapeStringConstantWithNewlineMode_NotContinued', 'plsqlroot', 'pl_ function', 'comp_options', 'comp_option', 'sharp', 'option_value', 'opt_semi', 'pl_block', 'decl_sect', 'decl_start', 'd ecl_stmts', 'label_decl', 'decl_stmt', 'decl_statement', 'opt_scrollable', 'decl_cursor_query', 'decl_cursor_args', 'dec l_cursor_arglist', 'decl_cursor_arg', 'decl_is_for', 'decl_aliasitem', 'decl_varname', 'decl_const', 'decl_datatype', 'd ecl_collate', 'decl_notnull', 'decl_defval', 'decl_defkey', 'assign_operator', 'proc_sect', 'proc_stmt', 'stmt_perform', 'stmt_call', 'opt_expr_list', 'stmt_assign', 'stmt_getdiag', 'getdiag_area_opt', 'getdiag_list', 'getdiag_list_item', ' getdiag_item', 'getdiag_target', 'assign_var', 'stmt_if', 'stmt_elsifs', 'stmt_else', 'stmt_case', 'opt_expr_until_when' , 'case_when_list', 'case_when', 'opt_case_else', 'stmt_loop', 'stmt_while', 'stmt_for', 'for_control', 'opt_for_using_e xpression', 'opt_cursor_parameters', 'opt_reverse', 'opt_by_expression', 'for_variable', 'stmt_foreach_a', 'foreach_slic e', 'stmt_exit', 'exit_type', 'stmt_return', 'opt_return_result', 'stmt_raise', 'opt_stmt_raise_level', 'opt_raise_list' , 'opt_raise_using', 'opt_raise_using_elem', 'opt_raise_using_elem_list', 'stmt_assert', 'opt_stmt_assert_message', 'loo p_body', 'stmt_execsql', 'stmt_dynexecute', 'opt_execute_using', 'opt_execute_using_list', 'opt_execute_into', 'stmt_ope n', 'opt_open_bound_list_item', 'opt_open_bound_list', 'opt_open_using', 'opt_scroll_option', 'opt_scroll_option_no', 's tmt_fetch', 'opt_cursor_from', 'opt_fetch_direction', 'stmt_move', 'stmt_close', 'stmt_null', 'stmt_commit', 'stmt_rollb ack', 'plsql_opt_transaction_chain', 'stmt_set', 'cursor_variable', 'exception_sect', 'proc_exceptions', 'proc_exception ', 'proc_conditions', 'proc_condition', 'opt_block_label', 'opt_loop_label', 'opt_label', 'opt_exitcond', 'any_identifie r', 'sql_expression', 'expr_until_then', 'expr_until_semi', 'expr_until_rightbracket', 'expr_until_loop', 'make_execsql_ stmt', 'opt_returning_clause_into', 'c_expr_c_expr_expr'
Please leave messages if you have any questions :)
Besides, I have found that the generated SQLs for PostgreSQL are typically incomplete and not executable that fail to obey the grammar rule strictly?
SELECT INTERSECT ALL SELECT ; SELECT INTERSECT DISTINCT SELECT SELECT INTERSECT DISTINCT SELECT UNION SELECT INTERSECT ALL SELECT INTERSECT DISTINCT SELECT INTERSECT DISTINCT SELECT UNION DISTINCT SELECT FOR READ ONLY ;
SELECT INTERSECT SELECT EXCEPT SELECT ; SELECT EXCEPT ALL SELECT INTERSECT ALL SELECT
from grammarinator.
Maybe the incomplete queries generated can be attributed to the truncation due to the parameter -d
?
from grammarinator.
Related Issues (20)
- API usage HOT 2
- unrecognized arguments: -p HOT 1
- Generating alternation at most once HOT 2
- Enforce coverage of a grammar rule HOT 1
- undefined variables HOT 3
- Random seed initialisation not working HOT 1
- Exclude tests directory from installation HOT 2
- Can max depth >20? HOT 2
- Error in processor.py when processing PartiQL grammars HOT 2
- Example failing - ImportError: cannot import name 'UnparserRuleContext' from 'grammarinator.runtime' HOT 2
- Grammarinator crashes when generating sqlite test cases HOT 1
- Is it possible to generate valid Java programs using Grammarinator? HOT 1
- Test case doesn't work anymore HOT 2
- How to get the value of `current` ? HOT 2
- Wrong python code generated HOT 3
- Can't set alt weights HOT 4
- AttributeError: type object 'JSONGenerator' has no attribute '<INVALID>' HOT 1
- ANTLR download fails due to SSLError HOT 2
- Generation seed not working HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from grammarinator.