Giter Site home page Giter Site logo

b3b00 / csly Goto Github PK

View Code? Open in Web Editor NEW
348.0 11.0 31.0 10.64 MB

a C# embeddable lexer and parser generator (.Net core)

License: MIT License

C# 99.93% Dockerfile 0.03% Shell 0.04%
dot-net csharp parser lexer-generator parser-generator lexer expression-parser recursive-descent-parser grammar-rules mathematical-parser

csly's Introduction

csly's People

Contributors

aaaabcv avatar aldosa avatar arlm avatar b3b00 avatar cp3088 avatar dependabot[bot] avatar dnyanu76 avatar dotted avatar endlesstravel avatar fossabot avatar heku avatar jlittorin avatar magne avatar olduh29 avatar terac avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

csly's Issues

additional syntax parser backend

add an alternative backend syntax parser, beside the acrual LL recursive descent parser.
maybe a LR table driven parser ( yacc/bison like)

Thread-Safety

Hello b3b00,

If we need to parse a lot of strings in a multithreaded environment (all strings having the same syntax), do you recommend to use the same Parser instance everywhere?
In other words, are Lexer and Parser thread-safe?

Thanks

nested groups fails to parse

Language => Expr: (STRING (COMMA [d] STRING)*)?
Error => "unexpected "(" (LPAREN) at line 1, column 15.expecting IDENTIFIER, unexpected "(" (LPAREN) at line 1, column 15.expecting DISCARD, unexpected "(" (LPAREN) at line 1, column 15.expecting RPAREN, unexpected "(" (LPAREN) at line 1, column 15.expecting DISCARD, unexpected "(" (LPAREN) at line 1, column 15.expecting IDENTIFIER, unexpected "(" (LPAREN) at line 1, column 15.expecting DISCARD, unexpected "(" (LPAREN) at line 1, column 15.expecting RPAREN, unexpected "(" (LPAREN) at line 1, column 15.expecting DISCARD, unexpected "(" (LPAREN) at line 1, column 15.expecting IDENTIFIER, unexpected "(" (LPAREN) at line 1, column 15.expecting DISCARD, unexpected "(" (LPAREN) at line 1, column 15.expecting RPAREN, unexpected "(" (LPAREN) at line 1, column 15.expecting DISCARD, unexpected "(" (LPAREN) at line 1, column 15.expecting IDENTIFIER, unexpected "(" (LPAREN) at line 1, column 15.expecting DISCARD, unexpected "(" (LPAREN) at line 1, column 15.expecting DISCARD, unexpected "(" (LPAREN) at line 1, column 15.expecting RPAREN, unexpected "(" (LPAREN) at line 1, column 15.expecting IDENTIFIER, unexpected "(" (LPAREN) at line 1, column 15.expecting DISCARD, unexpected "(" (LPAREN) at line 1, column 15.expecting RPAREN, unexpected "(" (LPAREN) at line 1, column 15.expecting DISCARD, unexpected "(" (LPAREN) at line 1, column 15.expecting IDENTIFIER, unexpected "(" (LPAREN) at line 1, column 15.expecting DISCARD, unexpected "(" (LPAREN) at line 1, column 15.expecting RPAREN, unexpected "(" (LPAREN) at line 1, column 15.expecting DISCARD, unexpected "(" (LPAREN) at line 1, column 15.expecting IDENTIFIER, unexpected "(" (LPAREN) at line 1, column 15.expecting DISCARD, unexpected "(" (LPAREN) at line 1, column 15.expecting RPAREN, unexpected "(" (LPAREN) at line 1, column 15.expecting DISCARD, unexpected "(" (LPAREN) at line 1, column 15.expecting IDENTIFIER, unexpected "(" (LPAREN) at line 1, column 15.expecting DISCARD, unexpected "(" (LPAREN) at line 1, column 15.expecting DISCARD, unexpected "(" (LPAREN) at line 1, column 15.expecting RPAREN, "

when I change language to this: Expr: (STRING COMMA [d] STRING)?
no problem occurs

Am I doing something wrong or nested groups are not supported?

EBNF Parser - Production rule matching issues?

Hello,

I have been really enjoying csly, but seem to have ran into an issue. I have rules for function calls and variable assignments among other statements. Here are the grammar paths for the two:

tvscript : stmt+
stmt : global_stmt
global_stmt : global_stmt2
global_stmt2 : LBEG [d] global_stmt_content ( COMMA [d] global_stmt_content )* ( COMMA )? LEND [d]

global_stmt_content : fun_call
global_stmt_content : var_assign

--------------------------------------------------------

fun_call : id LPAR [d] fun_actual_args? RPAR [d]
fun_actual_args : pos_args ( COMMA [d] kw_args )?
pos_args : arith_expr ( COMMA [d] arith_expr )*
...
literal : STR_LITERAL

--------------------------------------------------------

var_assign : id ASSIGN [d] arith_expr
...
cmp_expr : add_expr ( GT add_expr )*
...
literal : INT_LITERAL

---------------------------------------------------------

Using these two two inputs, one at a time:

|B|study('Preprocessor example')|E|
|B|a := b > 100|E|

The first line trys to become an assignment, and the other trys to become a function call...
When I comment out the production rule for function calls, assignments begin to work perfectly, and vice versa.

I have been looking through the source, but cannot find why it is choosing the incorrect rule when both are enabled.

Thanks,
CP3088

parser guided lexing

guide the lexer with expected tokens from the parser : do not try to recognize tokens that will not match the grammar

how to use value option

[Production("ComparisonExpr: StringConcatExpr (GeneralComp StringConcatExpr)?")]
public string ComparisonExpr(string stringConcatExpr, ValueOption<Group<string, string>> option)

[Production("StringConcatExpr: STRING")]
public string StringConcatExpr(Token<TokenEnum> value)

[Production("GeneralComp: GENERALCOMPARATOR")]
public string GeneralComp(Token<TokenEnum> token)

I have the grammar, above. In ComparisonExpr: I have a group clause which consists of non-terminals. What should I use in the parameters of the function ComparisonExpr to have the value of the group in the option parameter.

Also, is (TERMINAL NON_TERMINAL) clauses supported?
For example: StringConcatExpr (COMMA StringConcatExpr)?
What should be the function signature for this?

Continuous Integration

Déployer à CI on AppVeyor or travisCI

  • build for framework netcoreapp1.0 or 1.1 ans net45

  • déploy on demand tout nuget.org for both framework

Refactor EBNFRecursiveDescentSyntaxParser.Parse(IList, Rule, int, string)

I've selected EBNFRecursiveDescentSyntaxParser.Parse(IList, Rule, int, string) for refactoring, which is a unit of 103 lines of code and 23 branch points. Addressing this will make our codebase more maintainable and improve Better Code Hub's Write Simple Units of Code guideline rating! 👍

Here's the gist of this guideline:

  • Definition 📖
    Limit the number of branch points (if, for, while, etc.) per unit to 4.
  • Why
    Keeping the number of branch points low makes units easier to modify and test.
  • How 🔧
    Split complex units with a high number of branch points into smaller and simpler ones.

You can find more info about this guideline in Building Maintainable Software. 📖


ℹ️ To know how many other refactoring candidates need addressing to get a guideline compliant, select some by clicking on the 🔲 next to them. The risk profile below the candidates signals (✅) when it's enough! 🏁


Good luck and happy coding! :shipit: ✨ 💯

EBNF Parser : End of Stream issues.

Hello,

I am currently making a project in C# for the PineScript language with cSly.
There are many grammar rules for this loosely typed functional language, but I seem to get this error:

unexpected end of stream. at line 0, column 693. at sly.parser.generator.EBNFParserBuilder2.<>c__DisplayClass2_0.`

The problem seems to revolve around these Production rules:

[Production("fun_def_singleline : id fun_head ARROW [d] fun_body_singleline")] [Production("fun_def_multiline : id fun_head ARROW [d] LEND? [d] local_stmts_multiline")] [Production("fun_head : LPAR [d] fun_head_params? RPAR [d]")] [Production("fun_head_params : id ( COMMA [d] id )*")]

I am still working on the class structures, so I am returning nulls for now. This doesn't seem to be the issue, as the exception is for the Parser's initialization...

When I remove the rules temporarily, this becomes the new error.

unexpected end of stream. at line 0, column 1077.expecting IDENTIFIER, LPAREN, unexpected end of stream. at line 0, column 1077.

What might cause end of stream?

Generic Lexer : bad lexing error reporting

Generic lexer does not report lexing errors.

example for following lexer:

public enum Issue114
    {
        [Lexeme(GenericToken.SugarToken, "//")]
        First = 1,

        [Lexeme(GenericToken.SugarToken, "/*")]
        Second = 2
    }

lexing "/&" should

  • hrow a "unexpected char '&' at line 1 column 2" lexer exeception.
  • Instead it throws a InvalidOperationException at FSMLexer line 120 (accessing last item of an empty list)

lexing "// /&" should :

  • hrow a "unexpected char '&' at line 1 column 5" lexer exeception.
  • instead it success with a single "//" token

Error is probably in FSMLexer.Run line 264

Branch https://github.com/b3b00/csly/tree/bugfix/%23114-bad-lexing-error-reporting has an unit test (GenericLexerTests.TestIssue114) for testing issue

Fix may or may not be easy depending on possible regressions (check the existing unit tests).

ValueOption warnings

Hello again,

I'm getting following warnings from BuildResult it seems related to ValueOption:

"non terminal [GROUP-COMPARATOR-StringConcatExpr] is never used."
"non terminal [GROUP-TO[d]-AdditiveExpr] is never used."
"non terminal [GROUP-Argument-ExtraArgumentList] is never used."

Also,

I have the following rule:

[Production("RangeExpr: AdditiveExpr (TO [d] AdditiveExpr)?")]
public object RangeExpr(object additiveExpr, ValueOption<Group<XPathToken, object>> optionalGroup)

When I delete the optional group clause it works without problem, but when I add the rule, the program gives the following exception. Note that, the parsed string does not contain anything related to the optional group clause

OUTCH Object of type 'sly.parser.parser.ValueOption`1[System.Object]' cannot be converted to type 
'sly.parser.parser.ValueOption`1[sly.parser.parser.Group`2[netSchematron.XPathToken,System.Object]]'. 
calling RangeExpr__AdditiveExpr_GROUP-TO[d]-AdditiveExpr? =>  RangeExpr

We have talked about this exception before in this issue however, I can not seem to identify the lexer problem this time.

I think the warnings and the exception are connected.

benchmark performance for generic lexer before System.Span use

after a quick test I found that GenericLexer from 2.2.5.1 seems faster ( *10) and use less memory (*5).
This must be further investigated.

It looks strange as usage of Span<> is there to use less memory !? Maybe the tweaking in FSMLexer introduces some performance regression.

if this reveal to be real. a rollback is necessary.

Possible leading tokens

// other rules leads to UnaryExpr from root
[Production("UnaryExpr: PLUS* ValueExpr")]
// other rules leading to IntegerLiteral from ValueExpr
[Production("IntegerLiteral: INTEGER")]

In such a grammar, since PLUS can occur zero times or more, in case of PLUS occurs zero times, possible leading tokens of ValueExpr should be added to rules prior to ValueExpr. Am I right? Because right now I am unable parse "2 + 2" because possible leading tokens does not contain INTEGER token.

add AST building as beaver parser generator

beaver provides a compact way to produce AST.
Provides CSLY a similar mechanism.
Maybe it needs an externalisation to a grammar file ( as beaver does), if so the grammar MUSTbe parsed with CSLY (eatsyourown dog food principle )
Must not add a build step to produce the parser as it s the leading CSLY feature.

Possible leading tokens is missing

Hello again,

I have not been able identify the last problem we had talked about possible leading tokens.

I did just a little bit debugging, here what I gathered from that:

this is a valid XPath expression (tested) -----(2+1)

when I try to parse, it comes with no problem up to the token 2.
After that it prompts an error that it has no possible leading tokens as INTEGER.

some of the rules

// root
[Production("XPath: Expr")]
public string XPath(string expr)

// Expr
[Production("Expr: ExprSingle")]
public string Expr(string exprSingle)

// other rules that lead to ArrowExpr
[Production("ArrowExpr: UnaryExpr")]
public string ArrowExpr(string unaryExpr)

[Production("UnaryExpr: MinusOrPlusExpr* ValueExpr")]
public string UnaryExpr(List<string> uMinusPlus, string valueExpr)

[Production("ValueExpr: SimpleMapExpr")]
public string ValueExpr(string simpleMapExpr)

it continues...

So I decided to connect ArrowExpr to ValueExpr, so my language will not support unary expressions.
I was able to parse 1-2-(3+4) without problem.

Then I add UnaryExpr again and tried to parse 1-2-(3+4), an error occurred. I did another debug session, the error is the same.

Then I tried 1-2-3+4 the error is the same.

Then I tried 1 the error is the same.

Then I connected ArrowExpr to ValueExpr, every expression that I tried before worked without problem.

So, I again did I debug on working version, and saw that PossibleLeadingTokens had INTEGER

My guess is that when I use UnaryExpr, since there is a MinusOrPlusExpr, token types that lead to ValueExpr does not taken in to account, however, since MinusOrPlusExpr can occur 0 times, token types that lead to ValueExpr should be taken into account.

syntax tree visitor context

For now, ther is no simple and safe way to share a context when visiting a syntax tree.
when building a parser' a parser instance is used.

ExpressionParser parserInstance = new ExpressionParser()

Parser<ExpressionToken,int> Parser = ParserBuilder.BuildParser<ExpressionToken,int>(parserInstance,
                                                                            ParserType.LL_RECURSIVE_DESCENT,'expression)

This instance may be used to hold context during the tree visit. but this instance is unique and so not threadsafe.

  • first solution :
    Allow to create a new instance each time a source is parsed.
ExpressionParser  someNewParserInstance = new ExpressionParser()
ParseResult<ExpressionToken> r = Parser.Parse(expression, someNewParserInstance);

instead of just

ParseResult<ExpressionToken> r = Parser.Parse(expression)
  • second
    extend visitor methods to accept some context object (may be type parametized) as a parameter.
    [Production("primary: LPAREN expression RPAREN")]
        public int Group(object discaredLParen, int groupValue ,object discardedRParen)
        {
            return groupValue;
        }

may then be :

    [Production("primary: LPAREN expression RPAREN")]
        public int Group(object discaredLParen, int groupValue ,object discardedRParen, SomeContextClass someContext)
        {
            return groupValue;
        }     

⚠️ **** Try not to break existing API. ****

System.ArgumentOutOfRangeException

I got this exception during syntax checking of json files containing errors.

"Index was out of range. Must be non-negative and less than the size of the collection."

`

    [Fact]
    public void TestErrorMissingClosingBracket()
    {
        EbnfJsonGenericParser jsonParser = new EbnfJsonGenericParser();
        ParserBuilder<JsonTokenGeneric, JSon> builder = new ParserBuilder<JsonTokenGeneric, JSon>();
        Parser = builder.BuildParser(jsonParser, ParserType.EBNF_LL_RECURSIVE_DESCENT, "root").Result;

        ParseResult<JsonTokenGeneric, JSon> r = Parser.Parse("{");
        Assert.True(r.IsError);
    }

`
After a first glance at the code you don't check the position accessing tokens after end of list .

Code coverage

  • add opencover (or other if exists)
  • integrate opencover with appveyor
  • publish opencover coverage to codecov

EBNF Parser - Issues chaining rules for a param list

Happy Holidays,

I have almost completed a parser using csly. There is one issue I have been facing however.
The scripting language I am mimicking has the ability, like most languages do, to mix default and non-default parameters in function definitions. The key difference is that function calls can mix them as well, assuming the normal non-defaults come first of course.

My function calls work perfectly with only default params, and the same with only non-defaults. However, when I try to mix the two like this:

|B|plot(c, title='Out')|E|

I get "unexpected end of stream. at line 0, column 0"

My production rules are as follows:

[Production("test : LBEG [d] fun_call LEND [d]")]
public ScriptAST TestStatement(ScriptAST statement)
{
    /* ... */
}

[Production("fun_call : id LPAR [d] ( fun_actual_args )? RPAR [d]")]
public ScriptAST FuncCall(Identifier id, ValueOption<Group<ScriptToken, ScriptAST>> args)
{
    /* ... */
}

[Production("fun_actual_args : kw_args")]
public ScriptAST FuncActualArgsA(ScriptAST kwArgs)
{
    /* ... */
}

[Production("fun_actual_args : pos_args ( COMMA [d] kw_args )?")]
public ScriptAST FuncActualArgsB(Arguments posArgs, ValueOption<Group<ScriptToken, ScriptAST>> kwArgs)
{
    /* ... */
}

[Production("pos_args : arith_expr ( COMMA [d] arith_expr )*")]
public ScriptAST PosArguments(ScriptAST initial, List<Group<ScriptToken, ScriptAST>> subsequent)
{
    /* ... */
}

[Production("kw_args : kw_arg ( COMMA [d] kw_arg )*")]
public ScriptAST KwArguments(Argument initial, List<Group<ScriptToken, ScriptAST>> subsequent)
{
    /* ... */
}

[Production("kw_arg : id DEFINE [d] arith_expr")]
public ScriptAST KeywordArg(Identifier id, ScriptAST expression)
{
    /* ... */
}

Both of these work fine...

|B|study("Name or something", overlay)|E|
|B|test(close, 123, open)|E|

But neither of these...

|B|test2(a, b, c=100)|E|
|B|plotshape(data, style=shape.xcross)|E|

I have been debugging this for some time and suspect I need to find another way for csly to accept this behavior.

Regards,
CP3088

Generic Lexer : add callback mechanism to discriminate generic tokens (as identifier)

GenericLexer genrates ... generates very generic tokens.
an identifier can only be an identifier as an example.
some languages may want to discriminate such tokens so a callback mechanism at the end of a token recognition may bed added to introduce discimination.

use case example :
prolog discriminates atoms and variables with a leading uppercase char for variables. a callback on identifier tokens may allow to produce different token (ATOM , VARIABLE)

best solution would be something like

enum SomeToken{
 [Lexeme(GenericToken.Identifier)]
[Callback(LexerCallback.IdentifierCallback)]
        IDENTIFIER = 20,
/*
some other tokens
*/
}

public class LexerCallback {
    public static Token<SomeToken> IdentifierCallback(Token<SomeToken> token) {
         // do some magic here to modify your token 
         return token
    }
}

not sure it's quite possible though because of the generics.

Refactor FSMLexer.Run(string, int)

I've selected FSMLexer.Run(string, int) for refactoring, which is a unit of 95 lines of code. Addressing this will make our codebase more maintainable and improve Better Code Hub's Write Short Units of Code guideline rating! 👍

Here's the gist of this guideline:

  • Definition 📖
    Limit the length of code units to 15 lines of code.
  • Why
    Small units are easier to analyse, test and reuse.
  • How 🔧
    When writing new units, don't let them grow above 15 lines of code. When a unit grows beyond this, split it in smaller units of no longer than 15 lines.

You can find more info about this guideline in Building Maintainable Software. 📖


ℹ️ To know how many other refactoring candidates need addressing to get a guideline compliant, select some by clicking on the 🔲 next to them. The risk profile below the candidates signals (✅) when it's enough! 🏁


Good luck and happy coding! :shipit: ✨ 💯

Documentation

Imprime documentation.

  • split in different pages
    • lexer
    • parser
    • expression parsing
  • under the hood
    • inspiration
    • ebnf using bnf

add EBNF groups support

Allow to use the EBNF groups notation :

rule : item ( SEPRATOR item)*

groups must support multiplier (* and + not ?)

such a group returns a List<Either<Token<In>, Out>> to the visitor. When multiplied the value is
List<List<Either<Token<In>, Out>>>

behind the hood the parser generator create a rule that match de clause sequence between the paranthesis. The calling rule is then modified replacing the group with the newly generated non terminal.

lazy lexing

recognize tokens as parsing goes on.
do not lex xhole source prior to parse, wait for the parser to ask for it. memorize tokens to allow bactracking without relexing the source

move to netstandard 2.0

Configueé projects to use netstandard 2 instead of netcoreapp for better support across .net versions

| alternate choice operator

only for items of the same type : terminals or non-terminals. mixing both is forbiden.this only two keep visitor arguments simple.

  • for each choice generate a new rule ? beware cartesian products when many alternate in same rule. 

  • or maybe simpler add a new clause type, just like option or group

Question: Sprache to CSLY

Currently I have set up Sprache to follow a simple set of rules:

Query             = QueryFilter ;

QueryFilter       = StringValue Operator StringValue ;

Operator          = EqualOperator 
				  | ":!="
				  | ":<"
				  | ":<="
				  | ":>"
				  | ":>="
				  ;

EqualOperator     = ":=" ;

StringValue       = """ String """ ;

So in C# I have the following query:

string query = "StartTime:>=\"2015-01-01T06:00:00Z\" EndTime:<=\"2015-03-25T11:59:00Z\" SamplingMode:=\"Raw\"";

My question is how would I set up CSLY to follow these rules?

At the end of the day I need to perform query validation and inform users of where the error is and so on...

zero string allocation for regex lexer

investigate ifa zero string allocation for the regex lexer is possible.
at least is it possible to reduce string alloc.
this must us us System.Span and/or System.Memory

Refactor RecursiveDescentSyntaxParser.ParseNonTerminal(IList, NonTerminalClause, int)

I've selected RecursiveDescentSyntaxParser.ParseNonTerminal(IList, NonTerminalClause, int) for refactoring, which is a unit of 77 lines of code and 17 branch points. Addressing this will make our codebase more maintainable and improve Better Code Hub's Write Simple Units of Code guideline rating! 👍

Here's the gist of this guideline:

  • Definition 📖
    Limit the number of branch points (if, for, while, etc.) per unit to 4.
  • Why
    Keeping the number of branch points low makes units easier to modify and test.
  • How 🔧
    Split complex units with a high number of branch points into smaller and simpler ones.

You can find more info about this guideline in Building Maintainable Software. 📖


ℹ️ To know how many other refactoring candidates need addressing to get a guideline compliant, select some by clicking on the 🔲 next to them. The risk profile below the candidates signals (✅) when it's enough! 🏁


Good luck and happy coding! :shipit: ✨ 💯

Lexical Error '#'

I am having an issue with a generic lexeme.

Input: #F4C587
Error: Lexical Error : Unrecognized symbol '#'
Lexeme: [Lexeme(GenericToken.Identifier, IdentifierType.Custom, "#", "0-9a-fA-F")] COLOR_LITERAL = 42,

Is this type of character not allowed?

Regards,
CP3088

The FSMLexer does not properly backtrack

If the FSMLexer contains two lexemes where one starts with the prefix of the other and is at least two characters longer than the prefix, the lexer will return the prefix token correctly, but will fail to backtrack to where the token ended.

Given the following lexer:

public enum Backtrack
{
    [Lexeme(GenericToken.SugarToken, ".")]
    A = 1,
    [Lexeme(GenericToken.SugarToken, "-")]
    B,
    [Lexeme(GenericToken.SugarToken, "-+")]
    C,
    [Lexeme(GenericToken.SugarToken, "---")]
    D,
}

If we parse the string --+, we expect the token stream BC0. Instead we will get an error Lexical Error : Unrecognized symbol '+' at (line 0, column 2).

Worse, if we parse the string --., we expect the token stream BBA0. Instead we get the token stream BA0, with no indication that the lexer failed.

This also means that the expected behavior i #106 is wrong. Instead of producing an INT token, it should fail. However, if the lexer also contains a lexeme for . (period), it should succeed with INT and PERIOD tokens.

The FSMLexer does not properly terminate token stream

Given the following lexer:

public enum Premature
{
    [Lexeme(GenericToken.SugarToken, "++")]
    A = 1,
    [Lexeme(GenericToken.SugarToken, "-")]
    B,
    [Lexeme(GenericToken.SugarToken, "---")]
    C,
}

If we parse the string +, we expect an error Lexical Error : Unrecognized symbol '+' at (line 0, column 0). Instead we get the token stream 0. That is, end of token stream and no error.

if we parse the string --, we expect the token stream BB0. Instead we get the token stream B0, with no indication that the lexer skipped a token.

EBNF - Recursive Parsing -> Exponential Performance Losses

Hello,

I have noticed an issue when it comes to parsing code.
From the example below, you can see an exponential performance impact:

ms	Code
-------------------------------------------------------------------------------------------------
218	|B|xATRTrailingStop = iff(close > val, not val, val)|E|
8077	|B|xATRTrailingStop = iff(close > val, iff(close < val, 0, 1), val)|E|
23659	|B|xATRTrailingStop = iff(close > val, iff(close < val, 0, 1), iff(close < val, 0, 1))|E|

The line of code which brought this to my attention was:

xATRTrailingStop = iff(close > nz(xATRTrailingStop[1], 0) and close[1] > nz(xATRTrailingStop[1], 0), max(nz(xATRTrailingStop[1]), close - nLoss),
                    iff(close < nz(xATRTrailingStop[1], 0) and close[1] < nz(xATRTrailingStop[1], 0), min(nz(xATRTrailingStop[1]), close + nLoss), 
                        iff(close > nz(xATRTrailingStop[1], 0), close - nLoss, close + nLoss)))

Which still did not finish on an 8 core cpu while I slept.
To provide full example, I will need to send you code privately.

This issue could likely be reproduced easily, but I have 66 production rules and 49 tokens...
I would love to help in anyway I can to improve this.

Regards,
CP3088

Fix - Error list with no elements

Good afternoon,

RecursiveDecentSyntaxParser.cs on line 434 will sometimes throw a System.InvalidOperationException when the error list is empty. Not sure if that in itself is part of a greater issue, but here is the fix for now.

greaterIndex = errors.Select(e => e.UnexpectedToken.PositionInTokenFlow).Max();

Should become

greaterIndex = errors.Count > 0 ? errors.Select(e => e.UnexpectedToken.PositionInTokenFlow).Max() : 0;

I am not a GitHub pro, so I will just offer the solution here, haha.

Regards,
CP3088

Graphviz Issue

Sometimes when creating the graphviz output, there will be arrows without a destination.

My grammar is very complex, so that is most likely the issue. However, to prevent this from happening to anyone else, I made an edit in sly/parser/generator/visitor/GraphVizEBNFSyntaxTreeVisitor.cs

Line 133 in private DotArrow Visit(SyntaxNode node)

children.ForEach(c =>
{
    if (c != null) // Prevent arrows with null destinations
    {
        var edge = new DotArrow(result, c) {
            // Set all available properties
            ArrowHeadShape = "none"
        };    
        Graph.Add(edge);
    }
});    

generic lexer : "1." lexing fails

when using GenericLexer with following configuration :

        [Lexeme(GenericToken.Int)]
        Integer = 5,
        
        [Lexeme(GenericToken.Double)]
        Double = 6,

lexer fails to parse "1.".
It throws an exception in GenericLexer.transcode.
expected behavior should be returning an Int token.

lexer performance

the actual lexer implementation is the bottleneck for the library.
the performance may be improved building a single finite state machinefor all tokens scanning.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.