Giter Site home page Giter Site logo

Strange behaviour for `choice` about funcj HOT 4 CLOSED

typemeta avatar typemeta commented on May 27, 2024
Strange behaviour for `choice`

from funcj.

Comments (4)

jon-hanson avatar jon-hanson commented on May 27, 2024 1

The issue is that funcj.parser is a framework for LL(1) parsers, meaning parsers for grammars that can be parsed by looking only 1 character ahead at each stage. In your grammar the parser needs to look 5 characters ahead to determine that the token isn't "true" or "false".

You can usually work around this by deferring the decision as to which parse rule gets applied until you have read sufficient tokens. If I simplify your example for brevity and pretend that a boolean is either "aa" or "bb", then the following illustrates this approach:

private static Either<Boolean, String> left(boolean value) {
    return Either.left(value);
}

private static Either<Boolean, String> right(String value) {
    return Either.right(value);
}

static Parser<Chr, Boolean> aa = string("aa").map(F.konst(true));
static Parser<Chr, Boolean> bb = string("bb").map(F.konst(false));
static Parser<Chr, Boolean> bool = choice(aa, bb);
static Parser<Chr, String> word = alpha.many1().map(Chr::listToString);

static Parser<Chr, Either<Boolean, String>> trueOrWord =
        chr('a').andR(Combinators.any())
                .map(c -> c.equals('a') ? left(true) : right("a" + c));

static Parser<Chr, Either<Boolean, String>> falseOrWord =
        chr('b').andR(Combinators.any())
                .map(c -> c.equals('b') ? left(false) : right("a" + c));

static Parser<Chr, Either<Boolean, String>> boolOrWord =
        choice(trueOrWord, falseOrWord, word.map(Either::right));

public static void main(String[] args) {
    final var parser = boolOrWord;
    final var bool = Input.of("aa");
    final var anyWord = Input.of("zz");
    final var startsLikeBool = Input.of("ab");

    final var parsedTrue = parser.parse(bool).getOrThrow();
    System.out.println(parsedTrue.left());
    System.out.println("assert: " + (parsedTrue.left() == true));

    final var parsedWord = parser.parse(anyWord).getOrThrow();
    System.out.println(parsedWord.right());
    System.out.println("assert: " + parsedWord.right().equals("zz"));

    final var parsedLikeBool = parser.parse(startsLikeBool).getOrThrow();
    assert parsedLikeBool.right().equals("ab");
    System.out.println(parsedLikeBool.right());
    System.out.println("assert: " + parsedLikeBool.right().equals("ab"));
}

from funcj.

vbrandl avatar vbrandl commented on May 27, 2024

So in general a grammar like mine, which is not uniquely decidable, cannot be parsed by a LL(1) parser (my parsers and compilers course was 3 years ago so I don't remember the details...)

Requiring words that are not boolean values to be wrapped in "" might fix the ambiguity, too?

from funcj.

jon-hanson avatar jon-hanson commented on May 27, 2024

You grammar in its current form is not an LL(1) grammar. The wikipedia page includes some strategies for resolving this. Alternatively, as you suggest, you can change the language syntax to make it more amenable to be being expressed in an LL(1) grammar, though most languages would lean towards putting the arbitrary string in quotes and leaving true and false as unquoted keywords. Another approach might be to tokenise the input first, and then have the parser operate on word tokens.

from funcj.

vbrandl avatar vbrandl commented on May 27, 2024

Thanks for the refreshing on some theoretical concepts :)

I think for now I'll take the lazy solution and do something like

alpha.many1().map(Chr::listToString).map(s -> {
  if (s.equals("true"))
    return Either.left(true);
  else if (s.equals("false"))
    return Either.left(false);
  else
    return Either.right(s);
})

But I might fix that in later™

from funcj.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.