typemeta / funcj Goto Github PK

View Code? Open in Web Editor NEW

99.0 3.0 11.0 67.12 MB

Assorted functional-oriented data structures and algorithms for Java.

License: MIT License

Java 100.00%

java functional-programming applicative monads codec parser-combinators json

funcj's Introduction

A collection of functional-oriented data structures, algorithms and libraries, for Java.

Introduction

funcj is a collection of functional-oriented data structures, algorithms and libraries.

At present the project consists of the following sub-libraries:

core - primarily data and control structures.
parser - a combinator parser framework.
json - a parser and data model for JSON data.
codec - a framework for serialising Java data into streams.

funcj's People

Contributors

Stargazers

Watchers

Forkers

algunion funcguy darmie raveensasidharan vbrandl freight-trust reshad-d thomasabraham benbinford wchua

funcj's Issues

Request: Change visibility of various parser fields in `JsonCombParser` to public

The JsonCombParser class has various predefined parsers for parsing various JSON types/primitives such as string, boolean, number, digit, hexDigit etc. These definitions are very useful. However, currently they cannot be used externally as their visibility scope is package.

My request is to make them visible with public scope so that consumers of this library can avoid code duplication.

Please let me know if you have any questions/suggestions. Thanks!

Make Codec implementations static inner classes

Elides the reference to the outer class.

Codec support for configurable field naming

Add Codec support for pluggable field naming - how class field member names are translated into, for example, JSON object keys and XML element names.

Example parsers with specific Token data structure vs Chr

Hi,

First of all would like to share how amazing this library looks. Have been working with ANTLR for years and am trying to move to a more functional approach to parsing. Also I'd like to use a pure java DSL vs parser generators as it helps so much with the developper experience (IDE integrations, no pre-compile phase to generate stuff, etc.).

I spent a few hours trying to develop a minimal parser for a small language, and the largest issue I have found so far was the inherent LL(1) parsing limitations. From one of your comments earlier, trying to come-up with a pre-analysis phase that would generate a stream of tokens (as opposed to just individual chars, a token would be a string and meta data such as whether this is a language reserved word, a user defined symbol etc.) and hoping this will help with the LL(1) parsing issues.

As I am developing this, I wonder if you had any resource/documentation available that would explain how to work with a specific Token data strructure vs Chr? At first glance, seems like the lib is pretty generic, so that would mean Input < Token >, Parser<Token, SomeType> etc., but would really love to go through examples if there were any available

Also would like to know if you are planning on extending this library or making it more known, that's a gem!

thanks for this library and the great work again

`choice` and `or` combinators do not work correctly with `string`

Hello. First, an amazing library, I am a big fan. Thanks for your work.

I have an issue when I am trying to choice between different options of string. Consider the next example:

    public Parser<Chr, String> day() {
        return choice(
                string("today"),
                string("tomorrow")
        );
    }

// tests

    @Test
    public void testDay() {
        Result<Chr, String> res = day().parse(Input.of("tomorrow"));
        String result = res.getOrThrow();
        assertEquals(result, "tomorrow");
    }

Getting the error here:

java.lang.RuntimeException: Failure at position 2, expected=t

	at org.typemeta.funcj.parser.Result$FailureOnExpected.getOrThrow(Result.java:253)
// blabla java stack trace

I suppose that the test should go green here, as parser should fail on trying to compare to "today" and fallback to "tomorrow". What I think is happening is that parser fails on first option and dies there without picking the next option. Is there a workaround or fix for this problem?

UPD
In fact, the test passed for parser choice(string("yesterday"), string("tomorrow")). Now I think it has something to do with the first characters of variants.

Allow dynamic type tags to be disabled

IList::flatmap does not assign result properly

A call to flatmap always results in a cast error, since the return value of addAll is not used.

funcj/core/src/main/java/org/typemeta/funcj/data/IList.java

Line 656 in 2470727

r.addAll(f.apply(n.head()));

Additionally, shouldn't flatmap return an IList? flatmap may result in an empty list.

`NullPointerException` when calling `new Ref(parser)` with non-null parser

Looks like this is checking the wrong thing:

funcj/parser/src/main/java/org/typemeta/funcj/parser/Ref.java

Lines 50 to 52 in 3f511c7

    
           Ref(Parser<I, A> p) { 
        
               this.impl = Objects.requireNonNull(impl); 
        
           }

Strange behaviour for `choice`

The choice combinator behaves strange for some parsers:

I want to parse either boolean values or words (e.g. identifiers). Each parser on it's own works fine but when combining the boolean and word parser, the parser fails for inputs that start with the same character as any of the boolean values (true/false).

In the attached example, the first two inputs can be parsed just fine but when parsing and getting the third result, the following exception occurs:

Exception in thread "main" java.lang.RuntimeException: Failure at position 2, expected=t
        at org.typemeta.funcj.parser.Result$FailureOnExpected.getOrThrow(Result.java:253)

I would expect the parser to try parsing the input as tr (for true) then find a mismatch, try f (for false) find another mismatch and in the end parse the input as a word.

static Parser<Chr, Boolean> trueParser() {
    return string("true").map(F.konst(true));
}

static Parser<Chr, Boolean> falseParser() {
    return string("false").map(F.konst(false));
}

static Parser<Chr, Boolean> bool() {
    return choice(trueParser(), falseParser());
}

static Parser<Chr, String> word() {
    return alpha.many1().map(Chr::listToString);
}

static Parser<Chr, Either<Boolean, String>> boolOrWord() {
    return choice(bool().map(Either::left), word().map(Either::right));
}

public static void main(String[] args) {
    final var parser = boolOrWord();
    final var bool = Input.of("true");
    final var anyWord = Input.of("anyword");
    final var startsLikeBool = Input.of("trDEFINITELYSOMETHINGELSE");

    final var parsedTrue = parser.parse(bool).getOrThrow();
    System.out.println(parsedTrue.left());
    System.out.println("assert: " + (parsedTrue.left() == true));

    final var parsedWord = parser.parse(anyWord).getOrThrow();
    System.out.println(parsedWord.right());
    System.out.println("assert: " + parsedWord.right().equals("anyword"));

    final var parsedLikeBool = parser.parse(startsLikeBool).getOrThrow();
    assert parsedLikeBool.right().equals("trDEFINITELYSOMETHINGELSE");
    System.out.println(parsedLikeBool.right());
    System.out.println("assert: " + parsedLikeBool.right().equals("trDEFINITELYSOMETHINGELSE"));
}

Question: choice not taking alternate paths

Hello! Thanks for your nice library.

I have a question, when I execute this:

final Parser<Chr, String> p = choice(string("128").or(string("16")).or(string("1")));
System.out.println(p.parse(Input.of("128")));
System.out.println(p.parse(Input.of("16")));
System.out.println(p.parse(Input.of("1")));

the output is:

Success{value=128, next=StringInput{3,data="EOF"}
FailureOnExpected{input=StringInput{1,data="6", expected=1}
FailureOnExpected{input=StringInput{1,data="EOF", expected=1}

I expected it to pass all three, because the choice should try each option in turn. It seems to fail because the options all start with "1", and only tries the first one? What am I doing wrong?

Allow data to be serialised even if a ctor can't be found

Handle Collections singletonMap

Currently fails due to lack of default constructor.

Unify CodecCore implementation APIs (i.e. JsonCodecCore etc)

Should have consistent encode/decode entry points (in addition to bespoke methods).

funcj.codec - add support for type aliases

E.g allow "X" to be used as an alias for class "a.b.c.d.X" when the type name is encoded (and decoded).

`sepBy1` should return `NonEmpty<T>`

Currently the signature of Parser::sepBy1 is

<SEP> Parser<I,IList<A>> sepBy1(Parser<I,SEP> sep)

Since sepBy1 ensures to match at least once, shouldn't it return a NonEmpty list? I propose to change the signature to

<SEP> Parser<I,IList.NonEmpty<A>> sepBy1(Parser<I,SEP> sep)

This would mirror the types of many and many1.

Equivalent inverse parser or manyTill parser?

How would i implement a parser that consumes and returns all characters until another Parser succeeds?
I'm figuring out how to parse comment string between a comment char and end of line.
I've looked at the the Haskell Parsec documentation too and I found this that might meet my need. https://hackage.haskell.org/package/parsec-3.1.11/docs/src/Text.Parsec.Combinator.html#manyTill

Need help with my grammar definition

Hi Jon,

I am trying to define a grammar for parsing a string that has something like a path structure, I have made some progress, but now I am stuck and would really appreciate any pointers on what I am doing wrong and how I can fix it.

I am trying to parse a kind of path which has segments. The rules are as follows:

A segment can be a normal, quoted, wildcard or wildcard_suffix segment
A path consists of one or more segments separated by a period .
normal segment consists of 0-9 a-z A-Z and _ characters
quoted segment is the same syntax as a JSON string
wildcard segment consists of a single * character
wildcard_suffix segment consists of a quoted segment (aka JSON string) followed by a * character
wildcard segment and wildcard_suffix segment can only appear as the last segment of a path.

Examples:
single segment paths: a, "*", "a", *
multi segment paths: a.b, a."b"*, a.*

I am able to parse single segment paths all right, but I am getting stuck trying to parse multi segment paths.

// singleSegmentPath works fine for parsing paths with only one segment
   private static final Parser<Chr, Path> singleSegmentPath =
      choice(
         quotedSegment,
         wildcardSuffixSegment,
         wildcardSegment,
         normalSegment
      ).map(Model2::path);

// parser is for parsing both single and multi segment paths, but does not work
   private static final Parser<Chr, Path> parser =
      choice(normalSegment, quotedSegment)
         .andL(period).many()
         .and(choice(
            quotedSegment,
            wildcardSuffixSegment,
            wildcardSegment,
            normalSegment
         ))
         .map(segments -> segment -> Model2.path(segments, segment));

Please let me know if you need more information.

thank you!

Calling many() on an uninitialized reference throws exception but it shouldn't.

There is a check in many() that throws exception if this doesn't accept empty symbol via the acceptsEmpty method. acceptsEmpty is not implemented on uninitialized Refs which prevents the combinator many from being used on res:

Ref<Chr, Chr> r = Parser.ref();
Parser<Chr, IList<Chr>> rs = r.many();

I think the check should be postponed to when the ref is set.

Codec: registerAllowedPackage support for package wildcards

Extend the CodecConfigBuilder.registerAllowedPackage to support wildcards for packages. Provide an allowAllPackages method. Perhaps add a configureAsWriterOnly method as a shortcut to allow all packages and call failOnNoTypeConstructor(false).

Partial parsing fails

Consider this test:

@Test
	public void partialParse() {
		Text.string("ab").many().apply(Input.of("ababa")).getOrThrow();
	}

I'm trying to extract the matching substring "abab" and leave a remaining "a" in the Input (hence the use of the apply method and not the parse method). However, this code fails with:

java.lang.RuntimeException: Failure at position 5, expected=a
	at org.typemeta.funcj.parser.Result$FailureOnExpected.getOrThrow(Result.java:252)
...

Replacing the final a with c results in success

@Test
	public void partialParse() {
		Text.string("ab").many().apply(Input.of("ababc")).getOrThrow();
	}

I suspect that the root cause is in:

    default Parser<I, IList<A>> many() {
        if (acceptsEmpty().apply()) {
            throw new RuntimeException("Cannot construct a many parser from one that accepts empty");
        }
        // We use an iterative implementation, in favour of a more concise recursive solution,
        // for performance, and to avoid StackOverflowExceptions.
        return new ParserImpl<I, IList<A>>(LTRUE, this.firstSet()) {
            @Override
            public Result<I, IList<A>> apply(Input<I> in, SymSet<I> follow) {
                IList<A> acc = IList.of();
                final SymSet<I> follow2 = follow.union(Parser.this.firstSet().apply());
                while (true) {
                    if (!in.isEof()) {
                        final I i = in.get();
                        if (Parser.this.firstSet().apply().matches(i)) {
                            final Result<I, A> r = Parser.this.apply(in, follow2);
                            if (r.isSuccess()) {
                                final Result.Success<I, A> succ = (Result.Success<I, A>) r;
                                acc = acc.add(succ.value());
                                in = succ.next();
                                continue;
                            } else {
                                return ((Result.Failure<I, A>)r).cast();
                            }
                        }
                    }
                    return Result.success(acc.reverse(), in);
                }
            }
        };
    }

IIUIC, this code will continue to repeat the parsing step as long as Parser.this.firstSet().apply().matches(i). In the example, the last iteration fails because "a" doesn't match the entire pattern "ab". What is more, this results in returning a failure. I would have expected to return the successful previous iterations (excluding the last failed attempt)?

Maybe I'm completely off-base here. If so, could you give an example of a partial parse?

        var r = Parser.choice(
                              Text.string("AB"),
                              Text.string("AC"))
                      .parse(Input.of("AC"));
       System.out.println(r.isSuccess()); // false, although expect to be true

Shouldn't it try the second option ('AC') when the first one fails?

"@type": "java.util.Collections$SingletonList"
"@value": [ ... ]

how would you go about capturing a block comment?

I am trying to wrap my mind around how to parse the contents of something like the following cases:

/* content */
/**/
/***/

I tried the followings:

string("/*")
                .andR(Combinators.<Chr>any().many())
                .andL(string("*/"))

choice(
        chr('*').andR(notChr('/')).map(v -> "*" + v),
        notChr('*').map(String::valueOf)
)
        .many()
        .between(string("/*"), string("*/"))

But no luck.

Thank you for such great library!

	Ref(Parser<I, A> p) {
	this.impl = Objects.requireNonNull(impl);
	}