Giter Site home page Giter Site logo

haircomb's Introduction

haircomb

A typescript parser combinator library heavily inspired by Pidgin.

usage

This library provides a set of low-level, atom level parsers that can be composed together in chains to form more complex operations. It includes a complete-enough set of atoms to handle most common and uncommon grammar use-cases but does not includes any higher level parsers besides the included samples, which are themselves very limited.

Parsers are used in two phases. The first phase is the composition phase where you can combine them into chains using the various provided operators or even create your own. The second phase is the execution phase where an input state is provided to the parse method of the chain, expecting a result.

All parsers are regrouped into two broad categories: parsers and operators. Parsers are simple atoms that extracts values from an input state. Operators are atoms that transforms the result of other atoms, themselves returning their result to the next parser in the chain.

From there, you can find two other sub-categories of parsers: classes and functions. Class parsers inherit from the Parser<TToken, T> base class which provides the base infrastructure for all other parsers to build on top of. They are simple classes with the goal of taking values out of the input state to produce their results. Function parsers are used to build a DSL around class parsers to make it easier to compose them and chain them together. They typically are defined as higher-order functions, compatible with the pipeline operator proposal from tc39.

As an interim solution, this library also includes a pipe(...) operator that can be used to compose parsers.

parsers

To configure a parser like the TokenSequenceParser<TToken, T> to parse a sequence of a single token like a character out of a string, you could use the following snippet of code:

const lowerCaseA = new TokenSequenceParser<Char, Char>([ "a" ]);

But a much simpler way of accomplishing this is to use the provided DSL function, which takes case of the boiler plate for you:

const lowerCaseA = ofChar("a");

As you can see, even though there are over 25 parser functions, those parsers are very bare-bone and cannot accomplish much on their own. This is why you need operators to combine them into chains.

operators

You can use operators to define a chain of parsers to accomplish more complex tasks. For instance, you could use the following incredibly complicated snippet of code to parse a simple quoted string:

const quote = new TokenSequenceParser([ '"' ]); 
const quotedString = Map3Parser<Char, string, string, string, string>(
    quote,
    new OneOfParser<Char, string>(
        new Map1Parser<Char, Char[], string>(
            acc => acc.join(""),
            new ChainAtLeastOnceLParser<Char, string>(
                new CandidateTokenParser<Char, Char>(c => c !== '"'),
                () => [] as Char[],
                (acc, c) => { acc.push(c); return acc; }
            )
        ),
        new ReturnParser("")
    )
    quote,
    (leftQuote, content, rightQuote) => content
);

This chain starts by looking for a quote character, followed by at least one non quote character and ending at an other quote character. In the end, it returns the content of the string. Or, you could simply use the provided DSL which builds on top of these parsers to provide a much friendlier syntax:

const quote = ofChar('"');
const quotedString = pipe(
    manyString(),
    between(quote)
)(candidate(c => c !== '"'));

Here you can see how the pipe operator enables you to compose multiple parsers together, returning a higher-order function ready to accept an other parser as its input. You can also see how easily we can reuse parsers multiple times as they are all stateless components.

execution

Once your parsing chain is configured, you can use the parse(...) method to attempt to produce an AST from an input value. The structure of this AST is defined by how you build the chain of parsers. This structure will be created with usage of the map function to project the raw result of a parser to a more complex object. For a concrete example on how this is done, take a look at the included JsonParser sample which produces a structure of JsonObject, JsonMember, JsonArray, and JsonString.

A typical execution would look like this:

const result = parse('"a quoted string"').with(quotedString);

Which returns a ParseResult<TToken, T>. This result can either represent a successful value, in which case the successful property will be set to true and the result property will return a meaningful value, or it could represent an error, in which case the successful property will be set to false and the error property will contain a description of the error.

haircomb's People

Contributors

dependabot[bot] avatar kawazoe avatar

Stargazers

 avatar

Watchers

 avatar  avatar

haircomb's Issues

JSON sample should better handle whitespaces

Given the current JSON sample parser
When it parses objects and arrays containing whitespaces like [\t ] or { \t\n}
Or when it parses root objects and arrays surrounded by whitespaces like \n\t{...} \t
Then it should successfully parse the JSON input.

Should the pipe operator be a high-ordered function

The current version of the pipe operator returns a function that takes a parser instead or returning a parser directly.

export function pipe<TToken, T>(...): (parser: Parser<TToken, T>) => Parser<TToken, T>;
// instead of
export function pipe<TToken, T>(parser: Parser<TToken, T>, ...): Parser<TToken, T>;

Which one makes more sense?

Failed parse sessions does not provide meaningful expected results

Given a parser like the JSON parser sample
When a parse session fails
Then the expected result that is part of the final error should render the entire Expected tree
But the current version seems to badly render some cases:

AssertionError: Parse error.
Token did not match expected predicate or token list.
Unexpected: " "
Expected:
    <Expected: root JObject>
    Due to: 
        <Expected tokens: <{, }>>
        <Expected tokens: <{, }>>
        <Expected tokens: <{, }>>
        <Expected tokens: <{, }>>
        <Expected tokens: <{, }>>
        <Expected tokens: <{, }>>
        <Expected tokens: <{, }>>
        <Expected tokens: <{, }>>
        <Expected tokens: <{, }>>
        <Expected tokens: <{, }>>
        <Expected tokens: <{, }>>
        <Expected tokens: <{, }>>
        <Expected tokens: <{, }>>
    <Expected: root JArray>
    Due to: 
        <Expected tokens: <[, ]>>
        <Expected tokens: <[, ]>>
        <Expected tokens: <[, ]>>
        <Expected tokens: <[, ]>>
        <Expected tokens: <[, ]>>
        <Expected tokens: <[, ]>>
        <Expected tokens: <[, ]>>
at line 1, column 1: expected false to be true
Expected :true
Actual   :false
 <Click to see difference>

    at expectJson (lib\parse.specs.ts:13:45)
    at Context.<anonymous> (lib\parse.specs.ts:26:19)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.