Giter Site home page Giter Site logo

spartanz / parserz Goto Github PK

View Code? Open in Web Editor NEW
70.0 70.0 7.0 318 KB

A purely-functional library for creating both parsers, pretty-printers, and grammar definitions from a single, type-safe specification of a grammar

License: Apache License 2.0

Scala 100.00%
functional-programming invertible parser-combinators scala

parserz's People

Contributors

danielyli avatar gitter-badger avatar ryanonsrc avatar sergei-shabanau avatar xuwei-k avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

parserz's Issues

V.0.2

Use free encoding for library internals

Explore removing nested tuples and eithers

Given we have the following combinators

def ~ [SI1 <: SI, SO1 >: SO, E1 >: E, B](that: Grammar[SI1, SO1, E1, B]): Grammar[SI1, SO1, E1, A /\ B]
def | [SI1 <: SI, SO1 >: SO, E1 >: E, B](that: Grammar[SI1, SO1, E1, B]): Grammar[SI1, SO1, E1, A \/ B]

that produce nested structures when chained, e.g.

val g1: Grammar[SI1, SO1, E, ((A, B), C)] = a ~ b ~ c

Explore the concept of building a parser/printer that skips creation of nested structures when executed.

Idea is to use a witness type with 2 parameters

type G[_, _]

where 1st is a phantom type to capture what is produced by using parser combinators and 2nd is some representation of the 1st.

For example

sealed trait Equiv[A, Repr]
object Equiv {
  def caseClass1[Z, A](f: A => Z, g: Z => Option[A]): Equiv[A, Z] = ???
  def caseClass2[Z, A, B](f: (A, B) => Z, g: Z => Option[(A, B)]): Equiv[(A, B), Z] = ???
  def caseClass3[Z, A, B, C](f: (A, B, C) => Z, g: Z => Option[(A, B, C)]): Equiv[((A, B), C), Z] = ???
}

This way a

final case class Person(name: String, age: Int, address: String)

can be isomorphic to

((String, Int), String)

via means of Equiv.

def to[Z](implicit equiv: Equiv[A, Z]): Grammar[SI, SO, E, Z]

can be used to describe an intent for isomorphic transformation.
Existing program in terms of grammars should be kept to be further optimized into an efficient code that is not doing extra allocations.

`alt` for `A | A`

Handling a | b | c is cumbersome due to nested disjunction handling, e.g.

case Left(Left(a)) => 

and requires code changes if combination also changes, e.g. becomes a | b | c | d.

There must be a way to remove the clutter from handling function.

Composition: use one codec after another

I'm trying to understand the use case and how it can be expressed.

We have

case class Codec[I, A](eq: Equiv[I, (I, A)])

which consumes a bit of I to produce A and returns (remainder of I) + A.

We already have product operation to combine two codecs

def ~ [B](that: Codec[I, B]): Codec[I, (A, B)]

which consumes a bit of I to produce A and then consumes a bit of remainder of I to produce B and returns (what is yet remained of I) + (A, B).

I've built (see the difference in that type parameters)

def >> [B](that: Codec[A, B]): Codec[I, (A, B)]

which consumes a bit of I to produce A and then consumes a bit of A to produce B and returns (the remainder of I) + (the remainder of A) + B.

Questions:

  1. Is the use case real?
  2. The signature is confusing, i.e. going from I to (A, B) in both cases, and I believe product is the logical assumption here. Can def >> be expressed differently?
  3. Obviously Codec isn't a monad. Does def >> has anything to do with monadic composition?

Explicit error type parameter

Define parsers as

sealed trait Parsing[F[_], G[_], E]

and require ApplicativeError[F, E] instead of Applicative[F]

Hard question: how it affects parsers that rely on "optional" effect?

Do not use remainder of input

These are current signatures of parser and printer

def parser[S, E, A](grammar: Grammar[S, S, E, A]): (S, Input) => (S, E \/ (Input, A))
def printer[S, E, A](grammar: Grammar[S, S, E, A]): (S, (Input, A)) => (S, E \/ Input)

They assume chipping off the input (and appending to current output) at every consumption step defined by a user, for example

  val spacing: G[Unit] = consumePure(
    cs => (cs.dropWhile(c => c == ' ' || c == '\n' || c == '\r'), ()),
    { case (cs, _) => ' ' :: cs }
  )

Because it's defined by user it may be arbitrary and inefficient. Because it's mandatory, it is also a source of inefficiency due to allocations that must happen.

The idea is to prototype a solution that allows inner optimizations of accessing the next consumed piece of input, for example using random access, without needing to have user-defined functions and requiring allocations.
This means instead of user-defined functions we have to allow to describe how consumption is supposed to happen. The complexity arises from the fact that Input is currently user-defined type and "description" would require to have implementation of consumption in the library itself, so we may need to constraint the Input to only certain types, i.e. specialize on certain types of Input.

Hide state from user API

Currently state is exposed in grammars

sealed abstract class Grammar[-SI, +SO, +E, A]

This results in state container (a Tuple) being allocated and returned at each step of execution

case Grammar.GADT.Produce(a)      => (s: S, i: Input) => (s, Right((i, a)))

which is a huge performance overhead.

What if state is only returned at the very end?
So instead of

def parser[S, E, A](grammar: Grammar[S, S, E, A]): (S, Input) => (S, E \/ (Input, A))

we have

def parser[E, A](grammar: Grammar[E, A]): Input => (State, E \/ (Input, A))

Derivation of Equiv instances

Given the idea described in #75, will it be possible to use Magnolia to derive instances like

final case class Person(name: String, age: Int, address: String)

implicit val personEquiv: Equiv[((String, Int), String), Person] = 
  Equiv.caseClass(Person.apply(_, _, _), Person.unapply)

via a concise syntax

implicit val personEquiv: Equiv[((String, Int), String), Person] = equiv[Person]

?

`InSet` combinator

To describe things that are parts of a predefined set, like chars or strings. To remove boilerplate. It can play nice with bnf rendering too.

Using `Option` as effect

It is not possible to use Option because there is no ApplicativeError instance for it.

TODO: Use Unit \/ A in the example.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.