Giter Site home page Giter Site logo

cl-peg-yapp's Introduction

cl-peg-yapp

CI/testing

Stands for "yet another PEG parser".

This is a hand-rolled implementation of a PEG parsing generator from first principles, for my own edification. The main objectives are parser generation: a provided PEG input should result in a generated parser, which can scan and parse a valid language sample.

It's heavily inspired from cl-peg for the end-user API, but with the difference that hopefully it is more easily readable, via a liberal use of macros and no reliance on CLOS. Tests are also included in-line.

It can also expose and pretty-print the resulting node tree, with the PEG expression definitions as a unified struct, called a match. The current implementation also makes use of packrat caching as my uses lean heavily towards memory tradeoff for speed.

Please note the flavor of PEG implemented is very basic and opinionated. The main caveats are:

  • expression definitions always are CamelCase, starting with upcase.
  • unicode literals (e.g. u10EA44) can be parsed, but other literals (binary, etc) are not implemented.
  • definition arrows are single-dash, but both a unicode and compound arrow is accepted.
  • String literals, ('0', "0"), escaped chars (via \) and wildcard chars (.) are supporte

I'm considering making the definitions more flexible so anyone can bring their PEG definitions but for now it's not a priority. However, the current implementation is tested against a flavor of the venerable c-based markdown PEG in 'leg' and that seems to work fine.

Usage

The external API is a bit clunky, but the tradeoff is that it gives you more insights into your grammar. Consider this example:

  • You have a markdown definition, say like this one
  • You'd like load the def to parse out some markdown

You would accomplish that with this code:

;; load the system

(load "~/quicklisp/setup.lisp")
(ql:quickload "cl-peg-yapp")

;; load your grammar
(setf *my-grammar* 
  (cl-peg-yapp:parse-grammar #p"src/tests/grammars/markdown.peg"))

;; here you'll see the node structure of your grammar, e.g.
;; #(M :KIND :PATTERN :START 0000 :END 20261 matched str: >>>|Doc <-       BOM? Block* `Newline`  `Newline` Block <-     BlankLine* (BlockQuote `Newline`               /  Verbatim `Newline`               /  Note `Newline`               /  Reference `Newline`               /  HorizontalRule `Newline`               /  Heading `Newline`               /  OrderedList `Newline` ...  

;; now you can check if your nodes are correct and so on, or parse your target string via a generated parser function:

(setf *my-grammar-parser* (cl-peg-yapp:generate *my-grammar*))
(funcall *my-grammar-parser* #p"my-file.md")

;; you'll also get a node tree structure representing your parsed grammar:
;;
;; #(M :KIND :DOC :START 0000 :END 0043 matched str: >>>|## Hey  `Newline`  `Newline` This is me, [my link](example.com)|<<<:CHILDREN
;;    #(M :KIND :BLOCK :START 0000 :END 0008 matched str: >>>|## Hey  `Newline` |<<<:CHILDREN
;;       #(M :KIND :HEADING :START 0000 :END 0008 matched str: >>>|## Hey  `Newline` |<<<:CHILDREN
;;          #(M :KIND :ATXHEADING :START 0000 :END 0008 matched str: >>>|## Hey  `Newline` |<<<:CHILDREN
;;             #(M :KIND :ATXSTART :START 0000 :END 0002 matched str: >>>|##|<<<)
;;             #(M :KIND :SP :START 0002 :END 0003 matched str: >>>| |<<<:CHILDREN
;;                #(M :KIND :SPACECHAR :START 0002 :END 0003 matched str: >>>| |<<<))

Testing

There's a considerable amount of tests available for the project. To run:

;; load 5am
(ql:quickload "fiveam")

;; load the system
(ql:quickload "cl-peg-yapp")

;; run all the tests
(5am:run! 'cl-peg-yapp:peg-suite)

Tests are broken down by suites found in test-suite.lisp

cl-peg-yapp's People

Contributors

maxarturo avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.