lcsb-biocore / pikaparser.jl Goto Github PK

View Code? Open in Web Editor NEW

51.0 1.0 2.0 491 KB

Pure Julia implementation of pika parser.

Home Page: https://lcsb-biocore.github.io/PikaParser.jl

License: Apache License 2.0

Julia 100.00%

julia parser-library

pikaparser.jl's People

Contributors

Stargazers

Watchers

Forkers

heptazhou arcj137442

pikaparser.jl's Issues

complete scanning of `canMatchZeroChars`

apparently, there are convoluted cases that result in the topologic order of derivation of canMatchZeroChars giving wrong results.

Change the derivation to a flood-fill algorithm.

support SOF/EOF marks

...because it's just much more handy than checking if the match really matched everything

Extra clauses to simplify frontend

counted repetion, similar to (something){123} in regex. Possibly doable by using the option index in ZeroOrMore (or OneOrMore) to count the matched repetitions.
sepBy alternative (can be generalized by stuff like alternated / cycled / ...)

Precedence

Hi, I was trying to parse regexes with this parser and couldn't get the | operator to work.

For abc|de I get

sequence(
    expr(char()),
    expr(char()),
    expr(either(sequence(expr(char())), pipe(), sequence(expr(char()), expr(char())))),
)

Because I don't know how to specify that | should have the largest possible sequences on left and right.

The relevant grammar parts of this are :sequence => P.one_or_more(:expr), then :expr => P.first(:either, :negative_lookahead, :positive_lookahead, :positive_lookbehind, :negative_lookbehind, :zero_or_more, :one_or_more, :repetition, :repetition_at_least, :repetition_from_to, :maybe, :noncapturing_group, :capturing_group, :not_set, :set, :specialized_char, :dot, :normalized_char, :char, :_begin, :_end) and :either => P.seq(:sequence, :pipe, :sequence).

I saw that https://github.com/lukehutch/pikaparser has precedence integers for clauses but I didn't see anything here, is that a missing piece of the puzzle?

codecov token for CI

@laurentheirendt can you please add the codecov token to this repo (just as with other repos)

We've got a nice 99% coverage so better report it. :D

[Feature request] Accept Regex expressions in `Scan`

Proposal

Often I find myself that some clauses are more easily parsed with a regex than with PikaParser clauses. The solution is to user a Scan in a way similar to:

rules = Dict(
    ...,
    :id => PikaParser.scan() do x
        matched = match(r"^[a-z][a-zA-Z0-9_]*", x)
        isnothing(matched) && return 0
        length(matched.match)
    end,
    ...,
)

It would be great if we could just pass the regex to scan.

Unsolved issues

Only regex of the form r"^..." should be accepted. If the ^ clause is not present, then the regex will search the pattern along all the input.

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

lcsb-biocore / pikaparser.jl Goto Github PK

pikaparser.jl's People

Contributors

Stargazers

Watchers

Forkers

pikaparser.jl's Issues

complete scanning of `canMatchZeroChars`

support SOF/EOF marks

Extra clauses to simplify frontend

Precedence

codecov token for CI

[Feature request] Accept Regex expressions in `Scan`

Proposal

Unsolved issues

TagBot trigger issue

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent