Giter Site home page Giter Site logo

Comments (12)

Valloric avatar Valloric commented on July 17, 2024

One more thing I just noticed: if I change the \t in the Spacechar rule to, say, x so that Spacechar <- " " / "x", I get the same wrong output. It doesn't matter what character I use instead of \t; it seems the problem is in /, that is the ordered or.

from pegged.

PhilippeSigaud avatar PhilippeSigaud commented on July 17, 2024

On Mon, Oct 1, 2012 at 5:46 AM, Val Markovic [email protected]:

[1] Which is just terrible BTW. I know, it's not your fault, the original
peg-markdown grammar has the bugs, I checked. I'm improving it so that it's
correct and uses the very nice Pegged extensions and I'll pull-request the
new grammar once I'm done.

Thanks a lot, and that's a pull request I find most exciting! I hope
parameterized rules should help in dealing with the dozens of HTML rules.I
recently found a bug in them (param rules), so I'll try and correct it
rapidly.

I corrected some bugs for the C and D grammar (some left in D, though), and
Markdown was next on my list.Apart from the bugs, I find the parse tree
delivered by this grammar to be strangely constructed, due to the way the
grammar was written.

The next step will be tu use it to parse the docs themselves.Then, writing
a tree-walking function that delivers LaTeX or raw text (or inserts
examples) from a .md file will be easy to code.

from pegged.

PhilippeSigaud avatar PhilippeSigaud commented on July 17, 2024

On Mon, Oct 1, 2012 at 6:13 AM, Val Markovic [email protected]:

One more thing I just noticed: if I change the \t in the Spacechar rule
to, say, x so that Spacechar <- " " / "x", I get the same wrong output.
It doesn't matter what character I use instead of \t; it seems the
problem is in /, that is the ordered or.

Maybe there is a repetition somewhere, like (Spacechar*)+, which can
loop indefinitely?

from pegged.

Valloric avatar Valloric commented on July 17, 2024

Maybe there is a repetition somewhere, like (Spacechar*)+, which can
loop indefinitely?

I don't see the repetition in the test case I posted. With this test case
alone I'm experiencing the problem. Are you sure it's not a bug somewhere
in Pegged? Again, this test case is self-contained; the bug is either here
or in Pegged, and I don't see it here.

WRT markdown to LaTeX... I'm writing a markdown converter in D using Pegged
and the example grammar as a basis. I've already made many changes to it;
the final library will provide a ConvertMarkdown(input, output_type)
function that takes in a string of markdown text and the output format
desired (HTML, LaTex etc) and returns the processed string. v1.0 will
include only HTML output, but a LaTeX output type will be easy to add once I get
everything in place for the HTML. Also, I plan to add many, many test cases
for this library; there's already several different sets of markdown test
cases https://github.com/trentm/python-markdown2/wiki/Testing-Notes that
the various markdown converters out there are using and I intend to use as
many of the tests I can.

And yes, the example grammar builds not only a tree that's very peculiar,
but also incorrect for even very basic cases. But again, I'm fixing the
problems as I find them.

I'll gladly upstream the grammar when I'm done with it.

from pegged.

PhilippeSigaud avatar PhilippeSigaud commented on July 17, 2024

On Mon, Oct 1, 2012 at 7:04 PM, Val Markovic [email protected]:

Maybe there is a repetition somewhere, like (Spacechar*)+, which can
loop indefinitely?

I don't see the repetition in the test case I posted. With this test case
alone I'm experiencing the problem. Are you sure it's not a bug somewhere
in Pegged? Again, this test case is self-contained; the bug is either here
or in Pegged, and I don't see it here.

Callumenator found a bug in the keyword function (which I just
corrected). Maybe that was it ? Could send me the hanging grammar please?
(philippe.sigaud and the google mail).

WRT markdown to LaTeX... I'm writing a markdown converter in D using Pegged
and the example grammar as a basis.

That's mightily cool.

I've already made many changes to it;
the final library will provide a ConvertMarkdown(input, output_type)
function that takes in a string of markdown text and the output format
desired (HTML, LaTex etc) and returns the processed string.

would output_type be an enum, or a type?

v1.0 will
include only HTML output, but a LaTeX style will be easy to add once I get
everything in place for the HTML.

Yes.The markup done by markdown is quite simple (headers, links, lists...).

Also, I plan to add many, many test cases
for this library; there's already several different sets of markdown test
cases https://github.com/trentm/python-markdown2/wiki/Testing-Notes that
the various markdown converters out there are using and I intend to use as
many of the tests I can.

That's a pretty good idea.

And yes, the example grammar builds not only a tree that's very peculiar,
but also incorrect for even very basic cases. But again, I'm fixing the
problems as I find them

What I find strange is that it's supposed to be used by peg-markdown, which
in turn is used by MultiMarkDown. So I don't get how they do that...

I'll gladly upstream the grammar when I'm done with it.

Thanks a lot! Don't forget you in the grammar's attribution and the whole
converter.

That might be the basis of a bit more general converter (adding a basic
HTML grammar and a very basic to attribute the MD grammar as LaTeX
grammar to translate docs). Then add Ddoc and we are good.

from pegged.

Valloric avatar Valloric commented on July 17, 2024

Callumenator found a bug in the keyword function (which I just
corrected). Maybe that was it ? Could send me the hanging grammar please?
(philippe.sigaud and the google mail).

Um... read my bug report again. All the information is there. :)

WRT bug in keyword function... I suggest you compile the test case with the
latest Pegged source and try it out.

What I find strange is that it's supposed to be used by peg-markdown, which
in turn is used by MultiMarkDown. So I don't get how they do that...

I find that strange too, but there you go.

Thanks a lot! Don't forget you in the grammar's attribution and the whole
converter.

I'm making the converter a separate library which I'm going to host here on
GitHub.

from pegged.

PhilippeSigaud avatar PhilippeSigaud commented on July 17, 2024

Then this is indeed the bug in keywords (activated by proposing only strings as alternative, as in your SpaceChar rule).

This was corrected a few minutes ago by another commit and works now. I used your original example.

Output:

Test  [0, 12]["foo", " ", "bar", " ", "baz "]
 +-Test.Inlines  [0, 12]["foo", " ", "bar", " ", "baz "]
    +-Test.Inline  [0, 3]["foo"]
    |  +-Test.String  [0, 3]["foo"]
    +-Test.Inline  [3, 4][" "]
    |  +-Test.Spaces  [3, 4][" "]
    +-Test.Inline  [4, 7]["bar"]
    |  +-Test.String  [4, 7]["bar"]
    +-Test.Inline  [7, 8][" "]
    |  +-Test.Spaces  [7, 8][" "]
    +-Test.Inline  [8, 12]["baz "]
       +-Test.String  [8, 12]["baz "]

["foo", " ", "bar", " ", "baz "]

from pegged.

Valloric avatar Valloric commented on July 17, 2024

Awesome, thanks for fixing it!

Is the hang-on-leading-space problem also fixed? Again, same test case but change the input to " foo bar baz " (not at my workstation so can't check myself, sorry).

from pegged.

PhilippeSigaud avatar PhilippeSigaud commented on July 17, 2024
void main() 
{
  auto tree = Test(" foo bar baz ");
  writeln( tree );
  writeln( tree.matches );
}

Gives

Test  [0, 13][" ", "foo", " ", "bar", " ", "baz "]
 +-Test.Inlines  [0, 13][" ", "foo", " ", "bar", " ", "baz "]
    +-Test.Inline  [0, 1][" "]
    |  +-Test.Spaces  [0, 1][" "]
    +-Test.Inline  [1, 4]["foo"]
    |  +-Test.String  [1, 4]["foo"]
    +-Test.Inline  [4, 5][" "]
    |  +-Test.Spaces  [4, 5][" "]
    +-Test.Inline  [5, 8]["bar"]
    |  +-Test.String  [5, 8]["bar"]
    +-Test.Inline  [8, 9][" "]
    |  +-Test.Spaces  [8, 9][" "]
    +-Test.Inline  [9, 13]["baz "]
       +-Test.String  [9, 13]["baz "]

[" ", "foo", " ", "bar", " ", "baz "]

So it does not hang and is OK for the first space, but there is a bug on the last one (look the baz node, it's parsing 4 chars, including the space). And using more than one ending space gives strange results.

OK, back to pegged.peg.keywords, it's still buggy.

from pegged.

PhilippeSigaud avatar PhilippeSigaud commented on July 17, 2024

OK, I found it and corrected it. Dammit, quite a few bugs for such a short template.

void main() 
{
    auto tree = Test(" foo bar baz   ");
    writeln( tree );
    writeln( tree.matches );
}

Now correctly gives:

Test  [0, 15][" ", "foo", " ", "bar", " ", "baz", "   "]
 +-Test.Inlines  [0, 15][" ", "foo", " ", "bar", " ", "baz", "   "]
    +-Test.Inline  [0, 1][" "]
    |  +-Test.Spaces  [0, 1][" "]
    +-Test.Inline  [1, 4]["foo"]
    |  +-Test.String  [1, 4]["foo"]
    +-Test.Inline  [4, 5][" "]
    |  +-Test.Spaces  [4, 5][" "]
    +-Test.Inline  [5, 8]["bar"]
    |  +-Test.String  [5, 8]["bar"]
    +-Test.Inline  [8, 9][" "]
    |  +-Test.Spaces  [8, 9][" "]
    +-Test.Inline  [9, 12]["baz"]
    |  +-Test.String  [9, 12]["baz"]
    +-Test.Inline  [12, 15]["   "]
       +-Test.Spaces  [12, 15]["   "]

[" ", "foo", " ", "bar", " ", "baz", "   "]

And the trailing spaces are parsed OK.

from pegged.

Valloric avatar Valloric commented on July 17, 2024

Great, thanks again!

On a related note, I've found it useful to add test cases to the test suite
of whatever project I was working on when I found and fixed a bug. The new
test would then test for the absence of the bug I just fixed, to make sure
that the same problem does not occur in the future.

Personally, I've found this workflow to be incredibly useful.

from pegged.

Valloric avatar Valloric commented on July 17, 2024

Yup, just verified, everything works now. Thanks again!

from pegged.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.