Comments (12)
One more thing I just noticed: if I change the \t
in the Spacechar
rule to, say, x
so that Spacechar <- " " / "x"
, I get the same wrong output. It doesn't matter what character I use instead of \t
; it seems the problem is in /
, that is the ordered or.
from pegged.
On Mon, Oct 1, 2012 at 5:46 AM, Val Markovic [email protected]:
[1] Which is just terrible BTW. I know, it's not your fault, the original
peg-markdown grammar has the bugs, I checked. I'm improving it so that it's
correct and uses the very nice Pegged extensions and I'll pull-request the
new grammar once I'm done.
Thanks a lot, and that's a pull request I find most exciting! I hope
parameterized rules should help in dealing with the dozens of HTML rules.I
recently found a bug in them (param rules), so I'll try and correct it
rapidly.
I corrected some bugs for the C and D grammar (some left in D, though), and
Markdown was next on my list.Apart from the bugs, I find the parse tree
delivered by this grammar to be strangely constructed, due to the way the
grammar was written.
The next step will be tu use it to parse the docs themselves.Then, writing
a tree-walking function that delivers LaTeX or raw text (or inserts
examples) from a .md file will be easy to code.
from pegged.
On Mon, Oct 1, 2012 at 6:13 AM, Val Markovic [email protected]:
One more thing I just noticed: if I change the \t in the Spacechar rule
to, say, x so that Spacechar <- " " / "x", I get the same wrong output.
It doesn't matter what character I use instead of \t; it seems the
problem is in /, that is the ordered or.
Maybe there is a repetition somewhere, like (Spacechar*)+, which can
loop indefinitely?
from pegged.
Maybe there is a repetition somewhere, like (Spacechar*)+, which can
loop indefinitely?
I don't see the repetition in the test case I posted. With this test case
alone I'm experiencing the problem. Are you sure it's not a bug somewhere
in Pegged? Again, this test case is self-contained; the bug is either here
or in Pegged, and I don't see it here.
WRT markdown to LaTeX... I'm writing a markdown converter in D using Pegged
and the example grammar as a basis. I've already made many changes to it;
the final library will provide a ConvertMarkdown(input, output_type)
function that takes in a string of markdown text and the output format
desired (HTML, LaTex etc) and returns the processed string. v1.0 will
include only HTML output, but a LaTeX output type will be easy to add once I get
everything in place for the HTML. Also, I plan to add many, many test cases
for this library; there's already several different sets of markdown test
cases https://github.com/trentm/python-markdown2/wiki/Testing-Notes that
the various markdown converters out there are using and I intend to use as
many of the tests I can.
And yes, the example grammar builds not only a tree that's very peculiar,
but also incorrect for even very basic cases. But again, I'm fixing the
problems as I find them.
I'll gladly upstream the grammar when I'm done with it.
from pegged.
On Mon, Oct 1, 2012 at 7:04 PM, Val Markovic [email protected]:
Maybe there is a repetition somewhere, like (Spacechar*)+, which can
loop indefinitely?I don't see the repetition in the test case I posted. With this test case
alone I'm experiencing the problem. Are you sure it's not a bug somewhere
in Pegged? Again, this test case is self-contained; the bug is either here
or in Pegged, and I don't see it here.
Callumenator found a bug in the keyword
function (which I just
corrected). Maybe that was it ? Could send me the hanging grammar please?
(philippe.sigaud and the google mail).
WRT markdown to LaTeX... I'm writing a markdown converter in D using Pegged
and the example grammar as a basis.
That's mightily cool.
I've already made many changes to it;
the final library will provide a ConvertMarkdown(input, output_type)
function that takes in a string of markdown text and the output format
desired (HTML, LaTex etc) and returns the processed string.
would output_type
be an enum, or a type?
v1.0 will
include only HTML output, but a LaTeX style will be easy to add once I get
everything in place for the HTML.
Yes.The markup done by markdown is quite simple (headers, links, lists...).
Also, I plan to add many, many test cases
for this library; there's already several different sets of markdown test
cases https://github.com/trentm/python-markdown2/wiki/Testing-Notes that
the various markdown converters out there are using and I intend to use as
many of the tests I can.
That's a pretty good idea.
And yes, the example grammar builds not only a tree that's very peculiar,
but also incorrect for even very basic cases. But again, I'm fixing the
problems as I find them
What I find strange is that it's supposed to be used by peg-markdown, which
in turn is used by MultiMarkDown. So I don't get how they do that...
I'll gladly upstream the grammar when I'm done with it.
Thanks a lot! Don't forget you in the grammar's attribution and the whole
converter.
That might be the basis of a bit more general converter (adding a basic
HTML grammar and a very basic to attribute the MD grammar as LaTeX
grammar to translate docs). Then add Ddoc and we are good.
from pegged.
Callumenator found a bug in the
keyword
function (which I just
corrected). Maybe that was it ? Could send me the hanging grammar please?
(philippe.sigaud and the google mail).Um... read my bug report again. All the information is there. :)
WRT bug in keyword function... I suggest you compile the test case with the
latest Pegged source and try it out.
What I find strange is that it's supposed to be used by peg-markdown, which
in turn is used by MultiMarkDown. So I don't get how they do that...
I find that strange too, but there you go.
Thanks a lot! Don't forget you in the grammar's attribution and the whole
converter.
I'm making the converter a separate library which I'm going to host here on
GitHub.
from pegged.
Then this is indeed the bug in keywords
(activated by proposing only strings as alternative, as in your SpaceChar
rule).
This was corrected a few minutes ago by another commit and works now. I used your original example.
Output:
Test [0, 12]["foo", " ", "bar", " ", "baz "]
+-Test.Inlines [0, 12]["foo", " ", "bar", " ", "baz "]
+-Test.Inline [0, 3]["foo"]
| +-Test.String [0, 3]["foo"]
+-Test.Inline [3, 4][" "]
| +-Test.Spaces [3, 4][" "]
+-Test.Inline [4, 7]["bar"]
| +-Test.String [4, 7]["bar"]
+-Test.Inline [7, 8][" "]
| +-Test.Spaces [7, 8][" "]
+-Test.Inline [8, 12]["baz "]
+-Test.String [8, 12]["baz "]
["foo", " ", "bar", " ", "baz "]
from pegged.
Awesome, thanks for fixing it!
Is the hang-on-leading-space problem also fixed? Again, same test case but change the input to " foo bar baz " (not at my workstation so can't check myself, sorry).
from pegged.
void main()
{
auto tree = Test(" foo bar baz ");
writeln( tree );
writeln( tree.matches );
}
Gives
Test [0, 13][" ", "foo", " ", "bar", " ", "baz "]
+-Test.Inlines [0, 13][" ", "foo", " ", "bar", " ", "baz "]
+-Test.Inline [0, 1][" "]
| +-Test.Spaces [0, 1][" "]
+-Test.Inline [1, 4]["foo"]
| +-Test.String [1, 4]["foo"]
+-Test.Inline [4, 5][" "]
| +-Test.Spaces [4, 5][" "]
+-Test.Inline [5, 8]["bar"]
| +-Test.String [5, 8]["bar"]
+-Test.Inline [8, 9][" "]
| +-Test.Spaces [8, 9][" "]
+-Test.Inline [9, 13]["baz "]
+-Test.String [9, 13]["baz "]
[" ", "foo", " ", "bar", " ", "baz "]
So it does not hang and is OK for the first space, but there is a bug on the last one (look the baz node, it's parsing 4 chars, including the space). And using more than one ending space gives strange results.
OK, back to pegged.peg.keywords, it's still buggy.
from pegged.
OK, I found it and corrected it. Dammit, quite a few bugs for such a short template.
void main()
{
auto tree = Test(" foo bar baz ");
writeln( tree );
writeln( tree.matches );
}
Now correctly gives:
Test [0, 15][" ", "foo", " ", "bar", " ", "baz", " "]
+-Test.Inlines [0, 15][" ", "foo", " ", "bar", " ", "baz", " "]
+-Test.Inline [0, 1][" "]
| +-Test.Spaces [0, 1][" "]
+-Test.Inline [1, 4]["foo"]
| +-Test.String [1, 4]["foo"]
+-Test.Inline [4, 5][" "]
| +-Test.Spaces [4, 5][" "]
+-Test.Inline [5, 8]["bar"]
| +-Test.String [5, 8]["bar"]
+-Test.Inline [8, 9][" "]
| +-Test.Spaces [8, 9][" "]
+-Test.Inline [9, 12]["baz"]
| +-Test.String [9, 12]["baz"]
+-Test.Inline [12, 15][" "]
+-Test.Spaces [12, 15][" "]
[" ", "foo", " ", "bar", " ", "baz", " "]
And the trailing spaces are parsed OK.
from pegged.
Great, thanks again!
On a related note, I've found it useful to add test cases to the test suite
of whatever project I was working on when I found and fixed a bug. The new
test would then test for the absence of the bug I just fixed, to make sure
that the same problem does not occur in the future.
Personally, I've found this workflow to be incredibly useful.
from pegged.
Yup, just verified, everything works now. Thanks again!
from pegged.
Related Issues (20)
- is it okay for the D grammar to compile in 10 minutes and use 45 gigs of ram? HOT 14
- Run-time parsing of Peg grammars HOT 1
- qualifiedIdentifier in dgrammar.d is not defined HOT 2
- Add equivalent of “important” blocks? HOT 3
- Release 0.4.5 HOT 2
- `pure` and `@safe` grammars/rules? HOT 3
- Can't fix this failure. HOT 2
- Grammar won't parse whole text, it stops short a few lines... HOT 1
- Comment syntax HOT 7
- Unwanted space consumption in rule parameter
- Syntax wrappers HOT 1
- How does one do Intellisense using a Pegged grammar? HOT 3
- Is there a way to break up a grammar into D classes, each with their own data to parse into & thus subgrammar? HOT 2
- I can't find the Wikipedia ParseTree handling code in the Pegged wiki any more... HOT 1
- Will there be a speed-up when using a Dlang switch-case and shorter pegged variable names? HOT 2
- Can we create a grammar induction algorithm based upon expected Pegged parsing failures? HOT 2
- How do you use dub + pegged + asModule? HOT 1
- Bump new version HOT 7
- Remove obsolete CI.
- How do you branch on complex names such as caseSensitiveLiteral!("let") (they get much more complicated) HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pegged.