Giter Site home page Giter Site logo

Comments (6)

ChristianMurphy avatar ChristianMurphy commented on May 28, 2024 1

@Zambonifofex there's a distinction here between implementation and spec.
This spec focuses on completeness and correctness, and as @wooorm mentioned the commonmark spec that this builds on, has several features which are at odds with incremental parsing.
Including reference links as you mention.

At an implementation level parsers may subtly diverge from the spec, to support faster parsing avoiding excessive backtracking or lookahead.
For example this incremental commonmark parser: https://github.com/ikatyang/tree-sitter-markdown
notes:

Note: This grammar is based on the assumption that link label matchings will never fail since reference links can come before their reference definitions, which causes it hard to do incremental parsing without this assumption.

from common-markup-state-machine.

ChristianMurphy avatar ChristianMurphy commented on May 28, 2024

There was some discussion of incremental parsing at micromark/micromark#8 (comment)
@wooorm is working on a new version of the Micromark parser, he may be able to comment better on whether incremental parsing is still in scope.

This repo is mostly spec, not implementation, it sounds like you are interested in an implementation of an incremental markdown parser?
If so it may sense to continue this at https://github.com/micromark/micromark

from common-markup-state-machine.

zamfofex avatar zamfofex commented on May 28, 2024

This repo is mostly spec, not implementation, it sounds like you are interested in an implementation of an incremental markdown parser?

I imagined that a change in the spec might have been necessary to accomodate for incremental parser implementations, or at least to facilitate it somehow.

If you think this fits better on the micromark repo, you might be able to transfer this issue there depending on your permissions on both repositories.

from common-markup-state-machine.

wooorm avatar wooorm commented on May 28, 2024

CM compliance is at odds with incremental parsing, as something on the first line of the document could affect the last line. Or something at the start of a paragraph could affect the end. As the thread Christian linked to explains in more detail.

I imagined that a change in the spec might have been necessary to accomodate for incremental parser implementations, or at least to facilitate it somehow.

That would make it no longer compatible with commonmark though 🤔

from common-markup-state-machine.

zamfofex avatar zamfofex commented on May 28, 2024

[…] something on the first line of the document could affect the last line.

That’s true, but the cases you had mentioned are kinda edge cases in my opinion. Very rarely will people have documents that look like those, and in those circumstances, it’d make sense and be fine to have to reparse as much as necessary.

[…] something at the start of a paragraph could affect the end.

In my opinion, incremental parsing only really needs to be done at the block level. For the inline level, it feels a bit unnecessary/overkill.


The actually problematic thing (which I didn’t see mentioned in that thread) is related to reference links. Ideally, if someone adds a new reference to the document, a paragraph would only need to be reparsed if it contains potential links whose target would have been that reference (if it had been present earlier, that is).

The problem is that it’s not clear to me whether it’s easy to parse potential link references beforehand that way.

from common-markup-state-machine.

wooorm avatar wooorm commented on May 28, 2024

Going to close this because the conversation seems resolved.

I don’t think I’ll document incremental parsing here, as it deviates from CommonMark, and I’m not interesting in doing that.

For whether people want to skip this case and implement it differently in their parsers, I’m fine with that and I find it quite understandable!

from common-markup-state-machine.

Related Issues (14)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.