Comments (6)
@Zambonifofex there's a distinction here between implementation and spec.
This spec focuses on completeness and correctness, and as @wooorm mentioned the commonmark spec that this builds on, has several features which are at odds with incremental parsing.
Including reference links as you mention.
At an implementation level parsers may subtly diverge from the spec, to support faster parsing avoiding excessive backtracking or lookahead.
For example this incremental commonmark parser: https://github.com/ikatyang/tree-sitter-markdown
notes:
Note: This grammar is based on the assumption that link label matchings will never fail since reference links can come before their reference definitions, which causes it hard to do incremental parsing without this assumption.
from common-markup-state-machine.
There was some discussion of incremental parsing at micromark/micromark#8 (comment)
@wooorm is working on a new version of the Micromark parser, he may be able to comment better on whether incremental parsing is still in scope.
This repo is mostly spec, not implementation, it sounds like you are interested in an implementation of an incremental markdown parser?
If so it may sense to continue this at https://github.com/micromark/micromark
from common-markup-state-machine.
This repo is mostly spec, not implementation, it sounds like you are interested in an implementation of an incremental markdown parser?
I imagined that a change in the spec might have been necessary to accomodate for incremental parser implementations, or at least to facilitate it somehow.
If you think this fits better on the micromark repo, you might be able to transfer this issue there depending on your permissions on both repositories.
from common-markup-state-machine.
CM compliance is at odds with incremental parsing, as something on the first line of the document could affect the last line. Or something at the start of a paragraph could affect the end. As the thread Christian linked to explains in more detail.
I imagined that a change in the spec might have been necessary to accomodate for incremental parser implementations, or at least to facilitate it somehow.
That would make it no longer compatible with commonmark though 🤔
from common-markup-state-machine.
[…] something on the first line of the document could affect the last line.
That’s true, but the cases you had mentioned are kinda edge cases in my opinion. Very rarely will people have documents that look like those, and in those circumstances, it’d make sense and be fine to have to reparse as much as necessary.
[…] something at the start of a paragraph could affect the end.
In my opinion, incremental parsing only really needs to be done at the block level. For the inline level, it feels a bit unnecessary/overkill.
The actually problematic thing (which I didn’t see mentioned in that thread) is related to reference links. Ideally, if someone adds a new reference to the document, a paragraph would only need to be reparsed if it contains potential links whose target would have been that reference (if it had been present earlier, that is).
The problem is that it’s not clear to me whether it’s easy to parse potential link references beforehand that way.
from common-markup-state-machine.
Going to close this because the conversation seems resolved.
I don’t think I’ll document incremental parsing here, as it deviates from CommonMark, and I’m not interesting in doing that.
For whether people want to skip this case and implement it differently in their parsers, I’m fine with that and I find it quite understandable!
from common-markup-state-machine.
Related Issues (14)
- Phrasing HOT 1
- GFM extension: strikethrough HOT 1
- GFM extension: extended autolinks (aka literal urls) HOT 1
- GFM extension: tag filter HOT 1
- MDX extension: JSX HOT 2
- Content HOT 1
- Stack of continuation HOT 1
- Extensions HOT 12
- Turning tokens into content HOT 1
- Grouping in the queue HOT 1
- The thing between the tokeniser and an adapter HOT 8
- GFM extension: tables HOT 1
- GFM extension: task lists HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from common-markup-state-machine.