Giter Site home page Giter Site logo

yuin / goldmark Goto Github PK

View Code? Open in Web Editor NEW
3.5K 29.0 249.0 3.7 MB

:trophy: A markdown parser written in Go. Easy to extend, standard(CommonMark) compliant, well structured.

License: MIT License

Makefile 0.28% Go 99.43% C 0.29%
markdown commonmark golang go

goldmark's Introduction

goldmark

https://pkg.go.dev/github.com/yuin/goldmark https://github.com/yuin/goldmark/actions?query=workflow:test https://coveralls.io/github/yuin/goldmark https://goreportcard.com/report/github.com/yuin/goldmark

A Markdown parser written in Go. Easy to extend, standards-compliant, well-structured.

goldmark is compliant with CommonMark 0.31.2.

Motivation

I needed a Markdown parser for Go that satisfies the following requirements:

  • Easy to extend.
    • Markdown is poor in document expressions compared to other light markup languages such as reStructuredText.
    • We have extensions to the Markdown syntax, e.g. PHP Markdown Extra, GitHub Flavored Markdown.
  • Standards-compliant.
    • Markdown has many dialects.
    • GitHub-Flavored Markdown is widely used and is based upon CommonMark, effectively mooting the question of whether or not CommonMark is an ideal specification.
      • CommonMark is complicated and hard to implement.
  • Well-structured.
    • AST-based; preserves source position of nodes.
  • Written in pure Go.

golang-commonmark may be a good choice, but it seems to be a copy of markdown-it.

blackfriday.v2 is a fast and widely-used implementation, but is not CommonMark-compliant and cannot be extended from outside of the package, since its AST uses structs instead of interfaces.

Furthermore, its behavior differs from other implementations in some cases, especially regarding lists: Deep nested lists don't output correctly #329, List block cannot have a second line #244, etc.

This behavior sometimes causes problems. If you migrate your Markdown text from GitHub to blackfriday-based wikis, many lists will immediately be broken.

As mentioned above, CommonMark is complicated and hard to implement, so Markdown parsers based on CommonMark are few and far between.

Features

  • Standards-compliant. goldmark is fully compliant with the latest CommonMark specification.
  • Extensible. Do you want to add a @username mention syntax to Markdown? You can easily do so in goldmark. You can add your AST nodes, parsers for block-level elements, parsers for inline-level elements, transformers for paragraphs, transformers for the whole AST structure, and renderers.
  • Performance. goldmark's performance is on par with that of cmark, the CommonMark reference implementation written in C.
  • Robust. goldmark is tested with go test --fuzz.
  • Built-in extensions. goldmark ships with common extensions like tables, strikethrough, task lists, and definition lists.
  • Depends only on standard libraries.

Installation

$ go get github.com/yuin/goldmark

Usage

Import packages:

import (
    "bytes"
    "github.com/yuin/goldmark"
)

Convert Markdown documents with the CommonMark-compliant mode:

var buf bytes.Buffer
if err := goldmark.Convert(source, &buf); err != nil {
  panic(err)
}

With options

var buf bytes.Buffer
if err := goldmark.Convert(source, &buf, parser.WithContext(ctx)); err != nil {
  panic(err)
}
Functional option Type Description
parser.WithContext A parser.Context Context for the parsing phase.

Context options

Functional option Type Description
parser.WithIDs A parser.IDs IDs allows you to change logics that are related to element id(ex: Auto heading id generation).

Custom parser and renderer

import (
    "bytes"
    "github.com/yuin/goldmark"
    "github.com/yuin/goldmark/extension"
    "github.com/yuin/goldmark/parser"
    "github.com/yuin/goldmark/renderer/html"
)

md := goldmark.New(
          goldmark.WithExtensions(extension.GFM),
          goldmark.WithParserOptions(
              parser.WithAutoHeadingID(),
          ),
          goldmark.WithRendererOptions(
              html.WithHardWraps(),
              html.WithXHTML(),
          ),
      )
var buf bytes.Buffer
if err := md.Convert(source, &buf); err != nil {
    panic(err)
}
Functional option Type Description
goldmark.WithParser parser.Parser This option must be passed before goldmark.WithParserOptions and goldmark.WithExtensions
goldmark.WithRenderer renderer.Renderer This option must be passed before goldmark.WithRendererOptions and goldmark.WithExtensions
goldmark.WithParserOptions ...parser.Option
goldmark.WithRendererOptions ...renderer.Option
goldmark.WithExtensions ...goldmark.Extender

Parser and Renderer options

Parser options

Functional option Type Description
parser.WithBlockParsers A util.PrioritizedSlice whose elements are parser.BlockParser Parsers for parsing block level elements.
parser.WithInlineParsers A util.PrioritizedSlice whose elements are parser.InlineParser Parsers for parsing inline level elements.
parser.WithParagraphTransformers A util.PrioritizedSlice whose elements are parser.ParagraphTransformer Transformers for transforming paragraph nodes.
parser.WithASTTransformers A util.PrioritizedSlice whose elements are parser.ASTTransformer Transformers for transforming an AST.
parser.WithAutoHeadingID - Enables auto heading ids.
parser.WithAttribute - Enables custom attributes. Currently only headings supports attributes.

HTML Renderer options

Functional option Type Description
html.WithWriter html.Writer html.Writer for writing contents to an io.Writer.
html.WithHardWraps - Render newlines as <br>.
html.WithXHTML - Render as XHTML.
html.WithUnsafe - By default, goldmark does not render raw HTML or potentially dangerous links. With this option, goldmark renders such content as written.

Built-in extensions

Attributes

The parser.WithAttribute option allows you to define attributes on some elements.

Currently only headings support attributes.

Attributes are being discussed in the CommonMark forum. This syntax may possibly change in the future.

Headings

## heading ## {#id .className attrName=attrValue class="class1 class2"}

## heading {#id .className attrName=attrValue class="class1 class2"}
heading {#id .className attrName=attrValue}
============

Table extension

The Table extension implements Table(extension), as defined in GitHub Flavored Markdown Spec.

Specs are defined for XHTML, so specs use some deprecated attributes for HTML5.

You can override alignment rendering method via options.

Functional option Type Description
extension.WithTableCellAlignMethod extension.TableCellAlignMethod Option indicates how are table cells aligned.

Typographer extension

The Typographer extension translates plain ASCII punctuation characters into typographic-punctuation HTML entities.

Default substitutions are:

Punctuation Default entity
' &lsquo;, &rsquo;
" &ldquo;, &rdquo;
-- &ndash;
--- &mdash;
... &hellip;
<< &laquo;
>> &raquo;

You can override the default substitutions via extensions.WithTypographicSubstitutions:

markdown := goldmark.New(
    goldmark.WithExtensions(
        extension.NewTypographer(
            extension.WithTypographicSubstitutions(extension.TypographicSubstitutions{
                extension.LeftSingleQuote:  []byte("&sbquo;"),
                extension.RightSingleQuote: nil, // nil disables a substitution
            }),
        ),
    ),
)

Linkify extension

The Linkify extension implements Autolinks(extension), as defined in GitHub Flavored Markdown Spec.

Since the spec does not define details about URLs, there are numerous ambiguous cases.

You can override autolinking patterns via options.

Functional option Type Description
extension.WithLinkifyAllowedProtocols [][]byte | []string List of allowed protocols such as []string{ "http:" }
extension.WithLinkifyURLRegexp *regexp.Regexp Regexp that defines URLs, including protocols
extension.WithLinkifyWWWRegexp *regexp.Regexp Regexp that defines URL starting with www.. This pattern corresponds to the extended www autolink
extension.WithLinkifyEmailRegexp *regexp.Regexp Regexp that defines email addresses`

Example, using xurls:

import "mvdan.cc/xurls/v2"

markdown := goldmark.New(
    goldmark.WithRendererOptions(
        html.WithXHTML(),
        html.WithUnsafe(),
    ),
    goldmark.WithExtensions(
        extension.NewLinkify(
            extension.WithLinkifyAllowedProtocols([]string{
                "http:",
                "https:",
            }),
            extension.WithLinkifyURLRegexp(
                xurls.Strict(),
            ),
        ),
    ),
)

Footnotes extension

The Footnote extension implements PHP Markdown Extra: Footnotes.

This extension has some options:

Functional option Type Description
extension.WithFootnoteIDPrefix []byte | string a prefix for the id attributes.
extension.WithFootnoteIDPrefixFunction func(gast.Node) []byte a function that determines the id attribute for given Node.
extension.WithFootnoteLinkTitle []byte | string an optional title attribute for footnote links.
extension.WithFootnoteBacklinkTitle []byte | string an optional title attribute for footnote backlinks.
extension.WithFootnoteLinkClass []byte | string a class for footnote links. This defaults to footnote-ref.
extension.WithFootnoteBacklinkClass []byte | string a class for footnote backlinks. This defaults to footnote-backref.
extension.WithFootnoteBacklinkHTML []byte | string a class for footnote backlinks. This defaults to &#x21a9;&#xfe0e;.

Some options can have special substitutions. Occurrences of “^^” in the string will be replaced by the corresponding footnote number in the HTML output. Occurrences of “%%” will be replaced by a number for the reference (footnotes can have multiple references).

extension.WithFootnoteIDPrefix and extension.WithFootnoteIDPrefixFunction are useful if you have multiple Markdown documents displayed inside one HTML document to avoid footnote ids to clash each other.

extension.WithFootnoteIDPrefix sets fixed id prefix, so you may write codes like the following:

for _, path := range files {
    source := readAll(path)
    prefix := getPrefix(path)

    markdown := goldmark.New(
        goldmark.WithExtensions(
            NewFootnote(
                WithFootnoteIDPrefix(path),
            ),
        ),
    )
    var b bytes.Buffer
    err := markdown.Convert(source, &b)
    if err != nil {
        t.Error(err.Error())
    }
}

extension.WithFootnoteIDPrefixFunction determines an id prefix by calling given function, so you may write codes like the following:

markdown := goldmark.New(
    goldmark.WithExtensions(
        NewFootnote(
                WithFootnoteIDPrefixFunction(func(n gast.Node) []byte {
                    v, ok := n.OwnerDocument().Meta()["footnote-prefix"]
                    if ok {
                        return util.StringToReadOnlyBytes(v.(string))
                    }
                    return nil
                }),
        ),
    ),
)

for _, path := range files {
    source := readAll(path)
    var b bytes.Buffer

    doc := markdown.Parser().Parse(text.NewReader(source))
    doc.Meta()["footnote-prefix"] = getPrefix(path)
    err := markdown.Renderer().Render(&b, source, doc)
}

You can use goldmark-meta to define a id prefix in the markdown document:

---
title: document title
slug: article1
footnote-prefix: article1
---

# My article

CJK extension

CommonMark gives compatibilities a high priority and original markdown was designed by westerners. So CommonMark lacks considerations for languages like CJK.

This extension provides additional options for CJK users.

Functional option Type Description
extension.WithEastAsianLineBreaks ...extension.EastAsianLineBreaksStyle Soft line breaks are rendered as a newline. Some asian users will see it as an unnecessary space. With this option, soft line breaks between east asian wide characters will be ignored. This defaults to EastAsianLineBreaksStyleSimple.
extension.WithEscapedSpace - Without spaces around an emphasis started with east asian punctuations, it is not interpreted as an emphasis(as defined in CommonMark spec). With this option, you can avoid this inconvenient behavior by putting 'not rendered' spaces around an emphasis like 太郎は\ **「こんにちわ」**\ といった.

Styles of Line Breaking

Style Description
EastAsianLineBreaksStyleSimple Soft line breaks are ignored if both sides of the break are east asian wide character. This behavior is the same as east_asian_line_breaks in Pandoc.
EastAsianLineBreaksCSS3Draft This option implements CSS text level3 Segment Break Transformation Rules with some enhancements.

Example of EastAsianLineBreaksStyleSimple

Input Markdown:

私はプログラマーです。
東京の会社に勤めています。
GoでWebアプリケーションを開発しています。

Output:

<p>私はプログラマーです。東京の会社に勤めています。\nGoでWebアプリケーションを開発しています。</p>

Example of EastAsianLineBreaksCSS3Draft

Input Markdown:

私はプログラマーです。
東京の会社に勤めています。
GoでWebアプリケーションを開発しています。

Output:

<p>私はプログラマーです。東京の会社に勤めています。GoでWebアプリケーションを開発しています。</p>

Security

By default, goldmark does not render raw HTML or potentially-dangerous URLs. If you need to gain more control over untrusted contents, it is recommended that you use an HTML sanitizer such as bluemonday.

Benchmark

You can run this benchmark in the _benchmark directory.

against other golang libraries

blackfriday v2 seems to be the fastest, but as it is not CommonMark compliant, its performance cannot be directly compared to that of the CommonMark-compliant libraries.

goldmark, meanwhile, builds a clean, extensible AST structure, achieves full compliance with CommonMark, and consumes less memory, all while being reasonably fast.

  • MBP 2019 13″(i5, 16GB), Go1.17
BenchmarkMarkdown/Blackfriday-v2-8                   302           3743747 ns/op         3290445 B/op      20050 allocs/op
BenchmarkMarkdown/GoldMark-8                         280           4200974 ns/op         2559738 B/op      13435 allocs/op
BenchmarkMarkdown/CommonMark-8                       226           5283686 ns/op         2702490 B/op      20792 allocs/op
BenchmarkMarkdown/Lute-8                              12          92652857 ns/op        10602649 B/op      40555 allocs/op
BenchmarkMarkdown/GoMarkdown-8                        13          81380167 ns/op         2245002 B/op      22889 allocs/op

against cmark (CommonMark reference implementation written in C)

  • MBP 2019 13″(i5, 16GB), Go1.17
----------- cmark -----------
file: _data.md
iteration: 50
average: 0.0044073057 sec
------- goldmark -------
file: _data.md
iteration: 50
average: 0.0041611990 sec

As you can see, goldmark's performance is on par with cmark's.

Extensions

List of extensions

Loading extensions at runtime

goldmark-dynamic allows you to write a goldmark extension in Lua and load it at runtime without re-compilation.

Please refer to goldmark-dynamic for details.

goldmark internal(for extension developers)

Overview

goldmark's Markdown processing is outlined in the diagram below.

            <Markdown in []byte, parser.Context>
                           |
                           V
            +-------- parser.Parser ---------------------------
            | 1. Parse block elements into AST
            |   1. If a parsed block is a paragraph, apply 
            |      ast.ParagraphTransformer
            | 2. Traverse AST and parse blocks.
            |   1. Process delimiters(emphasis) at the end of
            |      block parsing
            | 3. Apply parser.ASTTransformers to AST
                           |
                           V
                      <ast.Node>
                           |
                           V
            +------- renderer.Renderer ------------------------
            | 1. Traverse AST and apply renderer.NodeRenderer
            |    corespond to the node type

                           |
                           V
                        <Output>

Parsing

Markdown documents are read through text.Reader interface.

AST nodes do not have concrete text. AST nodes have segment information of the documents, represented by text.Segment .

text.Segment has 3 attributes: Start, End, Padding .

(TBC)

TODO

See extension directory for examples of extensions.

Summary:

  1. Define AST Node as a struct in which ast.BaseBlock or ast.BaseInline is embedded.
  2. Write a parser that implements parser.BlockParser or parser.InlineParser.
  3. Write a renderer that implements renderer.NodeRenderer.
  4. Define your goldmark extension that implements goldmark.Extender.

Donation

BTC: 1NEDSyUmo4SMTDP83JJQSWi1MvQUGGNMZB

License

MIT

Author

Yusuke Inuzuka

goldmark's People

Contributors

88250 avatar abhinav avatar aletson avatar anthonyfok avatar camdencheek avatar cipherboy avatar dertseha avatar eltociear avatar helfper avatar henry0312 avatar hivehand avatar hochhaus avatar jchenry avatar jlauinger avatar jmooring avatar jschaf avatar jsteuer avatar kissaki avatar litao-byted avatar mdigger avatar mironovalexey avatar moorereason avatar movsb avatar paperprototype avatar stephenafamo avatar tangiel avatar vincentbernat avatar yilei avatar yuin avatar zzwx avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

goldmark's Issues

How to apply microtypographic rules to Markdown?

  1. What version of goldmark are you using? : v1.11.1 (Hugo 0.60.1)
  2. What version of Go are you using? : go1.13.4
  3. What operating system and processor architecture are you using? darwin/amd64 (macOS 10.15.1)
  4. What did you do? : Write Markdown in french language
  5. What did you expect to see? : french typographic rules applied (like inserting a non-breakable space before a question mark)
  6. What did you see instead? : no french typographic rules
  7. (Feature request only): Why you can not implement it as an extension?: Not a Go programmer

How should be french typographic rules applied, through an extension, or is it something that is dependendant of the Go language itself? Or another Go Package? SmartyPants but with more rules specific to a language.

For instance, languages like PHP have libs to handle this https://github.com/jolicode/JoliTypo

Some of the french typographic rules are liste by Grammalecte Firefox extension:

Comma breaks Linkify

Using this file:

package main
import (
   "bytes"
   gm "github.com/yuin/goldmark"
   "github.com/yuin/goldmark/extension"
)
func main() {
   var s1 = []byte("https://github.com#sun,mon")
   var s2 bytes.Buffer
   gm.New(gm.WithExtensions(extension.Linkify)).Convert(s1, &s2)
   print(s2.String())
}

I get this result:

<p><a href="https://github.com#sun">https://github.com#sun</a>,mon</p>

With github.com parser, I get this result:

<p><a href="https://github.com#sun,mon">https://github.com#sun,mon</a></p>

Example:

https://github.com#sun,mon

Support for inline footnotes

Please answer the following before submitting your issue:

  1. What version of goldmark are you using? : v1.1.10 via Hugo v0.60.1
  2. What version of Go are you using? : go1.13.4
  3. What operating system and processor architecture are you using? : darwin/amd64

This is a feature request. Both Pandoc and Black Friday support a different version of footnotes than the one currently supported by Goldmark. Pandoc refers to them as inline footnotes. The syntax looks like this:

This is a sentence.^[This is footnote one.] This is also a sentence.^[This will become footnote two.]

Would it be possible for Goldmark to support this kind of footnote?

Linkify does not work after Chinese characters

goldmark does not linkify following links:

搜索引擎链接https://www.google.com

OR

搜索引擎链接:https://www.google.com
  1. What version of goldmark are you using? : v1.1.11
  2. What version of Go are you using? : 1.12
  3. What operating system and processor architecture are you using? : Hugo v0.60.1 on macOS
  4. What did you do? : Put a link after Chinese characters
  5. What did you expect to see? : The link should be automatically created
  6. What did you see instead? : The link was not automatically created
  7. (Feature request only): Why you can not implement it as an extension?: not applicable

Linkify bug

package main

import (
	"bytes"
	"fmt"
	"log"
	"runtime/debug"

	"github.com/yuin/goldmark"
	"github.com/yuin/goldmark/extension"
)

func main() {

	convert(`
Go to [http://www.example.com](www.example.com) or http://www.example.com.
`)
}

func convert(src string) {
	markdown := goldmark.New(
		goldmark.WithExtensions(extension.Linkify),
	)

	var buf bytes.Buffer
	err := markdown.Convert([]byte(src), &buf)
	if err != nil {
		log.Fatal(err)
	}

	fmt.Println(buf.String())
}

Prints:

<p>Go to <a href="www.example.com"><a href="http://www.example.com">http://www.example.com</a></a> or <a href="http://www.example.com">http://www.example.com</a>.</p>

Add some kind of "non-rendering render hook"

In working on adding this to Hugo, I wanted to implement ToC in a general way that we could possibly also use for other things; e.g. a "content map" with byte slice pointers (start/stop) into the rendered content.

I experimented by creating an extension:

https://github.com/bep/hugo/blob/goldmark2/markup/goldmark/contentmap.go#L35

But that doesn't work, as I notice that you pick up the first renderer for a given node kind.

Note that for the ToC thing (which is what Hugo has today), I can traverse the AST and build the ToC from that, but it would be really useful if could somehow register the rendered start/stop position for the different blocks; so people could do things like:

  • Split content over multiple pages
  • Insert ads/bylines etc.
  • ...

Again, thanks for this library, it's really easy to use.

Apostrophes in contractions are not converted to right single quote

For the following text:

I'm going to see my mother. She's very nice.

Currently, ' is not converted to &rsquo; for contractions when the typography extension is enabled, but smartypants does. I would expect the output to be:

I&rsquo;m going to see my mother. She&rsquo;s very nice.

Heading attribute panics

The source below is taken from the README.

package main

import (
	"bytes"

	"github.com/yuin/goldmark"
	"github.com/yuin/goldmark/parser"
)

func main() {
	md := goldmark.New(
		goldmark.WithParserOptions(
			parser.WithAttribute(),
		),
	)
	source := []byte(`
## heading {#id .className attrName=attrValue class="class1 class2"}
`)
	var buf bytes.Buffer
	if err := md.Convert(source, &buf); err != nil {
		panic(err)
	}
}

Panics:

panic: interface conversion: interface {} is [][]uint8, not []uint8

goroutine 1 [running]:
github.com/yuin/goldmark/renderer/html.(*Renderer).RenderAttributes(0xc00009cae0, 0x12262c0, 0xc000119d80, 0x1227860, 0xc0001b2000)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/renderer/html/html.go:513 +0x227
github.com/yuin/goldmark/renderer/html.(*Renderer).renderHeading(0xc00009cae0, 0x12262c0, 0xc000119d80, 0xc000184140, 0x49, 0x49, 0x1227860, 0xc0001b2000, 0x2001, 0x8, ...)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/renderer/html/html.go:208 +0x11f
github.com/yuin/goldmark/renderer.(*renderer).Render.func2(0x1227860, 0xc0001b2000, 0x1, 0x0, 0x12262c0, 0xc000119d80)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/renderer/renderer.go:167 +0x108
github.com/yuin/goldmark/ast.Walk(0x1227860, 0xc0001b2000, 0xc000175e30, 0x3, 0x0)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/ast/ast.go:433 +0x43
github.com/yuin/goldmark/ast.Walk(0x12273e0, 0xc00011c780, 0xc000175e30, 0xc0001b8000, 0x0)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/ast/ast.go:439 +0x149
github.com/yuin/goldmark/renderer.(*renderer).Render(0xc0001245f0, 0x12243c0, 0xc0000909f0, 0xc000184140, 0x49, 0x49, 0x12273e0, 0xc00011c780, 0xc000124501, 0xc0000909f0)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/renderer/renderer.go:162 +0x13c
github.com/yuin/goldmark.(*markdown).Convert(0xc000119a00, 0xc000184140, 0x49, 0x49, 0x12243c0, 0xc0000909f0, 0x0, 0x0, 0x0, 0xc000064058, ...)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/markdown.go:117 +0xe3

An infinite loop in ASTTransformer

func (*test) Transform(node *ast.Document, reader text.Reader, pc parser.Context) {
  walk(node)
}

func walk(node ast.Node) {
  for n := node.FirstChild(); n != nil; n = node.NextSibling() {
    walk(n)
  }
}

Footnotes numbering should always be sequential

Running Hugo 0.60.1 with Goldmark 1.1.8. I'm basing this off the PHP Markdown Extra spec since CommonMark doesn't support footnotes, so please bear with me.

From my reading of the PHP Markdown Extra spec for footnotes, they should always be parsed a and numbered in sequential order in the document. What I've noticed in Goldmark is that they're parsed using the text in the footnote name, or that this sequential renumbering step is skipped - I'm not entirely sure which.

  1. What did you do? : Added footnotes to text, e.g.:
This[^3] is[^1] text with footnotes[^2].

[^1]: Footnote one
[^2]: Footnote two
[^3]: Footnote three
  1. What did you expect to see? :

This1 is2 text with footnotes3
1: Footnote three
2: Footnote one
3: Footnote two

  1. What did you see instead? :

This3 is1 text with footnotes2
1: Footnote one
2: Footnote two
3: Footnote three

(I apologize, my first example did not reflect the actual problem. I have updated it.)

Rather a question

Thank you for great work on this library! I've been looking for this clean implementation that works out-of-box without any patches.

Now for a project I'm working on I need a little bit of hacking of default renderer. Would you please direct me as to where I would plug my custom rendering of a youtube auto-links. Basically I'm having them as *ast.AutoLink nodes. Now I'm trying to rewrite the rendering of those so that they appear in <div>s with the <img> of the youtube preview picture and <a> leading to the video. That's the idea.

So far I've been able to declare a custom type:

// CustomGoldmarkRenderer renders specific markdown documents containing video links
type CustomGoldmarkRenderer struct {
	defaultRenderer renderer.Renderer
	file            *[]byte
}

which then I make implementing the Renderer interface:

func (c CustomGoldmarkRenderer) Render(w io.Writer, source []byte, n ast.Node) error {
	ast.Walk(n, func(n ast.Node, entering bool) (status ast.WalkStatus, err error) {
		switch t := n.(type) {
		case *ast.AutoLink:
			url := string(t.URL(*c.file))
			matches := youTubeLinkRegex.FindAllStringSubmatch(url, -1)
			if len(matches) == 0 {
				// Or try a short link
				matches = youTubeShortLinkRegex.FindAllStringSubmatch(url, -1)
			}
			if len(matches) > 0 {
				videoID := matches[0][1] // Group 1 stands for the first (...) block
				if entering {
					fmt.Fprintf(w, `
					<div class="py-2 col-12 col-xl-3 col-lg-3 col-md-4 mb-2">
						<a href="%s" target="_blank" class="d-block h-180">
							<img class="img-fluid img-thumbnail rounded" src="%s" alt="%s"/>
						</a>
						%s`,
						url,
						"https://img.youtube.com/vi/"+videoID+"/mqdefault.jpg",
						"title",
						"titleHTML")
					return ast.WalkSkipChildren, nil
				} else {
					fmt.Fprintf(w, `</div>`)
				}
			}
		}
		return ast.WalkContinue, nil
	})
	return c.defaultRenderer.Render(w, source, n)
}

so that I'm able to plug my custom renderer into the md := goldmark.New(...) as following:

	md.SetRenderer(CustomGoldmarkRenderer{
		defaultRenderer: md.Renderer(),
		file:            &file,  // Passing original source so that it becomes available in parsing function
	})

	var buf bytes.Buffer
	if err := md.Convert(file, &buf); err != nil {
		panic(err)
	}

Now of course what I get is simply my additional rendering of <div>s that I do with my walker, and then (with no surprise) the ordinary rendering is being appended to the io.Writer when return c.defaultRenderer.Render(w, source, n) comes into play.

Being an amateur coder in Go I can't figure out how to render the rest of the nodes with the default way while I do the rendering in my custom ast.Walk(...) call, node by node, since return c.defaultRenderer.Render(w, source, n) seems to be called just once for the Document node and doesn't really help me with individual nodes at all.

So, Would you be so kind to hint me where I'm wrong and which direction I would rather need to choose?

Typographic elements in heading are excluded from the automatically generated heading IDs

Hello,

For background and related discussion, please see the following post in Hugo forum.

https://discourse.gohugo.io/t/difference-in-auto-generated-heading-anchor-names-between-previous-versions-and-v0-60-x/22076

Please answer the following before submitting your issue:

  1. What version of goldmark are you using? : 1.1.8 (included in Hugo 0.60.1)
  2. What version of Go are you using? : 1.11.2 (but shouldn't matter as the test is done with pre-built Hugo)
  3. What operating system and processor architecture are you using? : macOS 10.13.6, Intel Core i5
  4. What did you do? : Upgrade Hugo from 0.54.0 to 0.60.1 to check the basic functionality
  5. What did you expect to see? : Non-alphanumeric typhographic elements (hyphen, period, underscore, etc.) in heading are transformed into hyphen in the auto heading IDs (e.g. for heading "Command-Gen-Instance" and "v1.0.0 (Apr 21, 2019)", the results are command-gen-instance and v1-0-0-apr-21-2019)
  6. What did you see instead? : Non-alphanumeric typhographic elements in heading are excluded from the auto heading IDs (e.g. for the example above, the results are commandgeninstance and v100-april-21-2019)

Many thanks for your work with Goldmark.

table extension: Merged table columns

Tested with Goldmark 1.16.

package main

import (
	"bytes"
	"fmt"
	"log"

	"github.com/yuin/goldmark"
	"github.com/yuin/goldmark/extension"
)

func main() {

	convert(`Foo|Bar
---|---
` + "`" + `Yoyo` + "`" + `|Dyne`)
}

func convert(src string) {

	markdown := goldmark.New(
		goldmark.WithExtensions(
			extension.Table,
		),
	)
	var buf bytes.Buffer
	err := markdown.Convert([]byte(src), &buf)
	if err != nil {
		log.Fatal(err)
	}

	fmt.Println(buf.String())
}

Produces

<table>
<thead>
<tr>
<th>Foo</th>
<th>Bar</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>Yoyo</code>|Dyne</td>
<td></td>
</tr>
</tbody>
</table>

The "try it" on https://commonmark.org/help/tutorial/02-emphasis.html renders it correclty, https://spec.commonmark.org/dingus/ renders nothing.

gohugoio/hugo#6641

"!" will always start a new text element.

Given:

This is a line! Yes.

And this is another!

Will got:

    Paragraph {
        RawText: "This is a line! Yes."
        HasBlankPreviousLines: false
        Text: "This is a line"
        Text: "! Yes."
    }
    Paragraph {
        RawText: "And this is another!"
        HasBlankPreviousLines: false
        Text: "And this is another"
        Text: "!"
    }

Expected:

    Paragraph {
        RawText: "This is a line! Yes."
        HasBlankPreviousLines: false
        Text: "This is a line! Yes."
    }
    Paragraph {
        RawText: "And this is another!"
        HasBlankPreviousLines: false
        Text: "And this is another!"
    }

Fenced code block with carriage returns causes a panic error

Hey!

I am trying to "markdownify" input coming from an HTML textarea, and it contains carriage returns.

Using a fenced code block with carriage returns cause the whole program to panic with a slice bounds out of range error.

Here is an example:

package main

import (
	"bytes"
	"fmt"
	"html/template"

	"github.com/yuin/goldmark"
)

func main() {
	var buf bytes.Buffer
	if err := goldmark.Convert([]byte("lol\r\n\r\n```\r\nok\r\n```\r\n\r\nyes"), &buf); err != nil {
		panic(err)
	}
	fmt.Printf("%v", template.HTML(buf.String()))
}
panic: runtime error: slice bounds out of range

goroutine 1 [running]:
github.com/yuin/goldmark/parser.(*fencedCodeBlockParser).Open(0x8336e0, 0x6b0340, 0xc000102600, 0x6afa80, 0xc00009a7e0, 0x6af880, 0xc00009a8a0, 0x0, 0x0, 0x10)
        /home/thomas/Proj/blobstash/vendor/github.com/yuin/goldmark/parser/fcode_block.go:51 +0x419
github.com/yuin/goldmark/parser.(*parser).openBlocks(0xc000172000, 0x6b0340, 0xc000102600, 0x1, 0x6afa80, 0xc00009a7e0, 0x6af880, 0xc00009a8a0, 0x3)
        /home/thomas/Proj/blobstash/vendor/github.com/yuin/goldmark/parser/parser.go:849 +0x27e
github.com/yuin/goldmark/parser.(*parser).parseBlocks(0xc000172000, 0x6b0340, 0xc000102600, 0x6afa80, 0xc00009a7e0, 0x6af880, 0xc00009a8a0)
        /home/thomas/Proj/blobstash/vendor/github.com/yuin/goldmark/parser/parser.go:923 +0x1ce
github.com/yuin/goldmark/parser.(*parser).Parse(0xc000172000, 0x6afa80, 0xc00009a7e0, 0x0, 0x0, 0x0, 0x20, 0x62e2c0)
        /home/thomas/Proj/blobstash/vendor/github.com/yuin/goldmark/parser/parser.go:771 +0x157
github.com/yuin/goldmark.(*markdown).Convert(0xc00017a000, 0xc0000b6a00, 0x1a, 0x1a, 0x6ab8a0, 0xc00007ad20, 0x0, 0x0, 0x0, 0x0, ...)
        /home/thomas/Proj/blobstash/vendor/github.com/yuin/goldmark/markdown.go:116 +0x94
github.com/yuin/goldmark.Convert(...)
        /home/thomas/Proj/blobstash/vendor/github.com/yuin/goldmark/markdown.go:31
main.main()
        /home/thomas/Proj/blobstash/cmd/lol/p.go:13 +0xbc
exit status 2

Thanks!

BlockParser logics

Thank you for the excellent markdown processor. This is really very impressive.

Could you please help me to understand the logic of BlockParser.

For example, I want to implement the behaviour of (nesting) lists/blockquotes with help of markers, i.e

%START%

Content 1

%START%

Content2

%FINISH%

Content 3

%FINISH%

should produce

<START>
Content 1
<START>
Content 2
</FINISH>
Content 3
</FINISH>

But when I write something like this, I am a little bit confused. It interrupts the parsing along with the first parser.Close call. But blockquote works well and there could be multiple parser.Close calls during the nested blockquotes parsing cycle.

Colon inside ** breaks "boldness"

package main

import (
	"bytes"
	"fmt"
	"log"

	"github.com/yuin/goldmark"
)

func main() {
	content := `**Bold:**Regular`

	markdown := goldmark.New()

	var buf bytes.Buffer
	err := markdown.Convert([]byte(content), &buf)
	if err != nil {
		log.Fatal(err)
	}

	fmt.Println(buf.String())
}

Prints:

<p>**Bold:**Regular</p>

goldmark can't emphasized the specific Chinese character

After using goldmard process the markdown text **「刻舟求剑」**, the result is still **「刻舟求剑」**, but the expected is 「刻舟求剑」.

  1. What version of goldmark are you using? : v1.1.11
  2. What version of Go are you using? : 1.12
  3. What operating system and processor architecture are you using? : Hugo v0.60.1 on macOS
  4. What did you do? : Create a markdown file with **
  5. What did you expect to see? : The words should be emphasized
  6. What did you see instead? : The words was not emphasized
  7. (Feature request only): Why you can not implement it as an extension?: not applicable

Pandoc Markdown Compatibility?

Goldmark's CommonMark compatibility is amazing and with attribute support, the mathjax extension, and the metadata extension covers what I feel are the most popular parts of Pandoc Markdown. It seems very possible that Goldmark could eventually replace external pandoc dependencies in many Go applications today. I would very much like to contribute toward that goal and I'm aware of others who would be also.

To that end I am seeking some design direction and consensus about how to move forward.

Extensions for each feature seems most reasonable. But I opened this issue to make sure a full AST Transformer might not be a better approach. Personally I prefer the modularity of an extension for each --- particularly Pandoc's unique Simplified Tables --- and find the composition design valuable that Pandoc has used for its internals.

Which design direction is most recommended for such work? Several extensions or a single Transformer? I'm almost sure the answer is extensions but am asking anyway to avoid something I may have missed.

Thank you. (If there is a better place to have this discussion please let me know.)

Footnote parsing error

test![^1]

[^1]: footnote
<p>test![^1]</p>

This happens if an exclamation mark is placed before the footnote link.

Apostrophe breaks Linkify

Using this file:

package main
import (
   "bytes"
   gm "github.com/yuin/goldmark"
   "github.com/yuin/goldmark/extension"
)
func main() {
   var s1 = []byte("https://github.com/sunday's")
   var s2 bytes.Buffer
   gm.New(gm.WithExtensions(extension.Linkify)).Convert(s1, &s2)
   print(s2.String())
}

I get this result:

<p><a href="https://github.com/sunday">https://github.com/sunday</a>'s</p>

With github.com parser, I get this result:

<p><a href="https://github.com/sunday's">https://github.com/sunday's</a></p>

Example:

https://github.com/sunday's

API question

Ref this interface:

// A Markdown interface offers functions to convert Markdown text to
// a desired format.
type Markdown interface {
	// Convert interprets a UTF-8 bytes source in Markdown and write rendered
	// contents to a writer w.
	Convert(source []byte, writer io.Writer, opts ...parser.ParseOption) error

	// Parser returns a Parser that will be used for conversion.
	Parser() parser.Parser

	// SetParser sets a Parser to this object.
	SetParser(parser.Parser)

	// Parser returns a Renderer that will be used for conversion.
	Renderer() renderer.Renderer

	// SetRenderer sets a Renderer to this object.
	SetRenderer(renderer.Renderer)
}

With the above, I can create a Markdown with a custom parser and renderer (I'm not sure what the setters are for) and then run Convert to do the job.

A big win (ref. your benchmarks) when you have this strict separation between parse and render, is to parse once and render to every format you need. I don't see how that is possible with the current API?

Rendering of external links in safe mode

I've now merged in Goldmark as the default Markdown handler in Hugo and it works great.

I have set unsafe=false as the default, and that works mostly as expected.

But the rendering of external links comes as a surprise on most people, I think.

[Google Search!](https://google.com/)

=>

[Google Search!](https://google.com/)

So, the security motivation behind the above is maybe to prevent fake linking? But when the end result is that most people configure it to be unsafe just to get proper links, I think that makes the net security much less.

gohugoio/hugoThemesSite#67

Panic in auto-id

package main

import (
	"bytes"
	"fmt"
	"log"
	"runtime/debug"

	"github.com/yuin/goldmark/parser"

	"github.com/yuin/goldmark"
)

func main() {

	convert(`#
# FOO`)
}

func convert(src string) {
	defer func() {
		if r := recover(); r != nil {
			fmt.Println("Panic:\n", string(debug.Stack()))
		}
	}()

	markdown := goldmark.New(
		goldmark.WithParserOptions(
			parser.WithAutoHeadingID(),
		),
	)
	var buf bytes.Buffer
	err := markdown.Convert([]byte(src), &buf)
	if err != nil {
		log.Fatal(err)
	}
}
github.com/yuin/goldmark/text.(*Segments).At(...)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/text/segment.go:182
github.com/yuin/goldmark/parser.generateAutoHeadingID(0xc0001ba000, 0x1229380, 0xc0001aa7e0, 0x1229440, 0xc0001aa850)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/parser/atx_heading.go:190 +0x219
github.com/yuin/goldmark/parser.(*atxHeadingParser).Close(0xc00012120e, 0x122a0c0, 0xc0001ba000, 0x1229380, 0xc0001aa7e0, 0x1229440, 0xc0001aa850)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/parser/atx_heading.go:173 +0xba
github.com/yuin/goldmark/parser.(*parser).closeBlocks(0xc0001b1500, 0x0, 0x0, 0x1229380, 0xc0001aa7e0, 0x1229440, 0xc0001aa850)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/parser/parser.go:845 +0x162
github.com/yuin/goldmark/parser.(*parser).parseBlocks(0xc0001b1500, 0x1229c40, 0xc000126780, 0x1229380, 0xc0001aa7e0, 0x1229440, 0xc0001aa850)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/parser/parser.go:1058 +0x753
github.com/yuin/goldmark/parser.(*parser).Parse(0xc0001b1500, 0x1229380, 0xc0001aa7e0, 0x0, 0x0, 0x0, 0x8, 0x8)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/parser/parser.go:818 +0x148
github.com/yuin/goldmark.(*markdown).Convert(0xc000123a40, 0xc000121230, 0x7, 0x8, 0x1226bc0, 0xc00009a9f0, 0x0, 0x0, 0x0, 0xc0001ae000, ...)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/markdown.go:116 +0x94
main.convert(0x11f960a, 0x7)
	/Users/bep/dev/go/bep/temp/main.go:33 +0x1d5
main.main()
	/Users/bep/dev/go/bep/temp/main.go:16 +0x36

Unwanted paragraph closing tag in html template tag

First of all, thanks a lot for the work on goldmark. I just tried it with the new release and it works great. Though, there is one minor imperfection:

The unsafe option is turned on and there is html code inside a paragraph, like this:

This is **Bold** <span>Component</span><template>
<div>Name</div>
</template>  **Bold** as well.

This will render as:

<p>This is <strong>Bold</strong> <span>Component</span><template></p>
<div>Name</div>
</template>  **Bold** as well.

Notice how the closing tag </p> is set too early. If I remove the line break after <template> it works as aspected:

This is **Bold** <span>Component</span><template> <div>Name</div>
</template>  **Bold** as well.
<p>This is <strong>Bold</strong> <span>Component</span><template> <div>Name</div>
</template>  <strong>Bold</strong> as well.</p>

Of course I can just move the div up, but there are other divs in my template as well (they also call </p> too early) and therefore this one line will become quite long and hard to maintain. Basically, the template tag and everything inside should not call for the automatic setting of </p>.

Even though I use Hugo to render, I think this is a goldmark related issue.

Last backtick appears to escape in fenced code blocks

Hi there,

Thanks for spending the time to make this! This is super useful, and the extensibility is a great feature not easily found elsewhere. I had one issue, I'm not sure if this is a bug or a side effect, but here it goes. In fenced code blocks, it appears that the last backtick escapes.

So, for example:

    ```
    function lorem(ipsum, dolor = 1) {
      const sit = ipsum == null ? 0 : ipsum.sit;
      dolor = sit - amet(dolor);
      return sit ? consectetur(ipsum, 0, dolor < 0 ? 0 : dolor) : [];
    }

    function adipiscing(...elit) {
      if (!elit.sit) {
        return [];
      }
    
      const sed = elit[0];
      return eiusmod.tempor(sed) ? sed : [sed];
    }

    function incididunt(ipsum, ut = 1) {
      ut = labore.et(amet(ut), 0);
      const sit = ipsum == null ? 0 : ipsum.sit;

      if (!sit || ut < 1) {
        return [];
      }

      let dolore = 0;
      let magna = 0;
      const aliqua = new eiusmod(labore.ut(sit / ut));

      while (dolore < sit) {
        aliqua[magna++] = consectetur(ipsum, dolore, (dolore += ut));
      }
    
      return aliqua;
    }
    ```

Ends up being rendered as:
——————————————————————————————

function lorem(ipsum, dolor = 1) {
  const sit = ipsum == null ? 0 : ipsum.sit;
  dolor = sit - amet(dolor);
  return sit ? consectetur(ipsum, 0, dolor < 0 ? 0 : dolor) : [];
}

function adipiscing(...elit) {
  if (!elit.sit) {
    return [];
  }

  const sed = elit[0];
  return eiusmod.tempor(sed) ? sed : [sed];
}

function incididunt(ipsum, ut = 1) {
  ut = labore.et(amet(ut), 0);
  const sit = ipsum == null ? 0 : ipsum.sit;

  if (!sit || ut < 1) {
    return [];
  }

  let dolore = 0;
  let magna = 0;
  const aliqua = new eiusmod(labore.ut(sit / ut));

  while (dolore < sit) {
    aliqua[magna++] = consectetur(ipsum, dolore, (dolore += ut));
  }

  return aliqua;
}

`
——————————————————————————————
^ superfluous last backtick

This is the actual code fragment that is generated by above:

<pre style="color:#93a1a1;background-color:#002b36"><span style="color:#268bd2">function</span> lorem(ipsum, dolor <span style="color:#719e07">=</span> <span style="color:#2aa198">1</span>) {
  <span style="color:#268bd2">const</span> sit <span style="color:#719e07">=</span> ipsum <span style="color:#719e07">==</span> <span style="color:#cb4b16">null</span> <span style="color:#719e07">?</span> <span style="color:#2aa198">0</span> <span style="color:#719e07">:</span> ipsum.sit;
  dolor <span style="color:#719e07">=</span> sit <span style="color:#719e07">-</span> amet(dolor);
  <span style="color:#719e07">return</span> sit <span style="color:#719e07">?</span> consectetur(ipsum, <span style="color:#2aa198">0</span>, dolor <span style="color:#719e07">&lt;</span> <span style="color:#2aa198">0</span> <span style="color:#719e07">?</span> <span style="color:#2aa198">0</span> <span style="color:#719e07">:</span> dolor) <span style="color:#719e07">:</span> [];
}

<span style="color:#268bd2">function</span> adipiscing(...elit) {
  <span style="color:#719e07">if</span> (<span style="color:#719e07">!</span>elit.sit) {
    <span style="color:#719e07">return</span> [];
  }

  <span style="color:#268bd2">const</span> sed <span style="color:#719e07">=</span> elit[<span style="color:#2aa198">0</span>];
  <span style="color:#719e07">return</span> eiusmod.tempor(sed) <span style="color:#719e07">?</span> sed <span style="color:#719e07">:</span> [sed];
}

<span style="color:#268bd2">function</span> incididunt(ipsum, ut <span style="color:#719e07">=</span> <span style="color:#2aa198">1</span>) {
  ut <span style="color:#719e07">=</span> labore.et(amet(ut), <span style="color:#2aa198">0</span>);
  <span style="color:#268bd2">const</span> sit <span style="color:#719e07">=</span> ipsum <span style="color:#719e07">==</span> <span style="color:#cb4b16">null</span> <span style="color:#719e07">?</span> <span style="color:#2aa198">0</span> <span style="color:#719e07">:</span> ipsum.sit;

  <span style="color:#719e07">if</span> (<span style="color:#719e07">!</span>sit <span style="color:#719e07">||</span> ut <span style="color:#719e07">&lt;</span> <span style="color:#2aa198">1</span>) {
    <span style="color:#719e07">return</span> [];
  }

  <span style="color:#268bd2">let</span> dolore <span style="color:#719e07">=</span> <span style="color:#2aa198">0</span>;
  <span style="color:#268bd2">let</span> magna <span style="color:#719e07">=</span> <span style="color:#2aa198">0</span>;
  <span style="color:#268bd2">const</span> aliqua <span style="color:#719e07">=</span> <span style="color:#719e07">new</span> eiusmod(labore.ut(sit <span style="color:#719e07">/</span> ut));

  <span style="color:#719e07">while</span> (dolore <span style="color:#719e07">&lt;</span> sit) {
    aliqua[magna<span style="color:#719e07">++</span>] <span style="color:#719e07">=</span> consectetur(ipsum, dolore, (dolore <span style="color:#719e07">+=</span> ut));
  }

  <span style="color:#719e07">return</span> aliqua;
}
</pre><p>`</p>

Thanks for taking a look!

How to remove all nodes with NodeType in ASTTransformer?

func (*testTransformer) Transform(node *ast.Document, reader text.Reader, pc parser.Context) {
    processNodes(node)
}

func processNodes(n ast.Node) {
    if n.Kind() == ast.KindHeading {
        if p := n.Parent(); p != nil {
            p.RemoveChild(p, n)
        }
        return
    }
    for c := n.FirstChild(); c != nil; c = n.NextSibling() {
        processNodes(c)
    }
}

Source markdown:

# Header 1

text

## Header 2

text

Result:

<p>text</p>
<h2>Header 2</h2>
<p>text</p>

Hard line breaks not rendered in files with Windows-style line endings

Hello

Member of the Hugo team here. Currently testing Goldmark as the new default in Hugo 0.60.0 DEV.

Apparently hard line breaks as specified in Commonmark 0.29 are not rendered by Golmark for markdown files with Windows-style line endings.

In a collaborative project that I maintain files can be edited by other team members on Windows.
Typically we use two spaces for a line break.

But I only managed to render the line break after using dos2unix to convert the line endings from DOS to UNIX like so: dos2unix some-file.md.

cc: @bep

Release Notes

Please add Release Notes to your releases!

You may borrow my script/tools that I use for this tools/release.sh

This will help a lot for folks watching releases and not all commits :)

Consider adding a context (data holder) to Render

This is a follow up to #37

So, setting state on the nodes in the AST and then use that while rendering works, but ...

  • It makes for some fairly clumsy and verbose code
  • It breaks the separation of concerns (adding rendering code to the parser)

What I'm now doing instead is something ala:

        w := renderContext{
		BufWriter: bufio.NewWriter(buf),
		renderContextData: renderContextDataHolder{
			rctx: ctx,
			dctx: c.ctx,
		},
	}

	if err := c.md.Renderer().Render(w, ctx.Src, doc); err != nil {
		return nil, err
	}

This works great , and I don't mind doing it like this (this is entirely internal), but the down side is that it may stop working in the future if you decide to wrap the writer or something.

Support footnote return links

goldmark v1.1.7
with Hugo 0.60.0

As mentioned in gohugoio/hugo/issues/6551 Goldmark seems to not support footnote return links although they are supported by PHP Markdown extra.

It would be great if Goldmark supported them.

Thank you very much in advance.

Markdown:

That's some text with a footnote.[^1]

[^1]: And that's the footnote.

Output:

…
<section class="footnotes" role="doc-endnotes"><hr><ol><li id="fn:1" role="doc-endnote"><p>And that's the footnote.</p></li></ol></section>

Rendering "class" attribute

Hi @yuin
I'm trying to append class="..." to all img tags and wondering if something like this would make sense to add (of course it's simply a hard-coded example for "class" attribute only) :

zzwx-forks@11441f5

This way users wouldn't have to completely rewrite render function in case something simple as adding a class is needed and they don't want to possibly break the code when the library gets updated.

This is my use case:

case *ast.Image:
  if entering {
    n.SetAttributeString("class", "img-fluid")
  }

Fuzz crasher in parser/attribute.go:102

Please answer the following before submitting your issue:

  1. What version of goldmark are you using? : v1.1.10
  2. What version of Go are you using? : go1.13.4
  3. What operating system and processor architecture are you using? : linux/amd64
  4. What did you do? : Merge #54 and then make fuzz
  5. What did you expect to see? : boredom
  6. What did you see instead? :
sh$ cat fuzz/crashers/db0b78ba444c6efd83f1d4f6f74faab82aaf3cb5.quoted
        "{\n-"

sh$ cat fuzz/crashers/db0b78ba444c6efd83f1d4f6f74faab82aaf3cb5.output
panic: runtime error: index out of range [0] with length 0

goroutine 1 [running]:
github.com/yuin/goldmark/parser.parseAttribute(0x688e40, 0xc0001bd650, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7f33d7475000)
        /go/src/github.com/yuin/goldmark/parser/attribute.go:102 +0xa9b
github.com/yuin/goldmark/parser.ParseAttributes(0x688e40, 0xc0001bd650, 0x0, 0x1, 0x0, 0x1)
        /go/src/github.com/yuin/goldmark/parser/attribute.go:61 +0x1e2
github.com/yuin/goldmark/parser.parseLastLineAttributes(0x689b80, 0xc00016a090, 0x688e40, 0xc0001bd500, 0x688f00, 0xc0001bd5e0)
        /go/src/github.com/yuin/goldmark/parser/atx_heading.go:229 +0x429
github.com/yuin/goldmark/parser.(*setextHeadingParser).Close(0xc000117c70, 0x689b80, 0xc00016a090, 0x688e40, 0xc0001bd500, 0x688f00, 0xc0001bd5e0)
        /go/src/github.com/yuin/goldmark/parser/setext_headings.go:107 +0x56e
github.com/yuin/goldmark/parser.(*parser).closeBlocks(0xc000197500, 0x0, 0x0, 0x688e40, 0xc0001bd500, 0x688f00, 0xc0001bd5e0)
        /go/src/github.com/yuin/goldmark/parser/parser.go:845 +0x199
github.com/yuin/goldmark/parser.(*parser).parseBlocks(0xc000197500, 0x689700, 0xc00001e980, 0x688e40, 0xc0001bd500, 0x688f00, 0xc0001bd5e0)
        /go/src/github.com/yuin/goldmark/parser/parser.go:1023 +0xc12
github.com/yuin/goldmark/parser.(*parser).Parse(0xc000197500, 0x688e40, 0xc0001bd500, 0x0, 0x0, 0x0, 0x30, 0x633700)
        /go/src/github.com/yuin/goldmark/parser/parser.go:818 +0x1eb
github.com/yuin/goldmark.(*markdown).Convert(0xc000119e80, 0x7f33d7475000, 0x3, 0x3, 0x685fa0, 0xc00007cff0, 0x0, 0x0, 0x0, 0x9, ...)
        /go/src/github.com/yuin/goldmark/markdown.go:116 +0xac
github.com/yuin/goldmark/fuzz.Fuzz(0x7f33d7475000, 0x3, 0x3, 0x4)
        /go/src/github.com/yuin/goldmark/fuzz/fuzz.go:34 +0x43c
go-fuzz-dep.Main(0xc00026ff48, 0x1, 0x1)
        go-fuzz-dep/main.go:36 +0x1ad
main.main()
        github.com/yuin/goldmark/fuzz/go.fuzz.main/main.go:15 +0x52

Meta

It seems that there should be one more builtin goldmark extension - yaml metadata block. Then real GFM is fully supported.

Greater-than sign breaks Linkify

Using this file:

package main
import (
   "bytes"
   gm "github.com/yuin/goldmark"
   "github.com/yuin/goldmark/extension"
)
func main() {   
   var s1 = []byte("https://github.com?q=stars:>1")
   var s2 bytes.Buffer
   gm.New(gm.WithExtensions(extension.Linkify)).Convert(s1, &s2)
   print(s2.String())
}

I get this result:

<p><a href="https://github.com?q=stars:">https://github.com?q=stars:</a>&gt;1</p>

With github.com parser, I get this result:

<p><a href="https://github.com?q=stars:%3E1">https://github.com?q=stars:&gt;1</a></p>

Example:

https://github.com?q=stars:>1

question: Passing state to a rendering extension

I'm in the process of creating some link/image extensions that would allow for link resolution/image resize etc.

For that to work, I need to pass on some document state to the custom link renderer. But I don't see how.

The Parse method can take a context, but I don't see a similar way to pass a struct via Render. I could create a new goldmark.Markdown for each document, but that sounds wasteful.

Fuzz crash on "[^000]:0\t[^]:"

Please answer the following before submitting your issue:

  1. What version of goldmark are you using? : v1.1.9
  2. What version of Go are you using? : go1.13.4
  3. What operating system and processor architecture are you using? : linux/amd64
  4. What did you do? : make fuzz
  5. What did you expect to see? : boredom
  6. What did you see instead? :
sh$ cat fuzz/crashers/374f2bf4f9cd8bb2d4737a8bcb30f74ea5ef9e10.quoted
        "[^000]:0\t[^]:"
sh$ cat fuzz/crashers/374f2bf4f9cd8bb2d4737a8bcb30f74ea5ef9e10.output
panic: runtime error: slice bounds out of range [:14] with capacity 13

goroutine 1 [running]:
github.com/yuin/goldmark/text.(*Segment).Value(0xc00026edc0, 0x7f2e89c6c000, 0xd, 0xd, 0x7f2e89c6c009, 0x0, 0x0)
        /go/src/github.com/yuin/goldmark/text/segment.go:44 +0x33f
github.com/yuin/goldmark/text.(*reader).Value(0xc0001bd7a0, 0xe, 0xe, 0x0, 0x0, 0xd, 0x3)
        /go/src/github.com/yuin/goldmark/text/reader.go:106 +0x62
github.com/yuin/goldmark/extension.(*footnoteBlockParser).Open(0x7e2060, 0x689480, 0xc000125f40, 0x687780, 0xc0001bd7a0, 0x687840, 0xc0001bd880, 0x1, 0xc000125f40, 0x8)
        /go/src/github.com/yuin/goldmark/extension/footnote.go:55 +0x294
github.com/yuin/goldmark/parser.(*parser).openBlocks(0xc0001d4000, 0x689480, 0xc000125f40, 0x0, 0x687780, 0xc0001bd7a0, 0x687840, 0xc0001bd880, 0x2)
        /go/src/github.com/yuin/goldmark/parser/parser.go:908 +0x481
github.com/yuin/goldmark/parser.(*parser).parseBlocks(0xc0001d4000, 0x688040, 0xc00001ef80, 0x687780, 0xc0001bd7a0, 0x687840, 0xc0001bd880)
        /go/src/github.com/yuin/goldmark/parser/parser.go:1008 +0x218
github.com/yuin/goldmark/parser.(*parser).Parse(0xc0001d4000, 0x687780, 0xc0001bd7a0, 0x0, 0x0, 0x0, 0x30, 0x632200)
        /go/src/github.com/yuin/goldmark/parser/parser.go:818 +0x1eb
github.com/yuin/goldmark.(*markdown).Convert(0xc0001c88c0, 0x7f2e89c6c000, 0xd, 0xd, 0x6849e0, 0xc00007d9b0, 0x0, 0x0, 0x0, 0x24f90bed, ...)
        /go/src/github.com/yuin/goldmark/markdown.go:116 +0xac
github.com/yuin/goldmark/fuzz.Fuzz(0x7f2e89c6c000, 0xd, 0xd, 0x3)
        /go/src/github.com/yuin/goldmark/fuzz/fuzz.go:23 +0x269
go-fuzz-dep.Main(0xc00026ff48, 0x1, 0x1)
        go-fuzz-dep/main.go:36 +0x1ad
main.main()
        github.com/yuin/goldmark/fuzz/go.fuzz.main/main.go:15 +0x52
exit status 2

Fuzz crash on ">*\t>\n> \t0\n>\t\t0\n>0"

Please answer the following before submitting your issue:

  1. What version of goldmark are you using? : v1.1.9
  2. What version of Go are you using? : go1.13.4
  3. What operating system and processor architecture are you using? : linux/amd64
  4. What did you do? : make fuzz
  5. What did you expect to see? : boredom
  6. What did you see instead? :
sh$ cat fuzz/crashers/db13717bee8cb87337140ed44b4f9bc01214e3fb.quoted
        ">*\t>\n> \t0\n>\t\t0\n>0"
sh$ cat fuzz/crashers/db13717bee8cb87337140ed44b4f9bc01214e3fb.output
panic: interface conversion: ast.Node is *ast.CodeBlock, not *ast.ListItem

goroutine 1 [running]:
github.com/yuin/goldmark/parser.lastOffset(0x688820, 0xc000190630, 0x1)
        /go/src/github.com/yuin/goldmark/parser/list.go:102 +0xfd
github.com/yuin/goldmark/parser.(*listParser).Continue(0x7e2060, 0x688820, 0xc000190630, 0x687780, 0xc0001e97a0, 0x687840, 0xc0001e9880, 0xa)
        /go/src/github.com/yuin/goldmark/parser/list.go:192 +0x24e
github.com/yuin/goldmark/parser.(*parser).parseBlocks(0xc0001f2000, 0x688040, 0xc00001ec00, 0x687780, 0xc0001e97a0, 0x687840, 0xc0001e9880)
        /go/src/github.com/yuin/goldmark/parser/parser.go:1032 +0x558
github.com/yuin/goldmark/parser.(*parser).Parse(0xc0001f2000, 0x687780, 0xc0001e97a0, 0x0, 0x0, 0x0, 0x30, 0x632200)
        /go/src/github.com/yuin/goldmark/parser/parser.go:818 +0x1eb
github.com/yuin/goldmark.(*markdown).Convert(0xc000078b00, 0x7ff83a71d000, 0x11, 0x11, 0x6849e0, 0xc0000959b0, 0x0, 0x0, 0x0, 0x3a2334ec, ...)
        /go/src/github.com/yuin/goldmark/markdown.go:116 +0xac
github.com/yuin/goldmark/fuzz.Fuzz(0x7ff83a71d000, 0x11, 0x11, 0x3)
        /go/src/github.com/yuin/goldmark/fuzz/fuzz.go:23 +0x269
go-fuzz-dep.Main(0xc00029bf48, 0x1, 0x1)
        go-fuzz-dep/main.go:36 +0x1ad
main.main()
        github.com/yuin/goldmark/fuzz/go.fuzz.main/main.go:15 +0x52
exit status 2

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.