yuin / goldmark Goto Github PK

View Code? Open in Web Editor NEW

3.5K 29.0 249.0 3.7 MB

:trophy: A markdown parser written in Go. Easy to extend, standard(CommonMark) compliant, well structured.

License: MIT License

Makefile 0.28% Go 99.43% C 0.29%

markdown commonmark golang go

goldmark's Introduction

goldmark

A Markdown parser written in Go. Easy to extend, standards-compliant, well-structured.

goldmark is compliant with CommonMark 0.31.2.

goldmark playground : Try goldmark online. This playground is built with WASM(5-10MB).

Motivation

I needed a Markdown parser for Go that satisfies the following requirements:

Easy to extend.
- Markdown is poor in document expressions compared to other light markup languages such as reStructuredText.
- We have extensions to the Markdown syntax, e.g. PHP Markdown Extra, GitHub Flavored Markdown.
Standards-compliant.
- Markdown has many dialects.
- GitHub-Flavored Markdown is widely used and is based upon CommonMark, effectively mooting the question of whether or not CommonMark is an ideal specification.
  - CommonMark is complicated and hard to implement.
Well-structured.
- AST-based; preserves source position of nodes.
Written in pure Go.

golang-commonmark may be a good choice, but it seems to be a copy of markdown-it.

blackfriday.v2 is a fast and widely-used implementation, but is not CommonMark-compliant and cannot be extended from outside of the package, since its AST uses structs instead of interfaces.

Furthermore, its behavior differs from other implementations in some cases, especially regarding lists: Deep nested lists don't output correctly #329, List block cannot have a second line #244, etc.

This behavior sometimes causes problems. If you migrate your Markdown text from GitHub to blackfriday-based wikis, many lists will immediately be broken.

As mentioned above, CommonMark is complicated and hard to implement, so Markdown parsers based on CommonMark are few and far between.

Features

Standards-compliant. goldmark is fully compliant with the latest CommonMark specification.
Extensible. Do you want to add a @username mention syntax to Markdown? You can easily do so in goldmark. You can add your AST nodes, parsers for block-level elements, parsers for inline-level elements, transformers for paragraphs, transformers for the whole AST structure, and renderers.
Performance. goldmark's performance is on par with that of cmark, the CommonMark reference implementation written in C.
Robust. goldmark is tested with go test --fuzz.
Built-in extensions. goldmark ships with common extensions like tables, strikethrough, task lists, and definition lists.
Depends only on standard libraries.

Installation

$ go get github.com/yuin/goldmark

Usage

Import packages:

import (
    "bytes"
    "github.com/yuin/goldmark"
)

Convert Markdown documents with the CommonMark-compliant mode:

var buf bytes.Buffer
if err := goldmark.Convert(source, &buf); err != nil {
  panic(err)
}

With options

var buf bytes.Buffer
if err := goldmark.Convert(source, &buf, parser.WithContext(ctx)); err != nil {
  panic(err)
}

Functional option	Type	Description
`parser.WithContext`	A `parser.Context`	Context for the parsing phase.

Context options

Functional option	Type	Description
`parser.WithIDs`	A `parser.IDs`	`IDs` allows you to change logics that are related to element id(ex: Auto heading id generation).

Custom parser and renderer

import (
    "bytes"
    "github.com/yuin/goldmark"
    "github.com/yuin/goldmark/extension"
    "github.com/yuin/goldmark/parser"
    "github.com/yuin/goldmark/renderer/html"
)

md := goldmark.New(
          goldmark.WithExtensions(extension.GFM),
          goldmark.WithParserOptions(
              parser.WithAutoHeadingID(),
          ),
          goldmark.WithRendererOptions(
              html.WithHardWraps(),
              html.WithXHTML(),
          ),
      )
var buf bytes.Buffer
if err := md.Convert(source, &buf); err != nil {
    panic(err)
}

Functional option	Type	Description
`goldmark.WithParser`	`parser.Parser`	This option must be passed before `goldmark.WithParserOptions` and `goldmark.WithExtensions`
`goldmark.WithRenderer`	`renderer.Renderer`	This option must be passed before `goldmark.WithRendererOptions` and `goldmark.WithExtensions`
`goldmark.WithParserOptions`	`...parser.Option`
`goldmark.WithRendererOptions`	`...renderer.Option`
`goldmark.WithExtensions`	`...goldmark.Extender`

Parser and Renderer options

Parser options

Functional option	Type	Description
`parser.WithBlockParsers`	A `util.PrioritizedSlice` whose elements are `parser.BlockParser`	Parsers for parsing block level elements.
`parser.WithInlineParsers`	A `util.PrioritizedSlice` whose elements are `parser.InlineParser`	Parsers for parsing inline level elements.
`parser.WithParagraphTransformers`	A `util.PrioritizedSlice` whose elements are `parser.ParagraphTransformer`	Transformers for transforming paragraph nodes.
`parser.WithASTTransformers`	A `util.PrioritizedSlice` whose elements are `parser.ASTTransformer`	Transformers for transforming an AST.
`parser.WithAutoHeadingID`	`-`	Enables auto heading ids.
`parser.WithAttribute`	`-`	Enables custom attributes. Currently only headings supports attributes.

HTML Renderer options

Functional option	Type	Description
`html.WithWriter`	`html.Writer`	`html.Writer` for writing contents to an `io.Writer`.
`html.WithHardWraps`	`-`	Render newlines as `<br>`.
`html.WithXHTML`	`-`	Render as XHTML.
`html.WithUnsafe`	`-`	By default, goldmark does not render raw HTML or potentially dangerous links. With this option, goldmark renders such content as written.

Built-in extensions

extension.Table
- GitHub Flavored Markdown: Tables
extension.Strikethrough
- GitHub Flavored Markdown: Strikethrough
extension.Linkify
- GitHub Flavored Markdown: Autolinks
extension.TaskList
- GitHub Flavored Markdown: Task list items
extension.GFM
- This extension enables Table, Strikethrough, Linkify and TaskList.
- This extension does not filter tags defined in 6.11: Disallowed Raw HTML (extension). If you need to filter HTML tags, see Security.
- If you need to parse github emojis, you can use goldmark-emoji extension.
extension.DefinitionList
- PHP Markdown Extra: Definition lists
extension.Footnote
- PHP Markdown Extra: Footnotes
extension.Typographer
- This extension substitutes punctuations with typographic entities like smartypants.
extension.CJK
- This extension is a shortcut for CJK related functionalities.

Attributes

The parser.WithAttribute option allows you to define attributes on some elements.

Currently only headings support attributes.

Attributes are being discussed in the CommonMark forum. This syntax may possibly change in the future.

Headings

## heading ## {#id .className attrName=attrValue class="class1 class2"}

## heading {#id .className attrName=attrValue class="class1 class2"}

heading {#id .className attrName=attrValue}
============

Table extension

The Table extension implements Table(extension), as defined in GitHub Flavored Markdown Spec.

Specs are defined for XHTML, so specs use some deprecated attributes for HTML5.

You can override alignment rendering method via options.

Functional option	Type	Description
`extension.WithTableCellAlignMethod`	`extension.TableCellAlignMethod`	Option indicates how are table cells aligned.

Typographer extension

The Typographer extension translates plain ASCII punctuation characters into typographic-punctuation HTML entities.

Default substitutions are:

Punctuation	Default entity
`'`	`‘`, `’`
`"`	`“`, `”`
`--`	`–`
`---`	`—`
`...`	`…`
`<<`	`«`
`>>`	`»`

You can override the default substitutions via extensions.WithTypographicSubstitutions:

markdown := goldmark.New(
    goldmark.WithExtensions(
        extension.NewTypographer(
            extension.WithTypographicSubstitutions(extension.TypographicSubstitutions{
                extension.LeftSingleQuote:  []byte("&sbquo;"),
                extension.RightSingleQuote: nil, // nil disables a substitution
            }),
        ),
    ),
)

Linkify extension

The Linkify extension implements Autolinks(extension), as defined in GitHub Flavored Markdown Spec.

Since the spec does not define details about URLs, there are numerous ambiguous cases.

You can override autolinking patterns via options.

Functional option	Type	Description
`extension.WithLinkifyAllowedProtocols`	`[][]byte \| []string`	List of allowed protocols such as `[]string{ "http:" }`
`extension.WithLinkifyURLRegexp`	`*regexp.Regexp`	Regexp that defines URLs, including protocols
`extension.WithLinkifyWWWRegexp`	`*regexp.Regexp`	Regexp that defines URL starting with `www.`. This pattern corresponds to the extended www autolink
`extension.WithLinkifyEmailRegexp`	`*regexp.Regexp`	Regexp that defines email addresses`

Example, using xurls:

import "mvdan.cc/xurls/v2"

markdown := goldmark.New(
    goldmark.WithRendererOptions(
        html.WithXHTML(),
        html.WithUnsafe(),
    ),
    goldmark.WithExtensions(
        extension.NewLinkify(
            extension.WithLinkifyAllowedProtocols([]string{
                "http:",
                "https:",
            }),
            extension.WithLinkifyURLRegexp(
                xurls.Strict(),
            ),
        ),
    ),
)

Footnotes extension

The Footnote extension implements PHP Markdown Extra: Footnotes.

This extension has some options:

Functional option	Type	Description
`extension.WithFootnoteIDPrefix`	`[]byte \| string`	a prefix for the id attributes.
`extension.WithFootnoteIDPrefixFunction`	`func(gast.Node) []byte`	a function that determines the id attribute for given Node.
`extension.WithFootnoteLinkTitle`	`[]byte \| string`	an optional title attribute for footnote links.
`extension.WithFootnoteBacklinkTitle`	`[]byte \| string`	an optional title attribute for footnote backlinks.
`extension.WithFootnoteLinkClass`	`[]byte \| string`	a class for footnote links. This defaults to `footnote-ref`.
`extension.WithFootnoteBacklinkClass`	`[]byte \| string`	a class for footnote backlinks. This defaults to `footnote-backref`.
`extension.WithFootnoteBacklinkHTML`	`[]byte \| string`	a class for footnote backlinks. This defaults to `↩︎`.

Some options can have special substitutions. Occurrences of “^^” in the string will be replaced by the corresponding footnote number in the HTML output. Occurrences of “%%” will be replaced by a number for the reference (footnotes can have multiple references).

extension.WithFootnoteIDPrefix and extension.WithFootnoteIDPrefixFunction are useful if you have multiple Markdown documents displayed inside one HTML document to avoid footnote ids to clash each other.

extension.WithFootnoteIDPrefix sets fixed id prefix, so you may write codes like the following:

for _, path := range files {
    source := readAll(path)
    prefix := getPrefix(path)

    markdown := goldmark.New(
        goldmark.WithExtensions(
            NewFootnote(
                WithFootnoteIDPrefix(path),
            ),
        ),
    )
    var b bytes.Buffer
    err := markdown.Convert(source, &b)
    if err != nil {
        t.Error(err.Error())
    }
}

extension.WithFootnoteIDPrefixFunction determines an id prefix by calling given function, so you may write codes like the following:

markdown := goldmark.New(
    goldmark.WithExtensions(
        NewFootnote(
                WithFootnoteIDPrefixFunction(func(n gast.Node) []byte {
                    v, ok := n.OwnerDocument().Meta()["footnote-prefix"]
                    if ok {
                        return util.StringToReadOnlyBytes(v.(string))
                    }
                    return nil
                }),
        ),
    ),
)

for _, path := range files {
    source := readAll(path)
    var b bytes.Buffer

    doc := markdown.Parser().Parse(text.NewReader(source))
    doc.Meta()["footnote-prefix"] = getPrefix(path)
    err := markdown.Renderer().Render(&b, source, doc)
}

You can use goldmark-meta to define a id prefix in the markdown document:

---
title: document title
slug: article1
footnote-prefix: article1
---

# My article

CJK extension

CommonMark gives compatibilities a high priority and original markdown was designed by westerners. So CommonMark lacks considerations for languages like CJK.

This extension provides additional options for CJK users.

Functional option	Type	Description
`extension.WithEastAsianLineBreaks`	`...extension.EastAsianLineBreaksStyle`	Soft line breaks are rendered as a newline. Some asian users will see it as an unnecessary space. With this option, soft line breaks between east asian wide characters will be ignored. This defaults to `EastAsianLineBreaksStyleSimple`.
`extension.WithEscapedSpace`	`-`	Without spaces around an emphasis started with east asian punctuations, it is not interpreted as an emphasis(as defined in CommonMark spec). With this option, you can avoid this inconvenient behavior by putting 'not rendered' spaces around an emphasis like `太郎は\ 「こんにちわ」\ といった`.

Styles of Line Breaking

Style	Description
`EastAsianLineBreaksStyleSimple`	Soft line breaks are ignored if both sides of the break are east asian wide character. This behavior is the same as `east_asian_line_breaks` in Pandoc.
`EastAsianLineBreaksCSS3Draft`	This option implements CSS text level3 Segment Break Transformation Rules with some enhancements.

Example of `EastAsianLineBreaksStyleSimple`

Input Markdown:

私はプログラマーです。
東京の会社に勤めています。
GoでWebアプリケーションを開発しています。

Output:

<p>私はプログラマーです。東京の会社に勤めています。\nGoでWebアプリケーションを開発しています。</p>

Example of `EastAsianLineBreaksCSS3Draft`

Input Markdown:

私はプログラマーです。
東京の会社に勤めています。
GoでWebアプリケーションを開発しています。

Output:

<p>私はプログラマーです。東京の会社に勤めています。GoでWebアプリケーションを開発しています。</p>

Security

By default, goldmark does not render raw HTML or potentially-dangerous URLs. If you need to gain more control over untrusted contents, it is recommended that you use an HTML sanitizer such as bluemonday.

Benchmark

You can run this benchmark in the _benchmark directory.

against other golang libraries

blackfriday v2 seems to be the fastest, but as it is not CommonMark compliant, its performance cannot be directly compared to that of the CommonMark-compliant libraries.

goldmark, meanwhile, builds a clean, extensible AST structure, achieves full compliance with CommonMark, and consumes less memory, all while being reasonably fast.

MBP 2019 13″(i5, 16GB), Go1.17

BenchmarkMarkdown/Blackfriday-v2-8                   302           3743747 ns/op         3290445 B/op      20050 allocs/op
BenchmarkMarkdown/GoldMark-8                         280           4200974 ns/op         2559738 B/op      13435 allocs/op
BenchmarkMarkdown/CommonMark-8                       226           5283686 ns/op         2702490 B/op      20792 allocs/op
BenchmarkMarkdown/Lute-8                              12          92652857 ns/op        10602649 B/op      40555 allocs/op
BenchmarkMarkdown/GoMarkdown-8                        13          81380167 ns/op         2245002 B/op      22889 allocs/op

against cmark (CommonMark reference implementation written in C)

MBP 2019 13″(i5, 16GB), Go1.17

----------- cmark -----------
file: _data.md
iteration: 50
average: 0.0044073057 sec
------- goldmark -------
file: _data.md
iteration: 50
average: 0.0041611990 sec

As you can see, goldmark's performance is on par with cmark's.

Extensions

List of extensions

goldmark-meta: A YAML metadata extension for the goldmark Markdown parser.
goldmark-highlighting: A syntax-highlighting extension for the goldmark markdown parser.
goldmark-emoji: An emoji extension for the goldmark Markdown parser.
goldmark-mathjax: Mathjax support for the goldmark markdown parser
goldmark-pdf: A PDF renderer that can be passed to goldmark.WithRenderer().
goldmark-hashtag: Adds support for #hashtag-based tagging to goldmark.
goldmark-wikilink: Adds support for [[wiki]]-style links to goldmark.
goldmark-anchor: Adds anchors (permalinks) next to all headers in a document.
goldmark-figure: Adds support for rendering paragraphs starting with an image to <figure> elements.
goldmark-frontmatter: Adds support for YAML, TOML, and custom front matter to documents.
goldmark-toc: Adds support for generating tables-of-contents for goldmark documents.
goldmark-mermaid: Adds support for rendering Mermaid diagrams in goldmark documents.
goldmark-pikchr: Adds support for rendering Pikchr diagrams in goldmark documents.
goldmark-embed: Adds support for rendering embeds from YouTube links.
goldmark-latex: A $\LaTeX$ renderer that can be passed to goldmark.WithRenderer().
goldmark-fences: Support for pandoc-style fenced divs in goldmark.
goldmark-d2: Adds support for D2 diagrams.
goldmark-katex: Adds support for KaTeX math and equations.
goldmark-img64: Adds support for embedding images into the document as DataURL (base64 encoded).
goldmark-enclave: Adds support for embedding youtube/bilibili video, X's oembed tweet, tradingview's chart, quail's widget into the document.
goldmark-wiki-table: Adds support for embedding Wiki Tables.
goldmark-tgmd: A Telegram markdown renderer that can be passed to goldmark.WithRenderer().

Loading extensions at runtime

goldmark-dynamic allows you to write a goldmark extension in Lua and load it at runtime without re-compilation.

Please refer to goldmark-dynamic for details.

goldmark internal(for extension developers)

Overview

goldmark's Markdown processing is outlined in the diagram below.

            <Markdown in []byte, parser.Context>
                           |
                           V
            +-------- parser.Parser ---------------------------
            | 1. Parse block elements into AST
            |   1. If a parsed block is a paragraph, apply 
            |      ast.ParagraphTransformer
            | 2. Traverse AST and parse blocks.
            |   1. Process delimiters(emphasis) at the end of
            |      block parsing
            | 3. Apply parser.ASTTransformers to AST
                           |
                           V
                      <ast.Node>
                           |
                           V
            +------- renderer.Renderer ------------------------
            | 1. Traverse AST and apply renderer.NodeRenderer
            |    corespond to the node type

                           |
                           V
                        <Output>

Parsing

Markdown documents are read through text.Reader interface.

AST nodes do not have concrete text. AST nodes have segment information of the documents, represented by text.Segment .

text.Segment has 3 attributes: Start, End, Padding .

(TBC)

TODO

See extension directory for examples of extensions.

Summary:

Define AST Node as a struct in which ast.BaseBlock or ast.BaseInline is embedded.
Write a parser that implements parser.BlockParser or parser.InlineParser.
Write a renderer that implements renderer.NodeRenderer.
Define your goldmark extension that implements goldmark.Extender.

Donation

BTC: 1NEDSyUmo4SMTDP83JJQSWi1MvQUGGNMZB

License

MIT

Author

Yusuke Inuzuka

goldmark's People

Contributors

Stargazers

Watchers

Forkers

shammishailaj nschonni qsdj rjc antboard litao91 shnwang backwardn jabingp isgasho ksharpdabu mitghi anatofuz evgenyk freightprotocol moorereason tchigher johnyhi elinvention anthonyfok jkboxomine legacy-tech-repos seemethere artyom gaoyoubo zzwx-forks woshizilong banyue dut3062796s thinklib sosiska zeripath mfrank2016 qiuzhiqian tangiel sjml tv42 ganly nobonobo jschaf mikesbrown pzl dolanor-galaxy clearcodecn cipherboy mewbak silentchen mbrukman qianzy96 forkkit codem-code movsb khun84 twilightbook pgavlin yaaax jaydenwen123 showsmall ahmedalhulaibi daniel-007 gtrevg evankanderson chi07 abhi15sep bracketsoftware helfper onthegit andscoop linuxerwang klaven stepanstipl abijr tawawhite sts0mrg0 smalchi andymeneely fastgh inyono steadbytes matti hooligani tamudashe markcol karelbilek jangocheng admco-github-com blackclimber mayocream kokizzu didik78 sdirix eltociear reyadussalahin muharihar jinze stephenafamo zgtxxxx smarteng vforks wlevene

goldmark's Issues

Special Designed list_item may cause goldmark to infinite loop

Sample:

*[TAB]A
[space][space][space][space]B

Header attributes

It seems that parser.WithAttribute() pocessinп does not always work well.

https://github.com/mironovalexey/gm-test/tree/master/hattrs

https://github.com/mironovalexey/gm-test/blob/master/hattrs/test.md

slice bounds out of range on Windows

After 4536e57
This file crashes: https://github.com/gohugoio/hugo/blob/master/hugolib/testdata/what-is-markdown.md

How to apply microtypographic rules to Markdown?

What version of goldmark are you using? : v1.11.1 (Hugo 0.60.1)
What version of Go are you using? : go1.13.4
What operating system and processor architecture are you using? darwin/amd64 (macOS 10.15.1)
What did you do? : Write Markdown in french language
What did you expect to see? : french typographic rules applied (like inserting a non-breakable space before a question mark)
What did you see instead? : no french typographic rules
(Feature request only): Why you can not implement it as an extension?: Not a Go programmer

How should be french typographic rules applied, through an extension, or is it something that is dependendant of the Go language itself? Or another Go Package? SmartyPants but with more rules specific to a language.

For instance, languages like PHP have libs to handle this https://github.com/jolicode/JoliTypo

Some of the french typographic rules are liste by Grammalecte Firefox extension:

Comma breaks Linkify

Using this file:

package main
import (
   "bytes"
   gm "github.com/yuin/goldmark"
   "github.com/yuin/goldmark/extension"
)
func main() {
   var s1 = []byte("https://github.com#sun,mon")
   var s2 bytes.Buffer
   gm.New(gm.WithExtensions(extension.Linkify)).Convert(s1, &s2)
   print(s2.String())
}

I get this result:

<p><a href="https://github.com#sun">https://github.com#sun</a>,mon</p>

With github.com parser, I get this result:

<p><a href="https://github.com#sun,mon">https://github.com#sun,mon</a></p>

Example:

https://github.com#sun,mon

Support for inline footnotes

goldmark is fully compliant with the CommonMark. Before submitting issue, you must read CommonMark spec and confirm your output is different from CommonMark online demo.

Please answer the following before submitting your issue:

What version of goldmark are you using? : v1.1.10 via Hugo v0.60.1
What version of Go are you using? : go1.13.4
What operating system and processor architecture are you using? : darwin/amd64

This is a feature request. Both Pandoc and Black Friday support a different version of footnotes than the one currently supported by Goldmark. Pandoc refers to them as inline footnotes. The syntax looks like this:

This is a sentence.^[This is footnote one.] This is also a sentence.^[This will become footnote two.]

Would it be possible for Goldmark to support this kind of footnote?

Bernchmarks

See https://github.com/bep/markdown-benchmarks

I borrowed your tests and tried to make them as similar as possible + added some more.

Feel free to grab the code if you want.

GoldMark is doing well.

Linkify does not work after Chinese characters

goldmark does not linkify following links:

搜索引擎链接https://www.google.com

搜索引擎链接：https://www.google.com

What version of goldmark are you using? : v1.1.11
What version of Go are you using? : 1.12
What operating system and processor architecture are you using? : Hugo v0.60.1 on macOS
What did you do? : Put a link after Chinese characters
What did you expect to see? : The link should be automatically created
What did you see instead? : The link was not automatically created
(Feature request only): Why you can not implement it as an extension?: not applicable

Linkify bug

package main

import (
	"bytes"
	"fmt"
	"log"
	"runtime/debug"

	"github.com/yuin/goldmark"
	"github.com/yuin/goldmark/extension"
)

func main() {

	convert(`
Go to [http://www.example.com](www.example.com) or http://www.example.com.
`)
}

func convert(src string) {
	markdown := goldmark.New(
		goldmark.WithExtensions(extension.Linkify),
	)

	var buf bytes.Buffer
	err := markdown.Convert([]byte(src), &buf)
	if err != nil {
		log.Fatal(err)
	}

	fmt.Println(buf.String())
}

Prints:

<p>Go to <a href="www.example.com"><a href="http://www.example.com">http://www.example.com</a></a> or <a href="http://www.example.com">http://www.example.com</a>.</p>

Single line is treated as a parargraph

Test:
https://github.com/mironovalexey/gm-test/blob/master/line/run.go

Result:

<p>Single <code>line</code></p>

Add some kind of "non-rendering render hook"

In working on adding this to Hugo, I wanted to implement ToC in a general way that we could possibly also use for other things; e.g. a "content map" with byte slice pointers (start/stop) into the rendered content.

I experimented by creating an extension:

https://github.com/bep/hugo/blob/goldmark2/markup/goldmark/contentmap.go#L35

But that doesn't work, as I notice that you pick up the first renderer for a given node kind.

Note that for the ToC thing (which is what Hugo has today), I can traverse the AST and build the ToC from that, but it would be really useful if could somehow register the rendered start/stop position for the different blocks; so people could do things like:

Split content over multiple pages
Insert ads/bylines etc.
...

Again, thanks for this library, it's really easy to use.

Apostrophes in contractions are not converted to right single quote

For the following text:

I'm going to see my mother. She's very nice.

Currently, ' is not converted to ’ for contractions when the typography extension is enabled, but smartypants does. I would expect the output to be:

I&rsquo;m going to see my mother. She&rsquo;s very nice.

Heading attribute panics

The source below is taken from the README.

package main

import (
	"bytes"

	"github.com/yuin/goldmark"
	"github.com/yuin/goldmark/parser"
)

func main() {
	md := goldmark.New(
		goldmark.WithParserOptions(
			parser.WithAttribute(),
		),
	)
	source := []byte(`
## heading {#id .className attrName=attrValue class="class1 class2"}
`)
	var buf bytes.Buffer
	if err := md.Convert(source, &buf); err != nil {
		panic(err)
	}
}

Panics:

panic: interface conversion: interface {} is [][]uint8, not []uint8

goroutine 1 [running]:
github.com/yuin/goldmark/renderer/html.(*Renderer).RenderAttributes(0xc00009cae0, 0x12262c0, 0xc000119d80, 0x1227860, 0xc0001b2000)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/renderer/html/html.go:513 +0x227
github.com/yuin/goldmark/renderer/html.(*Renderer).renderHeading(0xc00009cae0, 0x12262c0, 0xc000119d80, 0xc000184140, 0x49, 0x49, 0x1227860, 0xc0001b2000, 0x2001, 0x8, ...)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/renderer/html/html.go:208 +0x11f
github.com/yuin/goldmark/renderer.(*renderer).Render.func2(0x1227860, 0xc0001b2000, 0x1, 0x0, 0x12262c0, 0xc000119d80)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/renderer/renderer.go:167 +0x108
github.com/yuin/goldmark/ast.Walk(0x1227860, 0xc0001b2000, 0xc000175e30, 0x3, 0x0)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/ast/ast.go:433 +0x43
github.com/yuin/goldmark/ast.Walk(0x12273e0, 0xc00011c780, 0xc000175e30, 0xc0001b8000, 0x0)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/ast/ast.go:439 +0x149
github.com/yuin/goldmark/renderer.(*renderer).Render(0xc0001245f0, 0x12243c0, 0xc0000909f0, 0xc000184140, 0x49, 0x49, 0x12273e0, 0xc00011c780, 0xc000124501, 0xc0000909f0)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/renderer/renderer.go:162 +0x13c
github.com/yuin/goldmark.(*markdown).Convert(0xc000119a00, 0xc000184140, 0x49, 0x49, 0x12243c0, 0xc0000909f0, 0x0, 0x0, 0x0, 0xc000064058, ...)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/markdown.go:117 +0xe3

An infinite loop in ASTTransformer

func (*test) Transform(node *ast.Document, reader text.Reader, pc parser.Context) {
  walk(node)
}

func walk(node ast.Node) {
  for n := node.FirstChild(); n != nil; n = node.NextSibling() {
    walk(n)
  }
}

Footnotes numbering should always be sequential

Running Hugo 0.60.1 with Goldmark 1.1.8. I'm basing this off the PHP Markdown Extra spec since CommonMark doesn't support footnotes, so please bear with me.

From my reading of the PHP Markdown Extra spec for footnotes, they should always be parsed a and numbered in sequential order in the document. What I've noticed in Goldmark is that they're parsed using the text in the footnote name, or that this sequential renumbering step is skipped - I'm not entirely sure which.

What did you do? : Added footnotes to text, e.g.:

This[^3] is[^1] text with footnotes[^2].

[^1]: Footnote one
[^2]: Footnote two
[^3]: Footnote three

What did you expect to see? :

This¹ is² text with footnotes³
¹: Footnote three
²: Footnote one
³: Footnote two

What did you see instead? :

This³ is¹ text with footnotes²
¹: Footnote one
²: Footnote two
³: Footnote three

(I apologize, my first example did not reflect the actual problem. I have updated it.)

Rather a question

Thank you for great work on this library! I've been looking for this clean implementation that works out-of-box without any patches.

Now for a project I'm working on I need a little bit of hacking of default renderer. Would you please direct me as to where I would plug my custom rendering of a youtube auto-links. Basically I'm having them as *ast.AutoLink nodes. Now I'm trying to rewrite the rendering of those so that they appear in <div>s with the <img> of the youtube preview picture and <a> leading to the video. That's the idea.

So far I've been able to declare a custom type:

// CustomGoldmarkRenderer renders specific markdown documents containing video links
type CustomGoldmarkRenderer struct {
	defaultRenderer renderer.Renderer
	file            *[]byte
}

which then I make implementing the Renderer interface:

func (c CustomGoldmarkRenderer) Render(w io.Writer, source []byte, n ast.Node) error {
	ast.Walk(n, func(n ast.Node, entering bool) (status ast.WalkStatus, err error) {
		switch t := n.(type) {
		case *ast.AutoLink:
			url := string(t.URL(*c.file))
			matches := youTubeLinkRegex.FindAllStringSubmatch(url, -1)
			if len(matches) == 0 {
				// Or try a short link
				matches = youTubeShortLinkRegex.FindAllStringSubmatch(url, -1)
			}
			if len(matches) > 0 {
				videoID := matches[0][1] // Group 1 stands for the first (...) block
				if entering {
					fmt.Fprintf(w, `
					<div class="py-2 col-12 col-xl-3 col-lg-3 col-md-4 mb-2">
						<a href="%s" target="_blank" class="d-block h-180">
							<img class="img-fluid img-thumbnail rounded" src="%s" alt="%s"/>
						</a>
						%s`,
						url,
						"https://img.youtube.com/vi/"+videoID+"/mqdefault.jpg",
						"title",
						"titleHTML")
					return ast.WalkSkipChildren, nil
				} else {
					fmt.Fprintf(w, `</div>`)
				}
			}
		}
		return ast.WalkContinue, nil
	})
	return c.defaultRenderer.Render(w, source, n)
}

so that I'm able to plug my custom renderer into the md := goldmark.New(...) as following:

	md.SetRenderer(CustomGoldmarkRenderer{
		defaultRenderer: md.Renderer(),
		file:            &file,  // Passing original source so that it becomes available in parsing function
	})

	var buf bytes.Buffer
	if err := md.Convert(file, &buf); err != nil {
		panic(err)
	}

Now of course what I get is simply my additional rendering of <div>s that I do with my walker, and then (with no surprise) the ordinary rendering is being appended to the io.Writer when return c.defaultRenderer.Render(w, source, n) comes into play.

Being an amateur coder in Go I can't figure out how to render the rest of the nodes with the default way while I do the rendering in my custom ast.Walk(...) call, node by node, since return c.defaultRenderer.Render(w, source, n) seems to be called just once for the Document node and doesn't really help me with individual nodes at all.

So, Would you be so kind to hint me where I'm wrong and which direction I would rather need to choose?

Typographic elements in heading are excluded from the automatically generated heading IDs

Hello,

For background and related discussion, please see the following post in Hugo forum.

https://discourse.gohugo.io/t/difference-in-auto-generated-heading-anchor-names-between-previous-versions-and-v0-60-x/22076

Please answer the following before submitting your issue:

What version of goldmark are you using? : 1.1.8 (included in Hugo 0.60.1)
What version of Go are you using? : 1.11.2 (but shouldn't matter as the test is done with pre-built Hugo)
What operating system and processor architecture are you using? : macOS 10.13.6, Intel Core i5
What did you do? : Upgrade Hugo from 0.54.0 to 0.60.1 to check the basic functionality
What did you expect to see? : Non-alphanumeric typhographic elements (hyphen, period, underscore, etc.) in heading are transformed into hyphen in the auto heading IDs (e.g. for heading "Command-Gen-Instance" and "v1.0.0 (Apr 21, 2019)", the results are command-gen-instance and v1-0-0-apr-21-2019)
What did you see instead? : Non-alphanumeric typhographic elements in heading are excluded from the auto heading IDs (e.g. for the example above, the results are commandgeninstance and v100-april-21-2019)

Many thanks for your work with Goldmark.

provide Katex support

@KaTeX

table extension: Merged table columns

Tested with Goldmark 1.16.

package main

import (
	"bytes"
	"fmt"
	"log"

	"github.com/yuin/goldmark"
	"github.com/yuin/goldmark/extension"
)

func main() {

	convert(`Foo|Bar
---|---
` + "`" + `Yoyo` + "`" + `|Dyne`)
}

func convert(src string) {

	markdown := goldmark.New(
		goldmark.WithExtensions(
			extension.Table,
		),
	)
	var buf bytes.Buffer
	err := markdown.Convert([]byte(src), &buf)
	if err != nil {
		log.Fatal(err)
	}

	fmt.Println(buf.String())
}

Produces

<table>
<thead>
<tr>
<th>Foo</th>
<th>Bar</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>Yoyo</code>|Dyne</td>
<td></td>
</tr>
</tbody>
</table>

The "try it" on https://commonmark.org/help/tutorial/02-emphasis.html renders it correclty, https://spec.commonmark.org/dingus/ renders nothing.

gohugoio/hugo#6641

"!" will always start a new text element.

Given:

This is a line! Yes.

And this is another!

Will got:

    Paragraph {
        RawText: "This is a line! Yes."
        HasBlankPreviousLines: false
        Text: "This is a line"
        Text: "! Yes."
    }
    Paragraph {
        RawText: "And this is another!"
        HasBlankPreviousLines: false
        Text: "And this is another"
        Text: "!"
    }

Expected:

    Paragraph {
        RawText: "This is a line! Yes."
        HasBlankPreviousLines: false
        Text: "This is a line! Yes."
    }
    Paragraph {
        RawText: "And this is another!"
        HasBlankPreviousLines: false
        Text: "And this is another!"
    }

Fenced code block with carriage returns causes a panic error

Hey!

I am trying to "markdownify" input coming from an HTML textarea, and it contains carriage returns.

Using a fenced code block with carriage returns cause the whole program to panic with a slice bounds out of range error.

Here is an example:

package main

import (
	"bytes"
	"fmt"
	"html/template"

	"github.com/yuin/goldmark"
)

func main() {
	var buf bytes.Buffer
	if err := goldmark.Convert([]byte("lol\r\n\r\n```\r\nok\r\n```\r\n\r\nyes"), &buf); err != nil {
		panic(err)
	}
	fmt.Printf("%v", template.HTML(buf.String()))
}

panic: runtime error: slice bounds out of range

goroutine 1 [running]:
github.com/yuin/goldmark/parser.(*fencedCodeBlockParser).Open(0x8336e0, 0x6b0340, 0xc000102600, 0x6afa80, 0xc00009a7e0, 0x6af880, 0xc00009a8a0, 0x0, 0x0, 0x10)
        /home/thomas/Proj/blobstash/vendor/github.com/yuin/goldmark/parser/fcode_block.go:51 +0x419
github.com/yuin/goldmark/parser.(*parser).openBlocks(0xc000172000, 0x6b0340, 0xc000102600, 0x1, 0x6afa80, 0xc00009a7e0, 0x6af880, 0xc00009a8a0, 0x3)
        /home/thomas/Proj/blobstash/vendor/github.com/yuin/goldmark/parser/parser.go:849 +0x27e
github.com/yuin/goldmark/parser.(*parser).parseBlocks(0xc000172000, 0x6b0340, 0xc000102600, 0x6afa80, 0xc00009a7e0, 0x6af880, 0xc00009a8a0)
        /home/thomas/Proj/blobstash/vendor/github.com/yuin/goldmark/parser/parser.go:923 +0x1ce
github.com/yuin/goldmark/parser.(*parser).Parse(0xc000172000, 0x6afa80, 0xc00009a7e0, 0x0, 0x0, 0x0, 0x20, 0x62e2c0)
        /home/thomas/Proj/blobstash/vendor/github.com/yuin/goldmark/parser/parser.go:771 +0x157
github.com/yuin/goldmark.(*markdown).Convert(0xc00017a000, 0xc0000b6a00, 0x1a, 0x1a, 0x6ab8a0, 0xc00007ad20, 0x0, 0x0, 0x0, 0x0, ...)
        /home/thomas/Proj/blobstash/vendor/github.com/yuin/goldmark/markdown.go:116 +0x94
github.com/yuin/goldmark.Convert(...)
        /home/thomas/Proj/blobstash/vendor/github.com/yuin/goldmark/markdown.go:31
main.main()
        /home/thomas/Proj/blobstash/cmd/lol/p.go:13 +0xbc
exit status 2

Thanks!

BlockParser logics

Thank you for the excellent markdown processor. This is really very impressive.

Could you please help me to understand the logic of BlockParser.

For example, I want to implement the behaviour of (nesting) lists/blockquotes with help of markers, i.e

%START%

Content 1

%START%

Content2

%FINISH%

Content 3

%FINISH%

should produce

<START>
Content 1
<START>
Content 2
</FINISH>
Content 3
</FINISH>

But when I write something like this, I am a little bit confused. It interrupts the parsing along with the first parser.Close call. But blockquote works well and there could be multiple parser.Close calls during the nested blockquotes parsing cycle.

Colon inside ** breaks "boldness"

package main

import (
	"bytes"
	"fmt"
	"log"

	"github.com/yuin/goldmark"
)

func main() {
	content := `**Bold:**Regular`

	markdown := goldmark.New()

	var buf bytes.Buffer
	err := markdown.Convert([]byte(content), &buf)
	if err != nil {
		log.Fatal(err)
	}

	fmt.Println(buf.String())
}

Prints:

<p>**Bold:**Regular</p>

No EOL at the end of file breaks processing

Test case: https://github.com/mironovalexey/gm-test (eof package).

goldmark can't emphasized the specific Chinese character

After using goldmard process the markdown text **「刻舟求剑」**, the result is still **「刻舟求剑」**, but the expected is 「刻舟求剑」.

What version of goldmark are you using? : v1.1.11
What version of Go are you using? : 1.12
What operating system and processor architecture are you using? : Hugo v0.60.1 on macOS
What did you do? : Create a markdown file with **
What did you expect to see? : The words should be emphasized
What did you see instead? : The words was not emphasized
(Feature request only): Why you can not implement it as an extension?: not applicable

Pandoc Markdown Compatibility?

Goldmark's CommonMark compatibility is amazing and with attribute support, the mathjax extension, and the metadata extension covers what I feel are the most popular parts of Pandoc Markdown. It seems very possible that Goldmark could eventually replace external pandoc dependencies in many Go applications today. I would very much like to contribute toward that goal and I'm aware of others who would be also.

To that end I am seeking some design direction and consensus about how to move forward.

Extensions for each feature seems most reasonable. But I opened this issue to make sure a full AST Transformer might not be a better approach. Personally I prefer the modularity of an extension for each --- particularly Pandoc's unique Simplified Tables --- and find the composition design valuable that Pandoc has used for its internals.

Which design direction is most recommended for such work? Several extensions or a single Transformer? I'm almost sure the answer is extensions but am asking anyway to avoid something I may have missed.

Thank you. (If there is a better place to have this discussion please let me know.)

Footnote parsing error

test![^1]

[^1]: footnote

<p>test![^1]</p>

This happens if an exclamation mark is placed before the footnote link.

Apostrophe breaks Linkify

Using this file:

package main
import (
   "bytes"
   gm "github.com/yuin/goldmark"
   "github.com/yuin/goldmark/extension"
)
func main() {
   var s1 = []byte("https://github.com/sunday's")
   var s2 bytes.Buffer
   gm.New(gm.WithExtensions(extension.Linkify)).Convert(s1, &s2)
   print(s2.String())
}

I get this result:

<p><a href="https://github.com/sunday">https://github.com/sunday</a>'s</p>

With github.com parser, I get this result:

<p><a href="https://github.com/sunday's">https://github.com/sunday's</a></p>

Example:

https://github.com/sunday's

API question

Ref this interface:

// A Markdown interface offers functions to convert Markdown text to
// a desired format.
type Markdown interface {
	// Convert interprets a UTF-8 bytes source in Markdown and write rendered
	// contents to a writer w.
	Convert(source []byte, writer io.Writer, opts ...parser.ParseOption) error

	// Parser returns a Parser that will be used for conversion.
	Parser() parser.Parser

	// SetParser sets a Parser to this object.
	SetParser(parser.Parser)

	// Parser returns a Renderer that will be used for conversion.
	Renderer() renderer.Renderer

	// SetRenderer sets a Renderer to this object.
	SetRenderer(renderer.Renderer)
}

With the above, I can create a Markdown with a custom parser and renderer (I'm not sure what the setters are for) and then run Convert to do the job.

A big win (ref. your benchmarks) when you have this strict separation between parse and render, is to parse once and render to every format you need. I don't see how that is possible with the current API?

Extended unicode characters discarded from auto heading IDs

Goldmark 1.1.8 implementation only takes into account one-byte code point (ASCII) while generating auto heading IDs, simply discarding extended latin characters (2 bytes) and other international characters (3 bytes).

https://github.com/yuin/goldmark/blob/master/parser/parser.go#L83-L85

In multilingual sites, this causes imperfect heading IDs to be generated.

Rendering of external links in safe mode

I've now merged in Goldmark as the default Markdown handler in Hugo and it works great.

I have set unsafe=false as the default, and that works mostly as expected.

But the rendering of external links comes as a surprise on most people, I think.

[Google Search!](https://google.com/)

[Google Search!](https://google.com/)

So, the security motivation behind the above is maybe to prevent fake linking? But when the end result is that most people configure it to be unsafe just to get proper links, I think that makes the net security much less.

gohugoio/hugoThemesSite#67

Panic in auto-id

package main

import (
	"bytes"
	"fmt"
	"log"
	"runtime/debug"

	"github.com/yuin/goldmark/parser"

	"github.com/yuin/goldmark"
)

func main() {

	convert(`#
# FOO`)
}

func convert(src string) {
	defer func() {
		if r := recover(); r != nil {
			fmt.Println("Panic:\n", string(debug.Stack()))
		}
	}()

	markdown := goldmark.New(
		goldmark.WithParserOptions(
			parser.WithAutoHeadingID(),
		),
	)
	var buf bytes.Buffer
	err := markdown.Convert([]byte(src), &buf)
	if err != nil {
		log.Fatal(err)
	}
}

github.com/yuin/goldmark/text.(*Segments).At(...)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/text/segment.go:182
github.com/yuin/goldmark/parser.generateAutoHeadingID(0xc0001ba000, 0x1229380, 0xc0001aa7e0, 0x1229440, 0xc0001aa850)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/parser/atx_heading.go:190 +0x219
github.com/yuin/goldmark/parser.(*atxHeadingParser).Close(0xc00012120e, 0x122a0c0, 0xc0001ba000, 0x1229380, 0xc0001aa7e0, 0x1229440, 0xc0001aa850)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/parser/atx_heading.go:173 +0xba
github.com/yuin/goldmark/parser.(*parser).closeBlocks(0xc0001b1500, 0x0, 0x0, 0x1229380, 0xc0001aa7e0, 0x1229440, 0xc0001aa850)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/parser/parser.go:845 +0x162
github.com/yuin/goldmark/parser.(*parser).parseBlocks(0xc0001b1500, 0x1229c40, 0xc000126780, 0x1229380, 0xc0001aa7e0, 0x1229440, 0xc0001aa850)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/parser/parser.go:1058 +0x753
github.com/yuin/goldmark/parser.(*parser).Parse(0xc0001b1500, 0x1229380, 0xc0001aa7e0, 0x0, 0x0, 0x0, 0x8, 0x8)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/parser/parser.go:818 +0x148
github.com/yuin/goldmark.(*markdown).Convert(0xc000123a40, 0xc000121230, 0x7, 0x8, 0x1226bc0, 0xc00009a9f0, 0x0, 0x0, 0x0, 0xc0001ae000, ...)
	/Users/bep/go/pkg/mod/github.com/yuin/[email protected]/markdown.go:116 +0x94
main.convert(0x11f960a, 0x7)
	/Users/bep/dev/go/bep/temp/main.go:33 +0x1d5
main.main()
	/Users/bep/dev/go/bep/temp/main.go:16 +0x36

Unwanted paragraph closing tag in html template tag

First of all, thanks a lot for the work on goldmark. I just tried it with the new release and it works great. Though, there is one minor imperfection:

The unsafe option is turned on and there is html code inside a paragraph, like this:

This is **Bold** <span>Component</span><template>
<div>Name</div>
</template>  **Bold** as well.

This will render as:

<p>This is <strong>Bold</strong> <span>Component</span><template></p>
<div>Name</div>
</template>  **Bold** as well.

Notice how the closing tag </p> is set too early. If I remove the line break after <template> it works as aspected:

This is **Bold** <span>Component</span><template> <div>Name</div>
</template>  **Bold** as well.

<p>This is <strong>Bold</strong> <span>Component</span><template> <div>Name</div>
</template>  <strong>Bold</strong> as well.</p>

Of course I can just move the div up, but there are other divs in my template as well (they also call </p> too early) and therefore this one line will become quite long and hard to maintain. Basically, the template tag and everything inside should not call for the automatic setting of </p>.

Even though I use Hugo to render, I think this is a goldmark related issue.

Last backtick appears to escape in fenced code blocks

Hi there,

Thanks for spending the time to make this! This is super useful, and the extensibility is a great feature not easily found elsewhere. I had one issue, I'm not sure if this is a bug or a side effect, but here it goes. In fenced code blocks, it appears that the last backtick escapes.

So, for example:

    ```
    function lorem(ipsum, dolor = 1) {
      const sit = ipsum == null ? 0 : ipsum.sit;
      dolor = sit - amet(dolor);
      return sit ? consectetur(ipsum, 0, dolor < 0 ? 0 : dolor) : [];
    }

    function adipiscing(...elit) {
      if (!elit.sit) {
        return [];
      }
    
      const sed = elit[0];
      return eiusmod.tempor(sed) ? sed : [sed];
    }

    function incididunt(ipsum, ut = 1) {
      ut = labore.et(amet(ut), 0);
      const sit = ipsum == null ? 0 : ipsum.sit;

      if (!sit || ut < 1) {
        return [];
      }

      let dolore = 0;
      let magna = 0;
      const aliqua = new eiusmod(labore.ut(sit / ut));

      while (dolore < sit) {
        aliqua[magna++] = consectetur(ipsum, dolore, (dolore += ut));
      }
    
      return aliqua;
    }
    ```

Ends up being rendered as:
——————————————————————————————

function lorem(ipsum, dolor = 1) {
  const sit = ipsum == null ? 0 : ipsum.sit;
  dolor = sit - amet(dolor);
  return sit ? consectetur(ipsum, 0, dolor < 0 ? 0 : dolor) : [];
}

function adipiscing(...elit) {
  if (!elit.sit) {
    return [];
  }

  const sed = elit[0];
  return eiusmod.tempor(sed) ? sed : [sed];
}

function incididunt(ipsum, ut = 1) {
  ut = labore.et(amet(ut), 0);
  const sit = ipsum == null ? 0 : ipsum.sit;

  if (!sit || ut < 1) {
    return [];
  }

  let dolore = 0;
  let magna = 0;
  const aliqua = new eiusmod(labore.ut(sit / ut));

  while (dolore < sit) {
    aliqua[magna++] = consectetur(ipsum, dolore, (dolore += ut));
  }

  return aliqua;
}

`
——————————————————————————————
^ superfluous last backtick

This is the actual code fragment that is generated by above:

<pre style="color:#93a1a1;background-color:#002b36"><span style="color:#268bd2">function</span> lorem(ipsum, dolor <span style="color:#719e07">=</span> <span style="color:#2aa198">1</span>) {
  <span style="color:#268bd2">const</span> sit <span style="color:#719e07">=</span> ipsum <span style="color:#719e07">==</span> <span style="color:#cb4b16">null</span> <span style="color:#719e07">?</span> <span style="color:#2aa198">0</span> <span style="color:#719e07">:</span> ipsum.sit;
  dolor <span style="color:#719e07">=</span> sit <span style="color:#719e07">-</span> amet(dolor);
  <span style="color:#719e07">return</span> sit <span style="color:#719e07">?</span> consectetur(ipsum, <span style="color:#2aa198">0</span>, dolor <span style="color:#719e07">&lt;</span> <span style="color:#2aa198">0</span> <span style="color:#719e07">?</span> <span style="color:#2aa198">0</span> <span style="color:#719e07">:</span> dolor) <span style="color:#719e07">:</span> [];
}

<span style="color:#268bd2">function</span> adipiscing(...elit) {
  <span style="color:#719e07">if</span> (<span style="color:#719e07">!</span>elit.sit) {
    <span style="color:#719e07">return</span> [];
  }

  <span style="color:#268bd2">const</span> sed <span style="color:#719e07">=</span> elit[<span style="color:#2aa198">0</span>];
  <span style="color:#719e07">return</span> eiusmod.tempor(sed) <span style="color:#719e07">?</span> sed <span style="color:#719e07">:</span> [sed];
}

<span style="color:#268bd2">function</span> incididunt(ipsum, ut <span style="color:#719e07">=</span> <span style="color:#2aa198">1</span>) {
  ut <span style="color:#719e07">=</span> labore.et(amet(ut), <span style="color:#2aa198">0</span>);
  <span style="color:#268bd2">const</span> sit <span style="color:#719e07">=</span> ipsum <span style="color:#719e07">==</span> <span style="color:#cb4b16">null</span> <span style="color:#719e07">?</span> <span style="color:#2aa198">0</span> <span style="color:#719e07">:</span> ipsum.sit;

  <span style="color:#719e07">if</span> (<span style="color:#719e07">!</span>sit <span style="color:#719e07">||</span> ut <span style="color:#719e07">&lt;</span> <span style="color:#2aa198">1</span>) {
    <span style="color:#719e07">return</span> [];
  }

  <span style="color:#268bd2">let</span> dolore <span style="color:#719e07">=</span> <span style="color:#2aa198">0</span>;
  <span style="color:#268bd2">let</span> magna <span style="color:#719e07">=</span> <span style="color:#2aa198">0</span>;
  <span style="color:#268bd2">const</span> aliqua <span style="color:#719e07">=</span> <span style="color:#719e07">new</span> eiusmod(labore.ut(sit <span style="color:#719e07">/</span> ut));

  <span style="color:#719e07">while</span> (dolore <span style="color:#719e07">&lt;</span> sit) {
    aliqua[magna<span style="color:#719e07">++</span>] <span style="color:#719e07">=</span> consectetur(ipsum, dolore, (dolore <span style="color:#719e07">+=</span> ut));
  }

  <span style="color:#719e07">return</span> aliqua;
}
</pre><p>`</p>

Thanks for taking a look!

Autolinks

Other parsers allow for bare autolinks. For example:

http://example.com

returns:

<a href="http://example.com">http://example.com</a>

https://github.github.com/gfm#autolinks-extension-

How to remove all nodes with NodeType in ASTTransformer?

func (*testTransformer) Transform(node *ast.Document, reader text.Reader, pc parser.Context) {
    processNodes(node)
}

func processNodes(n ast.Node) {
    if n.Kind() == ast.KindHeading {
        if p := n.Parent(); p != nil {
            p.RemoveChild(p, n)
        }
        return
    }
    for c := n.FirstChild(); c != nil; c = n.NextSibling() {
        processNodes(c)
    }
}

Source markdown:

# Header 1

text

## Header 2

text

Result:

<p>text</p>
<h2>Header 2</h2>
<p>text</p>

Hard line breaks not rendered in files with Windows-style line endings

Hello

Member of the Hugo team here. Currently testing Goldmark as the new default in Hugo 0.60.0 DEV.

Apparently hard line breaks as specified in Commonmark 0.29 are not rendered by Golmark for markdown files with Windows-style line endings.

In a collaborative project that I maintain files can be edited by other team members on Windows.
Typically we use two spaces for a line break.

But I only managed to render the line break after using dos2unix to convert the line endings from DOS to UNIX like so: dos2unix some-file.md.

cc: @bep

Release Notes

Please add Release Notes to your releases!

You may borrow my script/tools that I use for this tools/release.sh

This will help a lot for folks watching releases and not all commits :)

New lines within span-level elements

When span-level element contain new lines, its content is not treated as markdown.

Test case: https://github.com/mironovalexey/gm-test (html package).

https://github.com/mironovalexey/gm-test/blob/master/html/test.md

Consider adding a context (data holder) to Render

This is a follow up to #37

So, setting state on the nodes in the AST and then use that while rendering works, but ...

It makes for some fairly clumsy and verbose code
It breaks the separation of concerns (adding rendering code to the parser)

What I'm now doing instead is something ala:

        w := renderContext{
		BufWriter: bufio.NewWriter(buf),
		renderContextData: renderContextDataHolder{
			rctx: ctx,
			dctx: c.ctx,
		},
	}

	if err := c.md.Renderer().Render(w, ctx.Src, doc); err != nil {
		return nil, err
	}

This works great , and I don't mind doing it like this (this is entirely internal), but the down side is that it may stop working in the future if you decide to wrap the writer or something.

Support footnote return links

goldmark v1.1.7
with Hugo 0.60.0

As mentioned in gohugoio/hugo/issues/6551 Goldmark seems to not support footnote return links although they are supported by PHP Markdown extra.

It would be great if Goldmark supported them.

Thank you very much in advance.

Markdown:

That's some text with a footnote.[^1]

[^1]: And that's the footnote.

Output:

…
<section class="footnotes" role="doc-endnotes"><hr><ol><li id="fn:1" role="doc-endnote"><p>And that's the footnote.</p></li></ol></section>

Rendering "class" attribute

Hi @yuin
I'm trying to append class="..." to all img tags and wondering if something like this would make sense to add (of course it's simply a hard-coded example for "class" attribute only) :

zzwx-forks@11441f5

This way users wouldn't have to completely rewrite render function in case something simple as adding a class is needed and they don't want to possibly break the code when the library gets updated.

This is my use case:

case *ast.Image:
  if entering {
    n.SetAttributeString("class", "img-fluid")
  }

Fuzz crasher in parser/attribute.go:102

Please answer the following before submitting your issue:

What version of goldmark are you using? : v1.1.10
What version of Go are you using? : go1.13.4
What operating system and processor architecture are you using? : linux/amd64
What did you do? : Merge #54 and then make fuzz
What did you expect to see? : boredom
What did you see instead? :

sh$ cat fuzz/crashers/db0b78ba444c6efd83f1d4f6f74faab82aaf3cb5.quoted
        "{\n-"

sh$ cat fuzz/crashers/db0b78ba444c6efd83f1d4f6f74faab82aaf3cb5.output
panic: runtime error: index out of range [0] with length 0

goroutine 1 [running]:
github.com/yuin/goldmark/parser.parseAttribute(0x688e40, 0xc0001bd650, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7f33d7475000)
        /go/src/github.com/yuin/goldmark/parser/attribute.go:102 +0xa9b
github.com/yuin/goldmark/parser.ParseAttributes(0x688e40, 0xc0001bd650, 0x0, 0x1, 0x0, 0x1)
        /go/src/github.com/yuin/goldmark/parser/attribute.go:61 +0x1e2
github.com/yuin/goldmark/parser.parseLastLineAttributes(0x689b80, 0xc00016a090, 0x688e40, 0xc0001bd500, 0x688f00, 0xc0001bd5e0)
        /go/src/github.com/yuin/goldmark/parser/atx_heading.go:229 +0x429
github.com/yuin/goldmark/parser.(*setextHeadingParser).Close(0xc000117c70, 0x689b80, 0xc00016a090, 0x688e40, 0xc0001bd500, 0x688f00, 0xc0001bd5e0)
        /go/src/github.com/yuin/goldmark/parser/setext_headings.go:107 +0x56e
github.com/yuin/goldmark/parser.(*parser).closeBlocks(0xc000197500, 0x0, 0x0, 0x688e40, 0xc0001bd500, 0x688f00, 0xc0001bd5e0)
        /go/src/github.com/yuin/goldmark/parser/parser.go:845 +0x199
github.com/yuin/goldmark/parser.(*parser).parseBlocks(0xc000197500, 0x689700, 0xc00001e980, 0x688e40, 0xc0001bd500, 0x688f00, 0xc0001bd5e0)
        /go/src/github.com/yuin/goldmark/parser/parser.go:1023 +0xc12
github.com/yuin/goldmark/parser.(*parser).Parse(0xc000197500, 0x688e40, 0xc0001bd500, 0x0, 0x0, 0x0, 0x30, 0x633700)
        /go/src/github.com/yuin/goldmark/parser/parser.go:818 +0x1eb
github.com/yuin/goldmark.(*markdown).Convert(0xc000119e80, 0x7f33d7475000, 0x3, 0x3, 0x685fa0, 0xc00007cff0, 0x0, 0x0, 0x0, 0x9, ...)
        /go/src/github.com/yuin/goldmark/markdown.go:116 +0xac
github.com/yuin/goldmark/fuzz.Fuzz(0x7f33d7475000, 0x3, 0x3, 0x4)
        /go/src/github.com/yuin/goldmark/fuzz/fuzz.go:34 +0x43c
go-fuzz-dep.Main(0xc00026ff48, 0x1, 0x1)
        go-fuzz-dep/main.go:36 +0x1ad
main.main()
        github.com/yuin/goldmark/fuzz/go.fuzz.main/main.go:15 +0x52

Greater-than sign breaks Linkify

Using this file:

package main
import (
   "bytes"
   gm "github.com/yuin/goldmark"
   "github.com/yuin/goldmark/extension"
)
func main() {   
   var s1 = []byte("https://github.com?q=stars:>1")
   var s2 bytes.Buffer
   gm.New(gm.WithExtensions(extension.Linkify)).Convert(s1, &s2)
   print(s2.String())
}

I get this result:

<p><a href="https://github.com?q=stars:">https://github.com?q=stars:</a>&gt;1</p>

With github.com parser, I get this result:

<p><a href="https://github.com?q=stars:%3E1">https://github.com?q=stars:&gt;1</a></p>

Example:

https://github.com?q=stars:>1

question: Passing state to a rendering extension

I'm in the process of creating some link/image extensions that would allow for link resolution/image resize etc.

For that to work, I need to pass on some document state to the custom link renderer. But I don't see how.

The Parse method can take a context, but I don't see a similar way to pass a struct via Render. I could create a new goldmark.Markdown for each document, but that sounds wasteful.

Remove the 1.12.x tests or fix the library to conform to unsigned shifts change since 1.13

According to https://github.com/yuin/goldmark/blob/master/.github/workflows/test.yaml,
GitHub Actions are using 1.12.x tests that make sense if the library is compatible with 1.12 which is not by definition in go.mod.

On the other hand the only problem for not being compatible with 1.12 is unsigned shift operations. Is it then worth converting those loops causing trouble to using uint instead?

Fuzz crash on "[^000]:0\t[^]:"

Please answer the following before submitting your issue:

What version of goldmark are you using? : v1.1.9
What version of Go are you using? : go1.13.4
What operating system and processor architecture are you using? : linux/amd64
What did you do? : make fuzz
What did you expect to see? : boredom
What did you see instead? :

sh$ cat fuzz/crashers/374f2bf4f9cd8bb2d4737a8bcb30f74ea5ef9e10.quoted
        "[^000]:0\t[^]:"

sh$ cat fuzz/crashers/374f2bf4f9cd8bb2d4737a8bcb30f74ea5ef9e10.output
panic: runtime error: slice bounds out of range [:14] with capacity 13

goroutine 1 [running]:
github.com/yuin/goldmark/text.(*Segment).Value(0xc00026edc0, 0x7f2e89c6c000, 0xd, 0xd, 0x7f2e89c6c009, 0x0, 0x0)
        /go/src/github.com/yuin/goldmark/text/segment.go:44 +0x33f
github.com/yuin/goldmark/text.(*reader).Value(0xc0001bd7a0, 0xe, 0xe, 0x0, 0x0, 0xd, 0x3)
        /go/src/github.com/yuin/goldmark/text/reader.go:106 +0x62
github.com/yuin/goldmark/extension.(*footnoteBlockParser).Open(0x7e2060, 0x689480, 0xc000125f40, 0x687780, 0xc0001bd7a0, 0x687840, 0xc0001bd880, 0x1, 0xc000125f40, 0x8)
        /go/src/github.com/yuin/goldmark/extension/footnote.go:55 +0x294
github.com/yuin/goldmark/parser.(*parser).openBlocks(0xc0001d4000, 0x689480, 0xc000125f40, 0x0, 0x687780, 0xc0001bd7a0, 0x687840, 0xc0001bd880, 0x2)
        /go/src/github.com/yuin/goldmark/parser/parser.go:908 +0x481
github.com/yuin/goldmark/parser.(*parser).parseBlocks(0xc0001d4000, 0x688040, 0xc00001ef80, 0x687780, 0xc0001bd7a0, 0x687840, 0xc0001bd880)
        /go/src/github.com/yuin/goldmark/parser/parser.go:1008 +0x218
github.com/yuin/goldmark/parser.(*parser).Parse(0xc0001d4000, 0x687780, 0xc0001bd7a0, 0x0, 0x0, 0x0, 0x30, 0x632200)
        /go/src/github.com/yuin/goldmark/parser/parser.go:818 +0x1eb
github.com/yuin/goldmark.(*markdown).Convert(0xc0001c88c0, 0x7f2e89c6c000, 0xd, 0xd, 0x6849e0, 0xc00007d9b0, 0x0, 0x0, 0x0, 0x24f90bed, ...)
        /go/src/github.com/yuin/goldmark/markdown.go:116 +0xac
github.com/yuin/goldmark/fuzz.Fuzz(0x7f2e89c6c000, 0xd, 0xd, 0x3)
        /go/src/github.com/yuin/goldmark/fuzz/fuzz.go:23 +0x269
go-fuzz-dep.Main(0xc00026ff48, 0x1, 0x1)
        go-fuzz-dep/main.go:36 +0x1ad
main.main()
        github.com/yuin/goldmark/fuzz/go.fuzz.main/main.go:15 +0x52
exit status 2

HTML comments can break the processing

https://github.com/mironovalexey/gm-test/tree/master/hcomments

https://github.com/mironovalexey/gm-test/blob/master/hcomments/test.md

Any line between --- and --- breaks the processing.

Fuzz crash on ">*\t>\n> \t0\n>\t\t0\n>0"

Please answer the following before submitting your issue:

What version of goldmark are you using? : v1.1.9
What version of Go are you using? : go1.13.4
What operating system and processor architecture are you using? : linux/amd64
What did you do? : make fuzz
What did you expect to see? : boredom
What did you see instead? :

sh$ cat fuzz/crashers/db13717bee8cb87337140ed44b4f9bc01214e3fb.quoted
        ">*\t>\n> \t0\n>\t\t0\n>0"

sh$ cat fuzz/crashers/db13717bee8cb87337140ed44b4f9bc01214e3fb.output
panic: interface conversion: ast.Node is *ast.CodeBlock, not *ast.ListItem

goroutine 1 [running]:
github.com/yuin/goldmark/parser.lastOffset(0x688820, 0xc000190630, 0x1)
        /go/src/github.com/yuin/goldmark/parser/list.go:102 +0xfd
github.com/yuin/goldmark/parser.(*listParser).Continue(0x7e2060, 0x688820, 0xc000190630, 0x687780, 0xc0001e97a0, 0x687840, 0xc0001e9880, 0xa)
        /go/src/github.com/yuin/goldmark/parser/list.go:192 +0x24e
github.com/yuin/goldmark/parser.(*parser).parseBlocks(0xc0001f2000, 0x688040, 0xc00001ec00, 0x687780, 0xc0001e97a0, 0x687840, 0xc0001e9880)
        /go/src/github.com/yuin/goldmark/parser/parser.go:1032 +0x558
github.com/yuin/goldmark/parser.(*parser).Parse(0xc0001f2000, 0x687780, 0xc0001e97a0, 0x0, 0x0, 0x0, 0x30, 0x632200)
        /go/src/github.com/yuin/goldmark/parser/parser.go:818 +0x1eb
github.com/yuin/goldmark.(*markdown).Convert(0xc000078b00, 0x7ff83a71d000, 0x11, 0x11, 0x6849e0, 0xc0000959b0, 0x0, 0x0, 0x0, 0x3a2334ec, ...)
        /go/src/github.com/yuin/goldmark/markdown.go:116 +0xac
github.com/yuin/goldmark/fuzz.Fuzz(0x7ff83a71d000, 0x11, 0x11, 0x3)
        /go/src/github.com/yuin/goldmark/fuzz/fuzz.go:23 +0x269
go-fuzz-dep.Main(0xc00029bf48, 0x1, 0x1)
        go-fuzz-dep/main.go:36 +0x1ad
main.main()
        github.com/yuin/goldmark/fuzz/go.fuzz.main/main.go:15 +0x52
exit status 2

yuin / goldmark Goto Github PK

goldmark's Introduction

goldmark

Motivation

Features

Installation

Usage

With options

Context options

Custom parser and renderer

Parser and Renderer options

Parser options

HTML Renderer options

Built-in extensions

Attributes

Headings

Table extension

Typographer extension

Linkify extension

Footnotes extension

CJK extension

Styles of Line Breaking

Example of EastAsianLineBreaksStyleSimple

Example of EastAsianLineBreaksCSS3Draft

Security

Benchmark

against other golang libraries

against cmark (CommonMark reference implementation written in C)

Extensions

List of extensions

Loading extensions at runtime

goldmark internal(for extension developers)

Overview

Parsing

Donation

License

Author

goldmark's People

Contributors

Stargazers

Watchers

Forkers

goldmark's Issues

Recommend Projects

Recommend Topics

Recommend Org

Example of `EastAsianLineBreaksStyleSimple`

Example of `EastAsianLineBreaksCSS3Draft`