python-markdown / markdown Goto Github PK
View Code? Open in Web Editor NEWA Python implementation of John Gruber’s Markdown with Extension support.
Home Page: https://python-markdown.github.io/
License: BSD 3-Clause "New" or "Revised" License
A Python implementation of John Gruber’s Markdown with Extension support.
Home Page: https://python-markdown.github.io/
License: BSD 3-Clause "New" or "Revised" License
If an extension (e.g. inlinepattern) adds a comment to the tree, then markdown fails with an exception (TypeError) while subsequently running the PrettifyTreeprocessor.
The offending code is in treeprocessors.py in the _prettifyETree(self, elem) method.
The code attempts to verify that the comment is block level by calling markdown.isBlockLevel(e.tag) on the comment. However e.tag evaluates to a function for a comment and not to a string that the isBlockLevel function is expecting, causing the TypeError exception to be raised.
The code might need to explicitly check for comments and ignore them in this processor.
Any backslashes \ get removed during the markdown processing. Backslashes that are in code blocks get displayed properly, but ones in the markdown body -- not.
For example, if you parse "C:\Program Files" the markdown processor will return "C:Program Files".
I'm trying to get syntax highlighting to work with fenced code, but it's not cooperating.
Here's the contents of the input file, codetest.md
:
~~~~{.r}
# r code
c(1,2,3)
~~~~
And here's what happens when I try to run it. Apparently there's some sort of problem with pygments? This is running on Ubuntu 11.10, python 2.7.2.
Traceback (most recent call last):
File "/home/winston/.local/bin/markdown_py", line 5, in <module>
pkg_resources.run_script('Markdown==2.1.0', 'markdown_py')
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 467, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 1200, in run_script
execfile(script_filename, namespace, namespace)
File "/home/winston/.local/lib/python2.7/site-packages/Markdown-2.1.0-py2.7.egg/EGG-INFO/scripts/markdown_py", line 34, in <module>
run()
File "/home/winston/.local/lib/python2.7/site-packages/Markdown-2.1.0-py2.7.egg/markdown/__main__.py", line 81, in run
markdown.markdownFromFile(**options)
File "/home/winston/.local/lib/python2.7/site-packages/Markdown-2.1.0-py2.7.egg/markdown/__init__.py", line 416, in markdownFromFile
kwargs.get('encoding', None))
File "/home/winston/.local/lib/python2.7/site-packages/Markdown-2.1.0-py2.7.egg/markdown/__init__.py", line 346, in convertFile
html = self.convert(text)
File "/home/winston/.local/lib/python2.7/site-packages/Markdown-2.1.0-py2.7.egg/markdown/__init__.py", line 280, in convert
self.lines = prep.run(self.lines)
File "/home/winston/.local/lib/python2.7/site-packages/Markdown-2.1.0-py2.7.egg/markdown/extensions/fenced_code.py", line 128, in run
code = highliter.hilite()
File "/home/winston/.local/lib/python2.7/site-packages/Markdown-2.1.0-py2.7.egg/markdown/extensions/codehilite.py", line 99, in hilite
noclasses=self.noclasses)
File "/usr/lib/python2.7/dist-packages/pygments/formatters/html.py", line 347, in __init__
self.noclasses = get_bool_opt(options, 'noclasses', False)
File "/usr/lib/python2.7/dist-packages/pygments/util.py", line 58, in get_bool_opt
string, optname))
pygments.util.OptionError: Invalid type [False, 'Use inline styles instead of CSS classes - Default false'] for option noclasses; use 1/0, yes/no, true/false, on/off
It works fine if I use just codehilite (with indented code), or if I use fenced_code without codehilite.
This issue is copied from Ticket 64 of our old bug tracker. It has been copied as-is:
Nested lists do not nest. I've tried:
* Item 1 * Item A * Item B
I get a flat list.
Tried it here too:
http://babelmark.bobtfish.net/?markdown=*+Item+1%0D%0A++*+Item+A%0D%0A++*+Item+B&compare=on&src=4&dest=4Comments
By Waylan 7/1/10
Actually nested lists work fine when you indent with 4 spaces (I changed the title to better fit the actual situation). I realize that the Perl implementation works with 2 spaces of indent, but the fact is the syntax rules make no mention of any nested lists whatsoever and all other types of blocks require 4 spaces so Python-Markdown is consistent and requires 4 spaces for all types of nests content in lists (at least the first line of each block must be nested 4 spaces).
Unless someone can convince me otherwise, I'm considering this a bug in the perl implementation (and all other implementations that have copied its behavior). This will be marked wontfix in a few days. Please take any discussions on the matter to the mailing list.
This issue is a copy of Ticket 8 in our old bug reporting system. The text has been copied as-is:
This was recently brought up on the Markdown discussion list, but there are various characters that are allowed in email addresses, that non of the markdown implementations support. Interestingly, Python-Markdown appears to support the most at this time, but there is room for improvement. Perhaps we should add a test case as all the following addresses are valid:
<[email protected]> <[email protected]> <[email protected]> <[email protected]> <abc+mailbox/[email protected]> <!#$%&'*+-/=?^_`.{|}[email protected]> (all of these characters are allowed) <"abc@def"@example.com> (anything goes inside quotation marks) <"Fred Bloggs"@example.com>
It appears to me that we only have issues with the last three. Although the second to last one may be right. I'm not sure how we should treat the quotes. The examples come from Wikipedia.
When using the nl2br extension i have found the following bug. If you were to create a markdown list then try to write a snippet of code the code does not get put into <pre>
or <code>
tags.
I believe the cause of this to be that the nl2br is not escaping out of the list causing the following syntax.
<ul>
<li>
<p>Helloworld</p>
<p>My code snippet</p>
</li>
</ul>
According to http://www.freewisdom.org/projects/python-markdown/Using_as_a_Module , I should be able to pass in extension arguments as follows:
import markdown
md = markdown.Markdown(extensions=['toc'], extension_configs= {'toc' : ('anchorlink', True)},)
However, this produces the following traceback:
ValueError Traceback (most recent call last)
/home/wilfred/bleeding_edge/Python-Markdown/<ipython-input-19-31b461a39945> in <module>()
----> 1 md = markdown.Markdown(extensions=['toc'], extension_configs= {'toc' : ('anchorlink', True)},)
/home/wilfred/bleeding_edge/Python-Markdown/markdown/__init__.py in __init__(self, *args, **kwargs)
132 self.htmlStash = util.HtmlStash()
133 self.registerExtensions(extensions=kwargs.get('extensions', []),
--> 134 configs=kwargs.get('extension_configs', {}))
135 self.set_output_format(kwargs.get('output_format', 'xhtml1'))
136 self.reset()
/home/wilfred/bleeding_edge/Python-Markdown/markdown/__init__.py in registerExtensions(self, extensions, configs)
158 for ext in extensions:
159 if isinstance(ext, basestring):
--> 160 ext = self.build_extension(ext, configs.get(ext, []))
161 if isinstance(ext, Extension):
162 # might raise NotImplementedError, but that's the extension author's problem
/home/wilfred/bleeding_edge/Python-Markdown/markdown/__init__.py in build_extension(self, ext_name, configs)
177
178 # Parse extensions config params (ignore the order)
--> 179 configs = dict(configs)
180 pos = ext_name.find("(") # find the first "("
181 if pos > 0:
ValueError: dictionary update sequence element #0 has length 10; 2 is required
The call that's dying is dict(('anchorlink', True))
. Would it make sense for extension_configs to be a dict of dicts? e.g.:
markdown.Markdown(extensions=['toc'], extension_configs= {'toc' : {'anchorlink': True}},)
The alternative syntax works fine:
markdown.Markdown(extensions=['toc(anchorlink=1)'])
although according to http://www.freewisdom.org/projects/python-markdown/Table_of_Contents I should be able to pass booleans:
md = markdown.Markdown(extensions=['toc(anchorlink=True)'])
but this dies slightly later (I'm not 100% sure if this is the same issue, apologies if not) with:
In [33]: md = markdown.Markdown(extensions=['toc(anchorlink=True)'])
In [34]: md.convert("[TOC]\n\n# foo\n\n## bar")
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/home/wilfred/bleeding_edge/Python-Markdown/<ipython-input-34-aafe6026c2af> in <module>()
----> 1 md.convert("[TOC]\n\n# foo\n\n## bar")
/home/wilfred/bleeding_edge/Python-Markdown/markdown/__init__.pyc in convert(self, source)
285 # Run the tree-processors
286 for treeprocessor in self.treeprocessors.values():
--> 287 newRoot = treeprocessor.run(root)
288 if newRoot:
289 root = newRoot
/home/wilfred/bleeding_edge/Python-Markdown/markdown/extensions/toc.pyc in run(self, doc)
102 link.attrib["href"] = '#' + id
103
--> 104 if int(self.config["anchorlink"]):
105 anchor = etree.Element("a")
106 anchor.text = c.text
ValueError: invalid literal for int() with base 10: 'True'
This issue has been copied from Ticket 84 of our old bug tracker and was reported by "Eugen" (no contact info provided). It has been copied as-is:
Suppose you have got this markdown structure:
[TOC] ... # [Wikipedia](http://en.wikipedia.org/) ...I.e., one of the headers is a link to a website.
Then, the generated TOC link to the Section "Wikipedia" does not link to the section inside the document but also to "http://en.wikipedia.org/". I don't know if this is intended behaviour but I think it is irritating.
This may in fact be what the "spec" (Markdown.pl) tells us is correct behaviour, but it strikes me as odd:
>>> import markdown
>>> markdown.version
'2.0.3'
>>> print markdown.markdown('* * *')
<ul>
<li>
<ul>
<li>*</li>
</ul>
</li>
</ul>
Perhaps a more sensible translation would involve a single list item (containing two asterisks) inside a single unordered list.
I'm of the opinion that defining correct translations for "invalid" input is just as important as defining correct translations for valid Markdown.
HeaderId extension doesn't work with Setext-style headers. The following code:
import markdown
text = """
Setext-style header {#id1}
===================
Setext-style header {#id2}
-------------------
### Atx-style header ### {#id3}
"""
print markdown.markdown(text, ['extra'])
will output:
<h1>Setext-style header {#id1}</h1>
<h2>Setext-style header {#id2}</h2>
<h3 id="id3">Atx-style header</h3>
but it should output:
<h1 id="id1">Setext-style header</h1>
<h2 id="id2">Setext-style header</h2>
<h3 id="id3">Atx-style header</h3>
Version used is 2.0.3.
PHP Markdown Extra includes modifications to underscore emphasis, but Python Markdown doesn't support them.
According to the Markdown Extra syntax, underscores in the middle of a word don't generate an emphasis. There are 2 cases:
With Markdown Extra, the following:
The file name is "my__text__file.txt".
will be displayed as the following:
<p>The file name is "my__text__file.txt".</p>
However, the following Python Markdown code:
import markdown
text = "The file name is \"my__text__file.txt\"."
print markdown.markdown(text, ['extra'])
will output:
<p>The file name is "my<strong>text</strong>file.txt".</p>
Constant SMART_EMPHASIS
set to False
is useful to follow official Markdown syntax. However, this constant should affect only Markdown standard, but it affects also Markdown Extra. According to the Markdown Extra syntax, the following:
The file name is "my_text_file.txt".
will be displayed as the following:
<p>The file name is "my_text_file.txt".</p>
However, the following Python Markdown code (with SMART_EMPHASIS = False
):
import markdown
text = "The file name is \"my_text_file.txt\"."
print markdown.markdown(text, ['extra'])
will output:
<p>The file name is "my<em>text</em>file.txt".</p>
If there's an invalid URL following a change in indentation of the form:
### bar
# foo
[a][(b)
I get the following crash:
In [1]: from markdown import markdown
In [2]: s = """### bar
...:
...: # foo
...:
...: [a][(b)"""
In [3]: markdown(s, extensions=['toc'])
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
/home/wilfred/work/potatopedia/<ipython-input-3-5be88cfe066b> in <module>()
----> 1 markdown(s, extensions=['toc'])
/home/wilfred/work/potatopedia/markdown/__init__.py in markdown(text, *args, **kwargs)
375 """
376 md = Markdown(*args, **kwargs)
--> 377 return md.convert(text)
378
379
/home/wilfred/work/potatopedia/markdown/__init__.py in convert(self, source)
281 # Run the tree-processors
282 for treeprocessor in self.treeprocessors.values():
--> 283 newRoot = treeprocessor.run(root)
284 if newRoot:
285 root = newRoot
/home/wilfred/work/potatopedia/markdown/extensions/toc.py in run(self, doc)
106 c.append(anchor)
107
--> 108 list_stack[-1].append(last_li)
109
110 class TocExtension(markdown.Extension):
IndexError: list index out of range
Many thanks for Python-Markdown.
When a link is written like this:
[](http://example.com/)
the HTML is generated as:
<a href="http://example.com/" />
By way of comparison GitHub generates it as:
<a href="http://example.com/"></a>
And renders it as:
Interestingly, WebKit seems to handle the <a href="" />
form really badly in the rendering and element inspector. It leaves the link open to the start of the next anchor tag and adds multiple extra tags in the inspector.
Current footnotes implementation allows no more than about 1000 footnotes per document. A bit more and you get
very long traceback ending with:
(… the line below repeated many many times …)
File "/usr/lib/pymodules/python2.6/markdown/extensions/footnotes.py", line 177, in _handleFootnoteDefinitions
more_plain = self._handleFootnoteDefinitions(theRest)
File "/usr/lib/pymodules/python2.6/markdown/extensions/footnotes.py", line 176, in _handleFootnoteDefinitions
+ "\n".join(detabbed))
File "/usr/lib/pymodules/python2.6/markdown/extensions/footnotes.py", line 98, in setFootnote
self.footnotes[id] = text
File "/usr/lib/pymodules/python2.6/markdown/odict.py", line 31, in __setitem__
super(OrderedDict, self).__setitem__(key, value)
RuntimeError: maximum recursion depth exceeded while calling a Python object
Save the script below as foot_demo.py:
COUNT = 990 # On my machine this is the smallest value which breaks, your values may change
for i in xrange(1, COUNT):
print "Something[^%d]\n" % i
for i in xrange(1, COUNT):
print "[^%d]: Another thing\n" % i
then run:
python foot_demo.py > foot_demo.txt
markdown -x footnotes foot_demo.txt
The _handleFootnoteDefinitions method finds first footnote definition, then
calls itself recursively on the remaining text. This obviously calls
for very deep stack when there are many footnotes, it is also fairly
inefficient.
Just rewrite the method so it loops instead of recursing.
PS If original author has no time to work on the code, I can try
working on the patch – but I am not sure whether I understand all
ideas in the code properly.
See this discussion on the mailing list for details.
the code →→code
(where →
is a tab) is expanded into this:
code
why aren’t my tabs retained?
Say the following Python Markdown code (tested with the branch on github):
import markdown
text = "Lorem _a_ ipsum."
print markdown.markdown(text)
It will output:
<p>Lorem _a_ ipsum.</p>
while it should generate an emphasis, as the following:
<p>Lorem <em>a</em> ipsum.</p>
In Markdown, we can create a horizontal rule with 3 or more proper symbols (hyphens, underscores or asterisks) with 2 spaces maximum between each symbol. The following examples produce 3 horizontal rules with both Markdown.pl and PHP Markdown:
- -- -
** * **
_ _ _
but Python Markdown doesn't create any horizontal rule:
<ul>
<li>-- -</li>
</ul>
<p><strong> * </strong></p>
<p>_ _ _</p>
For information, the regex that I use to highlight a horizontal rule in gedit is the following:
^[ ]{0,3} # Maximum 3 spaces at the beginning of the line.
(
(-+[ ]{0,2}){3,} | # 3 or more hyphens, with 2 spaces maximum between each hyphen.
(_+[ ]{0,2}){3,} | # Idem, but with underscores.
(\*+[ ]{0,2}){3,} # Idem, but with asterisks.
)
[ \t]*$ # Optional trailing spaces or tabs.
The Header Id extension doesn't work properly on "underlined" headers
Example:
Header 1 {#header1}
========
Header 2 {#header2}
--------
Becomes:
<h1>Header 1 {#header1}</h1>
<h2>Header 2 {#header2}</h2>
Expected:
<h1 id="header1">Header 1</h1>
<h2 id="header2">Header 2</h2>
A quick hack to support non-ascii headings :
=== diff -u toc.py.old toc.py >>> ===
--- toc.py.old 2011-11-20 20:03:03.000000000 +1100
+++ toc.py 2011-11-20 20:04:22.000000000 +1100
@@ -76,7 +76,7 @@
# Do not override pre-existing ids
if not "id" in c.attrib:
id = self.config["slugify"][0](c.text)
- if id in used_ids:
+ if ( id == '' ) or ( id in used_ids ):
ctr = 1
while "%s_%d" % (id, ctr) in used_ids:
ctr += 1
=== <<< ===
Basically the slugify() method makes an empty slug for non-ascii headings, so we name them as "_%d" % (heading_occurence) then .
Create a file m.py
with the following content:
import markdown
text = """
Lorem ipsum | Lorem ipsum | Lorem ipsum
----------------------------------- | :---------: | :---------:
Lorem ipsum dolor sit amet inceptos | Lorem ipsum | Lorem ipsum
"""
print markdown.markdown(text)
Run the file:
$ python m.py
Actual result: Python-Markdown freezes. If we interrupt the process, we have the following output:
$ python m.py
^CTraceback (most recent call last):
File "m.py", line 9, in <module>
print markdown.markdown(text)
File "/home/nom/.local/lib/python2.7/site-packages/markdown/__init__.py", line 386, in markdown
return md.convert(text)
File "/home/nom/.local/lib/python2.7/site-packages/markdown/__init__.py", line 283, in convert
root = self.parser.parseDocument(self.lines).getroot()
File "/home/nom/.local/lib/python2.7/site-packages/markdown/blockparser.py", line 62, in parseDocument
self.parseChunk(self.root, '\n'.join(lines))
File "/home/nom/.local/lib/python2.7/site-packages/markdown/blockparser.py", line 77, in parseChunk
self.parseBlocks(parent, text.split('\n\n'))
File "/home/nom/.local/lib/python2.7/site-packages/markdown/blockparser.py", line 93, in parseBlocks
if processor.test(parent, blocks[0]):
File "/home/nom/.local/lib/python2.7/site-packages/markdown/blockprocessors.py", line 470, in test
return bool(self.SEARCH_RE.search(block))
KeyboardInterrupt
$
Expected result: no freeze. Both PHP Markdown (no Extra) and Markdown.pl output the following:
<p>Lorem ipsum | Lorem ipsum | Lorem ipsum
----------------------------------- | :---------: | :---------:
Lorem ipsum dolor sit amet inceptos | Lorem ipsum | Lorem ipsum</p>
More information:
The bug doesn't occur with extension Extra:
print markdown.markdown(text, extensions=['extra'])
Tested with waylan-Python-Markdown-2.1.0.beta-0-ge8cdb0b.zip
.
Create a file n.py
with the following content:
import markdown
text = """
Lorem ipsum
Lorem ipsum dolor sit amet inceptos | Lorem ipsum | Lorem ipsum
----------------------------------- | :---------: | :---------:
Lorem ipsum | Lorem ipsum | Lorem ipsum
"""
print markdown.markdown(text, extensions=['extra'])
Run the file:
$ python n.py
Actual result: Python-Markdown freezes. If we interrupt the process, we have the following output:
$ python n.py
^CTraceback (most recent call last):
File "n.py", line 10, in <module>
print markdown.markdown(text, extensions=['extra'])
File "/home/nom/.local/lib/python2.7/site-packages/markdown/__init__.py", line 386, in markdown
return md.convert(text)
File "/home/nom/.local/lib/python2.7/site-packages/markdown/__init__.py", line 283, in convert
root = self.parser.parseDocument(self.lines).getroot()
File "/home/nom/.local/lib/python2.7/site-packages/markdown/blockparser.py", line 62, in parseDocument
self.parseChunk(self.root, '\n'.join(lines))
File "/home/nom/.local/lib/python2.7/site-packages/markdown/blockparser.py", line 77, in parseChunk
self.parseBlocks(parent, text.split('\n\n'))
File "/home/nom/.local/lib/python2.7/site-packages/markdown/blockparser.py", line 93, in parseBlocks
if processor.test(parent, blocks[0]):
File "/home/nom/.local/lib/python2.7/site-packages/markdown/blockprocessors.py", line 470, in test
return bool(self.SEARCH_RE.search(block))
KeyboardInterrupt
$
Expected result: no freeze.
More information:
The following code doesn't make Python-Markdown to freeze:
import markdown
text = """
Lorem ipsum
Lorem ipsum dolor sit amet inceptos | Lorem ipsum | Lorem ipsum
----------------------------------- | :---------: | :---------:
Lorem ipsum | Lorem ipsum | Lorem ipsum
"""
print markdown.markdown(text, extensions=['extra'])
Tested with waylan-Python-Markdown-2.1.0.beta-0-ge8cdb0b.zip
.
Ran into this one when upgrading from 1.7 to 2.1 ... when using Setext-style headings, you now need a minimum of 3x =
or -
characters to make the above line into a heading.
Heading doesn't work
--
Heading does work
---
Code to reproduce:
>>> import markdown
>>> m = markdown.Markdown()
>>> m.convert("Heading doesn't work\n--\n")
u'<p>Heading doesn't work\n--</p>'
>>> m.convert("Heading does work\n---\n")
u'<h2>Heading does work</h2>'
According to the spec, "Any number of underlining =’s or -’s will work." Perl/Showdown both support two characters
The offending regex is markdown/blockprocessors.py#L434.
As per this discussion on the markdown list, we should support fenced code blocks inside lists and blockquotes. Currently, they only work at the document root.
While we're at it, we might add support for github's syntax as an alternative??
Example:
[a b c](/a_b_c)
Becomes:
<p><a href="/a�klzzwxh:0000�b�klzzwxh:0001�c">a b c</a></p>
I'm getting a NameError
when trying to parse an email address like this:
markdown.markdown("<[email protected]>")
Here's the complete traceback:
>>> import markdown
>>> markdown.markdown("<[email protected]>")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.2/dist-packages/markdown/__init__.py", line 598, in markdown
return md.convert(text)
File "/usr/local/lib/python3.2/dist-packages/markdown/__init__.py", line 395, in convert
newRoot = treeprocessor.run(root)
File "/usr/local/lib/python3.2/dist-packages/markdown/treeprocessors.py", line 271, in run
text), child)
File "/usr/local/lib/python3.2/dist-packages/markdown/treeprocessors.py", line 95, in __handleInline
data, patternIndex, startIndex)
File "/usr/local/lib/python3.2/dist-packages/markdown/treeprocessors.py", line 219, in __applyPattern
node = pattern.handleMatch(match)
File "/usr/local/lib/python3.2/dist-packages/markdown/inlinepatterns.py", line 363, in handleMatch
letters = [codepoint2name(ord(letter)) for letter in email]
File "/usr/local/lib/python3.2/dist-packages/markdown/inlinepatterns.py", line 363, in <listcomp>
letters = [codepoint2name(ord(letter)) for letter in email]
File "/usr/local/lib/python3.2/dist-packages/markdown/inlinepatterns.py", line 357, in codepoint2name
entity = html.entities.codepoint2name.get(code)
NameError: global name 'html' is not defined
Links of the same format, <url>
, work well.
Hope this helps to solve the Problem. In the meantime, any workarounds would be appreciated.
This is Python Markdown 2.0.3 running on Python 3.2
This is a feature request. It'd be nice if there was a built-in (batteries included) extension to implement SmartyPants quoting by turning on a simple extension.
I notice that someone is already using SmartyPants with Markdown for Python, though not as an extension:
http://byrneswoder.com/blog/one-secret-to-generating-clean-html-from-text/
While it's nice to have safe mode, as a website owner, you sometimes want to limit the abilities of users even more. For example, on blog articles posted by users, you want all the markdown options, but in comments, you only want links, bold and italic text, and inline code snippets. A way to do that would be nice.
What I propose, is that you can use the markdown function like this:
markdown("**bold** and _italic_",allowed_tags=['b','a','i','u'],safe_mode='escape')
In docs/README
and docs/README.html
, the link to the bug tracker is outdated. It's http://www.freewisdom.org/projects/python-markdown/Tickets
instead of https://github.com/waylan/Python-Markdown/issues/
.
It would be great to see markdown-urlize as part of the included extensions, since it's very common functionality to add to markdown. Would this be possible? The author seems up for it.
PHP Markdown allows one to use markdown inside HTML blocks - simply by adding markdown=1 attribute to appropriate HTML block.
It would be nice if python markdown also allowed for such a feature.
I faced the problem after migrating my blog from PHP to Python but seems I am not the only one. It is useful in cases like
<div class="blahblah" markdown="1">
Some *normal* markdown [text][] here.
</div>
or
<blockquote markdown="1">
Some *markdown text*
<pre name="code" class="python">
# code block which is to stay inside blockquote
</pre>
Yet another *markdown text*
</blockquote>
It's not possible to get a literal [TOC]
in the output when use the (excellent) table of contents extension. It would be nice to be able to do
`[TOC]`
or
[TOC]
and just get the literal [TOC] in the output.
Say the following code:
import markdown
text = """
Paragraphe
<img src="/exemple1.png" alt="Texte alternatif" />
Paragraphe
"""
print markdown.markdown(text)
Python Markdown doesn't put the image into a paragraph:
<p>Paragraphe</p>
<img src="/exemple1.png" alt="Texte alternatif" />
<p>Paragraphe</p>
while PHP Markdown and Markdown.pl do:
<p>Paragraphe</p>
<p><img src="/exemple1.png" alt="Texte alternatif" /></p>
<p>Paragraphe</p>
See Maruku for details.
because _I_ wouldn't kill him, the _bunshin_ would
should come out as
<p>because <em>I</em> wouldn't kill him, the <em>bunshin</em> would</p>
(according to the official spec and Dingus) but in 2.0.3 it comes out as
<p>because <em>I_ wouldn't kill him, the _bunshin</em> would</p>
The following markdown:
* ### Promo Item 1 ####
Promotext line 1 Lorem ipsum dolor sit amet, consectetur adipisicing elit.
* ### Promo Item 2 ###
Promotext line 2 Lorem ipsum dolor sit amet, consectetur adipisicing elit.
Produces the following markup:
Promotext line 1 Lorem ipsum dolor sit amet, consectetur adipisicing elit.
Promotext line 2 Lorem ipsum dolor sit amet, consectetur adipisicing elit.
The first h3 is after the p... it should be the other way around. I have confirmed that this works properly with the original Markdown.pl parser.
This is a small feature request.
I recommend including the "Nlbr" extension as part of the Python-Markdown package, as described here:
http://deathofagremmie.com/2011/05/09/a-newline-to-break-python-markdown-extension/
Do not turn it on by default, because obviously this changes the semantics of markdown substantially. Still, it's a common change to markdown (used by Github, for example), and it's easy to implement. Making it easy to invoke, when desired, would be very nice.
Thanks!
input = """<div>
</div>
## Heading"""
expected output
u'<div>\n\n</div>\n<h2>Heading</h2>'
actual output
u'<div>\n\n</div>\n## Heading'
Hi, just to report a possible bug I found:
In [3]: markdown.markdown('[q=go:GO\\:0000307](/query?q=go:GO\\:0000307)')
Out[3]: u'<p><a href="/query?q=go:GO\x02klzzwxh:0001\x030000307">q=go:GO:0000307</a></p>'
I also tried from the original markdown website: http://daringfireball.net/projects/markdown/dingus. The same input generates
<p><a href="/query?q=go:GO\:0000307">q=go:GO\:0000307</a></p>
correctly.
I’d like the option to disable syntax guessing. Something like codehilite(guess_syntax=False)
.
I feel like I’m starting 80% (a completely made up number) of my blocks with “:::text” to avoid weird colors here and there. So I’m adding markup to make it not do something, where I’d prefer to just add it where I want something to happen.
I could attempt this myself and submit a pull request if you prefer. Thanks.
Built and installed markdown 2.1.0 using Python 3.2.1, on Fedora 16 (markdown_py has been renamed to /usr/bin/markdown_py-3.2).
Now:
% /bin/echo -e "### Heading\n\ntest" | markdown_py-3.2
Traceback (most recent call last):
File "/usr/bin/markdown_py-3.2", line 34, in <module>
run()
File "/usr/lib/python3.2/site-packages/markdown/__main__.py", line 81, in run
markdown.markdownFromFile(**options)
File "/usr/lib/python3.2/site-packages/markdown/__init__.py", line 416, in markdownFromFile
kwargs.get('encoding', None))
File "/usr/lib/python3.2/site-packages/markdown/__init__.py", line 341, in convertFile
text = input_file.read()
File "/usr/lib64/python3.2/codecs.py", line 480, in read
data = self.bytebuffer + newdata
TypeError: can't concat bytes to str
Specifying -e UTF-8 does not help.
* ### Promo Item 1 #### Duis aute irure dolor in _reprehenderit_ in voluptate velit esse cillum dolore eu fugiat nulla pariatur. [Learn more](http://www.naz.edu) * ### Promo Item 2 #### Duis aute irure dolor in reprehenderit in **voluptate** velit esse cillum dolore eu fugiat nulla pariatur. [Learn more.](http://www.naz.edu 'Yay') * ### Promo Item 3 #### Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. [Learn more.](http://www.naz.edu 'Yay') * ### Promo Item 4 #### Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. [Learn more.](http://www.naz.edu 'Yay')
Renders as:
<ul> <li> <h3>Promo Item 1</h3> Duis aute irure dolor in _reprehenderit_ in voluptate velit esse cillum dolore eu fugiat nulla pariatur. [Learn more](http://www.naz.edu)</li> <li> <h3>Promo Item 2</h3> Duis aute irure dolor in reprehenderit in **voluptate** velit esse cillum dolore eu fugiat nulla pariatur. [Learn more.](http://www.naz.edu 'Yay')</li> <li> <h3>Promo Item 3</h3> Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. [Learn more.](http://www.naz.edu 'Yay')</li> <li> <h3>Promo Item 4</h3> Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. [Learn more.](http://www.naz.edu 'Yay')</li> </ul>
This issue is copied from Ticket 89 of our old bug tracker. It is copied as-is:
We're getting anonymous "We've got a problem header!" emails from our website; turns out it's markdown being royally insane.
In addition I've recently had to dive into the source code and found some calls to sys.exit(). Libraries should never ever call this. We don't want our python interpreter dying on us randomly just because some text failed to render. Instead it should throw exceptions, which can be caught by the caller if necessary.
I found that markdown hardly ever throws exceptions, instead it has a crazy log wrapper which basically just serves to hide where problems are coming from.
I've made a few logging changes in my fork on github - here's a comparison view
It's a backwards incompatible change since I removed a lot of the logging code which custom extensions may be using. (replaced with exceptions which are much more useful). So maybe pull this one for 2.1 rather than 2.0.4.
Python Markdown fails to process the following code (note the trailing spaces for manual line breaks) according to the Markdown syntax:
[link text 1]
[link text 2][]
[link text 3][link label 3]
[link text 4] [link label 4]
[link text 1]: url1
[link text 2]: url2
[link label 3]: url3
[link label 4]: url4
It outputs:
<p><a href="url2">link text 1</a>[]<br />
[link text 3]<a href="url3">link label 3</a><br />
[link text 4] <a href="url4">link label 4</a></p>
while Markdown.pl and PHP Markdown create 4 links:
<p><a href="url">link text 1</a> <br />
<a href="url">link text 2</a> <br />
<a href="url">link text 3</a> <br />
<a href="url">link text 4</a></p>
More information: if we add something between the first link and its manual line break, Python Markdown successfully creates 4 links. Example:
This little modification to the previous code (word lorem added):
[link text 1] lorem
[link text 2][]
[link text 3][link label 3]
[link text 4] [link label 4]
[link text 1]: url1
[link text 2]: url2
[link label 3]: url3
[link label 4]: url4
will create 4 links:
<p><a href="url1">link text 1</a> lorem<br />
<a href="url2">link text 2</a><br />
<a href="url3">link text 3</a><br />
<a href="url4">link text 4</a></p>
Python-Markdown does not work as delivered on Cygwin. It installs, but attempting to run it produces this error:
Traceback (most recent call last):
File "/usr/bin/markdown", line 44, in
from markdown import COMMAND_LINE_LOGGING_LEVEL
File "/usr/bin/markdown.py", line 44, in
ImportError: cannot import name COMMAND_LINE_LOGGING_LEVEL
The problem is that a Cygwin system is actually a Windows system underneath, so it requires the Windows patch. HOWEVER, sys.platform reports 'cygwin', not 'win32', so the patch isn't actually run. Just change the detection code so that 'cygwin' also enables the windows workaround, and all is well.
Here's a patch, please add it!
diff -u markdown.py.old markdown.py
--- markdown.py.old 2011-05-26 17:27:46.000000000 -0400
+++ markdown.py 2011-05-26 18:28:43.014140700 -0400
@@ -30,7 +30,7 @@
"""
import sys, os
-if sys.platform == 'win32':
+if sys.platform in ['win32', 'cygwin']:
# We have to remove the Scripts dir from path on windows.
# If we don't, it will try to import itself rather than markdown lib.
# This appears to *not* be a problem on *nix systems, only Windows.
This is a copy of Ticket 86 from our old bug tracker. It has been copied as-is:
The code checks for 1.0, I had 1.0.2 installed. The Comment and PI symbols were not defined. Installed 1.0.5 of cElementTree and it fixed the problem.
It would be great, if footnote extension allowed one to decide where should the footnotes text be placed, allowing (depending on some config param) one to:
Rationale:
a) (my main use-case) On ebook-reader (while viewing EPUB file) navigating to the footnote, and back, tend to be slow and (on non-touch readers) sometimes require troublesome navigation. Having footnote rendered below the current paragraph (likely styled with smaller font) would make much nicer user experience.
b) Also on webpages it may make better presentation if footnotes are close to the text they refer to (here some javascript instrumentation may make it possible to dynamically show them on mouseover or mouseclick)
I was trying to figure out which markdown was being used by a project, and so ended up comparing the output of markdown (python-markdown-2.0.3-3.fc15.noarch) with markdown2 (python-markdown2-1.0.1.17-3.fc15.noarch). I found that both got some emphasis wrong. I'll mention both here for orientation.
Did a diff
between markdown2 and markdown outputs, so second line is markdown
's. Before each I put the original source line:
To alter the environment we can set the _NODE_ENV_ environment variable, for example:
106,108c81
<
< <p>To alter the environment we can set the <em>NODE</em>ENV_ environment variable, for example:</p>
<
---
> <p>To alter the environment we can set the <em>NODE_ENV</em> environment variable, for example:</p>
Note that this method _end()_s the response, so you will want to use node's _res.write()_ for multiple writes or streaming.
971,973c738
<
< <p>Note that this method <em>end()</em>s the response, so you will want to use node's <em>res.write()</em> for multiple writes or streaming.</p>
<
---
> <p>Note that this method <em>end()_s the response, so you will want to use node's _res.write()</em> for multiple writes or streaming.</p>
connections will be accepted via _INADDR_ANY_.
1490,1491c1130
< connections will be accepted via <em>INADDR</em>ANY_.</p>
<
---
> connections will be accepted via <em>INADDR_ANY</em>.</p>
markdown2 is getting it wrong when handling tokens that have embedded '' characters, such as NODE_ENV
and INADDR_ANY
. Apparently it is using lazy regexs excluding '' characters.
But then markdown gets this line wrong, apparently because it is using greedy RE matching and allowing '_' characters:
Note that this method _end()_s the response, so you will want to use node's _res.write()_ for multiple writes or streaming.
The result desired was obviously as what markdown2 ended up producing.
How do you have more than one emphasis span on a line?
Ahh, got it! Changing _end()_s
to _end()_'s
produces <em>end()</em>'s
as desired. That changed the interior '' to an ending ''.
You wouldn't believe how many times I've been told not to add 'extra' apostrophes in my writing. Now I know why I want to...
Oh, boo. Changing it to _end()_\s
gives you <em>end()</em>s
. Since that's closer to what the author wrote, that's better? (So I'm back to "okay, okay, I'll take out the apostrophes!")
Is there some kind of "when things don't work" FAQ you could add this to?
If I run markdown on a single set of docs where all code examples are in the same language, I don't see a reason to repeat ...python
over and over again. It would be nice if codehilite allowed specifying the common language once.
When in Markdown source an HTML comment contains Markdown's link markup, the resulting HTML contains link placeholder in place of the comment's text:
Example.md:
Please see <!-- [Example][1] -->
[1]: http://example.com/
Example.html (rendered with python-markdown 2.0.3):
<p>Please see <!-- �klzzwxh:0000� --></p>
Actually there are STX and EXT chars in the output, they are just not shown.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.