berniey / html5print Goto Github PK

View Code? Open in Web Editor NEW

29.0 3.0 13.0 93 KB

HTML5, CSS, Javascript Pretty Print

License: Other

CSS 1.53% Python 42.32% JavaScript 1.15% HTML 55.00%

html5print's Introduction

HTML5 Pretty Print

This tool pretty print your HTML, CSS and JavaScript file. The package comes with two parts:

a command line tool, html5-print

a python module, html5print

Introduction

This module reformat web page code and make it more readable. It is targeted for developers, hence is not optimized for speed. I start out looking for a tool, ended up created this module. Hope it helps you!

Key features:

Pretty print HTML as well as embedded CSS and JavaScript within it

Pretty print pure CSS and JavaScript

Try to fix fragmented HTML5

Try to fix HTML with broken unicode encoding

Try to guess encoding of the document, and in some cases manage to convert 8-bit byte code back into correct UTF-8 format

Support both Python 2 and 3

Installation

$ [sudo] pip install html5print

Uninstallation

$ [sudo] pip uninstall html5print
$ [sudo] pip uninstall bs4 html5lib slimit tinycss2 requests chardet

Command Line Tool

Synopsis

$ html5-print --help
usage: html5-print [-h] [-o OUTFILE] [-s INDENT_WIDTH] [-e ENCODING]
                    [-t {html,js,css}] [-v]
                    infile

Beautify HTML5, CSS, JavaScript - Version 0.1.2 (By Bernard Yue)
This tool reformat the input and return a beautified version,
in unicode.

positional arguments:
  infile                filename | url | -, a dash, which represents stdin

optional arguments:
  -h, --help            show this help message and exit
  -o OUTFILE, --output OUTFILE
                        filename for formatted HTML, stdout if omitted
  -s INDENT_WIDTH, --indent-width INDENT_WIDTH
                        number of space for indentation, default 2
  -e ENCODING, --encoding ENCODING
                        encoding of input, default UTF-8
  -t {html,js,css}, --filetype {html,js,css}
                        type of file to parse, default "html"
  -v, --version         show program's version number and exit

Example

Pretty print HTML:

$ html5-print -s4 -
Press Ctrl-D when finished
<html><head><title>Small HTML page</title>
<style>p { margin: 10px 20px; color: black; }</style>
<script>function myFunction() {
document.getElementById("demo").innerHTML = "Paragraph changed.";
}</script>
</head><body>
<p>Some text for testing</body></html>
^D
<html>
    <head>
        <title>
            Small HTML page
        </title>
        <style>
            p {
                margin              : 10px 20px;
                color               : black;
            }
        </style>
        <script>
            function myFunction() {
                document.getElementById("demo").innerHTML = "Paragraph changed.";
            }
        </script>
    </head>
    <body>
        <p>
            Some text for testing
        </p>
    </body>
</html>
$

Create valid HTML5 document from HTML fragment:

$ html5-print -s4 -
Press Ctrl-D when finished
<title>Hello in different language</title>
<p>Here is "hello" in different languages</p>
<ul>
<li>Hello
<li>您好
<li>こんにちは
<li>Dobrý den,
<li>สวัสดี
^D
<html>
    <head>
        <title>
            Hello in different language
        </title>
    </head>
    <body>
        <p>
            Here is "hello" in different languages
        </p>
        <ul>
            <li>
                Hello
            </li>
            <li>
                您好
            </li>
            <li>
                こんにちは
            </li>
            <li>
                Dobrý den,
            </li>
            <li>
                สวัสดี
            </li>
        </ul>
    </body>
</html>
$

Python API

This module requires Python 2.6+ (should work for Python 3.0 and 3.1 but was not tested).

Pretty Print HTML

>>> from html5print import HTMLBeautifier
>>> html = '<title>Page Title</title><p>Some text here</p>'
>>> print(HTMLBeautifier.beautify(html, 4))
<html>
    <head>
        <title>
            Testing
        </title>
    </head>
    <body>
        <p>
            Some Text
        </p>
    </body>
</html>
<BLANKLINE>
>>>

Pretty Print CSS

Format common CSS

>>> from html5print import CSSBeautifier
>>> css = """
... .para { margin: 10px 20px;
... /* Cette règle contrôle l'espacement de tous les côtés \*/"""
>>> print(CSSBeautifier.beautify(css, 4))
.para {
    margin              : 10px 20px; /* Cette règle contrôle l'espacement de tous les côtés \*/
}

Format media query

>>> from html5print import CSSBeautifier
>>> css = '''@media (-webkit-min-device-pixel-ratio:0) {
... h2.collapse { margin: -22px 0 22px 18px;
... }
... ::i-block-chrome, h2.collapse { margin: 0 0 22px 0; } }
... '''
>>> print(CSSBeautifier.beautify(css, 4))
@media (-webkit-min-device-pixel-ratio:0) {
    h2.collapse {
        margin              : -22px 0 22px 18px;
    }
    ::i-block-chrome, h2.collapse {
        margin              : 0 0 22px 0;
    }
}

Pretty Print JavaScript

>>> from html5print import JSBeautifier
>>> js = '''
... "use strict"; /* Des bribes de commentaires ici et là \*/
... function MSIsPlayback() { try { return parent && parent.WebPlayer }
... catch (e) { return !1 } }
... '''
>>> print(JSBeautifier.beautify(js, 4))
"use strict"; /* Des bribes de commentaires ici et là \*/

function MSIsPlayback() {
    try {
        return parent && parent.WebPlayer
    } catch (e) {
        return !1
    }
}

Testing

The module uses pytest. Use pip to install pytest if you do not have it installed.

$ [sudo] pip install pytest

Then checkout source code and run test as normal.

$ git clone https://github.com/berniey/html5print.git
$ python setup.py test

You are encouraged to use virtualenv and virtualenvwrapper to avoid changing your currently operating environment.

License

This module is distributed under Apache License Version 2.0.

html5print's People

Contributors

Stargazers

Watchers

Forkers

svisser kriechi gzzz whalebot-helmsman expobrain neeky abezgauz mbrg viable-hartman hydrobuilder arlm mathben nasingfaund

html5print's Issues

How can auto detect input language (html/css/js) then return pretty code?

Hello sir.
I am trying html5print in django. It' seem good, but how can make html5print auto detect input language then return pretty code.
thanks man.

JSBeautifier throws TypeError under Python 3.10.4 on Windows 10

Addon error: Traceback (most recent call last):
  File "recordTraffic.py", line 134, in response
    outFile.write(JSBeautifier.beautify(flow.response.content.decode(charSet), 2))
  File "C:\bin\Python-3.10.4\lib\site-packages\html5print\jsprint.py", line 93, in beautify
    tree = parser.parse(decodeText(js))
  File "C:\bin\Python-3.10.4\lib\site-packages\slimit\parser.py", line 93, in parse
    return self.parser.parse(text, lexer=self.lexer, debug=debug)
  File "C:\bin\Python-3.10.4\lib\site-packages\ply\yacc.py", line 265, in parse
    return self.parseopt_notrack(input,lexer,debug,tracking,tokenfunc)
  File "C:\bin\Python-3.10.4\lib\site-packages\ply\yacc.py", line 971, in parseopt_notrack
    p.callable(pslice)
  File "C:\bin\Python-3.10.4\lib\site-packages\slimit\parser.py", line 1101, in p_case_block
    p[0] = p[2:-1]
  File "C:\bin\Python-3.10.4\lib\site-packages\ply\yacc.py", line 198, in __getitem__
    if n >= 0: return self.slice[n].value
TypeError: '>=' not supported between instances of 'slice' and 'int'

I am using html5print to pretty print JS code in an MITMproxy addon

Failed install on debian 8

I am install html5print but occur an error as below
"pip install html5print
Collecting html5print
Using cached html5print-0.1.1.tar.gz
Collecting beautifulsoup4>=4.3.2 (from html5print)
Using cached beautifulsoup4-4.4.1-py3-none-any.whl
Collecting chardet>=2.2.1 (from html5print)
Using cached chardet-2.3.0.tar.gz
Collecting html5lib>=0.999 (from html5print)
Using cached html5lib-0.9999999.tar.gz
Collecting requests>=2.3.5 (from html5print)
Using cached requests-2.9.1-py2.py3-none-any.whl
Collecting slimit>=0.8.1 (from html5print)
Using cached slimit-0.8.1.zip
Collecting tinycss2>=0.4 (from html5print)
Using cached tinycss2-0.5.tar.gz
Collecting ply==3.4 (from html5print)
Using cached ply-3.4.tar.gz
Requirement already satisfied (use --upgrade to upgrade): six in ./v3env/lib/python3.4/site-packages (from html5lib>=0.999->html5print)
Collecting webencodings>=0.4 (from tinycss2>=0.4->html5print)
Using cached webencodings-0.4.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "", line 1, in
File "/tmp/pip-build-dha2bzr1/webencodings/setup.py", line 8, in
).read().strip()).group(1)
File "/opt/projects/v3env/lib/python3.4/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 1663: ordinal not in range(128)

----------------------------------------

Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-dha2bzr1/webencodings/
"

Please tell me how to fix? Thanks

Use lib as standalone

How to use this lib as standalone? Without pip install or etc.

HTML5 Print footer not stick to the bottom to the last page of the print

As we tried in a simple HTML5 file and then we press Ctrl + P for print preview, the footer of the page cant be stuck to the bottom of the last page. in my print.

we have not decided on the number of pages in a single print but I need to stick my footer to the bottom of the last page doesn't matter how many pages are there in the print.

Framework: HTML5

sys.version should not be used for version comparisons

The string sys.version should not be used to determine the current Python version as there is no guarantee the version number is at the beginning of the string. Instead, the tuple sys.version_info should be used (which you do in some cases but not everywhere).

ModuleNotFoundError: No module named 'minifier'

from html5print import HTMLBeautifier
Traceback (most recent call last):
File "", line 1, in
File "/home/XXXXX/.local/lib/python3.9/site-packages/html5print/init.py", line 20, in
from .jsprint import JSBeautifier
File "/home/XXXXX/.local/lib/python3.9/site-packages/html5print/jsprint.py", line 22, in
import slimit
File "/home/XXXXX/.local/lib/python3.9/site-packages/slimit/init.py", line 27, in
from minifier import minify
ModuleNotFoundError: No module named 'minifier'

TypeError: argument of type 'module' is not iterable

I'm getting "TypeError: argument of type 'module' is not iterable" when trying to run this code.
http://pastebin.com/mvx4kYyL

Additional Space After Anchor

I have the following markup <p>This sentence end with a <a href="#">link</a>.</p>. HTMLBeautifier.beautify() creates

<p>
  This sentence end with a
  <a href="#">
    link
  </a>
  .
</p>

Which adds an additional space after the link.

How unwrap code line

How can i make unwrap long line of code (show on one line - scrollable).
Thanks

SyntaxError: Unexpected token in application/ld+json in HTML (slimit parser error)

Error SyntaxError: Unexpected token (COLON, u':') at 1:33 between LexToken(STRING,u'"@context"',1,22) and LexToken(STRING,u'"http://schema.org"',1,35) on valid JSON-LD markup in HTML.

#!/usr/bin/env python
# coding: utf8

from html5print import HTMLBeautifier

html = '''<!DOCTYPE html>
<html>
  <head></head>
  <body>
    <script type="application/ld+json">
      {
        "@context": "http://schema.org",
        "@type": "Organization",
        "name": "name",
        "url": "http://www.example.com/"
      }
    </script>
  </body>
</html>'''

print HTMLBeautifier.beautify(html)

The result:

python prettify-json-ld-error.py
Traceback (most recent call last):
  File "D:\\prettify-json-ld-error.py", line 21, in <module>
    print HTMLBeautifier.beautify(html)
  File "D:\python\lib\html5print\html5print.py", line 111, in beautify
    html = JSBeautifier.beautifyTextInHTML(html, indent, encoding)
  File "D:\python\lib\html5print\jsprint.py", line 127, in beautifyTextInHTML
    cls.beautify, (indent,), indent)
  File "D:\python\lib\html5print\utils.py", line 197, in _findAndReplace
    lines = [thisIndent + l for l in bfunc(*params).splitlines()]
  File "D:\python\lib\html5print\jsprint.py", line 93, in beautify
    tree = parser.parse(decodeText(js))
  File "D:\python\lib\slimit\parser.py", line 93, in parse
    return self.parser.parse(text, lexer=self.lexer, debug=debug)
  File "D:\python\lib\ply\yacc.py", line 265, in parse
    return self.parseopt_notrack(input,lexer,debug,tracking,tokenfunc)
  File "D:\python\lib\ply\yacc.py", line 1047, in parseopt_notrack
    tok = self.errorfunc(errtoken)
  File "D:\python\lib\slimit\parser.py", line 116, in p_error
    self._raise_syntax_error(token)
  File "D:\python\lib\slimit\parser.py", line 89, in _raise_syntax_error
    self.lexer.prev_token, self.lexer.token())
SyntaxError: Unexpected token (COLON, u':') at 1:20 between LexToken(STRING,u'"context"',1,11) and LexToken(STRING,u'"http://schema.org
"',1,22)

May be should ignore scripts what cannot be parsed or use JSON-LD parser?