tutorcruncher / pydf Goto Github PK
View Code? Open in Web Editor NEWPDF generation in python using wkhtmltopdf for heroku and docker
License: MIT License
PDF generation in python using wkhtmltopdf for heroku and docker
License: MIT License
Hi! I love your work. I am trying to make page numbers alternately appear at the bottom left for odd pages and bottom right for even pages. Is there a way to do that?
What I have so far is this...
wkhtmltopdf_path = 'C:/Program Files/wkhtmltopdf/bin/wkhtmltopdf.exe'
config = pdfkit.configuration(wkhtmltopdf=wkhtmltopdf_path)
options = {
'footer-center': '~ [page] of [topage] ~',
We have the ability to set whtmltopdf options via HTTP Header,
but that only works for argument with a value I'm afraid.
HTTP headers are read here:
Lines 39 to 41 in 8f9ce76
Then in:
Lines 33 to 42 in 8f9ce76
The only way to get the argument on the command line is to have a real True
.
So via the API we can't get options like --disable-javascript
or even --lowquality
.
Hi there
Could be me being new to Python, but I do think I've go plenty of decades of programming tomake up for that bit of ignorance
I was attracted to PYDF because it can run async.
I'm already up and saving PDF files using the PDFKIT module.
I was sorta surprised that I needed to modify the source code to get pydf to run on Windows.
I needed to add an Environment variable pointing to the WkhltmlPDF.exe file. OK, not such a bad thing, but would have been nice to see in the installation for windows notes. I also changed the wkhtmltopdf.py file because the executable did not was the required .EXE on the end of the filename.
OK, got past that initial install kinda problems but I'm still crashing.
So, taking a moment to log an Issue to see perhaps I've bitten off more than I can chew. I was looking for something off the shelf that I didn't need to modify the package source for.
Maybe I didn't read the docs carefully enough??
Sorry to be a moron. It's my 3rd week on Python and Github, etc. Still, I've come a very long ways!!
==========================================
Curious what IDE folks use on these projects. I've been using PyCharm and I LOVE it! Wow, what a treat to have a full decbugger with breakpoints and a great variable browser. I tried Thromber, but it didn't work quite as well.
==========================================
Thank you to the authors of this package just the same! And I appreciate anyone that takes the time to answer stupid questions.
When image link is presented in html, renderer just ignoring it.
HTML:
<html><style>
@page {
size: 9.25in 6.25in;
margin: 0;
}
@font-face {
font-family: "Arial";
src: url("https://f001.backblazeb2.com/file/inkit-cdn/arial.ttf")
format("truetype");
}
@font-face {
font-family: "Times New Roman";
src: url("https://f001.backblazeb2.com/file/inkit-cdn/times-new-roman.ttf")
format("truetype");
}
@font-face {
font-family: "Courier New";
src: url("https://f001.backblazeb2.com/file/inkit-cdn/courier-new.ttf")
format("truetype");
}
</style><body style="user-select:none;margin:0"><div style="width:9.25in;height:6.25in;position:relative"><div><div style="position:absolute;left:0in;top:0in;z-index:5002;right:0;bottom:0;padding:0.1875in;background:url(file:///home/user/Projects/services/pdf-renderer/service/eef55f7f-7bed-44c1-ab2d-0d11c61d4fd7.png) no-repeat center center / cover"><div style="position:relative"></div></div></div><div style="position:absolute;left:0.1875in;top:0.1875in;right:0.1875in;bottom:0.1875in"></div></div></body></html>
Also tried base64 string.
Now incompatible with Heroku-18.
RuntimeError: error running wkhtmltopdf, command: ['--cache-dir', '/tmp/pydf_cache', '--margin-bottom', '5mm', '--margin-top', '5mm', '--orientation', 'Portrait', '--page-size', 'Legal', '-', '-']
response: "/app/.heroku/python/lib/python3.7/site-packages/pydf/bin/wkhtmltopdf: error while loading shared libraries: libXrender.so.1: cannot open shared object file: No such file or directory"
@samuelcolvin are you still maintaining this? thanks....
hi.
i have a script working alright in linux (openuse).
tried to use it in a windows machine and doesnt.
even tried the simple example:
pdf = pydf.generate_pdf('<h1>this is html</h1>')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\ProgramData\Miniconda3\lib\site-packages\pydf\wkhtmltopdf.py", line 145, in generate_pdf
p = _execute_wk(*cmd_args, input=html.encode())
File "C:\ProgramData\Miniconda3\lib\site-packages\pydf\wkhtmltopdf.py", line 30, in _execute_wk
return subprocess.run(wk_args, input=input, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
File "C:\ProgramData\Miniconda3\lib\subprocess.py", line 403, in run
with Popen(*popenargs, **kwargs) as process:
File "C:\ProgramData\Miniconda3\lib\subprocess.py", line 707, in __init__
restore_signals, start_new_session)
File "C:\ProgramData\Miniconda3\lib\subprocess.py", line 990, in _execute_child
startupinfo)
OSError: [winerror 193]: %1 not a valid win32 program
Note: i traslated last line "OsError..". my machine threw that in spanish.
any trick to avoid this Error?
thanks.
How to render external files: css, images, fonts??
ar_template = """
<html>
<head>
<title>arabic</title>
<meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
<style type="text/css">
@font-face {{
font-family: "Font";
src: url("font.ttf") format("truetype");
}}
#ar{{
font-family:Font;
font-size: 36px;
margin: 30px;
border: 1px solid black;
}}
</style>
</head>
<body>
<p id="ar" dir="rtl" lang="ar">{content}</p>
</body>
</html>
"""
I wrote external font link. it didnot work
I'm using pydf with the wkhtmltopdf "footer-text" option,
but as every parameters are lower cased in:
Line 41 in 8f9ce76
My text content is not properly passed througt wkhtmltopdf.
I don't know if there is a reason for this lower()
call on all argument values? It's ok for the key I guess.
curl -d '<h1>this is html</h1>' -H "pdf-footer-right: UPPERCASE [page]/[topage]" http://localhost:8000/generate.pdf > created.pdf
uppercase 1/1
UPPERCASE 1/1
I couldn't use pydf inside docker.
I am using pypoetry. And I installed packages like this.
FROM python:3.8
RUN mkdir /src
WORKDIR /src
COPY . /src
RUN curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python
ENV PATH="/root/.poetry/bin:$PATH"
COPY pyproject.toml poetry.lock /opt/project/
RUN poetry config virtualenvs.create false && \
poetry install --no-dev
COPY . /opt/project/
RUN poetry install --no-dev
It is my code where I used pydf
from pydf import AsyncPydf
from io import BytesIO
apydf = AsyncPydf()
async def pdf_file():
pdf_content = await apydf.generate_pdf("<h1>hello world</h1>")
bytes_ = BytesIO(pdf_content)
bytes_name = "file.pdf"
return bytes_
It raises error
File "/src/ptime/core/tools.py", line 189, in pdf_file
pdf_content = await apydf.generate_pdf("<h1>hello world</h1>")
File "/usr/local/lib/python3.8/site-packages/pydf/wkhtmltopdf.py", line 73, in generate_pdf
raise RuntimeError('error running wkhtmltopdf, command: {!r}\n'
RuntimeError: error running wkhtmltopdf, command: ['/usr/local/lib/python3.8/site-packages/pydf/bin/wkhtmltopdf', '--cache-dir', '/tmp/pydf_cache', '-', '-']
response: "/usr/local/lib/python3.8/site-packages/pydf/bin/wkhtmltopdf: error while loading shared libraries: libjpeg.so.8: cannot open shared object file: No such file or directory"
How can I fix this?
#! /usr/bin/python
import asyncio
import aiohttp
import uvloop
from pydf import AsyncPydf
html = "<html>hello</html>"
uvloop.install() # toggle this line on/off
async def generate_async():
apydf = AsyncPydf()
await apydf.generate_pdf(html)
asyncio.run(generate_async())
Error:
untimeError: error running wkhtmltopdf, command: ['/home/sevaho/.local/share/virtualenvs/test-hTe5GLvs/lib/python3.8/sit
e-packages/pydf/bin/wkhtmltopdf', '--cache-dir', '/tmp/pydf_cache', '-', '-']
response: "Loading pages (1/6)
QPainter::begin(): Returned false============================] 100%
Error: Unable to write to destination
Exit with code 1, due to unknown error."
In the rendered PDF images are not loaded.
here is main.py
import pydf
html_str = """
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta name="viewport" content="width=device-width,initial-scale=1" />
<title>بن اعتباری</title>
</head>
<body>
<img src="./pizza.jpg" />
</body>
</html>
"""
pdf = pydf.generate_pdf(html_str)
with open("test_doc.pdf", "wb") as f:
f.write(pdf)
and image exists in that directory and everything is fine in html file
Traceback (most recent call last):
File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/home/bogdan/Projects/inkit_microservices/services/pdf-renderer/service/consumer.py", line 19, in run
PDFRenderer(self.campaign_id)
File "/home/bogdan/Projects/inkit_microservices/services/pdf-renderer/service/renderer.py", line 58, in __init__
self.ioloop.run_until_complete(self.process_contacts())
File "/usr/lib/python3.6/asyncio/base_events.py", line 468, in run_until_complete
return future.result()
File "/home/bogdan/Projects/inkit_microservices/services/pdf-renderer/service/renderer.py", line 67, in process_contacts
await asyncio.gather(*tasks)
File "/home/bogdan/Projects/inkit_microservices/services/pdf-renderer/service/renderer.py", line 100, in render_pdf
front_pdf = await self.apydf.generate_pdf(front_html)
File "/home/bogdan/Projects/inkit_microservices/services/pdf-renderer/.env/lib/python3.6/site-packages/pydf/wkhtmltopdf.py", line 65, in generate_pdf
loop=self.loop
File "/usr/lib/python3.6/asyncio/subprocess.py", line 225, in create_subprocess_exec
stderr=stderr, **kwds)
File "/usr/lib/python3.6/asyncio/base_events.py", line 1194, in subprocess_exec
bufsize, **kwargs)
File "/usr/lib/python3.6/asyncio/unix_events.py", line 203, in _make_subprocess_transport
self._child_watcher_callback, transp)
File "/usr/lib/python3.6/asyncio/unix_events.py", line 867, in add_child_handler
"Cannot add child handler, "
RuntimeError: Cannot add child handler, the child watcher does not have a loop attached
Not sure if this library is still supported, but FWIW:
Python 3.6.2 (default, Jul 17 2017, 16:44:45)
Type 'copyright', 'credits' or 'license' for more information
IPython 6.2.1 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import pydf
In [2]: p = pydf.generate_pdf('http://www.google.com')
---------------------------------------------------------------------------
OSError Traceback (most recent call last)
<ipython-input-2-911a3ffb27a6> in <module>()
----> 1 p = pydf.generate_pdf('http://www.google.com')
~/.virtualenvs/barberscore-api/lib/python3.6/site-packages/pydf/wkhtmltopdf.py in generate_pdf(html, cache_dir, grayscale, lowquality, margin_bottom, margin_left, margin_right, margin_top, orientation, page_height, page_width, page_size, image_dpi, image_quality, **extra_kwargs)
143 cmd_args = _convert_args(**py_args)
144
--> 145 p = _execute_wk(*cmd_args, input=html.encode())
146 pdf_content = p.stdout
147
~/.virtualenvs/barberscore-api/lib/python3.6/site-packages/pydf/wkhtmltopdf.py in _execute_wk(input, *args)
28 """
29 wk_args = (WK_PATH,) + args
---> 30 return subprocess.run(wk_args, input=input, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
31
32
/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/subprocess.py in run(input, timeout, check, *popenargs, **kwargs)
401 kwargs['stdin'] = PIPE
402
--> 403 with Popen(*popenargs, **kwargs) as process:
404 try:
405 stdout, stderr = process.communicate(input, timeout=timeout)
/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/subprocess.py in __init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, encoding, errors)
705 c2pread, c2pwrite,
706 errread, errwrite,
--> 707 restore_signals, start_new_session)
708 except:
709 # Cleanup if the child failed starting.
/usr/local/Cellar/python3/3.6.2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/subprocess.py in _execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, restore_signals, start_new_session)
1331 else:
1332 err_msg += ': ' + repr(orig_executable)
-> 1333 raise child_exception_type(errno_num, err_msg)
1334 raise child_exception_type(err_msg)
1335
OSError: [Errno 8] Exec format error
I am running the basic example as follow:
import pydf
pdf = pydf.generate_pdf('<h1>this is html</h1>')
with open('test_doc.pdf', 'wb') as f:
f.write(pdf)
and I get the following error:
Traceback (most recent call last):
File "pdytest.py", line 2, in <module>
pdf = pydf.generate_pdf('<h1>this is html</h1>')
File "C:\Program Files\Python37\lib\site-packages\pydf\wkhtmltopdf.py", line 145, in generate_pdf
p = _execute_wk(*cmd_args, input=html.encode())
File "C:\Program Files\Python37\lib\site-packages\pydf\wkhtmltopdf.py", line 30, in _execute_wk
return subprocess.run(wk_args, input=input, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
File "C:\Program Files\Python37\lib\subprocess.py", line 472, in run
with Popen(*popenargs, **kwargs) as process:
File "C:\Program Files\Python37\lib\subprocess.py", line 775, in __init__
restore_signals, start_new_session)
File "C:\Program Files\Python37\lib\subprocess.py", line 1178, in _execute_child
startupinfo)
OSError: [WinError 193] %1 is not a valid Win32 application
I am using python 3.7.
Could you please help?
Thank you.
When running the following:
async def show(request, response):
response.body = await pydf.AsyncPydf().generate_pdf(
my_html, print_media_type=True
)
it works well on MacOSX, but getting the following error on Linux:
Loading pages (1/6)
QPainter::begin(): Returned false============================] 100%
Error: Unable to write to destination
Exit with code 1, due to unknown error.
Note that this is probably due to the following difference of behavior:
wkhtmltopdf http://google.com /tmp/pydf_cache
works well on MacOSX and returns the arror above on Linux.
To solve this with wkhtmltopdf:
wkhtmltopdf http://google.com /tmp/pydf_cache/myfile
I suppose that pydf
should create a temporary file instead of writing to the tmp folder for cross platform.
Is there any option I missed?
Dear Samuel Colvin,
I am using your lib to generate pdf file. However, I found that it have a bug when it generate this link
https://www.highcharts.com/blog/news/175-highcharts-performance-boost.
Error response: "wkhtmltopdf: /home/sysadmin/wkhtmltopdf/qt/src/3rdparty/harfbuzz/src/harfbuzz-shaper.cpp:484: void HB_HeuristicSetGlyphAttributes(HB_ShaperItem*): Assertion `glyph_pos == item->num_glyphs' failed."
I think it should update a new patch of library, please kindly refer to this fix ariya/phantomjs#11513. Two files harfbuzz-hebrew.c and harfbuzz-shaper.cpp are needed to be fixed.
Would you give me some tutors of how to re-build the wkhtmltopdf binary for AWS lambda or if you don't mind would you please fix this bug and upload new wkhtmltopdf binary.
Thank you very much and I am looking forward to your reply.
Best Regards,
Andy
In the "wkhtmltopdf.py" file it has mention "Generate a pdf from either a url or a html string", how to pass in generate via URL?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.