Giter Site home page Giter Site logo

alvarcarto / url-to-pdf-api Goto Github PK

View Code? Open in Web Editor NEW
7.0K 125.0 765.0 5.01 MB

Web page PDF/PNG rendering done right. Self-hosted service for rendering receipts, invoices, or any content.

License: MIT License

JavaScript 0.55% HTML 99.45% Procfile 0.01%
pdf chrome receipt heroku puppeteer invoice heroku-button html headless headless-chrome

url-to-pdf-api's Introduction

Deploy

Build Status

URL to PDF Microservice

Web page PDF rendering done right. Microservice for rendering receipts, invoices, or any content. Packaged to an easy API.

Logo

⚠️ WARNING ⚠️ Don't serve this API publicly to the internet unless you are aware of the risks. It allows API users to run any JavaScript code inside a Chrome session on the server. It's fairly easy to expose the contents of files on the server. You have been warned!. See #12 for background.

⭐️ Features:

  • Converts any URL or HTML content to a PDF file or an image (PNG/JPEG)
  • Rendered with Headless Chrome, using Puppeteer. The PDFs should match to the ones generated with a desktop Chrome.
  • Sensible defaults but everything is configurable.
  • Single-page app (SPA) support. Waits until all network requests are finished before rendering.
  • Easy deployment to Heroku. We love Lambda but...Deploy to Heroku button.
  • Renders lazy loaded elements. (scrollPage option)
  • Supports optional x-api-key authentication. (API_TOKENS env var)

Usage is as simple as https://url-to-pdf-api.herokuapp.com/api/render?url=http://google.com. There's also a POST /api/render if you prefer to send options in the body.

🔍 Why?

This microservice is useful when you need to automatically produce PDF files for whatever reason. The files could be receipts, weekly reports, invoices, or any content.

PDFs can be generated in many ways, but one of them is to convert HTML+CSS content to a PDF. This API does just that.

🚀 Shortcuts:

How it works

Local setup is identical except Express API is running on your machine and requests are direct connections to it.

Good to know

  • By default, page's @media print CSS rules are ignored. We set Chrome to emulate @media screen to make the default PDFs look more like actual sites. To get results closer to desktop Chrome, add &emulateScreenMedia=false query parameter. See more at Puppeteer API docs.

  • Chrome is launched with --no-sandbox --disable-setuid-sandbox flags to fix usage in Heroku. See this issue.

  • Heavy pages may cause Chrome to crash if the server doesn't have enough RAM.

  • Docker image for this can be found here: https://github.com/restorecommerce/pdf-rendering-srv

Examples

⚠️ Restrictions ⚠️:

  • For security reasons the urls have been restricted and HTML rendering is disabled. For full demo, run this app locally or deploy to Heroku.
  • The demo Heroku app runs on a free dyno which sleep after idle. A request to sleeping dyno may take even 30 seconds.

The most minimal example, render google.com

https://url-to-pdf-api.herokuapp.com/api/render?url=http://google.com

The most minimal example, render google.com as PNG image

https://url-to-pdf-api.herokuapp.com/api/render?output=screenshot&url=http://google.com

Use the default @media print instead of @media screen.

https://url-to-pdf-api.herokuapp.com/api/render?url=http://google.com&emulateScreenMedia=false

Use scrollPage=true which tries to reveal all lazy loaded elements. Not perfect but better than without.

https://url-to-pdf-api.herokuapp.com/api/render?url=http://www.andreaverlicchi.eu/lazyload/demos/lazily_load_lazyLoad.html&scrollPage=true

Render only the first page.

https://url-to-pdf-api.herokuapp.com/api/render?url=https://en.wikipedia.org/wiki/Portable_Document_Format&pdf.pageRanges=1

Render A5-sized PDF in landscape.

https://url-to-pdf-api.herokuapp.com/api/render?url=http://google.com&pdf.format=A5&pdf.landscape=true

Add 2cm margins to the PDF.

https://url-to-pdf-api.herokuapp.com/api/render?url=http://google.com&pdf.margin.top=2cm&pdf.margin.right=2cm&pdf.margin.bottom=2cm&pdf.margin.left=2cm

Wait for extra 1000ms before render.

https://url-to-pdf-api.herokuapp.com/api/render?url=http://google.com&waitFor=1000

Download the PDF with a given attachment name

https://url-to-pdf-api.herokuapp.com/api/render?url=http://google.com&attachmentName=google.pdf

Wait for an element matching the selector input appears.

https://url-to-pdf-api.herokuapp.com/api/render?url=http://google.com&waitFor=input

Render HTML sent in JSON body

NOTE: Demo app has disabled html rendering for security reasons.

curl -o html.pdf -XPOST -d'{"html": "<body>test</body>"}' -H"content-type: application/json" http://localhost:9000/api/render

Render HTML sent as text body

NOTE: Demo app has disabled html rendering for security reasons.

curl -o html.pdf -XPOST -d@test/resources/large.html -H"content-type: text/html" http://localhost:9000/api/render

API

To understand the API options, it's useful to know how Puppeteer is internally used by this API. The render code is quite simple, check it out. Render flow:

  1. page.setViewport(options) where options matches viewport.*.

  2. Possibly page.emulateMedia('screen') if emulateScreenMedia=true is set.

  3. Render url or html.

    If url is defined, page.goto(url, options) is called and options match goto.*. Otherwise page.setContent(html, options) is called where html is taken from request body, and options match goto.*.

  4. Possibly page.waitFor(numOrStr) if e.g. waitFor=1000 is set.

  5. Possibly Scroll the whole page to the end before rendering if e.g. scrollPage=true is set.

    Useful if you want to render a page which lazy loads elements.

  6. Render the output

  • If output is pdf rendering is done with page.pdf(options) where options matches pdf.*.
  • Else if output is screenshot rendering is done with page.screenshot(options) where options matches screenshot.*.

GET /api/render

All options are passed as query parameters. Parameter names match Puppeteer options.

These options are exactly the same as its POST counterpart, but options are expressed with the dot notation. E.g. ?pdf.scale=2 instead of { pdf: { scale: 2 }}.

The only required parameter is url.

Parameter Type Default Description
url string - URL to render as PDF. (required)
output string pdf Specify the output format. Possible values: pdf , screenshot or html.
emulateScreenMedia boolean true Emulates @media screen when rendering the PDF.
enableGPU boolean false When set, enables chrome GPU. For windows user, this will always return false. See https://developers.google.com/web/updates/2017/04/headless-chrome
ignoreHttpsErrors boolean false Ignores possible HTTPS errors when navigating to a page.
scrollPage boolean false Scroll page down before rendering to trigger lazy loading elements.
waitFor number or string - Number in ms to wait before render or selector element to wait before render.
attachmentName string - When set, the content-disposition headers are set and browser will download the PDF instead of showing inline. The given string will be used as the name for the file.
viewport.width number 1600 Viewport width.
viewport.height number 1200 Viewport height.
viewport.deviceScaleFactor number 1 Device scale factor (could be thought of as dpr).
viewport.isMobile boolean false Whether the meta viewport tag is taken into account.
viewport.hasTouch boolean false Specifies if viewport supports touch events.
viewport.isLandscape boolean false Specifies if viewport is in landscape mode.
cookies[0][name] string - Cookie name (required)
cookies[0][value] string - Cookie value (required)
cookies[0][url] string - Cookie url
cookies[0][domain] string - Cookie domain
cookies[0][path] string - Cookie path
cookies[0][expires] number - Cookie expiry in unix time
cookies[0][httpOnly] boolean - Cookie httpOnly
cookies[0][secure] boolean - Cookie secure
cookies[0][sameSite] string - Strict or Lax
goto.timeout number 30000 Maximum navigation time in milliseconds, defaults to 30 seconds, pass 0 to disable timeout.
goto.waitUntil string networkidle0 When to consider navigation succeeded. Options: load, domcontentloaded, networkidle0, networkidle2. load - consider navigation to be finished when the load event is fired. domcontentloaded - consider navigation to be finished when the DOMContentLoaded event is fired. networkidle0 - consider navigation to be finished when there are no more than 0 network connections for at least 500 ms. networkidle2 - consider navigation to be finished when there are no more than 2 network connections for at least 500 ms.
pdf.scale number 1 Scale of the webpage rendering.
pdf.printBackground boolean false Print background graphics.
pdf.displayHeaderFooter boolean false Display header and footer.
pdf.headerTemplate string - HTML template to use as the header of each page in the PDF. Currently Puppeteer basically only supports a single line of text and you must use pdf.margins+CSS to make the header appear! See #77.
pdf.footerTemplate string - HTML template to use as the footer of each page in the PDF. Currently Puppeteer basically only supports a single line of text and you must use pdf.margins+CSS to make the footer appear! See #77.
pdf.landscape boolean false Paper orientation.
pdf.pageRanges string - Paper ranges to print, e.g., '1-5, 8, 11-13'. Defaults to the empty string, which means print all pages.
pdf.format string A4 Paper format. If set, takes priority over width or height options.
pdf.width string - Paper width, accepts values labeled with units.
pdf.height string - Paper height, accepts values labeled with units.
pdf.fullPage boolean - Create PDF in a single page
pdf.margin.top string - Top margin, accepts values labeled with units.
pdf.margin.right string - Right margin, accepts values labeled with units.
pdf.margin.bottom string - Bottom margin, accepts values labeled with units.
pdf.margin.left string - Left margin, accepts values labeled with units.
screenshot.fullPage boolean true When true, takes a screenshot of the full scrollable page.
screenshot.type string png Screenshot image type. Possible values: png, jpeg
screenshot.quality number - The quality of the JPEG image, between 0-100. Only applies when screenshot.type is jpeg.
screenshot.omitBackground boolean false Hides default white background and allows capturing screenshots with transparency.
screenshot.clip.x number - Specifies x-coordinate of top-left corner of clipping region of the page.
screenshot.clip.y number - Specifies y-coordinate of top-left corner of clipping region of the page.
screenshot.clip.width number - Specifies width of clipping region of the page.
screenshot.clip.height number - Specifies height of clipping region of the page.
screenshot.selector string - Specifies css selector to clip the screenshot to.

Example:

curl -o google.pdf https://url-to-pdf-api.herokuapp.com/api/render?url=http://google.com

POST /api/render - (JSON)

All options are passed in a JSON body object. Parameter names match Puppeteer options.

These options are exactly the same as its GET counterpart.

Body

The only required parameter is url.

{
  // Url to render. Either url or html is required
  url: "https://google.com",

  // Either "pdf" or "screenshot"
  output: "pdf",

  // HTML content to render. Either url or html is required
  html: "<html><head></head><body>Your content</body></html>",

  // If we should emulate @media screen instead of print
  emulateScreenMedia: true,

  // If we should ignore HTTPS errors
  ignoreHttpsErrors: false,

  // If true, page is scrolled to the end before rendering
  // Note: this makes rendering a bit slower
  scrollPage: false,

  // Passed to Puppeteer page.waitFor()
  waitFor: null,

  // Passsed to Puppeteer page.setCookies()
  cookies: [{ ... }]

  // Passed to Puppeteer page.setViewport()
  viewport: { ... },

  // Passed to Puppeteer page.goto() as the second argument after url
  goto: { ... },

  // Passed to Puppeteer page.pdf()
  pdf: { ... },

  // Passed to Puppeteer page.screenshot()
  screenshot: { ... },
}

Example:

curl -o google.pdf -XPOST -d'{"url": "http://google.com"}' -H"content-type: application/json" http://localhost:9000/api/render
curl -o html.pdf -XPOST -d'{"html": "<body>test</body>"}' -H"content-type: application/json" http://localhost:9000/api/render

POST /api/render - (HTML)

HTML to render is sent in body. All options are passed in query parameters. Supports exactly the same query parameters as GET /api/render, except url paremeter.

Remember that relative links do not work.

Example:

curl -o receipt.html https://rawgit.com/wildbit/postmark-templates/master/templates_inlined/receipt.html
curl -o html.pdf -XPOST [email protected] -H"content-type: text/html" http://localhost:9000/api/render?pdf.scale=1

GET /healthcheck

Health check endpoint used for monitoring if the service is still up and running.

curl -XGET http://localhost:9000/healthcheck

Development

To get this thing running, you have two options: run it in Heroku, or locally.

The code requires Node 8+ (async, await).

1. Heroku deployment

Scroll this readme up to the Deploy to Heroku -button. Click it and follow instructions.

WARNING: Heroku dynos have a very low amount of RAM. Rendering heavy pages may cause Chrome instance to crash inside Heroku dyno. 512MB should be enough for most real-life use cases such as receipts. Some news sites may need even 2GB of RAM.

2. Local development

First, clone the repository and cd into it.

  • cp .env.sample .env

  • Fill in the blanks in .env

  • npm install

  • npm start Start express server locally

  • Server runs at http://localhost:9000 or what $PORT env defines

Techstack

url-to-pdf-api's People

Contributors

aman601 avatar andreyshishkanov avatar anilkapoorwingify avatar arcatdmz avatar danielruf avatar elroadster avatar esvitaly avatar guptarohit avatar keefbaker avatar kimmobrunfeldt avatar kjagiello avatar lanre-ade avatar louim avatar marconi avatar maxstalker avatar micgro42 avatar nicky9door avatar nkimadusanka avatar onagurna avatar steakunderscore avatar tomasc avatar tranv94 avatar tsingwong avatar vanthome avatar yundifu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

url-to-pdf-api's Issues

Font weight ignored

I have an issue, whatever font-weight property I set it's being ignored. When I open html in Chrome it looks fine, but when I generate pdf from it everything has 'regular' font weight. Has anyone experienced this issue? Is there a workaround?

Add support to 0 timeout requests

As written inside README.md, you can pass the value 0 to goto.timeout when performing a request to /api/render. However, this is supported from puppeteer 0.12.0, while this project currently uses 0.11.0.
I've seen that there is a branch for updating to the newest puppeteer, so this is maybe a non-issue (but the README could be updated before we actually merge the new branch)

ERR_CERT_AUTHORITY_INVALID

In my corporation we have self-signed certs, which causes to throw errors. How do I disable SSL?

2017-10-11T20:44:32.919Z - info: [pdf-core.js] Set browser viewport..
2017-10-11T20:44:32.920Z - info: [pdf-core.js] Emulate @media screen..
2017-10-11T20:44:32.921Z - info: [pdf-core.js] Goto url http://google.com ..
2017-10-11T20:44:33.689Z - error: [pdf-core.js] Error when rendering page: Error: SSL Certificate error: ERR_CERT_AUTHORITY_INVALID
2017-10-11T20:44:33.689Z - error: [pdf-core.js] Error: SSL Certificate error: ERR_CERT_AUTHORITY_INVALID
    at NavigatorWatcher.waitForNavigation (/usr/src/app/node_modules/puppeteer/lib/NavigatorWatcher.js:73:20)
    at <anonymous>
    at process._tickCallback (internal/process/next_tick.js:188:7)
2017-10-11T20:44:33.690Z - info: [pdf-core.js] Closing browser..
2017-10-11T20:44:33.708Z - error: [error-logger.js] Request headers: host=localhost:9000, user-agent=Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:56.0) Gecko/20100101 Firefox/56.0, accept=text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8, accept-language=en-US,en;q=0.5, accept-encoding=gzip, deflate, connection=keep-alive, upgrade-insecure-requests=1
2017-10-11T20:44:33.708Z - error: [error-logger.js] Request parameters:
2017-10-11T20:44:33.709Z - error: [error-logger.js] Request body:
2017-10-11T20:44:33.710Z - error: [error-logger.js] Error: SSL Certificate error: ERR_CERT_AUTHORITY_INVALID
    at NavigatorWatcher.waitForNavigation (/usr/src/app/node_modules/puppeteer/lib/NavigatorWatcher.js:73:20)
    at <anonymous>
    at process._tickCallback (internal/process/next_tick.js:188:7) 'Error: SSL Certificate error: ERR_CERT_AUTHORITY_INVALID\n    at NavigatorWatcher.waitForNavigation (/usr/src/app/node_modules/puppeteer/lib/NavigatorWatcher.js:73:20)\n    at <anonymous>\n    at process._tickCallback (internal/process/next_tick.js:188:7)'
GET /api/render?url=http://google.com&pdf.margin.top=2cm&pdf.margin.right=2cm&pdf.margin.bottom=2cm&pdf.margin.left=2cm 500 1021.139 ms - -


how to generate a PDF with automatic height?

I use this Puppeteer microservice to generate receipts in PDF. For each receipt, width is always the same, but height changes, according to the article count in the order.

For now, I'm using the article count to approximate the required height for my receipt. It kind of works, but it's not perfect and is a dirty way to do.
Is there way to tell Puppeteer API : "Please automatically find the right PDF height, according to the HTML body height, in order to generate a perfectly sized PDF" ?

Feature request: Support rendering images

Hi,

First of all, thanks for this awesome project. It seems to be really well thought-out, so thank you for your efforts. I also really like the ability to render logged in pages by setting a cookie in the POST request.

Since you are using puppeteer, which also supports rendering pages to images via "screenshot", it would be possible to render images as well. Is this something you're interested in? We have some users which would like this, for example for dashboards that are displayed on a monitors.

Internal Server Error

Some requests to the demo Heroku app return:

{
  status: 500,
  statusText: "Internal Server Error",
  messages: [
    "Internal Server Error"
  ]
}

API key authentication

Hi folks! Could someone please point me to some documentation on how to do API key authentication. There's mention of it in the README, but no instructions yet, and I didn't see anything relevant in the Puppeteer docs. Any help appreciated!

grayscale

What is the way if I want convert my html to grayscale pdf ?

Security issues

It's easy to make Chrome display any file:// link. A couple of ways:

  • Redirect
  • window.location.href

Let's figure out if we could have a few ways in Puppeteer to block as much of these as possible. In any case, I'm quite confident that it's not possible to catch all of them. I would definitely recommend serving this API for "trusted" users, e.g. inside your organization.

Only HTTPS allowed?

http://localhost:9000/api/render?url=http://google.com


2017-10-05T16:05:58.491Z - warn: [error-logger.js] Request headers: host=localhost:9000, user-agent=Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:56.0) Gecko/20100101 Firefox/56.0, accept=text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8, accept-language=en-US,en;q=0.5, accept-encoding=gzip, deflate, connection=keep-alive, upgrade-insecure-requests=1
2017-10-05T16:05:58.491Z - warn: [error-logger.js] Request parameters:
2017-10-05T16:05:58.491Z - warn: [error-logger.js] Request body:
2017-10-05T16:05:58.491Z - warn: [error-logger.js] Error: Only HTTPS allowed.
GET /api/render?url=https://google.com 403 0.824 ms - 74

Improve error handling

Puppeteer await calls are not throwing all errors. Some errors can only be catched from page.on('error', cb) callback. We should be able to provide these errors better in the responses. Currently almost all errors except validation errors are 500 Internal Server Error. Only place to see what happened is application logs.

Font size decrease when pdf.width and pdf.height parameters are passed.

If the height and width parameters are passed while rendering an HTML page, it somehow reduces the font size but the size of the content boxes are not affected.
Is this the expected result, if not is there any solutions (any flag) to make sure the pdf rendering does not affect any applied styles(CSS).

scrollPage bug

There are some websites such as example using a special lazy loading strategy.

When users scroll quickly(<300ms) they do not load image. Just when users stop to look at the content they load image.

So, I think that we need another option (scrollInterval) to let user to test and decide the interval.

releated discussions:

puppeteer/puppeteer#338 (comment)

Thanks!

Searching for maintainers

Hi,

I'm searching for a few helping hands with the maintenance. This repo is definitely on my top open source maintenance priorities and I'll continue to be a maintainer also but I haven't had enough time to do good maintenance lately. I think it's healthy for any project to have at least 2 persons with collaboration rights. If you'd like to join the effort, please respond to this issue describing a bit your background in open source.

Support cookies

I would not want to pass the hosted version auth cookies but locally I would like to pass in a url and a cookie to be set. This would allow me to generate, locally, pdfs of my authenticated pages.

Thanks. It looks neat.

Becker

Issues with header and footer templates

This issue gathers a lot of issues with PDF header and footer templates. They are not as flexible as I and apparently many others have thought.

Headers and footers are not appearing

Working example: https://url-to-pdf-api.herokuapp.com/api/render?url=https://github.com&pdf.margin.bottom=100px&pdf.displayHeaderFooter=true&pdf.footerTemplate=%3Cp%20style=%22font-size:20px%22%3EFooter%20text%3C/p%3E

Styling is not working

See puppeteer/puppeteer#2916 and puppeteer/puppeteer#2388

Crash after 'read ECONNRESET' error

Hi,

I get this error randomly when I try to generate a pdf from my local url-to-pdf.

What I get

The server crash with the following error : Error: read ECONNRESET at exports._errnoException (util.js:1018:11) at TCP.onread (net.js:568:26).
curl print curl: (52) Empty reply from server

What I do

curl -o test_.pdf -XPOST [email protected] -H"content-type: text/html" http://localhost:9000/api/render\?emulateScreenMedia=false\&goto.waitUntil\=load

Solution?

This bug only happens AFTER the pdf generation, when browser.close() is called, but I don't know if this is caused by puppeteer closing its connexion to chrome, or the connexion to one of the assets of the page. Because this error happens after the pdf generation, I'm inclined to ignore it, and it can be done by adding a callback on process.on('uncaughtException', (error) => {}), but I'm not sure that's the correct thing to do, but for now it's the only solution I can provide.

The html file I use

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <title>Test</title>
  <!-- Normalize or reset CSS with your favorite library -->
  <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/normalize/3.0.3/normalize.css">

  <!-- Load paper.css for happy printing -->
  <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/paper-css/0.2.3/paper.css">
  <style>
    @page { 
      size: A4; 
    }
    img {
      display: block;
      position: absolute;
    }

    img:nth-of-type(1) {
      left: 200px;
      top: 200px;
      transform: rotate(30deg);
    }
    img:nth-of-type(2) {
      left: 10%;
      top: 70%;
      transform: rotate(200deg);
    }
    img:nth-of-type(2) {
      float: right;
    }
  </style>
</head>
<body class="A4">
  <section class="sheet">
    <h1>Lorem ipsum dolor sit amet, consectetur adipisicing elit. Cum, laboriosam!</h1>
    <p>Lorem ipsum dolor sit amet, <u>consectetur</u> adipisicing elit. <em>Officia</em> <strong>aspernatur sed</strong> <i>quis</i> veniam! Itaque fugiat voluptas rerum necessitatibus iste, <b>dolores id eligendi minus! <i>Velit <u>alias</u></i> quos</b> , deleniti optio quod numquam perspiciatis sequi. Hic autem omnis non ipsam odio. Sit nostrum officia, ea officiis corporis tempore ut illum minus placeat repellat similique natus facere iusto aperiam rerum magni inventore in vero error, quisquam nihil dolore culpa optio necessitatibus, dicta? Sit quos enim, id quidem ea amet voluptas vitae odit sequi, ex aliquid commodi illum aperiam odio suscipit reiciendis</p>
    <img src="https://placehold.it/400x400" alt="placeholder">
    <img src="https://placehold.it/400x400" alt="placeholder">
    <img src="https://placehold.it/400x400" alt="placeholder">
    <img src="https://placehold.it/400x400" alt="placeholder">
    <img src="https://placehold.it/400x400" alt="placeholder">
  </section>
  <section class="sheet">
    <h1>Such wow</h1>
    <h2>Such wow</h2>
    <h3>Such wow</h3>
    <h4>Such wow</h4>
    <h5>Such wow</h5>
    <h6>Such wow</h6>
    <p style="text-align: left">Lorem ipsum dolor sit amet, consectetur adipisicing elit. Minima, tempora? Lorem ipsum dolor sit amet, consectetur adipisicing elit. Molestiae ipsa inventore laborum rem deserunt placeat, praesentium soluta exercitationem corporis at, voluptatibus id atque amet voluptate mollitia nam sunt nisi, excepturi facilis nemo! Maiores deserunt qui, quia soluta culpa accusantium distinctio numquam eaque asperiores maxime suscipit, iusto inventore. Adipisci, quasi corporis!</p>
    <p style="text-align: right">Lorem ipsum dolor sit amet, consectetur adipisicing elit. Minima, tempora? Lorem ipsum dolor sit amet, consectetur adipisicing elit. Laborum, suscipit? Officia rem dolorum, quisquam autem expedita ea odio aliquam dicta amet corporis voluptatum ipsam sequi ipsa accusantium enim molestiae nemo, qui, et odit quod corrupti ab? Odio, quisquam voluptatem aperiam totam illum repellendus temporibus harum dolores, laboriosam alias, doloremque et?</p>
    <p style="text-align: center">Lorem ipsum dolor sit amet, consectetur adipisicing elit. Minima, tempora? Lorem ipsum dolor sit amet, consectetur adipisicing elit. Sapiente ipsam consectetur omnis ut repellendus, amet commodi minus fugit consequatur recusandae necessitatibus explicabo quasi nostrum eveniet dolores similique eligendi, expedita blanditiis doloremque nemo nobis. Sint aspernatur, mollitia expedita nulla est, rerum aliquam error. Provident saepe similique, dignissimos quia explicabo ab, nihil.</p>
    <p style="text-align: justify;">Lorem ipsum dolor sit amet, consectetur adipisicing elit. Minima, tempora? Lorem ipsum dolor sit amet, consectetur adipisicing elit. Sapiente ipsam consectetur omnis ut repellendus, amet commodi minus fugit consequatur recusandae necessitatibus explicabo quasi nostrum eveniet dolores similique eligendi, expedita blanditiis doloremque nemo nobis. Sint aspernatur, mollitia expedita nulla est, rerum aliquam error. Provident saepe similique, dignissimos quia explicabo ab, nihil.</p>
    <h1 style="transform: rotate(180deg);text-align: center;">WOOOOOOOOOOOOOOW</h1>
    <h1 style="transform: rotate(50deg);text-align: center;">AMAZING</h1>
    <h1 style="transform: rotate(80deg);text-align: center;">WOOOOOOOOOOOOOOW</h1>
    <h1 style="transform: rotate(300deg);text-align: center;">WOOOOOOOOOOOOOOW</h1>
    <h1 style="transform: rotate(260deg);text-align: center;">WOOOOOOOOOOOOOOW</h1>
    <h1 style="transform: rotate(120deg);text-align: center;">WOOOOOOOOOOOOOOW</h1>
    <h1 style="transform: rotate(190deg);text-align: center;">WOOOOOOOOOOOOOOW</h1>
  </section>

</body>
</html>

URL and HTML issues with POST

Hi there,

I'm having trouble getting a POST request in Mithril.js to a locally hosted version of this repo to generate a PDF from the URL I pass through. The URL field is undefined on the server side.

This is what my call looks like:

m.request({
		method: "POST",
		url: "http://localhost:9000/api/render",
		headers: {
			"content-type": "application/json",
		},
		data: {
			"url": "http://www.google.com",
		},
	})
	.then(function (result) {
		try{
			console.log('Worked');
		} catch (error) {
			console.log('Error:' + error);
		}
	})
	.catch(function (result) {
		console.log('Error: ' + result);
	})

On the server side I output the opts. I get this:

{ cookies: [],
  scrollPage: false,
  emulateScreenMedia: true,
  ignoreHttpsErrors: false,
  html: {},
  viewport:
   { width: 1600,
     height: 1200,
     deviceScaleFactor: undefined,
     isMobile: undefined,
     hasTouch: undefined,
     isLandscape: undefined },
  goto:
   { waitUntil: 'networkidle',
     networkIdleTimeout: 2000,
     timeout: undefined,
     networkIdleInflight: undefined },
  pdf:
   { format: 'A4',
     printBackground: true,
     scale: undefined,
     displayHeaderFooter: undefined,
     landscape: undefined,
     pageRanges: undefined,
     width: undefined,
     height: undefined,
     margin:
      { top: undefined,
        right: undefined,
        bottom: undefined,
        left: undefined } },
  url: undefined,
  attachmentName: undefined,
  waitFor: undefined }

When I do a curl command it works as expected, html is null and url contains the expected url.

What am I doing wrong? Thanks in advance!

Fails to navigate on a non-.com

We have an internal site that I'm trying to grab PDFS from on the fly. The app works fine on any public url, but not on our internal.

2017-10-12T14:48:19.108Z - info: [pdf-core.js] Set browser viewport..
2017-10-12T14:48:19.109Z - info: [pdf-core.js] Emulate @media screen..
2017-10-12T14:48:19.109Z - info: [pdf-core.js] Goto url https://cef.erwf.nin.asn/ ..
2017-10-12T14:48:21.395Z - error: [pdf-core.js] Error when rendering page: Error: Failed to navigate: https://cef.erwf.nin.asn/
2017-10-12T14:48:21.396Z - error: [pdf-core.js] Error: Failed to navigate: https://cef.erwf.nin.asn/
    at Page.goto (/usr/src/app/node_modules/puppeteer/lib/Page.js:390:13)
    at <anonymous>
2017-10-12T14:48:21.396Z - info: [pdf-core.js] Closing browser..
2017-10-12T14:48:21.407Z - error: [error-logger.js] Request headers: host=localhost:9000, user-agent=Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:56.0) Gecko/20100101 Firefox/56.0, accept=text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8, accept-language=en-US,en;q=0.5, accept-encoding=gzip, deflate, connection=keep-alive, upgrade-insecure-requests=1
2017-10-12T14:48:21.407Z - error: [error-logger.js] Request parameters:
2017-10-12T14:48:21.407Z - error: [error-logger.js] Request body:
2017-10-12T14:48:21.408Z - error: [error-logger.js] Error: Failed to navigate: https://cef.erwf.nin.asn/
    at Page.goto (/usr/src/app/node_modules/puppeteer/lib/Page.js:390:13)
    at <anonymous> 'Error: Failed to navigate: https://cef.erwf.nin.asn/\n    at Page.goto (/usr/src/app/node_modules/puppeteer/lib/Page.js:390:13)\n    at <anonymous>'
GET /api/render?url=https://cef.erwf.nin.asn/ 500 2484.461 ms - -

options in .env not used

When I alter the options in .env they are not used:

export NODE_ENV=development
export PORT=9990
export ALLOW_HTTP=true

When I use them as a prefix for the start command it works just fine:


ALLOW_HTTP=true PORT=9990 npm start

What am I doing wrong?

BTW, very nice piece of software!

random errors when rendering pdf from html via POST

First - thank you so much for creating and working on this project.

I've deployed to Heroku. Most of the time pdf is generated, sometimes there is an error and entire node server crashes.

Here is the log: url_to_pdf_api_error-01-25-2018.log

Is there a good way to debug this problem? Currently it crashes around ~20% of the time. I was running on "hobby" initially, but had same results on 1x and 2x instance types.

Cloudflare and 301 redirects

Hello does this software has been tested to handle 301 requests?
Cloudflare does that and other softwares don't seem to follow up.

CORS config is missing

CORS_ORIGIN is missing in config.js, and it is used in app.js:

  const corsOpts = {
    origin: config.CORS_ORIGIN, //undefined
    methods: ['GET', 'POST', 'PUT', 'DELETE', 'OPTIONS', 'HEAD', 'PATCH'],
  };

Cookies support

I was having difficulties getting the cookies to be sent with my request, and I think I may have found the problem. This function here is missing cookies assignment, and therefore the resulting cookies array is always empty.

Am I missing something? Thanks for the library by the way - it's just awesome!

Issue passing more than one parameter.

If i try to pass more than one parameter, i get an error on the second parameter.
Example: https://urltopdf2.herokuapp.com/api/render?url=https://server1.outsystemscloud.com/automatedterritoryas/PDFEmail.aspx?Tenantid=109&Territoryid=564

If i browse to the url, works no problem. When i try to use url-to-pdf-api, i get the following error:
{"status":400,"statusText":"Bad Request","errors":[{"field":["Territoryid"],"location":"query","messages":[""Territoryid" is not allowed"],"types":["object.allowUnknown"]}]}

Again, if i leave off the last &Territoryid=564 it works, no error. Add it, error.

cookies

i am confused in assigning cooking in api. could anyone help me. I have 3 cookies
eg - Evnetid = 6235765; sessionid = jshdak; documentID= sjdh; how to enter this in api.

i read the document and try to put the values but getting error every time could anyone help please?

Adding a footer and header on every page

Great work here -- its 2017 and generating PDFs is still unnecessarily complicated. I'm currently using wfhtmltopdf. I'd love to stop using it, and use this project as a micro service to handle all my pdf needs. However the one thing I can't figure out how to do is add a footer and/or header to every generated page. Header/footer would need to have stuff like logo, page number, warning, date, invoice number etc, so it needs to be more custom than the standard pdf.displayHeaderFooter option allows.

Does anyone have any experience with this? Is there something I'm missing? Thanks again for this awesome project.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.