Giter Site home page Giter Site logo

lampnick / doctron Goto Github PK

View Code? Open in Web Editor NEW
507.0 6.0 115.0 10.31 MB

html转pdf , html转图片 , Docker-powered html convert to pdf(html2pdf), html to image(html2image like jpeg,png),which using chrome(golang) kernel.

Home Page: http://doctron.lampnick.com

License: Apache License 2.0

Go 96.50% Makefile 2.57% Dockerfile 0.93%
html2pdf html2image pdf-watermark converter html pdf image png jpeg watermark

doctron's Introduction

Table of Contents

Doctron description

Doctron is a Docker-powered,serverless,sample,fast,high quality document convert tool.Supply html convert to pdf(html2pdf), html convert to image(html2image like jpeg,png),which using chrome(Chromium) kernel, add watermarks to pdf, convert pdf to images etc.

Online experience

open the following website to have a try. Conversion may be slower due to low server configuration. Doctron Live Demo

Convert preview

Encourage

If you feel doctron is not bad, give me a star and fork.Star and fork is my greatest encouragement!

Features

  • Html convert to pdf/image using chrome kernel to guarantee what you see is what you get.
  • Easy deployment.(Using docker,kubernetes.)
  • Rich transformation parameters.
  • Customize page size from html convert to pdf or image.
  • Serverless supported.

Installing

  • Using docker
#using default config
docker run -p 8080:8080 --rm --name doctron-alpine lampnick/doctron  
#using custom config
docker run -p 8080:8080 --rm --name doctron-alpine \
-v /path/to/doctron/conf/doctron.yaml:/doctron.yaml \
lampnick/doctron  
  • Using k8s
kubectl apply -f https://raw.githubusercontent.com/lampnick/doctron/master/manifests/k8s-doctron.yaml
  • From source code
First step:
download google-chrome,add  google-chrome path to system PATH .
Second step:
git clone https://github.com/lampnick/doctron.git
cd doctron
go run main.go --config=./conf/default.yaml
  • install doctron using go get
First step:
download google-chrome,add  google-chrome path to system PATH .
Second step:
go get -v github.com/lampnick/doctron
When finished then directly run
doctron --config=./conf/default.yaml

Quick Start

Html convert to pdf

basic
http://127.0.0.1:8080/convert/html2pdf?u=doctron&p=lampnick&url=<url>  
custom size
http://127.0.0.1:8080/convert/html2pdf?u=doctron&p=lampnick&url=<url>&marginTop=0&marginLeft=0&marginRight=0&marginbottom=0&paperwidth=4.1  
support params
  • u/username // doctron username
  • p/password // doctron password
  • uploadKey // upload to oss key
  • url // need convert html url
  • landscape // Paper orientation. core.Defaults to false.
  • displayHeaderFooter // Display header and footer. core.Defaults to false.
  • printBackground // Print background graphics. core.Defaults to false.
  • scale // Scale of the webpage rendering. core.Defaults to 1.
  • paperWidth // Paper width in inches. core.Defaults to 8.5 inches.
  • paperHeight // Paper height in inches. core.Defaults to 11 inches.
  • marginTop // Top margin in inches. core.Defaults to 1cm (~0.4 inches).
  • marginBottom // Bottom margin in inches. core.Defaults to 1cm (~0.4 inches).
  • marginLeft // Left margin in inches. core.Defaults to 1cm (~0.4 inches).
  • marginRight // Right margin in inches. core.Defaults to 1cm (~0.4 inches).
  • pageRanges // Paper ranges to print, e.g., '1-5, 8, 11-13'. core.Defaults to the empty string, which means print all pages.
  • ignoreInvalidPageRanges // Whether to silently ignore invalid but successfully parsed page ranges, such as '3-2'. core.Defaults to false.
  • headerTemplate // HTML template for the print header. Should be valid HTML markup with following classes used to inject printing values into them: - date: formatted print date - title: document title - url: document location - pageNumber: current page number - totalPages: total pages in the document For example, would generate span containing the title.
  • footerTemplate // HTML template for the print footer. Should use the same format as the headerTemplate.
  • preferCSSPageSize // Whether or not to prefer page size as defined by css. core.Defaults to false, in which case the content will be scaled to fit the paper size. (Generally, it can solve the problem that the single page converted to PDF will be inconsistent with the specified size)
  • WaitingTime // Waiting time after the page loaded. Default 0 means not wait. unit:Millisecond

Html convert to image

basic
http://127.0.0.1:8080/convert/html2image?u=doctron&p=lampnick&url=<url>  
custom size
http://127.0.0.1:8080/convert/html2image?u=doctron&p=lampnick&url=<url>&customClip=true&clipX=0&clipY=0&clipWidth=400&clipHeight=1500&clipScale=2&format=jpeg&Quality=80  
support params
  • u/username // doctron username
  • p/password // doctron password
  • uploadKey // upload to oss key
  • url // need convert html url
  • format // Image compression format (defaults to png).
  • quality // Compression quality from range [0..100] (jpeg only).
  • customClip //if set this value, the below clip will work,otherwise not work!
  • clipX // Capture the screenshot of a given region only.X offset in device independent pixels (dip).
  • clipY // Capture the screenshot of a given region only.Y offset in device independent pixels (dip).
  • clipWidth // Capture the screenshot of a given region only.Rectangle width in device independent pixels (dip).
  • clipHeight // Capture the screenshot of a given region only.Rectangle height in device independent pixels (dip). WaitingTime // Waiting time after the page loaded. Default 0 means not wait. unit:Millisecond

Pdf add watermark

add image watermark
http://127.0.0.1:8080/convert/pdfAddWatermark?u=doctron&p=lampnick&url=<pdf url>&imageUrl=<image url>
support params
  • u/username // doctron username
  • p/password // doctron password
  • uploadKey // upload to oss key
  • url // need convert html url
  • imageUrl // watermark image url,support png/jpeg

Pdf convert to image

coming soon

Doctron Client

Doctron go client

Doctron php client

License

Doctron is released under the Apache 2.0 license. See LICENSE.txt

avatar

doctron's People

Contributors

andotorg avatar pinguo-chenhuaibing avatar schaepher avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

doctron's Issues

docker容器启动后,报错3000000

docker容器启动后,访问http://localhost:8080/convert/html2image?u=doctron&p=lampnick&url=......报错如下:
{"code":30000000,"message":"context canceled","data":null}
请问是什么原因造成的

本地通过go run启动,需要找到.doctron配置文件,这个文件默认内容是?

Error after generation

Hello.

I have this issue.
When I try to generate an image or a pdf, in the output of the generated file I can read this string

{"code":30000000,"message":"worker run process failed.job request timed out","data":null}

What type of error is this? What can I do to solve it?

Thanks.

Timeout of loading page or something else

Hello again!
After update to v0.3.0-alpine i try to render hard page with many data loading per AJAX

After that i have a 2 issue

  1. on first start http://development-docker-monitoring:10000/convert/html2image?u=ddd&p=ddd&url=http://dev-monitoring-axibase:8080/pub/db/widgets_BIGBOARD?do-not-track=true&theme=black i have a result page
    {"code":30000000,"message":"worker run process failed.job request timed out","data":null} and logs of compose:
ddd-render | Now listening on: http://0.0.0.0:10000
ddd-render | Application started. Press CTRL+C to shut down.
ddd-render | [INFO] 2021/02/04 14:33 [/convert/html2image?u=ddd&p=ddd&url=http://dev-monitoring-axibase:8080/pub/db/widgets_BIGBOARD?do-not-track=true&theme=black][schema: invalid path "theme"]
ddd-render | [INFO] 2021/02/04 14:33 200 10.000796493s 172.16.0.2 GET /convert/html2image 
  1. after URL ENCODE an URL i have a issue on top and https://prnt.sc/ya5ker after refresh with logs:
ddd-render | [INFO] 2021/02/04 14:35 200 10.001182781s 172.16.0.2 GET /convert/html2image
ddd-render | [INFO] 2021/02/04 14:35 uuid:[c0207652-5f7f-4065-9f9b-874eaa92e83a],doctron.Convert Elapsed [10.019974817s],url:[/convert/html2image?u=ddd&p=ddd&url=http%3A%2F%2Fdev-monitoring-axibase%3A8080%2Fpub%2Fdb%2Fwidgets_BIGBOARD%3Fdo-not-track%3Dtrue%26theme%3Dblack]
ddd-render | [INFO] 2021/02/04 14:39 uuid:[3a2bf1ab-8a60-4b4e-937a-a796311723ed],doctron.Convert Elapsed [3.361823045s],url:[/convert/html2image?u=ddd&p=ddd&url=http%3A%2F%2Fdev-monitoring-axibase%3A8080%2Fpub%2Fdb%2Fwidgets_BIGBOARD%3Fdo-not-track%3Dtrue%26theme%3Dblack]
ddd-render | [INFO] 2021/02/04 14:39 200 3.366392817s 172.16.0.2 GET /convert/html2image

Timeout update to v0.3.0-alpine don't help correctly

生成的pdf中,汉字全部不显示

返回的pdf文件中,所有汉字都是空白,系统是:ubuntu 18.04。没有使用docker,直接命令行安装的 chromium

apt-get install chromium-browser chromium-browser-l10n 

生成pdf文件比较慢

使用docker pull 拉去镜像并启动
网址转图片耗时几百毫秒成功生成
但是转pdf需要五到六秒或者更多时间

启动后报错

使用go get 部署,成功启动后,执行转换时候报错
{"code":30000000,"message":"exec: "google-chrome": executable file not found in $PATH","data":null}

I found some pictures that couldn't be displayed

并发请求转换时出现错误

错误详情
[0925/052143.197062:FATAL:zygote_host_impl_linux.cc(173)] Check failed: process.IsValid(). Failed to launch zygote process
Received signal 6
r8: 00007ffff07033c4 r9: 00007fcace1de420 r10: 0000000000000008 r11: 0000000000000246
r12: 00007ffff0703b90 r13: 00007ffff0703acc r14: 00007ffff0703928 r15: 000000000000007a
di: 0000000000000002 si: 00007ffff0703290 bp: 00007ffff0703290 bx: 0000000000000000
dx: 0000000000000000 ax: 0000000000000000 cx: 00007fcad30a8a71 sp: 00007ffff0703288
ip: 00007fcad30a8a71 efl: 0000000000000246 cgf: 002b000000000033 erf: 0000000000000000
trp: 0000000000000000 msk: 0000000000000000 cr2: 0000000000000000
[end of stack trace]
Calling _exit(1). Core file will not be generated.

关于在centOS中无法挂载自定义配置的问题

当我使用wsl(ubuntu)启动容器时是能够正常工作的,
docker run --rm --name doctron-alpine -v ${PWD}/doctron.yaml:/doctron.yaml lampnick/doctron
但是在centOS 8.12却读的默认配置,即使自定义配置已经被挂载进去了
image
我的docker 版本是20.10.8

go运行报错

go运行包下面的错误
[read config ReadInConfig] err: Config File ".doctron" Not Found in "[C:\Users\iso]"
如何解?

Tag 0.3.1

Hello.

Can I use the latest release 0.3.1?
In this page the label "Verified" isn't present on the relative tag.
Let me know.

Many thanks.

无法传到阿里oss 报错如下:

http://127.0.0.1:8000/convert/html2image?u=doctron&p=lampnick&uploadKey=1123123123&url=http://www.baidu.com&customClip=true&clipX=0&clipY=0&clipWidth=400&clipHeight=1500&clipScale=2&format=jpeg&Quality=80

报错信息:
{"code":30000000,"message":"worker run process failed.job request timed out","data":null}

/doctron.yaml信息如下:

{

"Doctron": {

    "MaxConvertWorker": 50,

    "Env": "prod",

    "Retry": true,

    "MaxConvertQueue": 60,

    "ConvertTimeout": 10,

    "Uploader": "alioss",

    "Domain": "0.0.0.0:8080",

    "TLSCertFile": "certfile",

    "TLSKeyFile": "keyfile",

    "User": [

        {

            "Username": "doctron",

            "Password": "lampnick"

        }

    ]

},

"Oss": {

    "Endpoint": "oss-cn-hangzhou-internal.aliyuncs.com",

    "AccessKeyId": "LTAI5t7U",

    "AccessKeySecret": "gbFg5soa",

    "BucketName": "images",

    "PrivateServerDomain": "images.r3434.cn"

}

}

另外 ,如何设置 保存到特定目录,如:images/test/

Custom HTML

Is it possible to pass in custom HTML as a parameter instead of pointing to a URL?
I want to generate a pdf from HTML templates (using Jinja/Nunjucks)

Trying to get in touch regarding a security issue

Hey there!

I'd like to report a security issue but cannot find contact instructions on your repository.

If not a hassle, might you kindly add a SECURITY.md file with an email, or another contact method? GitHub recommends this best practice to ensure security issues are responsibly disclosed, and it would serve as a simple instruction for security researchers in the future.

Thank you for your consideration, and I look forward to hearing from you!

(cc @huntr-helper)

转pdf时,遇到转出的pdf尺寸偏小问题

htmlpdf 时,A4 纸大小(210mm×297mm)的 html 内容,转出后 pdfimage 尺寸偏小(已设置 DefaultMargin0DefaultPaperWidth 换算为英寸值 11.69291339,高度 paperHeight=8.2677 英寸)
示例:


arm架构html转pdf报错

基本信息:

image

转换过程中进程信息

image

报错:根据提供的Dockerfile制作的arm64架构alpine系统的doctron镜像,html转换pdf时,报如下错误,是什么原因?
{"code":20000000,"message":"worker run process failed.job request timed out","data":null}

图片超过3M时转换

1661411453737

通过一次性链接转换的图片,发现超过3M时就没有响应(设置了timeout600000),目前成功的最大2.46M。
width = "420"
height = "11000"
scope = "1.0";

go run main.go启动报错了

[read config ReadInConfig] err: Config File ".doctron" Not Found in "[/root]"
exit status 1
go version go1.18.3 linux/amd64
centos

带参数的url没办法转换成功

例如grafana的这样页面http://10.206.230.7:30001/d/G9Q1Ncdnk/zhu-ji-jin-cheng-zi-yuan-jian-kong?orgId=1&refresh=10s&kiosk=tv 转换的时候日志会报如下错误
INFO] 2021/11/25 09:55 [/convert/html2pdf?u=doctron&p=lampnick&url=http://10.206.230.7:30001/d/G9Q1Ncdnk/zhu-ji-jin-cheng-zi-yuan-jian-kong?orgId=1&refresh=10s&kiosk=tv][schema: invalid path "refresh" (and 1 other error)]
[INFO] 2021/11/25 09:55 uuid:[02864717-7b67-4814-ada3-f7cb245538b8],doctron.Convert Elapsed [2.678866504s],url:[/convert/html2pdf?u=doctron&p=lampnick&url=http://10.206.230.7:30001/d/G9Q1Ncdnk/zhu-ji-jin-cheng-zi-yuan-jian-kong?orgId=1&refresh=10s&kiosk=tv]
[INFO] 2021/11/25 09:55 200 2.681220589s 10.254.207.38 GET /convert/html2pdf
[INFO] 2021/11/25 09:55 [/convert/html2pdf?u=doctron&p=lampnick&url=http://10.206.230.7:30001/d/G9Q1Ncdnk/zhu-ji-jin-cheng-zi-yuan-jian-kong?orgId=1&refresh=10s&kiosk=tv][schema: invalid path "kiosk" (and 1 other error)]
[INFO] 2021/11/25 09:55 uuid:[acbdb29a-ac2c-4135-be97-a31d07a3c74f],doctron.Convert Elapsed [2.374941014s],url:[/convert/html2pdf?u=doctron&p=lampnick&url=http://10.206.230.7:30001/d/G9Q1Ncdnk/zhu-ji-jin-cheng-zi-yuan-jian-kong?orgId=1&refresh=10s&kiosk=tv]

Render Icon

I have an issue with doctron, Doctron can't render emoji
Somebody can help me to fix it?
Thanks
example: 🛍 🛍🛍

自定义页眉页脚

我看代码中有HeaderTemplate和FooterTemplate两个参数,但是说明文档中没有放出来这两个参数,是什么原因呢?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.