Giter Site home page Giter Site logo

shark.pdfconvert's Introduction

Shark.PdfConvert

What is Shark.PdfConvert?

Shark.PdfConvert is a simple .NET Core (also targets net451) wrapper around the WkHtmlToPdf tool. Most options are exposed via a PdfConversionSettings object, others can be specified by using Custom overrides for the configuration area you want.

Conversion setting defaults are set for a Windows environment and assume you have the WkHTMLToPDF (x64) tool installed. You can override the Path to the tool by overridding PdfConversionSettings . PdfToolPath.

You will need to install/download WkHtmlToPdf, it is not embedded in the NuGet Package

Sample 1: Static HTML Content

PdfConvert.Convert(new PdfConversionSettings
{
    Title = "My Static Content",
    Content = @"<h1>Lorem ipsum dolor sit amet consectetuer adipiscing elit 
	    I SHOULD BE RED BY JAVASCRIPT</h1>
		<script>document.querySelector('h1').style.color = 'rgb(128,0,0)';</script>",
    OutputPath = @"C:\temp\temp.pdf"
});

Sample 2: Get Content from a URL

PdfConvert.Convert(new PdfConversionSettings
{
    Title = "My URL based Content",
    ContentUrl = "http://www.lipsum.com/",
    OutputPath = @"C:\temp\temp-url.pdf"
});

Sample 3: Use Streams for Output and Input

PdfConversionSettings config = new PdfConversionSettings
{
    Title = "Streaming my HTML to PDF"
};

using (var fileStream = new FileStream(Path.GetTempFileName() + ".pdf", FileMode.Create))
{
    var task = new System.Net.Http.HttpClient().GetStreamAsync("http://www.google.com");
    task.Wait();

    using (var inputStream = task.Result)
    {
		PdfConvert.Convert(config, fileStream, inputStream);
	}
}

Sample 4: Mix and Match

PdfConversionSettings config = new PdfConversionSettings
{
    Title = "A little bit of Everything",
    GenerateToc = true,
    TocHeaderText = "Table of MY Contents",
    PageCoverUrl = "https://blackrockdigital.github.io/startbootstrap-landing-page/",
    ContentUrl = "http://www.lipsum.com/",
    PageHeaderHtml = @"
        <!DOCTYPE html>
        <html><body>
        <div style=""background-color: red; color: white; text-align: center; width: 100vw;"">SECRET SAUCE</div>
        </body></html>"
};

using (var fileStream = new FileStream(Path.GetTempFileName() + ".pdf", FileMode.Create))
{
    PdfConvert.Convert(config, fileStream);
}

Sample 5: Usage inside MVC Controller Action

public IActionResult ConvertToPdf([FromBody] PdfConversionSettings model) 
{
	// TAKE CARE WHEN Accepting the Conversion Settings from user land, it would be best 
	// to just NOT DO it, accept your own custom model and map the parameters as needed.
	// If you insist, then you could do something like the following to prevent malicious code execution
	// in my testing the Custom*Args members are not a valid attack vector, PdfToolPath certainly is, never* trust
	// the client
#if DEBUG
    // set path to executable, UNSAFE DEBUG USE ONLY FOR TESTING
    model.PdfToolPath = model.PdfToolPath ?? _host.ContentRootPath + @"\wkhtmltopdf.exe";
#else
    // set path to executable
    model.PdfToolPath = _host.ContentRootPath + @"\wkhtmltopdf.exe";
#endif	  

    if (model.OutputFilename.EndsWith(".pdf") == false) model.OutputFilename = model.OutputFilename + ".pdf";

    var memoryStream = new MemoryStream();
    PdfConvert.Convert(model, memoryStream);
    return new FileContentResult(memoryStream.ToArray(), MimeTypes.Pdf)
    {
        FileDownloadName = model.OutputFileName
    };
}

Sample 6: Get Content from multiple URLs

var settings = new PdfConversionSettings
{
    Title = "My Content from multiple URLs",
    OutputPath = @"C:\temp\temp-url-multiple.pdf"
};
settings.ContentUrls.Add("http://www.lipsum.com/");
settings.ContentUrls.Add("http://www.google.com/");

PdfConvert.Convert(settings);

Sample 7: Zoom and Page Size (Issue #5)

PdfConvert.Convert(new PdfConversionSettings
{
    Title = "Converted by Shark.PdfConvert",
    LowQuality = false,
    Margins = new PdfPageMargins() { Bottom = 10, Left = 10, Right = 10, Top = 10 },
    Size = PdfPageSize.A3,
    Zoom = 3.2f,
    Content = @"<h1>Lorem ipsum dolor sit amet consectetuer adipiscing elit I SHOULD BE RED BY JAVASCRIPT</h1><script>document.querySelector('h1').style.color = 'rgb(128,0,0)';</script>",
    OutputPath = @"C:\temp\sample7.pdf"
});

Revision History*

  • 1.0.4 - Merged PRs from very patient PR submitters #16, #9, and #7
  • 1.0.3 - Fixed Issue #5 with Zoom / Page Size options and Fixed Header/Footer/Cover issues.
  • 1.0.2 - Added ContentUrls property to PdfConversionSettings to allow for multiple URLs to be specified. Requested via Issue.
  • 1.0.1 - Spoke to soon, updated the samples, they had a typo, small tweaks in the code, nothing breaking or signature modifying
  • 1.0.0 - Should be stable going forward except for any bugs found. Modified Convert method signature to be a bit more sane, Added additional static content options, Added Url overrides if you wanted to have WkHTMLToPDF grab external sites for any portion of the generated document, exposed some process options
  • 0.1.0 - Initial Upload

shark.pdfconvert's People

Contributors

bartek4c avatar cp79shark avatar luca-defranceschi-touchmultimedia avatar sbfrancies avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

shark.pdfconvert's Issues

not converting arabic language

i have an html that is in both languages english and arabic.
it properly converts english and desgin to PDF but arabic replaced by special characters
Annotation 2020-09-10 102520

Throws an error on linux, amount not finding wkhtmltopdf.exe.

Unhandled exception. System.ArgumentException: File 'C:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe' not found. Check if wkhtmltopdf application is installed and set the correct location before calling this method.
   at Shark.PdfConvert.PdfConvert.Convert(PdfConversionSettings config, Stream pdfOutputStream, Stream contentInputStream, Stream coverInputStream, Stream footerInputStream, Stream headerInputStream, Action`2 outputCallback)
   at .Program.Main(String[] args) in /source/App/Program.cs:line 11

IOException in the library

It doesn't seem like this library is getting much attention anymore, but I figure I'll put this here just in case the developer sees it. It appears the library is throwing an IOException when attempting to delete one of the temporary files it creates in C:\WINDOWS\TEMP. Here is the error I got:

The process cannot access the file 'C:\WINDOWS\TEMP\ce20a607-f3c8-4116-bd58-74a731b790b0.html' because it is being used by another process.
STACKTRACE: at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
at System.IO.File.InternalDelete(String path, Boolean checkHost)
at System.IO.File.Delete(String path)
at Shark.PdfConvert.PdfConvert.Convert(PdfConversionSettings config, Stream pdfOutputStream, Stream contentInputStream, Stream coverInputStream, Stream footerInputStream, Stream headerInputStream, Action`2 outputCallback)
at .ConvertToPdf(String& htmlText) in :line 67

I know for a fact I'm not doing anything in the temp directory in my application, and when I dug down into this library I saw that you're creating and writing the html content into a temporary file in the format of ".html" out at C:\Windows\Temp.

I looked through the code to see where the file is getting used, and I'm unable to tell exactly which file it is that is having the issue when being deleted. I'm using this library in a small print service that my server uses, it seems like the issue is happening early in the morning. My sneaking suspicion is that Windows is locking down this file somehow because of where it lives, but I don't know enough about Windows to make that kind of conclusion. I'll be running a test on my end here to see if I can replicate the issue after not printing something for a while and coming back in in the morning and trying again and see if the issue repeats. If that's the case, I'll likely just fork off the project and modify the code to cause the temporary files to be created in a folder I know the system won't fiddle with and see then if the issue persists. As far as I can tell, the library is properly handling files, and it's not an immediate logic problem.

Custom header

Hi,

Thanks for a great tool! It works great for me, apart of one fairly big issue. When I generate the PDF I point the tool to the URL of another application, but I also need to pass the authorization token together with the request. Currently I don't know how to do it. I have looked into WkHtmlToPdf documentation and found the --custom-header option in the tool, but I'm not sure how to implement it into your code. Any chance you could help me with that?
Thanks

pageto, page, etc. keywords are not being replaced

Is it possible that the keyword replacer for 'frompage', 'topage', 'page', 'webpage', 'section', 'subsection', 'subsubsection' is not working or not implemented correctly for the headerHtml and footerHtml. You know, where you should just type <span class="page"></span> and the inside the <span> there should be the page number.

I fixed this by inserting the subst() JS script and calling it onload of <body> in my html-file like this.

    ...
    <script>
        function subst() {
          var vars={};
          var x=document.location.search.substring(1).split('&');
          for (var i in x) {var z=x[i].split('=',2);vars[z[0]] = unescape(z[1]);}
          var x=['frompage','topage','page','webpage','section','subsection','subsubsection'];
          for (var i in x) {
            var y = document.getElementsByClassName(x[i]);
            for (var j=0; j<y.length; ++j) y[j].textContent = vars[x[i]];
          }
        }
    </script>
</head>

<body onload="subst()">
 ...

Maybe this helps somebody.

Problem with Zoom and Page Size settings

Hello,

I have a problem with zoom and page size settings i receive error from wkhtmltopdf, this is my config:

Shark.PdfConvert.PdfConversionSettings config = new Shark.PdfConvert.PdfConversionSettings
                    {
                        Title = "Converted by Shark.PdfConvert",
                        LowQuality = false,
                        Margins = new Shark.PdfConvert.PdfPageMargins() { Bottom = 10, Left = 10, Right = 10, Top = 10 },
                        Size = Shark.PdfConvert.PdfPageSize.A4,
                        Zoom = 1.2f,
                        OutputPath = @"C:\DATA\pdfSharkTest.pdf"
                    };

this is the error i receive:

WkHTMLToPdf conversion of HTML data failed. Output: \r\nWkHTMLToPdf exited with code 1.

am I using it wrong? :) without Zoom and Size options it generates file normally

PageWidth and PageHeight not working

I need to create a pdf file with custom page width and height in mm.

I have tried to set in on the configuration but it seems not working. I have checked the source code and it seems that this parameter is not used on the rendering process of the pdf file.

Can you please provide support for it?

Handle multiple pages

Library has worked well for me so far, is there any plan to support multiple urls for a combination into a single PDF?

I've been using the following method which requires some specific ordering of parameters to get the required outcome for multiple web pages.

var config = new Shark.PdfConvert.PdfConversionSettings
{
Title = "Combined PDF",
ContentUrl = "thirdurlrequired" //Will render as last pages in PDF
}

var customArgs = new List();
customArgs.Add("--javascript-delay 1000");
customArgs.Add("--print-media-type");
customArgs.Add(string.Format("page \"{0}\" ", "firsturlrequired")); //Additional pages must come as last args, first url will render as first doc etc
customArgs.Add(string.Format("page \"{0}\" ", "secondurlrequired"));
config.CustomWkHtmlArgs = string.Join(" ", customArgs);

Header and footer not showing with cover

When adding a cover page header and footer are not rendered. This seems to be a problem with wkhtmltopdf it self wkhtmltopdf/wkhtmltopdf#1676 but I got it to work with your code by just putting the header and footer part before cover, so they are global, and now it works.

Running 0.12.4 of wkhtmltopdf.

I'm posting this as an issue since I can't do a pull-request at this time.

private static string BuildOptions(PdfConversionSettings config,
            string temporaryContentFilePath,
            string temporaryPdfFilePath,
            string temporaryCoverFilePath,
            string temporaryHeaderFilePath,
            string temporaryFooterFilePath)
        {
            StringBuilder options = new StringBuilder();

            // GLOBAL OPTIONS
            if (string.IsNullOrWhiteSpace(config.CustomWkHtmlArgs))
            {
                if (config.Grayscale) options.Append("--grayscale ");
                if (config.LowQuality) options.Append("--lowquality ");
                if (config.Margins.Bottom != null) options.AppendFormat("--margin-bottom {0} ", config.Margins.Bottom);
                if (config.Margins.Top != null) options.AppendFormat("--margin-top {0} ", config.Margins.Top);
                if (config.Margins.Left != null) options.AppendFormat("--margin-left {0} ", config.Margins.Left);
                if (config.Margins.Right != null) options.AppendFormat("--margin-right {0} ", config.Margins.Right);
                if (config.Size != PdfPageSize.Default) options.AppendFormat("--page-size {0}", config.Size.ToString());
                if (config.Orientation != PdfPageOrientation.Default) options.AppendFormat("--orientation {0} ", config.Orientation.ToString());
                if (string.IsNullOrWhiteSpace(config.Title) == false) options.AppendFormat("--title \"{0}\" ", config.Title.Replace("\"", ""));
            }
            else
            {
                options.Append(config.CustomWkHtmlArgs);
                options.Append(" ");
            }

            // FOOTER
            if (string.IsNullOrWhiteSpace(temporaryFooterFilePath) == false ||
                string.IsNullOrWhiteSpace(config.PageFooterUrl) == false)
            {
                options.AppendFormat("--footer-html  \"{0}\" ",
                    string.IsNullOrWhiteSpace(config.PageFooterUrl) ? temporaryFooterFilePath : config.PageFooterUrl);

                if (string.IsNullOrWhiteSpace(config.CustomWkHtmlFooterArgs) == false)
                {
                    options.Append(config.CustomWkHtmlFooterArgs);
                    options.Append(" ");
                }
            }

            // HEADER
            if (string.IsNullOrWhiteSpace(temporaryHeaderFilePath) == false ||
                string.IsNullOrWhiteSpace(config.PageHeaderUrl) == false)
            {
                options.AppendFormat("--header-html  \"{0}\" ",
                    string.IsNullOrWhiteSpace(config.PageHeaderUrl) ? temporaryHeaderFilePath : config.PageHeaderUrl);

                if (string.IsNullOrWhiteSpace(config.CustomWkHtmlHeaderArgs) == false)
                {
                    options.Append(config.CustomWkHtmlHeaderArgs);
                    options.Append(" ");
                }
            }

            // COVER
            if (string.IsNullOrWhiteSpace(temporaryCoverFilePath) == false ||
                string.IsNullOrWhiteSpace(config.PageCoverUrl) == false)
            {
                options.AppendFormat("cover  \"{0}\" ",
                    string.IsNullOrWhiteSpace(config.PageCoverUrl) ? temporaryCoverFilePath : config.PageCoverUrl);

                if (string.IsNullOrWhiteSpace(config.CustomWkHtmlCoverArgs) == false)
                {
                    options.Append(config.CustomWkHtmlCoverArgs);
                    options.Append(" ");
                }
            }

            // TABLE OF CONTENTS
            if (config.GenerateToc)
            {
                options.Append("toc ");
                if (string.IsNullOrWhiteSpace(config.CustomWkHtmlTocArgs) == false)
                {
                    options.Append(config.CustomWkHtmlTocArgs);
                    options.Append(" ");
                }
            }

            // PAGE
            options.AppendFormat("page \"{1}\" \"{0}\" ",
                temporaryPdfFilePath,
                string.IsNullOrWhiteSpace(config.ContentUrl) ? temporaryContentFilePath : config.ContentUrl);

            // PAGE OPTIONS
            if (string.IsNullOrWhiteSpace(config.CustomWkHtmlPageArgs))
            {
                if (config.Zoom != null) options.AppendFormat("--zoom {0} ", config.Zoom);
            }
            else
            {
                options.Append(config.CustomWkHtmlPageArgs);
            }

            return options.ToString();
        }

Margin Issue

Hello cp79shark,

Done a good job. It was working good, a few things to be improved If you fix those things your product will be rock, I will tell you something which I have found from my side.

  1. If I change the margin to 0, then the header and footer not displaying.
  2. background-image: url("xxx.png"); not working for any control.

This is for your kind information

Legal OSS compliance question

Since this framework use WkHtmlToPdf and WkHtmlToPdf is under LGPLv3 which is a heavy copy left license, can this FW be licensed under MIT? I know its not directly distributing WkHtmlToPdf as part of the distro package, however to make use of the package a developer needs WkHtmlToPdf .

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.