Giter Site home page Giter Site logo

wkhtmltox's Introduction

WkHtmlToX

C# wrapper for wkhtmltopdf.org Html to Pdf and Image library.

Badges

CodeFactor Total alerts Build Status Azure DevOps tests Azure DevOps coverage Quality Gate Status Sonar Tests Sonar Test Count Sonar Test Execution Time Sonar Coverage Nuget

Usage

In web api (in combination with SimpleInjector) registration should be as follows:

   // for simplicity only version for win here
    var configuration = new WkHtmlToXConfiguration((int)Environment.OSVersion.Platform, null);
    _container.RegisterInstance(configuration);
    _container.RegisterSingleton<IWkHtmlToXEngine, WkHtmlToXEngine>();
    _container.RegisterSingleton<IPdfConverter, PdfConverter>();
    _container.RegisterInitializer<IWkHtmlToXEngine>(e => e.Initialize());

In command line application:

    // for simplicity only version for win here
    var configuration = new WkHtmlToXConfiguration((int)Environment.OSVersion.Platform, null);
    using (var engine = new WkHtmlToXEngine(configuration))
    {
        engine.Initialize();

        var converter = new PdfConverter(engine);
    }

Method for conversion to pdf takes 3 parameters. First is settings object in which there is possibility to pass html content to be converted. In second parameter you need to pass func which based on length will create stream. In third you need to pass CancellationToken.

Second parameter is tricky but such construction allows to use for example Microsoft.IO.RecyclableMemoryStream to reuse block of memories. Then whole conversion can look like this

    // doc settings object created earlier
    Stream? stream = null;
    var converted = await _pdfConverter.ConvertAsync(
        doc,
        length =>
        {
            stream = _recyclableMemoryStreamManager.GetStream(
                Guid.NewGuid(),
                "wkhtmltox",
                length);
            return stream;
        },
        _httpContextAccessor.HttpContext?.RequestAborted ?? CancellationToken.None);
    stream!.Position = 0;
    if (converted)
    {
        var result = new FileStreamResult(stream, "application/pdf")
        {
            FileDownloadName = "sample.pdf",
        };

        return result;
    }

WkHtmlToX native lib comes with support for following operation system flavours:

  • WinX64,
  • WinX86,
  • OsxX64,
  • AmazonLinux2,
  • Centos6,
  • Centos7,
  • Centos8,
  • Debian9X64,
  • Debian9X86,
  • Debian10X64,
  • Debian10X86,
  • OpenSuseLeap15,
  • Ubuntu1404X64,
  • Ubuntu1404X86,
  • Ubuntu1604X64,
  • Ubuntu1604X86,
  • Ubuntu1804X64,
  • Ubuntu1804X86,
  • Ubuntu2004X64,

For linux you can set it using second parameter in config.

    var configuration = new WkHtmlToXConfiguration((int)PlatformID.Unix, WkHtmlToXRuntimeIdentifier.Ubuntu2004X64);

Influencers

Library is based on wrapper (DinkToPdf)[https://github.com/rdvojmoc/DinkToPdf]. Interoperability was totally reworked and now it is under tests to see if leaking memory can be avoided.

wkhtmltox's People

Contributors

adaskothebeast avatar 91651 avatar fossabot avatar

Stargazers

Anthony G. Rivera Cosme avatar Refactoring avatar  avatar Sander in 't Hout avatar

Watchers

James Cloos avatar  avatar  avatar  avatar  avatar

wkhtmltox's Issues

Crashes trying to convert an image

I'm trying to use the ImageConverter to convert a website to an image, but the application crashes.
This might be an issue with libwkhtmltox itself, but I figured I'd raise the issue with you first.

Running on Windows 11, using .NET 6, WkHtmlToX v6.0.0 and WkHtmlToX.native.win.x64 v0.12.6.

Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
Repeat 2 times:
--------------------------------
   at AdaskoTheBeAsT.WkHtmlToX.Native.ImageNativeMethods.wkhtmltoimage_destroy_converter(IntPtr)
--------------------------------
   at AdaskoTheBeAsT.WkHtmlToX.Modules.WkHtmlToImageModule.DestroyConverter(IntPtr)
   at AdaskoTheBeAsT.WkHtmlToX.Engine.ImageProcessor.Convert(AdaskoTheBeAsT.WkHtmlToX.Abstractions.IHtmlToImageDocument, System.Func`2<Int32,System.IO.Stream>)
   at AdaskoTheBeAsT.WkHtmlToX.Engine.WkHtmlToXEngine.AdaskoTheBeAsT.WkHtmlToX.WorkItems.IWorkItemVisitor.Visit(AdaskoTheBeAsT.WkHtmlToX.WorkItems.ImageConvertWorkItem)
   at AdaskoTheBeAsT.WkHtmlToX.WorkItems.ImageConvertWorkItem.Accept(AdaskoTheBeAsT.WkHtmlToX.WorkItems.IWorkItemVisitor)
   at AdaskoTheBeAsT.WkHtmlToX.Engine.WkHtmlToXEngine.Process(System.Object)
   at System.Threading.Thread.StartCallback()
using AdaskoTheBeAsT.WkHtmlToX;
using AdaskoTheBeAsT.WkHtmlToX.Documents;
using AdaskoTheBeAsT.WkHtmlToX.Engine;
using Microsoft.IO;

RecyclableMemoryStreamManager streamManager = new();
var configuration = new WkHtmlToXConfiguration((int)Environment.OSVersion.Platform, null);

using var engine = new WkHtmlToXEngine(configuration);
engine.Initialize();

HtmlToImageDocument doc = new()
{
	ImageSettings =
	{
		In = "https://www.google.com/",
		Format = "jpg",
		Out = ""
	}
};

ImageConverter converter = new(engine);
Stream? stream;

Console.WriteLine("before convert");

var converted = await converter.ConvertAsync(
	doc,
	length =>
	{
		stream = streamManager.GetStream(
			Guid.NewGuid(),
			"wkhtmltox",
			length);

		return stream;
	},
	CancellationToken.None);

Console.WriteLine("converted: " + converted);

Generated pdfs are corrupted after first request

Hi, first of all thank you for putting effort to make this project.
When I try sample project WebApiCore
The pdf generated from first request is looking as expected but in the flowing requests every things gets kind the text only
Here is sample html you provide. Look at the how paragraph getting inline.
image

When try complex html it getting worse, here is the sample
image

Do you have any suggestions why this is happening after first request.
Thanks in advance.

Sample html I use

complexpdf.zip

How to user the html to image converter?

I almots had everything set but don't know where to set the html document in the HtmlToImageDocument

Which prop I should use to set the html document?

HtmlToImageDocument.ImageSettings. ??? = htmlDocumentTemplate;

HtmlToPdfDocument sample

I can't figure out how to add my own html text to HtmlToPdfDocumentGenerator or how to add an interface to HtmlToPdfDocument

Paralell Request Getting Slower and Slower

Hi,
I did some load testing as I promised. When making single request, response time is roughly 500ms.
But making 20 parallel request response time getting slower and slower.
I prepared a test project with your package and current pdf conversion I use by starting new process in every request.
Because spawning multiple processes cause cpu peak I made request throttling(max 5 process) bu using https://www.tpeczek.com/2017/08/implementing-concurrent-requests-limit.html

Here ise demo project (I used Apache Benchmark for load testing)
https://github.com/revocengiz/PdfConvert

Maybe adding some king of multiple cached process thread improve performance.

Thank you.
Result preview
loadtest

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.