Giter Site home page Giter Site logo

Comments (3)

arklumpus avatar arklumpus commented on June 11, 2024

Hi! My understanding is that those are the input formats for images; in fact, that list at page 136 of MuPDF explored corresponds to the MuPDFCore.InputFileTypes enum.

The output formats are described in chapter 14 of MuPDF explored ("Rendered Output Formats"), page 87 and following. These are the same formats supported by MuPDFCore and mutool (one of the "official" programs using MuPDF created by Artifex).

If you need to produce TIFF images (or images in any other format), an option would be to use MuPDF to get the raw image pixel data (e.g. by using one of the overloads of the Render method that return a byte[] or take an IntPtr argument) and then use another library such as ImageSharp to create the TIFF file.

from mupdfcore.

Zanzacar avatar Zanzacar commented on June 11, 2024

After posting last night. I was afraid that was the case.

What I am currently doing which might not be the most optimal is document.SaveImage(page, zoom, color, PNG format) Then taking that file and processing it with Magick.Net.

It may be faster if I processed it via a byte[] and not write it out to the hard drive and read it back in.

from mupdfcore.

arklumpus avatar arklumpus commented on June 11, 2024

I don't know about Magick.Net, but using ImageSharp I think the most efficient way to do it would be something like this:

using MuPDFCore;
using SixLabors.ImageSharp;

// ...

// Initialize MuPDF context.
using MuPDFContext ctx = new MuPDFContext();

// Open PDF document.
using MuPDFDocument doc = new MuPDFDocument(ctx, @"path/to/PDF/file.pdf");

// Page number.
int pageNumber = 0;

// Zoom level at which the page will be rendered.
double zoom = 1.0;

// Get the size of the rendered image (width and height).
RoundedRectangle pageSize = doc.Pages[pageNumber].Bounds.Round(zoom);

// Get the size in bytes of the rendered image (this should be width * height * 3).
int byteSize = doc.GetRenderedSize(pageNumber, zoom, PixelFormats.RGB);
            
// Allocate the required unmanaged memory.
IntPtr destination = System.Runtime.InteropServices.Marshal.AllocHGlobal(byteSize);

// Render the image to raw pixels in RGB format, saving the results in the memory that has been allocated.
doc.Render(pageNumber, zoom, PixelFormats.RGB, destination);

// We need an unsafe context in order to create a ReadOnlySpan from an IntPtr.
unsafe
{
    // Create the ImageSharp image from the data in unmanaged memory.
    using Image image = Image.LoadPixelData<SixLabors.ImageSharp.PixelFormats.Rgb24>(new ReadOnlySpan<byte>((void*)destination, byteSize), pageSize.Width, pageSize.Height);

    // Save the image as TIFF.
    image.SaveAsTiff(@"path/to/output/file.tiff");
}

// Release the unmanaged memory.
System.Runtime.InteropServices.Marshal.FreeHGlobal(destination);

You will need to compile this using /unsafe.

I assume it will be similar using other graphics libraries, you just need to find a way to load the image data from an IntPtr in your library.

I would recommend using the overloads of the Render method that save the image to an IntPtr: these are faster, because the data is not marshaled. If you instead use the overloads that return a byte[], the library will first save the image to unmanaged memory, and then copy the unmanaged array into a managed byte array. This means that you need twice as much RAM (though only briefly), and you need some time to copy the data as well (though, depending on your use case, this might not make a big difference).

from mupdfcore.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.