arklumpus / mupdfcore Goto Github PK

Multiplatform .NET bindings for MuPDF

License: GNU Affero General Public License v3.0

C# 51.89% CMake 0.16% C++ 3.87% C 43.54% Batchfile 0.42% Shell 0.11% CSS 0.01%

mupdf

mupdfcore's Introduction

MuPDFCore: Multiplatform .NET bindings for MuPDF

MuPDFCore is a set of multiplatform .NET bindings for MuPDF. It can render PDF, XPS, EPUB and other formats to raster images returned either as raw bytes, or as image files in multiple formats (including PNG, JPEG, and PSD). It also supports multithreading.

It also includes MuPDFCore.MuPDFRenderer, an Avalonia control to display documents compatible with MuPDFCore in Avalonia windows (with multithreaded rendering).

The library is released under the AGPLv3 licence.

Getting started

The MuPDFCore library targets .NET Standard 2.0, thus it can be used in projects that target .NET Standard 2.0+, .NET Core 2.0+, .NET 5.0+, .NET Framework 4.6.1 (note) and possibly others. MuPDFCore includes a pre-compiled native library, which currently supports the following platforms:

Windows x86 (32 bit) win-x86
Windows x64 (64 bit) win-x64
Windows arm64 (ARM 64 bit) win-arm64
Linux x64 (64 bit)
- glibc-based linux-x64
- musl-based linux-musl-x64
Linux arm64/aarch64 (ARM 64 bit)
- glibc-based linux-arm64
- musl-based linux-musl-arm64 (see note)
macOS Intel x86_64 (64 bit) osx-x64
macOS Apple silicon (ARM 64 bit) osx-arm64

To use the library in your project, you should install the MuPDFCore NuGet package and/or the MuPDFCore.PDFRenderer NuGet package. When you publish a program that uses MuPDFCore, the correct native library for the target architecture will automatically be copied to the build folder (but see the note for .NET Framework).

Note: you should make sure that end users on Windows install the Microsoft Visual C++ Redistributable for Visual Studio 2015, 2017, 2019 and 2022 for their platform, otherwise they will get an error message stating that MuPDFWrapper.dll could not be loaded because a module was not found.

Note for musl-based Linux arm64: I could not find a way to ensure that the linux-musl-arm64 native artifact overwrites the linux-arm64 (glibc) artifact. As a result, when you publish a project that uses MuPDFCore targeting linux-musl-arm64, you will find two native assets in the build directory (MuPDFWrapper.so, which is the musl artifact, and libMuPDFWrapper.so, which is the glibc artifact). Everything will work fine out of the box (because the name of the musl artifact has higher priority), but you may want to delete libMuPDFWrapper.so in order to reduce size. You can use e.g. a post-build target to do this.

Usage

Documentation

You can find detailed descriptions of how to use MuPDFCore and some code examples in the MuPDFCore wiki. Interactive documentation for the library API can be accessed from the documentation website. A PDF reference manual is also available.

Minimal working example

The following example shows the bare minimum code necessary to render a page of a PDF document to a PNG image using MuPDFCore:

//Initialise the MuPDF context. This is needed to open or create documents.
using MuPDFContext ctx = new MuPDFContext();

//Open a PDF document
using MuPDFDocument document = new MuPDFDocument(ctx, "path/to/document.pdf");

//Page index (page 0 is the first page of the document)
int pageIndex = 0;

//Zoom level, converting document units into pixels. For a PDF document, a 1x zoom level corresponds to a
//72dpi resolution.
double zoomLevel = 1;

//Save the first page as a PNG image with transparency, at a 1x zoom level (1pt = 1px).
document.SaveImage(pageIndex, zoomLevel, PixelFormats.RGBA, "path/to/output.png", RasterOutputFileTypes.PNG);

Look at the wiki for more information.

Examples

The Demo folder in the repository contains some examples of how the library can be used to extract pages from a PDF or XPS document, render them to a raster image, or combine them in a new document

The PDFViewerDemo folder contains a complete (though minimal) example of a PDF viewer program built around the MuPDFCore.MuPDFRenderer.PDFRenderer control.

Note that these examples intentionally avoid any error handling code: in a production setting, you should typically make sure that calls to MuPDFCore library functions are within a try...catch block to handle any resulting MuPDFExceptions.

Building from source

Building the MuPDFCore library from source requires the following steps:

Building the libmupdf native library
Building the MuPDFWrapper native library
Creating the MuPDFCore.NativeAssets.xxx-yyy native assets NuGet packages
Creating the MuPDFCore library NuGet package

Starting from MuPDFCore 1.8.0, the native assets are split into their own NuGet packages, on which the main MuPDFCore package depends. Aside from reducing the size of individual packages, this means that if you are making changes that do not affect the native assets, you can skip steps 1-3 and go straight to step 4.

Steps 1 and 2 need to be performed on all of Windows, macOS and Linux, and on the various possible architectures (x86, x64 and arm64 for Windows, x64/Intel and arm64/Apple for macOS, and x64 and arm64 for Linux, both glibc and musl - no cross-compiling)! Otherwise, some native assets will be missing and it will not be possible to build the NuGet packages in step 3.

1. Building libmupdf

You can download the open-source (GNU AGPL) MuPDF source code from here. You will need to uncompress the source file and compile the library on Windows, macOS and Linux. You need the following files:

From Windows (x86, x64, arm64):
- libmupdf.lib
From macOS (Intel - x64, Apple silicon - arm64):
- libmupdf.a
- libmupdf-third.a
From Linux (x64, arm64):
- libmupdf.a
- libmupdf-third.a

Note that the files from macOS and Linux are different, despite sharing the same name.

For convenience, these compiled files for MuPDF 1.24.0 are included in the native/MuPDFWrapper/lib folder of this repository.

Tips for compiling MuPDF 1.24.0

On all platforms:
- Delete or comment line 316 in source/fitz/output.c (the fz_throw invocation within the buffer_seek method - this should leave the buffer_seek method empty). This line throws an exception when a seek operation on a buffer is attempted. The problem is that this makes it impossible to render a document as a PSD image in memory, because the fz_write_pixmap_as_psd method performs a few seek operations. By removing this line, we turn buffer seeks into no-ops; this doesn't seem to have catastrophic side-effects and the PSD documents produced in this way appear to be fine.
On Windows (x64):
- Open the platform/win32/mupdf.sln solution in Visual Studio 2022. You should get a prompt to retarget your projects. Accept the default settings (latest Windows SDK and v143 of the tools).
- Select the ReleaseExtra configuration and x64 architecture. Select every project in the solution except javaviewer and javaviewerlib and right-click to open the project properties. Go to C/C++ > Code Generation and set the Runtime Library to Multi-threaded DLL (/MD).
- Open the properties for the libpkcs7 project, go to C/C++ > Preprocessor and remove HAVE_LIBCRYPTO from the Preprocessor Definitions. Then go to Librarian > General and remove libcrypto.lib from the Additional Dependencies. Now, go to Custom Build Step and clear the Command Line and the Output.
- Save everything (CTRL+SHIFT+S) and close Visual Studio.
- Download the win64-binary release of libarchive - I used version 3.7.2. Extract the zip file and copy the libarchive folder to the thirdparty folder in the MuPDF source tree.
  - Open the thirdparty\libarchive\include folder and create a new subfolder called libarchive. Move the archive.h and archive_entry.h from thirdparty\libarchive\include to thirdparty\libarchive\include\libarchive.
  - Open the thirdparty\libarchive\lib folder and create a new subfolder called x64. Move the archive.lib and archive_static.lib files from thirdparty\libarchive\lib to thirdparty\libarchive\lib\x64.
- Download the bzip2 library (I used version 1.0.8) and extract the source code.
  - Open the x64 Native Tools Command Prompt for VS, move to the bzip2 source code folder, and run the following commands:
```
cl -Zi -EHsc -c bzlib.c blocksort.c compress.c crctable.c decompress.c huffman.c randtable.c
lib bzlib.obj blocksort.obj compress.obj crctable.obj decompress.obj huffman.obj randtable.obj
```
  - This will create some files, including one called bzlib.lib. Copy this file into the thirdparty\libarchive\lib\x64 folder, renaming it to libbz2-static.lib.
- Download the XZ Utils (I used v5.6.1) and extract the source code.
  - Open the x64 Native Tools Command Prompt for VS, move to the windows subfolder of the XZ Utils source code folder, and run the following commands:
```
cmake -DCMAKE_BUILD_TYPE=Release -DENABLE_NLS=OFF -DBUILD_SHARED_LIBS=OFF ..
msbuild xz.sln /p:Configuration=Release
```
  - Now go to the Release folder and copy liblzma.lib to the thirdparty\libarchive\lib\x64 in the MuPDF source tree.
- Download the Zstandard source code (I used v1.5.5) and extract it. Note that the precompiled version will not work because it was not compiled against the MSVCRT.
  - Open the zstd.sln file located in the build\VS2010 folder in Visual Studio. You should get a prompt prompt to retarget your projects. Accept the default settings (latest Windows SDK and v143 of the tools).
  - Save everything (CTRL+SHIFT+S) and close Visual Studio.
  - Open the x64 Native Tools Command Prompt for VS, move to the folder with the solution file, and build it with msbuild zstd.sln /p:Configuration=Release.
  - Copy the libzstd_static.lib file from the x64_Release folder to the thirdparty\libarchive\lib\x64 folder in the MuPDF source tree.
- Now, open the x64 Native Tools Command Prompt for VS, move to the folder with the solution file, and build it using msbuild mupdf.sln
- Then, build again using msbuild mupdf.sln /p:Configuration=Release.
- Finally, build again using msbuild mupdf.sln /p:Configuration=ReleaseExtra.
- This may still show some errors, but should produce the libmupdf.lib file that is required in the x64/ReleaseExtra folder (the file should be ~524MB in size).
On Windows (x86):
- You will have to use Visual Studio 2019, as Visual Studio 2022 is not supported on x86 platforms.
- Open the platform/win32/mupdf.sln solution in Visual Studio and select the ReleaseExtra configuration and Win32 architecture. Select every project in the solution except javaviewer and javaviewerlib and right-click to open the project properties. Go to C/C++ > Code Generation and set the Runtime Library to Multi-threaded DLL (/MD).
- Open the properties for the libpkcs7 project, go to C/C++ > Preprocessor and remove HAVE_LIBCRYPTO from the Preprocessor Definitions. Then go to Librarian > General and remove libcrypto.lib from the Additional Dependencies. Now, go to Custom compilation instructions and clear the Command line and the Output.
- Save everything (CTRL+SHIFT+S) and close Visual Studio.
- Download the source code release of libarchive (I used version 3.7.2) and extract it.
  - Open the x86 Native Tools Command Prompt for VS, move to the source code folder, and run the following commands:
```
cmake .
msbuild libarchive/archive_static.vcxproj /p:Configuration=Release /p:Platform="Win32"
```
  - This will create a file called archive_static.lib in the libarchive/Release folder.
  - Now, go to the MuPDF source directory and open the thirdparty folder.
    - Create a new folder called libarchive; within this folder, create two subfolders: include and lib.
    - In the thirdparty\libarchive\include folder, create another subfolder, called libarchive. Copy archive.h and archive_entry.h from the libarchive folder in the libarchive source tree, to the thirdparty\libarchive\include\libarchive folder within the MuPDF source code.
    - Copy the archive_static.lib file from the libarchive/Release folder in the libarchive source to thirdparty\libarchive\lib.
- Download the bzip2 library (I used version 1.0.8) and extract the source code.
  - Open the x86 Native Tools Command Prompt for VS, move to the bzip2 source code folder, and run the following commands:
```
cl -Zi -EHsc -c bzlib.c blocksort.c compress.c crctable.c decompress.c huffman.c randtable.c
lib bzlib.obj blocksort.obj compress.obj crctable.obj decompress.obj huffman.obj randtable.obj
```
  - This will create some files, including one called bzlib.lib. Copy this file into the thirdparty\libarchive\lib folder, renaming it to libbz2-static.lib.
- Download the XZ Utils (I used v5.6.1) and extract the source code.
  - Open the x86 Native Tools Command Prompt for VS, move to the windows subfolder of the XZ Utils source code folder, and run the following commands:
```
cmake -DCMAKE_BUILD_TYPE=Release -DENABLE_NLS=OFF -DBUILD_SHARED_LIBS=OFF ..
msbuild xz.sln /p:Configuration=Release /p:Platform=Win32
```
  - Now go to the Release folder and copy liblzma.lib to the thirdparty\libarchive\lib\x64 in the MuPDF source tree.
- Download the Zstandard source code (I used v1.5.5) and extract it. Note that the precompiled version will not work because it was not compiled against the MSVCRT.
  - Open the zstd.sln file located in the build\VS2010 folder in Visual Studio. You should get a prompt prompt to retarget your projects. Accept the default settings (latest Windows SDK and v142 of the tools).
  - Save everything (CTRL+SHIFT+S) and close Visual Studio.
  - Open the x86 Native Tools Command Prompt for VS, move to the folder with the solution file, and build it with msbuild zstd.sln /p:Configuration=Release /p:Platform=Win32.
  - Copy the libzstd_static.lib file from the bin/Win32_Release folder to the thirdparty\libarchive\lib folder in the MuPDF source tree.
- Now, open the x86 Native Tools Command Prompt for VS, move to the folder with the solution file, and build it using msbuild mupdf.sln /p:Platform=Win32
- Then, build again using msbuild mupdf.sln /p:Configuration=Release /p:Platform=Win32.
- Finally, build again using msbuild mupdf.sln /p:Configuration=ReleaseExtra /p:Platform=Win32.
- This should produce the libmupdf.lib file that is required in the ReleaseExtra folder (the file should be ~475MB in size).
On Windows (arm64)

This is going to be a bit more complicated, because it appears that MuPDF is not meant to be built on ARM. These instructions will assume that you are building MuPDF on an ARM machine.

First of all, make sure that you have installed Visual Studio 2022 and have selected the C++ ARM64 build tools component of the "Desktop development with C++" workload.
- Download and extract the MuPDF source code and follow the instructions for all platforms above.
- Add || defined(_M_ARM64) at the end of line 16 in scripts/tesseract/endianness.h.
- Open the file thirdparty/openjpeg/src/lib/openjp2/ht_dec.c and add the following after line 57 (source):
```
unsigned int __popcnt(unsigned int x) {
    unsigned int c = 0;
    for (; x; ++c) {
        x &= x - 1;
    }
    return c;
}
```
- Now we need to edit a few files in the thirdparty/tesseract/src/arch folder.
  - Comment or delete lines 183-212 (inclusive) in simddetect.cpp. You should now have an empty block between # elif defined(_WIN32) and #else. Also comment or delete lines 235-262 (inclusive) and 286-319 (inclusive).
  - Comment or delete lines 18-26 (inclusive) in dotproductsse.cpp. Delete everything from line 31 to line 142 (inclusive) and replace with:
```
double DotProductSSE(const double* u, const double* v, int n) {
    return DotProductNative(u, v, n);
}
```
  - Comment or delete lines 24-25 (inclusive) in dotproductavx.cpp. Delete everything from line 30 to line 82 (inclusive) and replace with:
```
double DotProductAVX(const double* u, const double* v, int n) {
    return DotProductNative(u, v, n);
}
```
  - Comment or delete lines 24-25 (inclusive) in dotproductfma.cpp. Delete everything from line 30 to line 86 (inclusive) and replace with:
```
double DotProductFMA(const double* u, const double* v, int n) {
    return DotProductNative(u, v, n);
}
```
  - Delete the contents of thirdparty/tesseract/src/arch/intsimdmatrixavx2.cpp and thirdparty/tesseract/src/arch/intsimdmatrixsse.cpp (do not delete the files, just their contents).
  - Comment or delete lines 119-120 (inclusive) in intsimdmatrix.h
- Download the source code release of libarchive (I used version 3.7.2) and extract it.
  - Open the Developer Command Prompt for VS, move to the source code folder, and run the following commands:
```
cmake .
msbuild libarchive/archive_static.vcxproj /p:Configuration=Release /p:Platform="Win32"
```
  - This will create a file called archive_static.lib in the libarchive/Release folder.
  - Now, go to the MuPDF source directory and open the thirdparty folder.
    - Create a new folder called libarchive; within this folder, create two subfolders: include and lib.
    - In the thirdparty\libarchive\include folder, create another subfolder, called libarchive. Copy archive.h and archive_entry.h from the libarchive folder in the libarchive source tree, to the thirdparty\libarchive\include\libarchive folder within the MuPDF source code.
    - In the thirdparty\libarchive\include folder, create a new subfolder called x64 and copy the archive_static.lib file from the libarchive/Release folder in the libarchive source to thirdparty\libarchive\lib\x64.
- Download the bzip2 library (I used version 1.0.8) and extract the source code.
  - Open the Developer Command Prompt for VS, move to the bzip2 source code folder, and run the following commands:
```
cl -Zi -EHsc -c bzlib.c blocksort.c compress.c crctable.c decompress.c huffman.c randtable.c
lib bzlib.obj blocksort.obj compress.obj crctable.obj decompress.obj huffman.obj randtable.obj
```
  - This will create some files, including one called bzlib.lib. Copy this file into the thirdparty\libarchive\lib\x64 folder, renaming it to libbz2-static.lib.
- Download the XZ Utils (I used v5.6.1) and extract the source code.
  - Open the Developer Command Prompt for VS, move to the windows subfolder of the XZ Utils source code folder, and run the following commands:
```
cmake -DENABLE_NLS=OFF -DBUILD_SHARED_LIBS=OFF ..
msbuild xz.sln /p:Configuration=Release
```
  - Now go to the Release folder and copy liblzma.lib to the thirdparty\libarchive\lib\x64 in the MuPDF source tree.
- Download the Zstandard source code (I used v1.5.5) and extract it. Note that the precompiled version will not work because it was not compiled against the MSVCRT.
  - Open the zstd.sln file located in the build\VS2010 folder in Visual Studio. You should get a prompt prompt to retarget your projects. Accept the default settings (latest Windows SDK and v142 of the tools).
  - In Visual Studio, click on the "Configuration Manager" item from the "Build" menu. In the new window, click on the drop down menu for the "Active solution platform" and select <New...>. In this new dialog, select the ARM64 platform and choose to copy the settings from x64. Leave the Create new project platforms option enabled and click on OK (this may take some time).
  - Save everything (CTRL+SHIFT+S) and close Visual Studio.
  - Open the Developer Command Prompt for VS, move to the folder with the solution file, and build it with msbuild zstd.sln /p:Configuration=Release /p:Platform=ARM64.
  - Copy the libzstd_static.lib file from the bin/ARM64_Release folder to the thirdparty\libarchive\lib\x64 folder in the MuPDF source tree.
- Back in the MuPDF source code folder, open the platform/win32/mupdf.sln solution in Visual Studio. You should get a prompt to retarget your projects. Accept the default settings (latest Windows SDK and v143 of the tools).
- In Visual Studio, click on the "Configuration Manager" item from the "Build" menu. In the new window, click on the drop down menu for the "Active solution platform" and select <New...>. In this new dialog, select the ARM64 platform and choose to copy the settings from x64. Leave the Create new project platforms option enabled and click on OK (this may take some time).
- Close the Configuration Manager and select the ReleaseExtra configuration and ARM64 architecture. Select every project in the solution except javaviewer and javaviewerlib and right-click to open the project properties. Go to C/C++ > Code Generation and set the Runtime Library to Multi-threaded DLL (/MD).
- Open the properties for the libpkcs7 project, go to C/C++ > Preprocessor and remove HAVE_LIBCRYPTO from the Preprocessor Definitions. Then go to Librarian > General and remove libcrypto.lib from the Additional Dependencies. Now, go to Custom Build Step and clear the Command Line and the Output.
- Save everything (CTRL+SHIFT+S) and close Visual Studio.
- Create a new folder platform/win32/Release. Now, the problem is that the bin2coff script included with MuPDF cannot create obj files for ARM64 (only for x86 and x64). Since I could not find a version that can do this, I translated the source code of bin2coff to C# and added this option myself. You can download an ARM64 bin2coff.exe from here; place it in the Release folder that you have just created.
- Open the Developer Command Prompt for VS, move to the folder with the solution file (platform/win32), and build it using msbuild mupdf.sln /p:Configuration=ReleaseExtra. Some compilation errors may occur towards the end, but they should not matter.
- After a while, this should produce libmupdf.lib in the ARM64/ReleaseExtra folder (the file should be ~521MB in size).
On Linux (x64, for both glibc- and musl- based distros):
- Edit the Makefile, adding the -fPIC compiler option at the end of line 24 (which specifies the CFLAGS).
- Comment line 218 in include/mupdf/fitz/config.h (for some reason, this seems to disable OCR even when using USE_TESSERACT=yes to build).
- Make sure that you are using a recent enough version of GCC (version 7.3.1 seems to be enough).
- Compile by running USE_TESSERACT=yes make HAVE_X11=no HAVE_GLUT=no (this builds just the command-line libraries and tools, and enables OCR through the included Tesseract library).
On Linux (arm64, for both glibc- and musl- based distros):
- Edit the Makefile, adding the -fPIC compiler option at the end of line 24 (which specifies the CFLAGS).
- Make sure that you are using a recent enough version of GCC (version 7.3.1 seems to be enough).
- Compile by running USE_TESSERACT=yes make HAVE_X11=no HAVE_GLUT=no (this builds just the command-line libraries and tools, and enables OCR through the included Tesseract library).
On macOS (Intel - x64):
- Edit the Makefile, adding the -fPIC compiler option at the end of line 24 (which specifies the CFLAGS). Also add the -std=c++11 option at the end of line 58 (which specifies the CXX_CMD).
- Compile by running USE_TESSERACT=yes make (this enables OCR through the included Tesseract library).
On macOS (Apple silicon - arm64)
- Edit the Makefile, adding the -fPIC compiler options at the end of line 24 (which specifies the CFLAGS). Also add the -std=c++11 option at the end of line 58 (which specifies the CXX_CMD).
- Compile by running USE_TESSERACT=yes make (this enables OCR through the included Tesseract library).

2. Building MuPDFWrapper

Once you have the required static library files, you should download the MuPDFCore source code (just clone this repository) and place the library files in the appropriate subdirectories in the native/MuPDFWrapper/lib/ folder (for Linux x64, copy the library built against glibc to the linux-x64 folder, and the library built against musl to the linux-musl-x64 folder, and do the same for Linux arm64).

To compile MuPDFWrapper you will need CMake (version 3.8 or higher) and (on Windows) Ninja.

On Windows, the easiest way to get all the required tools is probably to install Visual Studio. By selecting the "Desktop development with C++" workload you should get everything you need.

On macOS, you will need to install at least the Command-Line Tools for Xcode (if necessary, you should be prompted to do this while you perform the following steps) and CMake.

Once you have everything at the ready, you will have to build MuPDFWrapper on the nine platforms.

Build instructions

Windows (x86 and x64)

Assuming you have installed Visual Studio, you should open the "x64 Native Tools Command Prompt for VS" or the "x86 Native Tools Command Prompt for VS" (you should be able to find these in the Start menu). Take care to open the version corresponding to the architecture you are building for, otherwise you will not be able to compile the library. A normal command prompt will not work, either.

Note 1: you must build the library on two separate systems, one running a 32-bit version of Windows and the other running a 64-bit version. If you try to build the x86 library on an x64 system, the system will probably build a 64-bit library and place it in the 32-bit output folder, which will just make things very confusing.

Note 2 for Windows x86: for some reason, Visual Studio might install the 64-bit version of CMake and Ninja, even though you are on a 32-bit machine. If this happens, you will have to manually install the 32-bit CMake and compile a 32-bit version of Ninja. You will notice if this is an issue because the 64-bit programs will refuse to run.
CD to the directory where you have downloaded the MuPDFCore source code.
CD into the native directory.
Type build. This will start the build.cmd batch script that will delete any previous build and compile the library.

After this finishes, you should find a file named MuPDFWrapper.dll in the native/out/build/win-x64/MuPDFWrapper/ directory or in the native/out/build/win-x86/MuPDFWrapper/ directory. Leave it there.

Windows (arm64)

Locate the batch file that sets up the developer command prompt environment. You can do this by finding the "Developer Command Prompt for VS" link in the start menu, then clicking on Open file location, opening the properties of the link and looking at the Target. This could be e.g. C:\Program Files\Microsoft Visual Studio\2022\Preview\Common7\Tools\VsDevCmd.bat.
Open a normal command prompt and invoke the batch script with the -arch=arm64 -host_arch=x86 arguments (add quotes if there are spaces in the path to the batch script), e.g.:
```
"C:\Program Files\Microsoft Visual Studio\2022\Preview\Common7\Tools\VsDevCmd.bat" -arch=arm64 -host_arch=x86
```
CD to the directory where you have downloaded the MuPDFCore source code.
CD into the native directory.
Type build. This will start the build.cmd batch script that will delete any previous build and compile the library.

After this finishes, you should find a file named MuPDFWrapper.dll in the native/out/build/win-arm64/MuPDFWrapper/ directory. Leave it there.

macOS and Linux

Assuming you have everything ready, open a terminal in the folder where you have downloaded the MuPDFCore source code.
cd into the native directory.
Type chmod +x build.sh.
Type ./build.sh. This will delete any previous build and compile the library.

After this finishes, you should find a file named libMuPDFWrapper.dylib in the native/out/build/mac-x64/MuPDFWrapper/ directory (on macOS running on an Intel x64 processor) or in the native/out/build/mac-arm64/MuPDFWrapper/ directory (on macOS running on an Apple silicon arm64 processor), and a file named libMuPDFWrapper.so in the native/out/build/linux-XXX/MuPDFWrapper/ directory (on Linux - where XXX can be x64, arm64, musl-x64, or musl-arm64). Leave it there.

3. Creating the native assets MuPDFCore NuGet packages

Once you have the MuPDFWrapper.dll (3x), libMuPDFWrapper.dylib (2x) and libMuPDFWrapper.so (4x) files, make sure they are in the correct folders (native/out/build/xxx-yyy/MuPDFWrapper/), all on the same machine.

To create the native assets NuGet packages, you will need the .NET Core 2.0 SDK or higher for your platform. Once you have installed it and have everything ready, open a terminal in the folder where you have downloaded the MuPDFCore source code and type:

BuildNativeAssets

This will create the NuGet packages in the MuPDFCore.NativeAssets/NuGetPackages folder. Once the script finishes, this folder should contain 9 files. Make sure you add this folder as a local NuGet source.

4. Creating the MuPDFCore NuGet package

If you have made updates to the native assets, make sure to use the appropriate version numbers in MuPDFCore/MuPDFCore.csproj. Then, to create the main MuPDFCore NuGet package, open a terminal in the folder where you have downloaded the MuPDFCore source code and type:

cd MuPDFCore
dotnet pack -c Release

This will create a NuGet package in MuPDFCore/bin/Release. You can install this package on your projects by adding a local NuGet source.

5. Running tests

To verify that everything is working correctly, you should build the MuPDFCore test suite and run it on all platforms. To build the test suite, you will need the .NET 7 SDK or higher. You will also need to have enabled the Windows Subsystem for Linux.

To build the test suite:

Make sure that you have changed the version of the MuPDFCore NuGet package so that it is higher than the latest version of MuPDFCore in the NuGet repository (you should use a pre-release suffix, e.g. 1.4.0-a1 to avoid future headaches with new versions of MuPDFCore). This is set in line 9 of the MuPDFCore/MuPDFCore.csproj file.
Add the MuPDFCore/bin/Release folder to your local NuGet repositories (you can do this e.g. in Visual Studio).
If you have not done so already, create the MuPDFCore NuGet package following step 4 above.
Update line 56 of the Tests/Tests.csproj project file so that it refers to the version of the MuPDFCore package you have just created.

These steps ensure that you are testing the right version of MuPDFCore (i.e. your freshly built copy) and not something else that may have been cached.

Now, open a Windows command line in the folder where you have downloaded the MuPDFCore source code, type BuildTests and press Enter. This will create a number of files in the Release\MuPDFCoreTests folder, where each file is an archive containing the tests for a certain platform and architecture:

MuPDFCoreTests-linux-x64.tar.gz contains the tests for Linux environments using glibc on x64 processors.
MuPDFCoreTests-linux-arm64.tar.gz contains the tests for Linux environments using glibc on arm64 processors.
MuPDFCoreTests-linux-musl-x64.tar.gz contains the tests for Linux environments using musl on x64 processors.
MuPDFCoreTests-linux-musl-arm64.tar.gz contains the tests for Linux environments using musl on arm64 processors.
MuPDFCoreTests-mac-x64.tar.gz contains the tests for macOS environments on Intel processors.
MuPDFCoreTests-mac-arm64.tar.gz contains the tests for macOS environments on Apple silicon processors.
MuPDFCoreTests-win-x64.tar.gz contains the tests for Windows environments on x64 processors.
MuPDFCoreTests-win-x86.tar.gz contains the tests for Windows environments on x86 processors.
MuPDFCoreTests-win-arm64.tar.gz contains the tests for Windows environments on arm64 processors.

To run the tests, copy each archive to a machine running the corresponding operating system, and extract it (note: on Windows, the default zip file manager may struggle when extracting the text file with non-latin characters; you may need to manually extract this file). Then:

Windows

Open a command prompt and CD into the folder where you have extracted the contents of the test archive.
Enter the command MuPDFCoreTestHost (this will run the test program).

macOS and Linux

Open a terminal and cd into the folder where you have extracted the contents of the test archive.
Enter the command chmod +x MuPDFCoreTestHost (this will add the executable flag to the test program).
Enter the command ./MuPDFCoreTestHost (this will run the test program).
On macOS, depending on your security settings, you may get a message saying zsh: killed when you try to run the program. To address this, you need to sign the executable, e.g. by running codesign --timestamp --sign <certificate> MuPDFCoreTestHost, where <certificate> is the name of a code signing certificate in your keychain (e.g. Developer ID Application: John Smith). After this, you can try again to run the test program with ./MuPDFCoreTestHost.

The test suite will start; it will print the name of each test, followed by a green Succeeded or a red Failed depending on the test result. If everything went correctly, all tests should succeed.

When all the tests have been run, the program will print a summary showing how many tests have succeeded (if any) and how many have failed (if any). If any tests have failed, a list of these will be printed, and then they will be run again one at a time, waiting for a key press before running each test (this makes it easier to follow what is going on). If you wish to kill the test process early, you can do so with CTRL+C.

Note about MuPDFCore and .NET Framework

If you wish to use MuPDFCore in a .NET Framework project, you will need to manually copy the native MuPDFWrapper library for the platform you are using to the executable directory (this is done automatically if you target .NET/.NET core).

One way to obtain the appropriate library files is:

Manually download the appropriate native assets NuGet package from the table below. Note that AnyCPU builds on Windows need the win-x86 native asset.
Rename the .nupkg file so that it has a .zip extension.
Extract the zip file.
Within the extracted folder, the library files are in the runtimes/xxx/native/ folder, where xxx is linux-x64, linux-arm64, linux-musl-x64, linux-musl-arm64, osx-x64, osx-arm64, win-x64, win-x86 or win-arm64, depending on the platform you are using.
The file you need to copy should be called MuPDFWrapper.dll on Windows, libMuPDFWrapper.so or MuPDFWrapper.so on Linux, and libMuPDFWrapper.dylib on macOS.

Make sure you copy the appropriate file to the same folder as the executable!

OS	Platform		NuGet package
Windows	x86		win-x86
	x64		win-x64
	arm64		win-arm64
Linux	x64	glibc	linux-x64
	x64	musl	linux-musl-x64
	arm64	glibc	linux-arm64
	arm64	musl	linux-musl-arm64
macOS	x64 (Intel)		osx-x64
macOS	arm64 (Apple Silicon)		osx-arm64

mupdfcore's People

Contributors

Stargazers

Watchers

Forkers

zhangbo27 carael joslat esonhon elitetony djm132 michael-prog v8ify shenlebantongying maybelaterornot iqianyue myjimmy greendreamer

mupdfcore's Issues

MuPDFCore.dll in MuPDFCore NuGet package 1.7 has no strong name

Hi,

I just added MuPDFCore 1.7 NuGet package to my project, but MuPDFCore.dll couldn't be loaded because of no strong name:

MuPDFCore, Version=1.7.0.0, Culture=neutral, PublicKeyToken=null

Is this an issue or was designed?
Should I need to build from source to add strong name?

Thanks!
Wicky Hu

Other output image formats

I was reviewing muPDF_explored and noticed that on page 136 the original C API's support TIFF output, which is what I am mainly looking for in my current project. However per line 184 of muPDF.cs it appears we only have an enum for a select set of these values, plus some not support in the original C API. is there a means in which to add in the TIFF format or other image formats into the .NET library?

What is the best way to get the text of a pdf?

Hi,
First, thanks for the effort put in this library, greatly appreciate it! 👍 :)
I more or less got it done but I am unsure I am doing it fully correctly or having the full picture.
At the moment I do not see a document.GetText as can be found in pymupdf lib, https://pymupdf.readthedocs.io/en/latest/app2.html#plain-text so what I did is:
iterate on all pages of a document to get the MuPdfStructuredTextPage(s).
From each, get the MuPDFStructuredTextBlocks
From each get the MuPdftext lines
and concatenate those in a big string if they are not null or empty/whitespace...
Is this correct? meaning the best way to do this... or is there a better way?

I assume this will provide a UTF-8 Unicode... right?
What happens with text that comes in other languages? will we get it with that encoding or should we do a conversion?

Also to do this properly, we should use the OCR functions over all images, to try to obtain the text from them... right?

Set anti-aliasing level

Hi!
Native MuPDF library has the ability to set how many bits of anti-aliasing to use (fz_set_aa_level). The default value is 8.
But MuPdfCore has no such parameter in the WriteImage method.
Please add it, because sometimes anti-aliasing messes up schemes and fonts.

Why does the rendered page not display the signature stamp

I opened a PDF with a signature, but found that your project does not display the stamp. Is there any way to display it?

Opend by PDFViewerDemo

Opend by SumatraPDF

is there a plan to support the outline

Always get MuPDFCore.MuPDFException:“Cannot open document” exception in PDFViewer demo

Windows 10 x64 with vs2019, use the original source, when open a pdf or xps file, it fires MuPDFCore.MuPDFException:“Cannot open document” exception. Then I upgrade the related AVA and Mupdf package to the newest version, exception still there.

Is there any possible to create a text page by using this lib?

like SumatraPDF, the text part can be selected and copied. Any advice? :)

PDFViewerDemo cannot chanage pages

I managed to build the demo from the current master, but the turn page doesn't even work.

Click the ^ & v only to change the number displayed and nothing else. Press enter also doesn't work.

How to improve or decrease png quality?

Hello

I am trying this method, but how can I Increase or decrease the image quality?

MuPDFContext context = new MuPDFContext();
            MuPDFDocument document = new MuPDFDocument(context, @"blankAI\test.pdf");
            

            document.SaveImage(1, 1, PixelFormats.RGB, @"blankAI\0test.png", RasterOutputFileTypes.PNG);

I also tried some older demo when I changed width and height:(but it gets memory leaking errors )

for (int i = 0; i < document.Pages.Count; i++)
                                    {
                                        //Initialise the renderer for the current page, using two threads (total number of threads: number of pages x 2
                                        renderers[i] = document.GetMultiThreadedRenderer(i, 2);
                                        //Determine the boundaries of the page when it is rendered with a 1.5x zoom factor
                                        RoundedRectangle roundedBounds = document.Pages[i].Bounds.Round(1);
                                        renderedPageSizes[i] = new RoundedSize(roundedBounds.Width / 2, roundedBounds.Height / 2);
                                        //Determine the boundaries of each tile by splitting the total size of the page by the number of  threads.
                                        tileBounds[i] = renderedPageSizes[i].Split(renderers[i].ThreadCount);
                                        destinations[i] = new IntPtr[renderers[i].ThreadCount];
                                        for (int j = 0; j < renderers[i].ThreadCount; j++)
                                        {
                                            //Allocate the required memory for the j-th tile of the i-th page.
                                            //Since we will be rendering with a 24-bit-per-pixel format, the required memory in bytes is height   x width x 3.
                                            destinations[i][j] = System.Runtime.InteropServices.Marshal.AllocHGlobal(tileBounds[i][j].Height * tileBounds[i][j].Width * 3);
                                        }
                                    }

How to extract font name, font binary data, font 2D coordinates, image binary data, image 2D coordinates within a PDF page?

as shown in the title.

deepl translator.

If MuPdfWrapper.dll can work in 32-bit windows

Hi,
I am trying to build MuPdfWrapper.dll to x86 arch.
Build with success, but when CreateContext method is called in MuPdfCore, got System.AccessViolationException:“Attempted to read or write protected memory. This is often an indication that other memory is corrupt.”
I think my x86 version libmupdf.lib is invalid or version mismatch, which is compiled from https://github.com/ArtifexSoftware/mupdf , branch 1.18.x.

Method not found: 'Void MuPDFCore.MuPDFContext..ctor(Int64)'.

I get exceptions when trying to run the code

        //Save the full page as a PNG image.
        renderedPage.SaveAsPNG("page" + i.ToString() + ".png");

The code is taken from your readme, where you show how to convert PDF to png.

Actual exception

System.MissingMethodException: Method not found: 'Void MuPDFCore.MuPDFContext..ctor(Int64)'.
   at VectSharp.Raster.Raster.SaveAsPNG(Page page, String fileName, Double scale)
   at MuPDFWrapperCore.Program.Main(String[] args) in

Project .NET6 64bit.

show pdf files continuously

Please

I want to show pdf files continuously. Is it possible to do so now?

Looking forward to your reply，thanks

using MuPDFRenderer control on Avalonia 11

Hello,

I am a newcomer to Avalonia and I am encountering an issue while trying to use the MuPDFRenderer control in my project. The MuPDFRenderer control works fine on Avalonia 0.10.* versions, but I am facing difficulties in getting it to work on Avalonia 11.

It seems that there might be some compatibility issues or changes introduced in Avalonia 11 that are affecting the functionality of the MuPDFRenderer control. As a beginner, I am not familiar with the internal structure and specific features of Avalonia, so I am uncertain about the steps needed to resolve this problem.

I would greatly appreciate any guidance or suggestions you can provide in order to successfully use the MuPDFRenderer control on Avalonia 11. If there are any specific configuration settings or code modifications required, please let me know.

Thank you for your time and assistance!

Best regards,

Linux throws 'System.DllNotFoundException'

Hello, I am learning the knowledge of PDF.
I found that your project is great. I followed your instructions and successfully ran PDFViewerDemo on both Windows and Mac.
But under Linux, I got this error. I don’t know where the file "MuPDFWrapper" is. Is it the directory "\native\out\build\linux-x64\MuPDFWrapper"? I have this directory but still have this abnormal. There is also libMuPDFWrapper.so in my running directory.

blank area above PDFRender

Hello, I am using the 1.6.0 nuget package, but when I was using the up and down layout, I found an area above the PDFRender where I cannot set the background color or hide it. However, the PDF file can be dragged and dropped to that area. Is there any way for me to manipulate that area?

Do you support deleting pages

.dll problem when released

Hello. I released an app using MuPDF with windows forms .core 3.2. At my pc it worked fine, but on others it is saying that "Unable to load DLL MuPDFWrapper". Are there any dependencies besides core 3.2? Dll is right next to executable. I saw this (but see the note for .NET Framework). But it is telling about Framework not Core :) I guess it hasnt changed then(it worked when I installed that pack).

Cannot open document

Install MuPDFCore in Blazor WebAssembly

Hello, I am trying to install the library to a Blazor WebAssembly project in .NET 6.0, and I am getting the error of a missing MuPDFWrapper dll.
Do you have any ideas?

edit: I need to mention that I even tried adding the library as a blazor native dependency (using NativeFileReference) but the error persisted.

Don't work with musl-based linux distros

Hi!

I am trying to use MuPDFCore in container with Alpina Linux, but it doesn't work.
Here ldd result:

/app/runtimes/linux-x64/native # ldd libMuPDFWrapper.so
        /lib/ld-musl-x86_64.so.1 (0x7fafacb69000)
        libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x7fafa9cdf000)
        libm.so.6 => /lib/ld-musl-x86_64.so.1 (0x7fafacb69000)
        libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x7fafa9cc1000)
        libc.so.6 => /lib/ld-musl-x86_64.so.1 (0x7fafacb69000)
Error relocating libMuPDFWrapper.so: __isnan: symbol not found
Error relocating libMuPDFWrapper.so: __isinff: symbol not found
Error relocating libMuPDFWrapper.so: __finite: symbol not found
Error relocating libMuPDFWrapper.so: __isnanf: symbol not found
Error relocating libMuPDFWrapper.so: __isinf: symbol not found

Can you add linux-musl-x64 support?

MuPDFPageCollection.GetEnumerator bug

Hi, there is a bug in MuPDFPageCollection.GetEnumerator implemention.

        public IEnumerator<MuPDFPage> GetEnumerator()
        {
            for (int i = 0; i < Pages.Length; i++)
            {
                if (Pages[i] == null)
                {
                    Pages[i] = new MuPDFPage(OwnerContext, OwnerDocument, i);
                }
            }

            return (IEnumerator<MuPDFPage>)Pages.GetEnumerator();
        }

The type of Pages.GetEnumerator() is SZArrayEnumerator, it can not be cast to System.Collections.Generic.IEnumerator.

JPX support disabled

When I try to export PNG image from an PDF that contains an image in JPEG2000 format I receive the "JPX support disabled" error.

That indicates that the support for those files were explicitly disabled with the "FZ_ENABLE_JPX" header (since it's enabled by default).

I've build everything from sources with default config.h headers and the PDF's with JPEG2000 files works correctly.

Did you disable it explicitly? If yes what was the reason. And if it an omission is it possible for you to release the new version with FZ_ENABLE_JPX enabled?

Thanks in advance.

Accept and return ReadOnlySpan<byte> instead of IntPtr

Hello, first of all let me say that this is a great library and especially the multithreaded render.
Is there somewhere on the roadmap the ability to expose an API with Span<byte> instead of IntPtr ?

The reason is that in most of the cases, after rendering an e.g. PDF file, we have to either change the exported file format or perform some sort of image manipulation (with another library e.g. ImageSharp). Therefore, it would be beneficial to avoid marshalling memory back, holding twice the amount.

The only way I know to expose a Span<byte> out of an IntPtr is by going into unsafe mode and casting IntPtr to a void* pointer but this would make my project require an /unsafe build, which I was really hoping to avoid.

Do you have any ideas?

Thanks,
--Theodore

do you support reading encrypted PDF files?

Hello, do you support reading encrypted PDF files?

Is there a way to support iOS and Mac Catalyst?

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.