Giter Site home page Giter Site logo

evotecit / pswritepdf Goto Github PK

View Code? Open in Web Editor NEW
186.0 9.0 19.0 34.93 MB

PowerShell Module to create, edit, split, merge PDF files on Windows / Linux and MacOS

License: GNU Affero General Public License v3.0

PowerShell 92.79% HTML 7.21%
powershell pdf merge split edit create hacktoberfest

pswritepdf's Introduction

PSWritePDF

PSWritePDF is by no means a finished product. Like with most of my modules, I build some concept that matches view on how I would like it to look, and in the next months, I will probably update its functionality to match my expectations. Since PSWritePDF is based on iText 7 it should be possible with some work to get all that functionality into PowerShell. That means that this module has excellent possibilities when it comes to potential use cases.

For now, I've divided the module functionality into two categories:

  • ☑ Standalone functions such as Split-PDF, Merge-PDF or Convert-PDFtoText
  • ☑ Bundled functions working like PSWriteHTML where they are not supposed to be used separately mainly to create PDF files (for now)

To find out more read following blog posts:

3rd Party Notices

This PowerShell Module uses iText 7 Community for .NET therefore the license needs to be kept the same as iText (or at least I think so). If it isn't so I would be more than happy to release my PowerShell code as MIT license. I don't intend to modify iText7 codebase, just using it's API. As I'm not an expert on licensing I'm attaching some of articles I found that may make this license terms clearer.

Recommended read:

Other software used:

All that additional software is required to work with iText and so it's part of this package.

Installing / Updating

Install-Module PSWritePDF -Force

pswritepdf's People

Contributors

chrismagnuson avatar markdem avatar mathisdukatz avatar przemyslawklys avatar user8446 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pswritepdf's Issues

Register-PDFFont failed to add kaiu.ttf

I used New version 0.0.18 add kaiu.ttf.
but I’not working,but used to old version 0.0.17
add completed and can output pdf.

maybe is encoding something wrong.
my windows language is chinese-traditional

I want to used 0.0.18 New-PDFImage function.
can I just add this Function to 0.0.17?

Not an issue at all, but a question re: tables/checkboxes

Hello, just a question if you don't mind. (Seems like you've all done the most Itext/Powershell related work I can find scouring the net.)

Can PSWritePDF currently handle the insertion of checkboxes into a table to allow for centering within the table boxes? I've been looking through the currently posted code to see if you did, and if so, how you handled it.

I've been working to try and mimic the ability demonstrated on https://kb.itextpdf.com/home/it7kb/examples/create-fields-in-a-table and can get it all working with the exception figuring out getting the coordinates of the next cell to drawn.

I figured it was worth a shot to reach out and see if you might have any insight or if I'm just missing where it was handled currently.

Thank you in advance
Scott

If Output File already exists powershell ISE is crashing and closing

Hi,

regonized another failure. If the Output File already exists, than the whole ISE crashes.
Normally i add Unix timestamp to files to avoid this, but in the last example case i uploaded here #13 I recognized this beahavior.
Additional all other work was lost, so the normal behavior that a killed ISE resumes unsaved work didn't work.

Kind regards.

Split-PDF is not possible for PDFs with password

It would be great if you could implement a new parameter "-password" which is used to open the pdf and split the pdf.
When I Try it there comes this error:
WARNING: Split-PDF - Error has occured: Exception calling ".ctor" with "1" argument(s): "Bad user password. Password is not provided or wrong password provided. Correct password should be passed to PdfReader constructor with properties. See ReaderProperties#setPassword() method."

Merge-PDF does not merge multiple files with hyphens or dashes

Consider the following command:
Merge-PDF -InputFile "VIP-Build_4_16.pdf", "VIP-Build_4_15.pdf", "VIP-Build_4_14.pdf", "VIP-Build_4_13.pdf" -OutputFile 'test.pdf'

The merger does not show all pages. It only shows the first page, which is VIP-Build_4_16.pdf. All other files indicated in the Merge-PDF command is ignored.

Next test:
Merge-PDF -InputFile "0I-fqaav6_MVP-Build_3_22.pdf", "test3.pdf", "test3-Copy.pdf" -OutputFile "test.pdf"

All files are merged.

[IDEA]Adding images

Could you try to make function to add image to PDF? I want to connect this module with PSGraph and and graph to PDF

PDF Form Field Flattening

Hi...I'm attempting to craft a Powershell solution which merges a series of existing PDF Forms, which "flattens" the forms prior to adding them to the merged PDF. I want the original data displayed that was put into the form, but I don't care that the form functionality is lost. I was hoping this might be available in PSWritePDF. -- Thanks!

Split-PDF option to split every X pages

The current behavior (which works fine as a "default") is to split the PDF at every page break. I have a need of something that will split a PDF every x pages. Could look something like this:

Split-PDF -FilePath '.\PDFtoSplit.pdf' -OutputFolder $PathToOutput -pages 2

If no '-pages' variable is specified, then the command defaults to "1"

Feature request: Merge to a fixed paper size

I'm trying to merge 2 or more PDF (A5 size) in to a new PDF (A4 size).
I was wondering if it's possible to add a parameter that impose page size, for example:
Merge-PDF -InputFile $src1, $src2 -OutputFile $destination -OutputPageSize A4

Question: Table width

Hi,

i'm sorry but i can't figure out, how to keep the tables within the page boarders. For the most tables it works, but if there are only a few characters, then the table goes over the page width.

Any suggestion?

code is looking like this

New-PDFPage -PageSize A4 -MarginLeft 20 -MarginRight 20 -MarginTop 20 -MarginBottom 20 { New-PDFText -Text 'TITEL' -FontBold $true New-PDFText -Text 'LONGTEXT' New-PDFText -Text 'TITEL' -FontBold $true New-PDFText -Text 'LONGTEXT' New-PDFText -Text 'TITEL' -FontBold $true New-PDFTable -DataTable $PropertyTable }

version 0.0.18 and 0.0.19 cannot output pdf file

I think is OS language problems....
I have two test environment. One of en-us language other is zh-tw language.
in the en-us language environment output pdf file functioning normally. and zh-tw language environment show photo error
image

Transpose Tables

Waiting for feature to transpose datatable like
DataTable $variable -Transpose

The same feature have PSWriteWord

Convert-PDFToText - possible bug

Reported on linkedin to be verified

It seems that the function Convert-PDFToText is working a bit incorrect - I have to test further, but for the moment (in my environment) it works like this:

Assuming that PDF has multiple pages with PageText1, PageText2,.. PageTextN, after running the function I get the result where text from every next page has all the text from previous pages, smthng like "PageText1PageText1PageText2PageText1PageText2PageText3" for pdf of 3 pages.

It seems that (in my environment) I could fix it by explicitly declaring new TextExtractionStrategy for every call of GetTextFromPage

so, line 1754

[iText.Kernel.Pdf.Canvas.Parser.PdfTextExtractor]::GetTextFromPage($ExtractedPage, $iTextExtractionStrategy) converted to [iText.Kernel.Pdf.Canvas.Parser.PdfTextExtractor]::GetTextFromPage($ExtractedPage, [iText.Kernel.Pdf.Canvas.Parser.Listener.LocationTextExtractionStrategy]::new())

after this fix extraction worked as expected.

Module issue with UNC paths

Cmdlets like Split-PDF, Merge-PDF, and Convert-PDFToText don't work with UNC paths.

WARNING: Split-PDF - Error has occured: Exception calling ".ctor" with "1" argument(s): "Microsoft.PowerShell.Core\FileSystem::\FileServer01\Data\test.pdf not found as
file or resource."

The Resolve-Path cmdlet adds the Microsoft.PowerShell.Core\FileSystem:: to the UNC path which causes iText to puke, this doesn't happen with local file paths. I switched Resolve-Path to Convert-Path and it seems to be working fine on my local copy.

about_method

Write in readme about methods in that module and how they work in module.This could be helpful

Add Logo Image As Header

Hi There!
When will this function be added:
Add Logo Image As Header in the PDF
There is currently no way to do it.
Thank you!
Yaro

Feature: International encoding support for New-PDF

Please, add encoding support into this Module!

By default it use 1252 encoding but able to show other characters. In my case -cyrillic

Quick-and-dirty proof of concept:

--- PSWritePDF~.psm1	Fri Dec 25 16:23:26 2020
+++ PSWritePDF.psm1	Fri Dec 25 16:23:33 2020
@@ -545,7 +545,7 @@
             if ($null -ne $Font[$i]) {
                 if ($Font[$i]) {
                     $ConvertedFont = Get-PDFConstantFont -Font $Font[$i]
-                    $ApplyFont = [iText.Kernel.Font.PdfFontFactory]::CreateFont($ConvertedFont)
+                   $ApplyFont = [iText.Kernel.Font.PdfFontFactory]::CreateFont('C:\Windows\Fonts\verdana.ttf', [iText.IO.Font.PdfEncodings]::IDENTITY_H, $true)
                     $PDFText = $PDFText.SetFont($ApplyFont)
                 }
             } else {

After this change I get cyrillic (utf8) letters displayed with verdata font when I use any non-default -font for New-PDFText

Rotate Text

Hi,

First of all thanks for this awesome piece of code, it has saved me tons of hours, just a quick question, how can I rotate text to print it vertically on a page?

Thank you

Register-PDFFont problem?

When I try to use new font feature I meet a problem when repeat pdf creation

PS D:\> Register-PDFFont -FontName 'Verdana' -FontPath 'C:\Windows\fonts\verdana.ttf' -Encoding IDENTITY_H -Cached -Default
PS D:\> New-PDF -FilePath d:\2.pdf -PDFContent { New-PDFText -Text 'Hello ', 'Привет !' }
PS D:\> New-PDF -FilePath d:\2.pdf -PDFContent { New-PDFText -Text 'Hello ', 'Привет !' }
MethodInvocationException: D:\PowerShell\Modules\PSWritePDF\0.0.11\PSWritePDF.psm1:994
Line |
 994 |          $Script:Document.Close();
     |          ~~~~~~~~~~~~~~~~~~~~~~~~
     | Exception calling "Close" with "0" argument(s): "Pdf indirect object belongs to other PDF document. Copy object to current
     | pdf document."

May be I use it wrong ?

Merged PDF is losing its form capabilities

First of all, thank you for the great plugin!

I ran into a problem when merging multiple PDF files where one PDF file contains a form with scripts (to automatically calculate field values based on entered numbers).

The form functionality is gone when merging, at least the scripts to calculate the fields and apply functionality to the send/approve button.

Is there a way to merge multiple PDFs and keeping the full form functionality?

Ordered Output of Split Pages

Awesome module which I have used to sort through large PDF files at incredible speeds.

First time posting anything on GitHub, so I hope this is acceptable.

Only issue I have is when splitting documents with a large amount of pages, the naming convention of the [CustomeSplitter] Class names the file based on the page number. This can make it hard to then correctly read through split files in order.

Suggest expanding the file name to include leading zeros. I have successfully been able to modify the [CustomSplitter] Class to do this with the below code:

class CustomSplitter : iText.Kernel.Utils.PdfSplitter {
    [int] $_order
    [string] $_destinationFolder
    [string] $_outputName

    CustomSplitter([iText.Kernel.Pdf.PdfDocument] $pdfDocument, [string] $destinationFolder, [string] $OutputName) : base($pdfDocument) {
        $this._destinationFolder = $destinationFolder
        $this._order = 1
        $this._outputName = $OutputName
    }

    [iText.Kernel.Pdf.PdfWriter] GetNextPdfWriter([iText.Kernel.Utils.PageRange] $documentPageRange) {
        $Name = -join ($this._outputName, $this._order.ToString("D4"), ".pdf")
        $Path = [IO.Path]::Combine($this._destinationFolder, $Name)
        $this._order++
        return [iText.Kernel.Pdf.PdfWriter]::new($Path)
    }
}

"$this._order = 1" as a start for page 1.
"$this._order.ToString("D4")" will handle files that are up to 9999 pages long, so shouldn't push the limits too often.
"$this._order++" to increment to the next page number.

Ideally if I had time, I would expand this to look at the file prior to splitting to get the total amount of pages and adjust how many leading zeros are required so that the naming convention was dynamic based on the content at the time.

Tested this to work with both 0.0.10 and 0.0.17.

Thanks again for the module.

Issue when deploying to Azure Function

Hello,

I am new to using this module. It works great in scripts on my local machine with Powershell Core 7.2, but it does not work when deployed to an Azure Function with the same Ppowershell Core 7.2. I am getting the following error:

"[Warning] WARNING: Get-PDF - Processing document C:\local\Temp\tmp72B6.pdf failed with error: Exception calling ".ctor" with "1" argument(s): "The type initializer for 'iText.Commons.Actions.EventManager' threw an exception."

I am not sure how to further debug this since it works great on my local machine, but not when deploying the same code to Azure function.

Can anyone shine some light on what is going on?

Incorrect display of hashtables

Hi,
I have been working with 'New-PDFTable' and noticed a very unusual behavior. While the examples worked perfectly, my own code did not work. If I manually inserted something into $DataTable2, it worked again. If the variable was adjusted, not anymore.

As far as I could see the problem is in the function 'New-InteralPDFTable' and the line
'[Array] $TemporaryTable = foreach ($_ in $DataTable2) {'

the function must be changed to
'[Array] $TemporaryTable = foreach ($_ in $DataTable) {'

So far it seems to be working. Can you verify the error?

Merge-PDF with Bookmarks

Would you considering adding a switch to the Merge-PDF that would allow you to keep the bookmarks? I believe the iText framework allows for it. I could not figure out how to keep them on my own using that framework but your module works great for me

Not working on Mac Powershell 7.0.3

Hi, I run this command on the latest Mac OS with the latest Powershell (just installed using brew)
Merge-PDF -InputFile .\1.pdf -OutputFile .\output.pdf
The command completed successfully but no output.pdf file is generated in local directory.

List inside DataTable

When PsCustomObject have single variables in PCO property then DataTable works fine.
But when PsCustomObject have single variables and and array in property then the array is showed as variable. Could you please create a feature when i can pass a list into datatable when creating that datatable

Merge-PDF not producing any output File (no error message)

Hi,

I am trying to use Merge-PDF, however it is not working for me. I do not get any error messages, however no output file is being generated.

PS C:\mydir> Test-Path -PathType Leaf .\0001_.pdf
True

PS C:\mydir> Test-Path -PathType Leaf .\0006_.pdf
True

PS C:\mydir> Merge-PDF -InputFile .\0001_.pdf, .\0006_.pdf -OutputFile .\out.pdf -Verbose

PS C:\mydir> Test-Path -PathType Leaf .\out.pdf
False

I tried PDFs from different sources because I thought maybe the issue is with my PDFs being produced from my own code, however the result (no result) is always the same.

> Get-InstalledModule PSWritePDF | Format-List


Name                       : PSWritePDF
Version                    : 0.0.20
Type                       : Module
Description                : Little project to create, read, modify, split, merge PDF files on Windows, Linux and Mac.
Author                     : Przemyslaw Klys
CompanyName                : Przemyslaw.Klys
Copyright                  : (c) 2011 - 2022 Przemyslaw Klys @ Evotec. All rights reserved.
PublishedDate              : 02.10.2022 17:24:49
InstalledDate              : 28.06.2024 14:04:38
UpdatedDate                : 
LicenseUri                 : https://github.com/EvotecIT/PSWritePDF/blob/master/LICENSE
ProjectUri                 : https://github.com/EvotecIT/PSWritePDF
IconUri                    : https://evotec.xyz/wp-content/uploads/2019/11/PSWritePDF.png
Tags                       : {PDF, macOS, linux, windows...}
Includes                   : {Function, RoleCapability, Command, DscResource...}
PowerShellGetFormatVersion : 
ReleaseNotes               : 
Dependencies               : {}
RepositorySourceLocation   : https://www.powershellgallery.com/api/v2
Repository                 : PSGallery
PackageManagementProvider  : NuGet
AdditionalMetadata         : @{summary=Little project to create, read, modify, split, merge PDF files on Windows, Linux and Mac.; SourceName=PSGallery; 
                             installeddate=28.06.2024 14:04:38; copyright=(c) 2011 - 2022 Przemyslaw Klys @ Evotec. All rights reserved.; tags=PDF macOS linux windows 
                             PSModule PSEdition_Desktop PSEdition_Core; Type=Module; InstalledLocation=C:\Program Files\WindowsPowerShell\Modules\PSWritePDF\0.0.20; 
                             description=Little project to create, read, modify, split, merge PDF files on Windows, Linux and Mac.; published=02.10.2022 17:24:49; 
                             PackageManagementProvider=NuGet}
InstalledLocation          : C:\Program Files\WindowsPowerShell\Modules\PSWritePDF\0.0.20

Null array error when data table contains only one element

Since v.0.0.8, when using New-PDFTable, a null array error is thrown if the -DataTable array contains only one element:

Cannot index into a null array.
At C:\Program Files\WindowsPowerShell\Modules\PSWritePDF\0.0.8\PSWritePDF.psm1:343 char:9

  • if ($DataTable[0] -is [System.Collections.IDictionary]) {
    
  •     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    • CategoryInfo : InvalidOperation: (:) [], RuntimeException
    • FullyQualifiedErrorId : NullArray

Using "New-PDFTable -DataTable $Data", the input array was:
$Data = @(
[PSCustomObject] @{ Test = 'Name'; Test2 = 'Name2'; Test3 = 'Name3' }
)

If at least one more element is added to this input array, a PDF will be generated as expected. This error did not occur prior to version 0.0.8.

Split-PDF

Hi, Excellent tool. I was wondering if it would be possible to:

  1. Split all pdf's in a folder
  2. Use the original pdf names in the output file appended with the page no, i.e., ABC-xxx.pdf (Where xxx is PDF page)

Thanks for the excellent tool!!

Formatting a table to fix text to sheet.

Hello,

I have a New-PDFTable where one of the columns has a large amount of text that makes the table expand past the margins and onto another page. Is there way to format tables so that text is smaller and the columns don't exceed the margins?

Thank you

Example

$NewEmployeeInquiriesTable = @(
                    [pscustomobject]@{Completed = (Get-UDElement -Id 'NewEmployeeInquiriesCheckbox1').checked;Task = "Inquire with the new hire’s manager to determine:``n● Where the new hire will be seated.`n● What software packages the new hire will require.`n● If the new hire is an engineer, inquire if they should be given a general-purpose laptop or an engineering laptop for design work requiring enhanced graphics processing. `n● If ERP account(s) are required, create linked request(s) for each ERP access for the ERP technician to configure for the particular RBAC role and provide login information. [Article]`n● If offsite/VPN access will be required. If you do not have access to create a VPN account, create a linked request for an infrastructure technician."}                 
                )

New-PDFTable -DataTable $NewEmployeeInquiriesTable

New-PDF without arguments crashes PowerShell 7.0.0-rc1

Hi.

In PSWritePDF 0.0.5 - just installed from Gallery - running just New-PDF without any further parameters instantly crashes/closes PowerShell 7.0.0-rc1 when running inside the new Windows Terminal (v. 0.7.3451.0). When running from the ol' conhost terminal, it just prints the following:

WARNING: New-InternalPDF - Terminating error: Exception calling ".ctor" with "1" argument(s): "Empty path name is not legal. (Parameter 'path')"

but does not crash.

question

from the website:
It means given a file, it will split it into X number of files, where X is a number of pages in PDF.
Where do we enter X ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.