Giter Site home page Giter Site logo

testarea-pdfbox2's People

Contributors

dependabot[bot] avatar elvisbegovic avatar mkl-public avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

testarea-pdfbox2's Issues

One issue with BreakLongString.java example

Hi there,

I've posted this question to Stackoverflow but it is deleted by administrator.
I'm following the example of BreakLongString.java to show my content in Signature field.
The problem is, if the content of my text contains a very long word, this example does not handle this case yet.

For example, this is my text

String text = "I am trying toTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTM create a PDF file with a lot of text contents in the document. I am using PDFBox";

And the text is truncated in the generated pdf file.
o9u9s

Please help me to have a look on the issue?

How to add image to pdf

Hi,

I seen your test case for creating a button and adding an image appearance.

Im looking to use an existing button (as a placeholder) on the pdf to draw an image. Is this the correct approach?

If so does this code look correct? (This is based of the test you defined)

  try (InputStream resource = getClass().getClassLoader().getResourceAsStream("2x2colored.png")) {
            BufferedImage bufferedImage = ImageIO.read(resource);
            PDImageXObject pdImageXObject = LosslessFactory.createFromImage(acrobatDocument, bufferedImage);
            float width = 10 * pdImageXObject.getWidth();
            float height = 10 * pdImageXObject.getHeight();

            PDAppearanceStream pdAppearanceStream = new PDAppearanceStream(acrobatDocument);
            pdAppearanceStream.setResources(new PDResources());
            try (PDPageContentStream pdPageContentStream = new PDPageContentStream(acrobatDocument, pdAppearanceStream)) {
                pdPageContentStream.drawImage(pdImageXObject, 0, 0, width, height);
            }
            pdAppearanceStream.setBBox(new PDRectangle(width, height));

            //get button you want to replace
            PDButton button  = (PDPushButton)acrobatAcroForm.getField("PushButton");

            List<PDAnnotationWidget> widgets = button.getWidgets();
            for (PDAnnotationWidget pdAnnotationWidget : widgets) {

                PDAppearanceDictionary pdAppearanceDictionary = pdAnnotationWidget.getAppearance();
                if (pdAppearanceDictionary == null) {
                    pdAppearanceDictionary = new PDAppearanceDictionary();
                    pdAnnotationWidget.setAppearance(pdAppearanceDictionary);
                }

                //add appearance to button
                pdAppearanceDictionary.setNormalAppearance(pdAppearanceStream);
            }
 button.setReadOnly(true);
            acrobatAcroForm.getFields().add(button);

            acrobatDocument.save(new File("build", "imageWithButton.pdf"));
        }

Thanks,
Shane.

How to set multiple language font to text from?

I see mkl answered many PDF related questions and you are a PDF expert. And we run into issues when using pdfbox.
1、How to set multiple language(dynamic fonts) fonts to form field? I saw your implement works well on "showtext" https://github.com/mkl-public/testarea-pdfbox2/blob/master/src/test/java/mkl/testarea/pdfbox2/content/AddTextWithDynamicFonts.java
2、```
public static PDType0Font load(org.apache.pdfbox.pdmodel.PDDocument doc,
java.io.InputStream input,
boolean embedSubset)
embedSubset – True if the font will be subset before embedding. Set this to false when creating a font for AcroForm.

How to embed subset of a font when creating a font for AcroForm. Because CJK font could be very large.

How to Attach an Image to a PDF Form Field Without Invalidating an Existing Signature?

Usecase: I'm working on a feature that requires attaching an image to a PDF without invalidating a pre-existing digital signature. The PDF should have a form field designated for image attachment, which can be populated later. I want to implement this using PDFBox.

Implementation: Since PDFs lack a dedicated image form field, I’m utilizing a PDPushButton as a workaround, following the method outlined in the following issue;

Additionally, the PDF includes a signature form field. The process involves signing the signature field first and subsequently attaching the image to the PDPushButton field. However, this sequence is causing the signature to become invalid.

Here is the code for attaching an image to the PDPushButton.

@SneakyThrows
public static void fillInitialField(String inputFilePath, String outputFilePath) {
    // Load input file
    PDDocument document = PDDocument.load(new File(inputFilePath));

    // Find and link the relevant signature field
    PDPushButton initial = PdfService.findInitial(document, "132323423180965");

    PDImageXObject pdImageXObject = PDImageXObject.createFromFile("initial.png", document);
    float width = 10 * pdImageXObject.getWidth();
    float height = 10 * pdImageXObject.getHeight();

    PDAppearanceStream pdAppearanceStream = new PDAppearanceStream(document);
    pdAppearanceStream.setResources(new PDResources());
    try (PDPageContentStream pdPageContentStream = new PDPageContentStream(document, pdAppearanceStream)) {
        pdPageContentStream.drawImage(pdImageXObject, 200, 300, width, height);
    }
    pdAppearanceStream.setBBox(new PDRectangle(width, height));

    List<PDAnnotationWidget> widgets = initial.getWidgets();
    for (PDAnnotationWidget pdAnnotationWidget : widgets) {

        PDAppearanceDictionary pdAppearanceDictionary = pdAnnotationWidget.getAppearance();
        if (pdAppearanceDictionary == null) {
            pdAppearanceDictionary = new PDAppearanceDictionary();
            pdAnnotationWidget.setAppearance(pdAppearanceDictionary);
        }

        pdAppearanceDictionary.setNormalAppearance(pdAppearanceStream);
    }
    initial.setReadOnly(true);

    // Save and close the document
    FileOutputStream fos = new FileOutputStream(outputFilePath);
    document.save(fos);
    document.close();
}

I’ve created a repository that replicates this issue, which can be found here: https://github.com/ContractSPAN/ImageFormFieldIssue

How can I implement an image attachment to a PDF form field in such a way that it doesn’t invalidate an existing signature? I am open to alternative approaches to achieve this functionality.

cosstream has been closed and cannot be read. perhaps its enclosing pddocument has been closed

Hi!

Thanks for the tool.
I've got an issue when calling the merge method, sometimes (not always but quite frequently) it throws an error saying "cosstream has been closed and cannot be read. perhaps its enclosing pddocument has been closed" on 50th line of DenseMerge and same with VeryDenseTool.

My usage is this:

PdfVeryDenseMergeTool pdfDenseMergeTool = new PdfVeryDenseMergeTool(PDRectangle.A4, dim1, dim2, dim3);

ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
        
pdfDenseMergeTool.merge(byteArrayOutputStream, listOfDocuments);

Any idea of what is happening SOMETIMES? It feels like a problem of synch between the streams but idk

The problem is that is not deterministic, with exactly the same input it can throw or not the error

OptimizeAfterMerge breaks PDF file on native Firefox pdf reader

Hi there,

Considering issue title :

Reproduction :

  1. Starting from this PDF form-empty.pdf

  2. I create a PDF (merged PDF) to which I apply optimize() methode and result is : form-filled.pdf
    But when you open it on Chrome you can see 22 pages but when opening with Firefox you can only see 4 pages.
    When open this pdf gile with Adobe Reader you can see 22 pages but if you scroll down you get error 14 after page 3 :

image

  1. If i create same PDF (merged PDF) without optimize() method you can correctly read it on Firefox with 22 pages : form-filled-NO-optimized.pdf

Temporary workaround :

We cannot save this big file size without compression while with optimize() the PDF size is reduced by 5. We keep optimizing and ask user to read pdf on Adobe Reader OR use Chrome/Edge.

Expected behavior :

It seems optimization method is too agressive. How can we enhance optimize() to not breaks firefox-reader or how we should adapt our initial pdf-empty.pdf file to avoid this situation. It seems my initial pdf form-empty.pdf is not created correctly maybe due to copy/pase of AcroForm field... can this be catched/fixed by opzimize method.

Additionnal info :

We have others pdfs similar to form-filled2.pdf that works with optimize() method but this one I can understand why it breaks firefox building pdf reader.

How to extract all the lines in PDF?

Not all lines are extracted

pdfbox 2.0.21
linux 18.04
java:jdk1.8.0

test file:src/test/resources/mkl/testarea/pdfbox2/extract/demo.pdf
test code:src/test/java/mkl/testarea/pdfbox2/extract/ExtractLinesWithDir.java

demo

  1. Using this method can extract some lines, but not all lines. In the figure, the green line represents the extracted line, and the red "?"mark represents the non extracted line.
  2. How to extract all the lines in PDF?

how to modify form object stream?

https://github.com/mkl-public/testarea-pdfbox2/blob/master/src/test/java/mkl/testarea/pdfbox2/content/RemoveText.java#L49
`
public void processForm(PDFormXObject form) throws IOException {

    final PDStream formContentStream = form.getContentStream();
    final PDResources resources = form.getResources();
    formReplacementStream=formContentStream.createOutputStream(COSName.FLATE_DECODE);
    formReplacement = new ContentStreamWriter(formReplacementStream);
    super.showForm(form);

}

`
how to update the PDFormXObjects' stream?

Performance improvement for OptimizeAfterMerge

I have tested the code with 50k pages in a pdf
The total time it took was 3:10 min

If you change in equals() method following code on line 292:

if (keys.equals(bDict.keySet())) {

with this:

if (keys.size() == bDict.keySet().size()) {

then it takes only 1:08 min. The logic of the code is the same because once we see if the 2 sets are with equal size then we are comparing inside the if statement each element in the first set with the element in the second one - if the second one do not have the key then the value would be null and the comparison won't happen and it will return false on line 295

Error on execute

Capturar
Hello, I'm trying to run the tests and I always get the error attached. Can you help me ?

Remove text behind image

Help needed:
I am in a requirement to split a pdf into two, one with image and one with text. I dont want to remove the text which are behind an image and it should be the part of the image pdf. I want to extract only the top layered text in the PDF. Can any one help on this?

I already extracted the image and text into two pdfs by looping through pdf operators. I am facing trouble when not to remove the text behind the PDF.

https://stackoverflow.com/questions/52334071/pdfbox-remove-text-behind-image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.