typst / pdf-writer Goto Github PK

View Code? Open in Web Editor NEW

463.0 8.0 25.0 2.1 MB

A step-by-step PDF writer.

License: Apache License 2.0

Rust 100.00%

pdf-generation rust

pdf-writer's Introduction

pdf-writer

A step-by-step PDF writer.

[dependencies]
pdf-writer = "0.11"

The entry point into the API is the main Pdf, which constructs the document into one big internal buffer. The top-level writer has many methods to create specialized writers for specific PDF objects. These all follow the same general pattern: They borrow the main buffer mutably, expose a builder pattern for writing individual fields in a strongly typed fashion and finish up the object when dropped.

There are a few more top-level structs with internal buffers, like the builder for Content streams, but wherever possible buffers are borrowed from parent writers to minimize allocations.

Minimal example

The following example creates a PDF with a single, empty A4 page.

use pdf_writer::{Pdf, Rect, Ref};

// Define some indirect reference ids we'll use.
let catalog_id = Ref::new(1);
let page_tree_id = Ref::new(2);
let page_id = Ref::new(3);

// Write a document catalog and a page tree with one A4 page that uses no resources.
let mut pdf = Pdf::new();
pdf.catalog(catalog_id).pages(page_tree_id);
pdf.pages(page_tree_id).kids([page_id]).count(1);
pdf.page(page_id)
    .parent(page_tree_id)
    .media_box(Rect::new(0.0, 0.0, 595.0, 842.0))
    .resources();

// Finish with cross-reference table and trailer and write to file.
std::fs::write("target/empty.pdf", pdf.finish())?;

For more examples, check out the examples folder in the repository.

Safety

This crate forbids unsafe code, but it depends on a few popular crates that use unsafe internally.

License

This crate is dual-licensed under the MIT and Apache 2.0 licenses.

pdf-writer's People

Contributors

Stargazers

Watchers

pdf-writer's Issues

Bug: writer.finish() stuck when page resource is accessed in between

Hey team,

So i was trying to insert images as said in examples. My images are generated on the fly

So i setup my first page in a4 size as said in doc

  let mut page = writer.page(page_id);
  let a4 = Rect::new(0.0, 0.0, 595.0, 842.0);
  page.media_box(a4);
  page.parent(page_tree_id);
  page.contents(content_id);
  page.finish();

Now i tried to link the resources when my images are generated. so its like

 writer
        .page(page_id)
        .resources()
        .x_objects()
        .pair(image_name, image_id);

But this is done after page.finish part somewhere along the code where image is formed.

This caused writer.finish() to hang, like will be stuck in this line. If i change this above to first one just before page.finish like in example it works. I guess its stuck because its invalid pdf tree.

So i read the doc and found another example from here

Copy paste exact line with my page_id, after page.finish
This also made the writer.finish to be in freeze

May be my syntax is mistaken and should i use insert operation or something like that. Feels I am missing something here?
Can you please point out the issue.

Incorrect naming for `annotations::Appearance`

annotations::Appearance should be annotations::AppearanceCharacteristics, as it is not an appearance dictionary as specified in PDF 1.7 Section 12.5.5, this faulty naming was introduced in #20.

API & documentation cleanup

In wake of #26 the API would experience a breaking change, this unavoidable if we want to support interactive forms (#25).

If we break the API, we may as well make the best of it and bring some things into unity, there are small differences in the API that I'd like to clean up to provide an overall cleaner and more consistent API to the end user. While also allowing the average contributor to write less code.

Things I had in mind:

Module structure
- it could be a bit deeper to avoid having very long files
Documentation
- small inconsistencies break up the flow when reading
- required or disallowed attributes are only mentioned here and there
- the spec version which introduced an attribute is listed for some but not for others
- Some things are linked directly, while flags are often times simply put into CODE_BLOCKS
API
- lots of code duplication for very simple attributes
  - declarative macros for defining the most common attribute definitions would be a great addition (simple writers or single argument functions)
- methods which are only available for some subtypes for a writer are hard to find without checking the docs, prefixes like in #23 could help find the right ones quicker

I would like to take this issue to discuss changes to the API before the next update, so we can minimize the amount of breaking updates we have by thinking ahead.

How to add an image to pdf ?

Hey there,

First of all thank you for creating this crate. I was using this with svg2pdf to generate images for a personal project and its been great.

But due to sv2pdf rasterizing the file size seems to be soo big. So planning to change to png image by image crate . Can you please share some insights on how to add a png to pdf.

I saw there is image x_objects. But can't seem to understand on how to position the elements and on how to also use it properly. Has been reading the source code for quite sometime.

let id = Ref::new(5);

let sample  = image::from("/file.png");
 let image = writer.image_xobject(id, sample);
 image.finish();

By reading some of the dependent codebase came to this. But still unsure like how to place an image at a particular location in a page.

Would be great if you could add an example.

Thank you

A few use cases: How to center text? Generate page breaks? Does pdf-writer support text wrap?

Hi, I'm poking around the different options for pdf generation in Rust. The use cases I am unsure how to do with pdf-writer are:

Generate pages while looping through an iterator
center text
text wrap

I couldnt figure out how to add new pages so far. I see that there is an enum for centered text, but I didnt see where to pass that in. Lastly, is text wrap supported?

Thanks for the crate, well done. So far I was happy with the generated file sizes.

Question: What specification of PDF format is implemented?

Hi! Thank you for the crate!

I am starting to look into creating PDF files and wanted to know what is actually being written with this crate.

As with any format, there are many specifications over the year of the PDF file format.
What does this crate actually implement?

Annots attribute should contain indirect references

According to the pdf 1.7 standard (7.7.3.2, Table 30) the Annots attribute in a Page dictionary should contain indirect references, however this library directly puts the annotation dictionaries into the page dictionary.

Most (all?) pdf readers seem to deal fine with the direct references, however when looking into what would be needed to support interactive forms in pdfs (typst/typst#1765) I noticed that in this case annotation dictionaries need to be referenced from multiple places at once (the page on which they are placed and the form field which they represent).
This is only possible with indirect references.

Note that this change would require an backwards-incompatible API change.

Question: link to jump to a specific page

Hello,

I am trying to create a link to jump to a specific page within the PDF. Is it something doable using pdf-writer?

PDF A/1-b

I noticed that you mention PDF/A standard and that current functionality should allow to write documents compliant with that standard. I was just wondering: how would I embed things like ICC color profiles and add a correct OutputIntent?

internal documentation

I'd be really interested in a documentation on the inner workings of this create.

Can provide an example of loading a font from a TTF file?

Measuring page content ahead of time

Hey!

I want to use this crate to create PDFs of some text files. The content of the files is rarely larger than 4-8 lines.
Is it somehow possible to automatically shrink the page size once text has been written? Maybe with some way to specify a margin?

Something similar to Latex' separate document class :)

Thanks!

PDF with ICC based color space defined in the document's resource directory crashes CorelDRAW

I've implemented the ICC example, which works fine in all readers (Acrobat, Illustrator, Preview, CorelDRAW, ...).

As I'd like to reuse the profile across pages, I moved it to the resource dict of the document as opposed to the resource dict of the page. This generates a working file that works fine in all readers except CorelDRAW: the program seems to crash with a segfault when parsing the color space:

0   CrlPDFImport.dylib            	       0x365a893e8 CPDFColorSpaceResource::~CPDFColorSpaceResource() + 40
1   CrlPDFImport.dylib            	       0x365ad7e78 CPDFPageResources::CleanUp() + 260
2   CrlPDFImport.dylib            	       0x365ad7fac CPDFPageResources::~CPDFPageResources() + 32
3   CrlPDFImport.dylib            	       0x365abae54 CPDFPage::~CPDFPage() + 76
4   CrlPDFImport.dylib            	       0x365ada994 CPDFPagesTree::~CPDFPagesTree() + 92
5   CrlPDFImport.dylib            	       0x365a7d4d0 CPDFCatalog::~CPDFCatalog() + 76
6   CrlPDFImport.dylib            	       0x365a900a0 CPDFDocument::~CPDFDocument() + 304

Please see the following screen recording to illustrate the issue:

Screen.Recording.2024-06-27.at.15.38.57.mp4

Here's the adapted implementation which creates 2 PDF-files:

output-page-resources.pdf: works fine in all readers
output-document-resources: works fine in all readers except CorelDRAW

//! This example shows how to use ICC-based color spaces.
use pdf_writer::writers::ColorSpace;
use pdf_writer::{Content, Finish, Name, Pdf, Rect, Ref};

fn create_pdf_with_page_resources() -> std::io::Result<()> {
    let mut pdf = Pdf::new();
    let catalog_id = Ref::new(1);
    let page_tree_id = Ref::new(2);
    let page_id = Ref::new(3);
    let content_id = Ref::new(4);
    let icc_id = Ref::new(5);

    // Setup catalog and pages
    pdf.catalog(catalog_id).pages(page_tree_id);

    // Set up the page tree. For more details see `hello.rs`.
    pdf.pages(page_tree_id).kids([page_id]).count(1);

    // Create an A4 page.
    let mut page = pdf.page(page_id);
    page.media_box(Rect::new(0.0, 0.0, 595.0, 842.0));
    page.parent(page_tree_id);
    page.contents(content_id);

    // Setup page resources
    let color_space_name = Name(b"sRGB");
    page.resources()
        .color_spaces()
        .insert(color_space_name)
        .start::<ColorSpace>()
        .icc_based(icc_id);
    page.finish();

    // Write the content stream with a green rectangle and a crescent with a red stroke.
    let mut content = Content::new();
    content.set_fill_color_space(color_space_name);
    content.set_fill_cmyk(0.0, 1.0, 0.0, 0.0);
    content.rect(108.0, 734.0, 100.0, 100.0);
    content.fill_even_odd();

    // Write the content stream.
    pdf.stream(content_id, &content.finish());

    // Read the ICC profile from a file.
    let icc_data = std::fs::read("sRGB_v4.icc")?;
    let mut icc_profile = pdf.icc_profile(icc_id, &icc_data);
    icc_profile.n(4);
    icc_profile.alternate().device_cmyk();
    icc_profile.finish();

    // Write the PDF
    std::fs::write("output-page-resources.pdf", pdf.finish())
}

fn create_pdf_with_document_resources() -> std::io::Result<()> {
    let mut pdf = Pdf::new();
    let catalog_id = Ref::new(1);
    let page_tree_id = Ref::new(2);
    let page_id = Ref::new(3);
    let content_id = Ref::new(4);
    let icc_id = Ref::new(5);

    // Setup catalog and pages
    pdf.catalog(catalog_id).pages(page_tree_id);

    // Set up the page tree. For more details see `hello.rs`.
    let mut pages = pdf.pages(page_tree_id);
    pages.kids([page_id]).count(1);

    // Setup document resources
    let color_space_name = Name(b"sRGB");
    let mut resources = pages.resources();
    let mut spaces = resources.color_spaces();
    spaces
        .insert(color_space_name)
        .start::<ColorSpace>()
        .icc_based(icc_id);
    spaces.finish();
    resources.finish();
    pages.finish();

    // Create an A4 page.
    let mut page = pdf.page(page_id);
    page.media_box(Rect::new(0.0, 0.0, 595.0, 842.0));
    page.parent(page_tree_id);
    page.contents(content_id);
    page.finish();

    // Write the content stream with a green rectangle and a crescent with a red stroke.
    let mut content = Content::new();
    content.set_fill_color_space(color_space_name);
    content.set_fill_cmyk(0.0, 1.0, 0.0, 0.0);
    content.rect(108.0, 734.0, 100.0, 100.0);
    content.fill_even_odd();

    // Write the content stream.
    pdf.stream(content_id, &content.finish());

    // Read the ICC profile from a file.
    let icc_data = std::fs::read("sRGB_v4.icc")?;
    let mut icc_profile = pdf.icc_profile(icc_id, &icc_data);
    icc_profile.n(4);
    icc_profile.alternate().device_cmyk();
    icc_profile.finish();

    // Write the PDF
    std::fs::write("output-document-resources.pdf", pdf.finish())
}

fn main() {
    create_pdf_with_page_resources().expect("Failed to create PDF with page resources");
    create_pdf_with_document_resources().expect("Failed to create PDF with document resources");
}