Giter Site home page Giter Site logo

pdf-writer's Introduction

pdf-writer

Crates.io Documentation

A step-by-step PDF writer.

[dependencies]
pdf-writer = "0.11"

The entry point into the API is the main Pdf, which constructs the document into one big internal buffer. The top-level writer has many methods to create specialized writers for specific PDF objects. These all follow the same general pattern: They borrow the main buffer mutably, expose a builder pattern for writing individual fields in a strongly typed fashion and finish up the object when dropped.

There are a few more top-level structs with internal buffers, like the builder for Content streams, but wherever possible buffers are borrowed from parent writers to minimize allocations.

Minimal example

The following example creates a PDF with a single, empty A4 page.

use pdf_writer::{Pdf, Rect, Ref};

// Define some indirect reference ids we'll use.
let catalog_id = Ref::new(1);
let page_tree_id = Ref::new(2);
let page_id = Ref::new(3);

// Write a document catalog and a page tree with one A4 page that uses no resources.
let mut pdf = Pdf::new();
pdf.catalog(catalog_id).pages(page_tree_id);
pdf.pages(page_tree_id).kids([page_id]).count(1);
pdf.page(page_id)
    .parent(page_tree_id)
    .media_box(Rect::new(0.0, 0.0, 595.0, 842.0))
    .resources();

// Finish with cross-reference table and trailer and write to file.
std::fs::write("target/empty.pdf", pdf.finish())?;

For more examples, check out the examples folder in the repository.

Safety

This crate forbids unsafe code, but it depends on a few popular crates that use unsafe internally.

License

This crate is dual-licensed under the MIT and Apache 2.0 licenses.

pdf-writer's People

Contributors

awehrfritz avatar bqdong avatar cmoog avatar elegaanz avatar heinenen avatar kianmeng avatar laurenzv avatar laurmaedje avatar niklasei avatar reknih avatar tingerrr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pdf-writer's Issues

Bug: writer.finish() stuck when page resource is accessed in between

Hey team,

So i was trying to insert images as said in examples. My images are generated on the fly

So i setup my first page in a4 size as said in doc

  let mut page = writer.page(page_id);
  let a4 = Rect::new(0.0, 0.0, 595.0, 842.0);
  page.media_box(a4);
  page.parent(page_tree_id);
  page.contents(content_id);
  page.finish();

Now i tried to link the resources when my images are generated. so its like

 writer
        .page(page_id)
        .resources()
        .x_objects()
        .pair(image_name, image_id);

But this is done after page.finish part somewhere along the code where image is formed.

This caused writer.finish() to hang, like will be stuck in this line. If i change this above to first one just before page.finish like in example it works. I guess its stuck because its invalid pdf tree.

So i read the doc and found another example from here

Copy paste exact line with my page_id, after page.finish
This also made the writer.finish to be in freeze

May be my syntax is mistaken and should i use insert operation or something like that. Feels I am missing something here?
Can you please point out the issue.

API & documentation cleanup

In wake of #26 the API would experience a breaking change, this unavoidable if we want to support interactive forms (#25).

If we break the API, we may as well make the best of it and bring some things into unity, there are small differences in the API that I'd like to clean up to provide an overall cleaner and more consistent API to the end user. While also allowing the average contributor to write less code.

Things I had in mind:

  • Module structure
    • it could be a bit deeper to avoid having very long files
  • Documentation
    • small inconsistencies break up the flow when reading
    • required or disallowed attributes are only mentioned here and there
    • the spec version which introduced an attribute is listed for some but not for others
    • Some things are linked directly, while flags are often times simply put into CODE_BLOCKS
  • API
    • lots of code duplication for very simple attributes
      • declarative macros for defining the most common attribute definitions would be a great addition (simple writers or single argument functions)
    • methods which are only available for some subtypes for a writer are hard to find without checking the docs, prefixes like in #23 could help find the right ones quicker

I would like to take this issue to discuss changes to the API before the next update, so we can minimize the amount of breaking updates we have by thinking ahead.

How to add an image to pdf ?

Hey there,

First of all thank you for creating this crate. I was using this with svg2pdf to generate images for a personal project and its been great.

But due to sv2pdf rasterizing the file size seems to be soo big. So planning to change to png image by image crate . Can you please share some insights on how to add a png to pdf.

I saw there is image x_objects. But can't seem to understand on how to position the elements and on how to also use it properly. Has been reading the source code for quite sometime.

let id = Ref::new(5);

let sample  = image::from("/file.png");
 let image = writer.image_xobject(id, sample);
 image.finish();

By reading some of the dependent codebase came to this. But still unsure like how to place an image at a particular location in a page.

Would be great if you could add an example.

Thank you

A few use cases: How to center text? Generate page breaks? Does pdf-writer support text wrap?

Hi, I'm poking around the different options for pdf generation in Rust. The use cases I am unsure how to do with pdf-writer are:

  • Generate pages while looping through an iterator
  • center text
  • text wrap

I couldnt figure out how to add new pages so far. I see that there is an enum for centered text, but I didnt see where to pass that in. Lastly, is text wrap supported?

Thanks for the crate, well done. So far I was happy with the generated file sizes.

Question: What specification of PDF format is implemented?

Hi! Thank you for the crate!

I am starting to look into creating PDF files and wanted to know what is actually being written with this crate.

As with any format, there are many specifications over the year of the PDF file format.
What does this crate actually implement?

Annots attribute should contain indirect references

According to the pdf 1.7 standard (7.7.3.2, Table 30) the Annots attribute in a Page dictionary should contain indirect references, however this library directly puts the annotation dictionaries into the page dictionary.

Most (all?) pdf readers seem to deal fine with the direct references, however when looking into what would be needed to support interactive forms in pdfs (typst/typst#1765) I noticed that in this case annotation dictionaries need to be referenced from multiple places at once (the page on which they are placed and the form field which they represent).
This is only possible with indirect references.

Note that this change would require an backwards-incompatible API change.

PDF A/1-b

Hi

I noticed that you mention PDF/A standard and that current functionality should allow to write documents compliant with that standard. I was just wondering: how would I embed things like ICC color profiles and add a correct OutputIntent?

Measuring page content ahead of time

Hey!

I want to use this crate to create PDFs of some text files. The content of the files is rarely larger than 4-8 lines.
Is it somehow possible to automatically shrink the page size once text has been written? Maybe with some way to specify a margin?

Something similar to Latex' separate document class :)

Thanks!

PDF with ICC based color space defined in the document's resource directory crashes CorelDRAW

I've implemented the ICC example, which works fine in all readers (Acrobat, Illustrator, Preview, CorelDRAW, ...).

As I'd like to reuse the profile across pages, I moved it to the resource dict of the document as opposed to the resource dict of the page. This generates a working file that works fine in all readers except CorelDRAW: the program seems to crash with a segfault when parsing the color space:

0   CrlPDFImport.dylib            	       0x365a893e8 CPDFColorSpaceResource::~CPDFColorSpaceResource() + 40
1   CrlPDFImport.dylib            	       0x365ad7e78 CPDFPageResources::CleanUp() + 260
2   CrlPDFImport.dylib            	       0x365ad7fac CPDFPageResources::~CPDFPageResources() + 32
3   CrlPDFImport.dylib            	       0x365abae54 CPDFPage::~CPDFPage() + 76
4   CrlPDFImport.dylib            	       0x365ada994 CPDFPagesTree::~CPDFPagesTree() + 92
5   CrlPDFImport.dylib            	       0x365a7d4d0 CPDFCatalog::~CPDFCatalog() + 76
6   CrlPDFImport.dylib            	       0x365a900a0 CPDFDocument::~CPDFDocument() + 304

Please see the following screen recording to illustrate the issue:

Screen.Recording.2024-06-27.at.15.38.57.mp4

Here's the adapted implementation which creates 2 PDF-files:

  • output-page-resources.pdf: works fine in all readers
  • output-document-resources: works fine in all readers except CorelDRAW
//! This example shows how to use ICC-based color spaces.
use pdf_writer::writers::ColorSpace;
use pdf_writer::{Content, Finish, Name, Pdf, Rect, Ref};

fn create_pdf_with_page_resources() -> std::io::Result<()> {
    let mut pdf = Pdf::new();
    let catalog_id = Ref::new(1);
    let page_tree_id = Ref::new(2);
    let page_id = Ref::new(3);
    let content_id = Ref::new(4);
    let icc_id = Ref::new(5);

    // Setup catalog and pages
    pdf.catalog(catalog_id).pages(page_tree_id);

    // Set up the page tree. For more details see `hello.rs`.
    pdf.pages(page_tree_id).kids([page_id]).count(1);

    // Create an A4 page.
    let mut page = pdf.page(page_id);
    page.media_box(Rect::new(0.0, 0.0, 595.0, 842.0));
    page.parent(page_tree_id);
    page.contents(content_id);

    // Setup page resources
    let color_space_name = Name(b"sRGB");
    page.resources()
        .color_spaces()
        .insert(color_space_name)
        .start::<ColorSpace>()
        .icc_based(icc_id);
    page.finish();

    // Write the content stream with a green rectangle and a crescent with a red stroke.
    let mut content = Content::new();
    content.set_fill_color_space(color_space_name);
    content.set_fill_cmyk(0.0, 1.0, 0.0, 0.0);
    content.rect(108.0, 734.0, 100.0, 100.0);
    content.fill_even_odd();

    // Write the content stream.
    pdf.stream(content_id, &content.finish());

    // Read the ICC profile from a file.
    let icc_data = std::fs::read("sRGB_v4.icc")?;
    let mut icc_profile = pdf.icc_profile(icc_id, &icc_data);
    icc_profile.n(4);
    icc_profile.alternate().device_cmyk();
    icc_profile.finish();

    // Write the PDF
    std::fs::write("output-page-resources.pdf", pdf.finish())
}

fn create_pdf_with_document_resources() -> std::io::Result<()> {
    let mut pdf = Pdf::new();
    let catalog_id = Ref::new(1);
    let page_tree_id = Ref::new(2);
    let page_id = Ref::new(3);
    let content_id = Ref::new(4);
    let icc_id = Ref::new(5);

    // Setup catalog and pages
    pdf.catalog(catalog_id).pages(page_tree_id);

    // Set up the page tree. For more details see `hello.rs`.
    let mut pages = pdf.pages(page_tree_id);
    pages.kids([page_id]).count(1);

    // Setup document resources
    let color_space_name = Name(b"sRGB");
    let mut resources = pages.resources();
    let mut spaces = resources.color_spaces();
    spaces
        .insert(color_space_name)
        .start::<ColorSpace>()
        .icc_based(icc_id);
    spaces.finish();
    resources.finish();
    pages.finish();

    // Create an A4 page.
    let mut page = pdf.page(page_id);
    page.media_box(Rect::new(0.0, 0.0, 595.0, 842.0));
    page.parent(page_tree_id);
    page.contents(content_id);
    page.finish();

    // Write the content stream with a green rectangle and a crescent with a red stroke.
    let mut content = Content::new();
    content.set_fill_color_space(color_space_name);
    content.set_fill_cmyk(0.0, 1.0, 0.0, 0.0);
    content.rect(108.0, 734.0, 100.0, 100.0);
    content.fill_even_odd();

    // Write the content stream.
    pdf.stream(content_id, &content.finish());

    // Read the ICC profile from a file.
    let icc_data = std::fs::read("sRGB_v4.icc")?;
    let mut icc_profile = pdf.icc_profile(icc_id, &icc_data);
    icc_profile.n(4);
    icc_profile.alternate().device_cmyk();
    icc_profile.finish();

    // Write the PDF
    std::fs::write("output-document-resources.pdf", pdf.finish())
}

fn main() {
    create_pdf_with_page_resources().expect("Failed to create PDF with page resources");
    create_pdf_with_document_resources().expect("Failed to create PDF with document resources");
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.