pdf-association / pdf-issues Goto Github PK
View Code? Open in Web Editor NEWIndustry-based resolutions for issues and errata reported against any PDF-related specification
Home Page: https://pdf-issues.pdfa.org/
Industry-based resolutions for issues and errata reported against any PDF-related specification
Home Page: https://pdf-issues.pdfa.org/
Table 317 — Entries in a 3D background dictionary
3D support in PDF is currently heavily centred around RGB-based colour spaces - see ColorSpace key in "Table 311 — Entries in a 3D stream dictionary" and many places in clause 13.6 and related sub-clauses. Trying to future-proof things for other colour families for some future PDF version is fraught with potential issues and creates confusion now. Of course, the future is also highly unlikely to support Patterns, named colours or other advanced PDF colour spaces!
Lack of clarity for CS and C keys with confusing statements about a possible future:
The CS key is a "name or array" yet the description explicitly constrains values to only DeviceRGB (name) by an explicit shall statement: "The only valid value shall be the name DeviceRGB."
It then goes on to state "PDF consumers shall be prepared to encounter other values that may be supported in future versions of PDF.". Clearly, the intention is that the array format might be one of the other PDF colour spaces, but this is no different to handling future PDF features anywhere else in PDF as described in "Annex I (normative) PDF versions and compatibility". So this second explicit "shall" makes no sense with the other explicit "shall" statement.
Correspondingly the C key is typed as "(various)" with a description that doesn't really help, except for the DeviceRGB case (which is the only valid current color space as mandated by the CS "shall" statement!). It needs to be reworded to a similar statement to the C key from "Table 166 — Entries common to all annotation dictionaries":
"An array of 3 numbers in the range 0.0 to 1.0, representing the background colour in the colour space defined by the CS key".
If/when 3D ever supports other colour spaces, this would just be one place of many that need careful review and updating.
"Table 132 — Entries in a Type 5 halftone dictionary" Default key description says "The value shall not be 5." but it is of type "dictionary or stream". not a number or integer.
It should really say what is said for the "any colorant name" key description in Table 132: "The halftone may be of any Type other than 5."
Describe the bug
Section 12.5.1, Paragraph 4:
An interactive PDF processor shall provide certain expected behaviour for all annotation types that it does not recognise, as documented in 12.5.2, "Annotation dictionaries".
However, there is nothing in 12.5.2 (or anywhere else that I can find) that actually tells a processor (interactive or otherwise) what to do in the case of an unknown annotation Subtype.
Recommend that some text be written to define the behavior - or at least to agree that it is undefined.
Clause "13.6.4.3 3D background dictionaries" gives no clarity of if/when the BM key that is in all annotation dictionaries (see Table 166 — Entries common to all annotation dictionaries) is used.
3D annots have two situations - when an appearance stream is used (such as when print rendering or in a non-3D-capable processor) and when the 3D artwork is invoked (such as in an interactive viewer that is 3D aware). 13.6.4.3 also states "In effect, the 3D artwork and its background form a transparency group whose flattened results have an opacity of 1 (see 11, "Transparency")." but this does not mention the BM key...
Table 166 defines BM as "The blend mode that shall be used when painting the annotation onto the page .." so does it also apply when using the 3D artwork???
In ISO 32000-2:2020, "Table 5 — Entries common to all stream dictionaries" states that Length is a required key.
The F key is then described as "(Optional; PDF 1.2) The file containing the stream data. If this entry is present, the bytes between stream and endstream shall be ignored. However, the Length entry should still specify the number of those bytes (usually, there are no bytes and Length is 0). The filters that are applied to the file data shall be specified by FFilter and the filter parameters shall be specified by FDecodeParms."
In this text it says Length "should still specify..." but if Length is really mandatory then it "shall specify...".
Or Length is optional when F is present.
I personally think /Length is always required so the correct fix is "should" to "shall" in the description of F.
The Value cell for AP in table 166 explicitly says "(Optional; PDF 1.2)", but then goes on to say that "Every annotation [...], except for the two cases listed below, shall have at least one appearance dictionary."
I recommend changing the initial parenthetical text to "(Usually required; PDF 1.2)".
"Table 27 — Additional crypt filter dictionary entries for public-key security handlers" Recipients key is described as being of type "string or array".
The array case is then clearly defined to be a byte string: "If the crypt filter is referenced from StmF or StrF in the encryption dictionary, this entry shall be an array of byte strings, where each string shall be a binary-encoded CMS object that shall ...".
An improvement would be to change "where each string ..." to "where each byte string ..."
However for the string case, it just says "... this entry shall be a string that shall be a binary-encoded CMS object that shall contain a list of all recipients ..."
Question: is this also required to be a byte string or is just string OK?
Clause 7.6.5.3 describes the key derivation procedure for PubKey handlers as follows:
These operations digest the following data, in order:
a) The 20 bytes of seed.
b) The bytes of each item in the Recipients array of CMS objects in the order in which they appear in the array.
c) 4 bytes with the value 0xFF if the key being generated is intended for use in document-level encryption and the document metadata is being left as plaintext.
d) The first n/8 bytes of the resulting digest shall be used as the file encryption key, where n is the bit length of the file encryption key.
The step I'm confused about is (c).
StrF
or StmF
in the document-wide encryption settings?SubFilter
set to adbe.pkcs7.s4
or adbe.pkcs7.s3
, crypt filters are not supported. In that case, how do we know whether metadata is supposed to be encrypted? The EncryptMetadata
entry in the document-wide encryption dictionary is only defined for the standard security handler, after all. Is this an oversight? If not, should we assume that EncryptMetadata
is true
for the purposes of key derivation unless a crypt filter dictionary says otherwise?This entry has been undocumented since PDF Reference 1.7 era, but it has always been supported by commercial pdf products like Acrobat and so on. I'd like to propose formalization this entry into the standard for completeness and consistency.
Proposed entry:
PV
Proposed description text:
(Optional) A flag indicating the visibility of of the cutting plane. If true, then the cutting plane shall be visible. If false, then the cutting plane shall not be visible.
Default value: true
There is language in clause 7.6.5.2 indicating that adbe7.pkcs.s5 shall be used when crypt filters are used in a public key security handler. Clause 7.6.4.1 also states that when a security handler of version 4 or 5 is specified, the standard reader shall support crypt filters.
A literal reading of this requirement doesn't appear to forbid using adbe7.pkcs.s4
as the SubFilter
entry for a V5 public key security handler (e.g. in case you'd want to encrypt a document using AES-256 without bothering with crypt filters). Nonetheless, apparently Acrobat refuses to decrypt such files.
Was the intention here to make crypt filters effectively mandatory for version 4 and 5? In V4 they'd be effectively required anyway (to distinguish between RC4 and AES-128), but that doesn't apply to V5. If V4 and V5 require crypt filters to be used, adding a sentence to clause 7.6.5.2 to that effect could be helpful.
I've attached a couple of example files testing different combinations of security handler versions and subfilter values. As indicated in the specification, the S5
files use crypt filters, while the S4
files do not. Acrobat opens all of them except the V5-S4
one.
aes-tests-V2-S5.pdf
aes-tests-V5-S5.pdf
aes-tests-V2-S4.pdf
aes-tests-V5-S4.pdf
GitHub won't allow me to attach it, but the key material can be found here: https://github.com/MatthiasValvekens/pyHanko/tree/master/pyhanko_tests/data/crypto (the relevant files are named selfsigned.*
; you can either grab them in PKCS#12 format or as a straight PEM dump of the certificate + key)
Minor issue in 12.8.2.4 on the Fields value:
(Required if Action is Include or Exclude) An array of text strings containing field names.
Should say "fully qualified field names"
Clause 7.6.5.3 mandates the following:
A key shall be used to encrypt (and decrypt) the enveloped data. This key (the plaintext key in "Figure 4 — Public-key encryption algorithm") shall be encrypted for each recipient, using that recipient’s public key, and shall be stored in the CMS object (as the encrypted key for each recipient). To decrypt the document, that key shall be decrypted using the recipient’s private key, which yields a decrypted (plaintext) key.
The next paragraph includes provisions on the (symmetric) ciphers that can be used to encrypt the envelope contents (i.e. 20-byte seed + permissions):
The algorithms that shall be used to encrypt the enveloped data in the CMS object are:
- RC4 with key lengths up to 256-bits (deprecated);
- DES, Triple DES, RC2 with key lengths up to 128 bits (deprecated);
- 128-bit AES in Cipher Block Chaining (CBC) mode (deprecated);
- 192-bit AES in CBC mode (deprecated);
- 256-bit AES in CBC mode.
However, there is nothing in the clause restricting which public-key encryption schemes and key lengths are permissible to encrypt the plaintext key.
Even if we take "public-key encryption" to mean "RSA", there's still the issue of padding schemes. Obviously, classic RSA with PCKS#1 v1.5 padding probably works with virtually every implementation, but what about RSA-OAEP? The latter is a more modern parametrised scheme (RSA-OAEP is to encryption what RSA-PSS is to signing, essentially), and is not as widely supported.
I wouldn't necessarily oppose leaving this up to the implementation, but it feels a bit strange to me to constrain the enveloped data encryption to a well-defined list of ciphers, while at the same time not restricting the ways in which the envelope key can be encrypted for each recipient.
Clauses 7.6.4.3.3 and 7.6.4.4.9 state that the Perms
entry shall be computed using AES-256 in ECB mode. More precisely:
[...] Encrypt the 16-byte block using AES-256 in ECB mode with an initialization vector of zero, using the file encryption key as the key.
I believe this is a typo, since initialisation vectors don't make sense for block ciphers operating in ECB mode, and the specification consistently uses CBC mode elsewhere. This includes cases where the initialisation vector is mandated to be zero.
Note: for encrypting a single block, ECB mode is equivalent to CBC mode with an IV of zero, so this issue is unlikely to lead to errors in implementation. Nonetheless, it's a little confusing.
Proposed solution: either strike the words "with an initialization vector of zero", or replace "ECB" with "CBC" in both instances.
In the former case, explaining that both produce the same result in a note might also be useful, since some cryptographic libraries don't expose ECB on account of its obvious issues with repeating patterns (see here for an example).
Table 311, entry ColorSpace
The RGB colour space in which the 3D artwork’s colour values are encoded. Valid values are the name DeviceRGB, an array specifying a valid CalRGB color space (see 8.6.5.3 "CalRGB colour spaces"), or an array specifying a valid RGB-based ICCBased color space (see 8.6.5.5 "ICCBased colour spaces"). If this key is not present, the colour space for the 3D artwork colour values are considered undefined and a PDF processor may choose any appropriate RGB-based colour space, such as sRGB.
It is not clear from other parts of the text whether a DefaultRGB present on the page where the Annotation referencing the stream is used, should be used "in place" of the DeviceRGB (as would be the case for other uses of DeviceRGB). I believe that it should.
Clause 14.8.4.7.2 General inline level structure types, 2nd bullet below Table 368/Note 1 says:
"One object reference (see 14.7.5.3, "PDF objects as content items") to one link annotation associated with the content"
This restriction (new to PDF 2.0) is too strong; it doesn't allow for multiple link annotations tagged with a single element as would be necessary to fully represent semantics for content that spans pages.
There is a contradiction between the Contents entry in Table 31, which says that an array value for Contents in a page object is only a single content stream, built from multiple streams.
"(Optional) A content stream (see 7.8.2, "Content streams") that shall describe the contents of this page. If this entry is absent, the page shall be empty. The value shall be either a single stream or an array of streams. If the value is an array, the effect shall be as if all of the streams in the array were concatenated with at least one white-space character added between the streams’ data, in order, to form a single stream. PDF writers can create image objects and other resources as they occur, even though they interrupt the content stream. The division between streams may occur only at the boundaries between lexical tokens (see 7.2, "Lexical conventions") but shall be unrelated to the page’s logical content or organisation. Applications that consume or produce PDF files need not preserve the existing structure of the Contents array. PDF writers shall not create a Contents array containing no elements."
And clause 7.8.3 Resource dictionaries, which says that each item in the array is a content stream in its own right:
"For a content stream that is the value of a page’s Contents entry (or is an element of an array that is the value of that entry), ..."
Removing the text in parentheses in clause 7.8.3 would remove this contradiction.
Trap network annotations are deprecated and the error goes back the PDF 1.3 Reference, but the following should probably be corrected nevertheless. Table 404 "Additional entries specific to a trap network appearance stream" contains the following obvious typos:
"Valid values are DeviceGray, DeviceRGB, DeviceCMYK, DeviceCMY, DeviceRGBK, and DeviceN."
This should probably read:
"Valid values are DeviceGray, DeviceRGB, DeviceCMYK, and DeviceN."
In PDF2, Artifacts can be added to the document Structure Tree at any point in the tree. There are rules about tag ordering in section 14.8.4 which don't take this into account, and I'm not sure if this is an oversight (as this didn't apply in PDF1.7) or by design.
Specifically: a Caption has to be "the first or last structure element inside its parent structure element". If a Caption is preceded by an Artifact StructureElement in the tree (as it might well be if that Artifact was used to wrap the draw operations for the table background or border, for example) it's going to fail this rule.
Obviously in PDF1 this wasn't an issue as Artifacts were never part of the tree. Given that Artifacts represent "not real content" and can pop up anywhere (largely depending on the technical requirements of the tool creating the document), it seems to me their position relative to anything else shouldn't really matter; they should be transparent to any sort of restrictions on ordering of children. I am pretty certain the only time this restriction occurs is this rule for the Caption element.
I'll be sure to raise this in the PDF/UA-2 WG too, but as the restriction comes from the wording in ISO32K it really needs to be resolved there too. If it's intentional, it's not an insurmountable problem I'm sure, but wanted to flag it up in case it slipped through.
Clause 14.8.6.1 Namespaces for standard structure types and attributes
There is a gap in ISO 32000-2:2020 that allows undefined element types. Need to make it clearer in future editions that we still allow non-namespaced elements. But they need to be roll-mapped to a known type.
The Value cell for AP in table 166 explicitly says "(Optional; PDF 1.2)", but then goes on to say that "Every annotation [...], except for the two cases listed below, shall have at least one appearance dictionary."
I recommend changing the initial parenthetical text to something that make people realise that AP is not simply optional and that they need to read the full text; perhaps "(Usually required; PDF 1.2)".
Several integer keys in dictionaries do not state any explicit valid ranges, such as "positive integer ..." or "non-negative integer ...".
One way to fix this quickly may be to simply state once up the front of ISO 32K somewhere (where?) that key values that represent counts, sizes, widths, heights, file byte offsets, object numbers, page numbers and <anything else that is common?> are non-negative unless stated otherwise.
Or we could review each and add the explicit wording in place.
Here is an incomplete list (from a search of ISO 32000-2 for "integer" up to about Table 100 - more to be added later):
In 14.13.5, paragraph 3 it emboldens the phrase "Property List", which is not a formal key and therefore should not be bold.
The simplest change is to make it italic and all lower case, which matches other uses of that phrase.
In 8.9.7 Inline images is inconsistent between the text in the 1st para after Table 90: “Unless the image uses ASCIIHexDecode or ASCII85Decode as one of its filters” and in Note 2: “if the final or only filter is ASCIIHexDecode or ASCII85Decode”.
Shouldn’t the first of those also talk about “if the final or only filter” rather than ASCII~ being “one of its filters”? I realise you’d need to have done something very odd if you’re using ASCII~ and something else, and the ASCII~ is not last, but …
Clause 9.8.3.3 FD, 2nd paragraph currently states:
The key for each entry in an FD dictionary shall be the name of a class of glyphs — that is, a particular subset of the CIDFont’s character collection. The entry’s value shall be a font descriptor whose contents shall override the font-wide attributes for that class only. This font descriptor shall contain entries for metric information only; it shall not include FontFile, FontFile2, FontFile3, or any of the entries listed in “Table 120 — Entries common to all font descriptors”.
All the metrics that should go in the FD sub-dictionaries are thus supposedly listed in Table 120. Looking back through old specs, ISO 32000-1:2008 references Table 122, which is also "Entries common to all font descriptors”, but section 5.7.2 of the PDF 1.6 spec (page 433) references Table 5.21, which is "Additional font descriptor entries for CIDFonts”, which makes total sense. Looks like there was a typo in the table number when transitioning the PDF spec to ISO.
Table 111 "Type 3 font operators" describes the d0 and d1 operators which specify metrics of glyphs in a Type 3 font. wx is described as horizontal displacement in the glyph coordinate system, but its data type is not specified. Is it integer, real or number (i.e. both)?
The data type for wx, wy, llx, lly, urx, ury should be stated explicitly.
The example below Table 111 "Type 3 font operators" and Figure 62 "Output from the example" generates two glyphs which are appropriately named /square and /triangle. However, the EXAMPLE text mentions character codes a and b, and the comment within the code similarly talks about Type 3 font definition encoding two glyphs, 'a' and 'b'.
While the glyphs are placed at positions 97/98 in the Encoding which correspond to a and b in WinAnsiEncoding, there's nothing in the Type 3 font which would imply any relationship of those glyphs to the "glyphs" or "character codes" a and b. It just so happens that they occupy the same slot as a and b in some other encoding which is unrelated to the example.
Suggestions:
Table 164 — Entries in a transition dictionary
Di key - should be an "integer", not a "number" as it’s a predefined set of integer-only values.
Some clarification is required regarding whether or not an annotation requires an appearance stream /AP. The main source is Table 166 "Entries common to all annotation dictionaries" which states that the following don't need any appearance:
But there are additional sources:
Suggestions:
Change the /AP description in Table 166 from Optional to Required in some cases; see below.
(UNCHANGED) Every annotation (including those whose Subtype value is Widget, as used for form fields), except for the two cases listed below, shall have at least one appearance dictionary.
(NEW) The AP entry is not allowed in the following cases:
Table 177: The annotation dictionary’s AP entry, if present, shall take precedence...
Modify as follows:
The annotation dictionary’s AP entry shall take precedence...
Delete the following phrase in 12.5.6.24 as it duplicates information from Table 166:
A projection annotation with a Rect entry that has zero height or zero width shall not have an AP dictionary.
ISO 32000-2:2020 "Table 120 — Entries common to all font descriptors" FontName is described as "(Required) The PostScript name of the font. This name shall be the same as the value of BaseFont in the font or CIDFont dictionary that refers to this font descriptor."
Type3 fonts don't have BaseName as defined by Table 110 so therefore the FontDescriptor FontName should be not required for Type3.
Section 14.8.4.8.3 - Table 371
Current text reads: "A row of table header cells (TH) or table data cells (TD) in a table."
This could imply that a TR cannot have both TH and TD in it.
Recommended edit:
"A row of table header cells (TH) and/or table data cells (TD) in a table." (Adding "and".)
This text is from ISO32K2:2020, bottom of page 191:
PDF writers shall only use the profile types shown in "Table 67 — ICC profile types" for specifying calibrated colour spaces for colouring graphics objects. Each of the indicated fields shall have one of the values listed for that field in the second column of the table. Profiles shall satisfy both the criteria shown in the table. The terminology is taken from the ICC specifications.
...
Note 1. XYZ and 16-bit Lab profiles are not listed.
and here's table 67:
Header Field | Required Value |
---|---|
deviceClass | icSigInputClass ('scnr') |
icSigDisplayClass ('mntr') | |
icSigOutputClass ('prtr') | |
icSigColorSpaceClass ('spac') | |
colorSpace | icSigGrayData ('GRAY') |
icSigRgbData ('RGB ') | |
icSigCmykData ('CMYK') | |
icSigLabData ('Lab ') |
I've a few minor issues with this.
First, "The terminology is taken from the ICC specifications." - maybe, but there's no "icSigNNN" anywhere in ICC v4.3 or 2.4. Not a big deal, but replacing "icSigDisplayClass ('mntr')" with just "mntr" might be an improvement.
Second, the note explicitly states "16 Bit Lab is not listed" - not listed doesn't really tell us anything. The text was "not supported" in PDF1.7, which I think was a better phrasing.
Third, despite being not listed (or supported), Lab is listed in table 67. It's not clear what's supposed to be allowed - 8 bit Lab only? In fact there's no such thing as "16 bit Lab" in ICC: there's just Lab. It may have an 8-bit table, a 16 bit lookup table or a parametric curve. I've never seen an 8-bit one, but neither 16-bit nor parametric curve Lab profiles are supported in Acrobat.
I think the intent is just to exclude Lab as a colorSpace type, in which case I'd suggest dropping the "Lab" line from Table 67, and changing the Note to "Note 1. in particular, XYZ and Lab profiles are not supported." - although it's a bit redundant given the lack of those entries in table 67.
In Table 176 the BS entry specifies "the line width and dash pattern that shall be used in drawing the annotation’s border". It's pretty obvious what that means when only Rect is present. But what does it mean if QuadPoints is in use? Should the outline of every rectangle be painted?
In table 322 the default value of IV
entry is false
. However in existing commercial products like acrobat, the value of this entry has always been interpreted as true
if missing. (And it is indeed that true
as default value is more natural)
I'm not sure whether it's better to just change the specification in this case, or just claim that every existing implementation has been wrong. Personally i feel the former approach is more practical and helpful.
Note 3 in "8.9.7 Inline images" says that various colour space names don’t refer to resources in the ColorSpace subdictionary; all well and good. But it starts by saying that they identify the corresponding colour spaces directly, which implies that DefaultGray, DefaultRGB and DefaultCMYK do not apply here.
If that's true then it shouldn't be in a note, but it would be odd to make inline images so different to everything else. I recommend clarifying that default colour spaces do apply here.
I suggest something like "The names DeviceGray, DeviceRGB, and DeviceCMYK (as well as their abbreviations G, RGB, and CMYK) never refer to resources in the ColorSpace subdictionary; they always identify the corresponding colour spaces either directly or via a default color space (see 8.6.5.6 Default colour spaces)."
Table 116 — Predefined CJK CMap names
Table 116 in the "Korean" section lists only predefined CMaps that belong to the deprecated "Adobe-Korea1-2" character collection. On the other hand the predefined CMaps belonging to the new "Adobe-KR-9" are not listed in the "Korean" section of Table 116.
The predefined CMaps in the "Korean" section belonging to the deprecated "Adobe-Korea1-2" character collection should be removed from Table 116. The new predefined CMaps "UniAKR-UTF8-H", "UniAKR-UTF16-H" and "UniAKR-UTF32-H" should be added to the "Korean" section of Table 116.
See also The Adobe-KR-9 Character Collection in the README.md file in the Adobe CMap Resources GitHub repository.
I've just had another developer who misunderstood the text in 7.4.9 around using CIEJab for JPXDecode.
It says "Data used in PDF image XObjects shall be limited to the JPX baseline set of features, except for enumerated colour space 19 (CIEJab)." My understanding, supported by the wording in Adobe v1.7, is that CIEJab is not in the baseline set, but is allowed.
That's immediately followed by "In addition, enumerated colour space 12 (CMYK), which is part of JPX but not JPX baseline, shall be supported in a PDF file." I think the difference in the way that CIEJab and CMYK are described contributes to the confusion.
I suggest something like: "Data used in PDF image XObjects shall be limited to the JPX baseline set of features. In addition, enumerated colour spaces 12 (CMYK) and ### (CIEJab), which are part of JPX but not JPX baseline, shall be supported in a PDF file."
Clause 12.7.5.5, Table 237 (under "AddRevInfo") [...] adbe.pkcs7.detached and adbe.pkcs7.sha1 are deprecated in PDF 2.0.
Clause 12.8.3.1, Table 260 [...] values adbe.x509.rsa_sha1 and adbe.pkcs7.sha1 have been deprecated with PDF 2.0.
PeterW: If you search for “adbe.x509.rsa_sha1”, it comes up in:
If you search for “adbe.pkcs7.sha1”, it comes up in:
If you search for “adbe.pkcs7.detached”, you get the following:
Clause Q.2 (last para) states the following in regards to annotations:
Since Annotations require an appearance stream which is drawn by a PDF processor on top of the page content, it is possible that their presence may cause a page without any transparency to acquire some transparency. Therefore, all annotations object's in the page dictionary's Annots array shall have their appearance streams processed as a form XObject, according to Q.3, "Form XObjects".
This neglects the new BM key introduced by PDF 2.0 in Table 166 which is described as "The blend mode that shall be used when painting the annotation onto the page ..."
An additional sentence should be added describing the BM key when its value is not Normal.
Section 8.6.5.6 - Default color spaces defines the following cases when the default color spaces are applicable as:
A colour space is selected for painting each graphics object. This is either the current colour space parameter in the graphics state or a colour space given as an entry in an image XObject, inline image, or shading dictionary.
On the other hand, we have C entry in the annotation dictionary (Table 166) specifying a number of cases when Device colors are used to define the appearance of the annotation. Currently there is no mechanism available to remap these device colors to any device-independent ones.
This is not an issue if the annotation has an appearance stream, which is required for most cases in PDF 2.0. But, for example, Link annotations do not require AP entry and yet use the value of C entry to draw the border. Another case is when the annotation is modified and the appearance stream has to be recreated.
One way to resolve this would be to state that default color spaces defined in the page resource dictionary are applicable also to color spaces used in C and IC entries of the annotation dictionaries on that page.
This is especially important for PDF/A and PDF/X standards which forbid the use of Device colors with undefined matching output profile.
Clause 8.9.7 Inline images, Table 91 establishes that many keys in an inline image header pseudo-dictionary can be abbreviations of longer named keys in an Image XObject. But what happens if an inline image has both the full key name and its abbreviation (and they have different values)?
Is this a "duplicate key" and an error? In which case the clause 7.3.7 wording "Multiple entries in the same dictionary shall not have the same key" is inadequate as they are not actually the same key names (just logically/semantically).
We do NOT want to have "first key in dict" / "last key in dict" logic as that would also clearly contradict everywhere else in PDF and clause 7.3.7 "That ordering shall be ignored."
Or does one form of the key take precedence? e.g. long-form over abbreviation? Or vice-versa?
Or do we call this case out explicitly as an error in 8.9.7 and state that the inline image "shall" be skipped in any rendered output.
My preference is for some type of "resolve and continue processing" handling so that the inline image might hopefully get painted.
The first sentence of the first para under Table 40 says:
"Values of Domain shall constrain x in such a way that if N is not an integer, all values of x will be nonnegative,
and if N is negative, no value of x will be zero."
We have a sample file that has opened a debate about whether those are two separate statements, or if the second clause is a follow on from the first.
In other words, which of these is it?
a) "Values of Domain shall constrain x in such a way that:
b) "Values of Domain shall constrain x in such a way that if N is not an integer, all values of x will be nonnegative,
and if N is a negative integer, no value of x will be zero."
My reading woiuld be a), but a couple of editing applications seem to be happy to retain a negative non-integer value of N in combination with a Domain of [ 0 1 ] when updating files.
Thanks
Martin
Clause 9.6.2.2 Standard Type 1 fonts (standard 14 fonts) (PDF 1.0-1.7) ends with the following text:
PDF processors supporting PDF 1.0 to PDF 1.7 files shall have these fonts, or their font metrics and suitable substitution fonts, available.
These fonts, or their font metrics and suitable substitution fonts, shall be available to the PDF processor.
Editorial issue: the second sentence duplicates the first sentence. Suggestion: delete second sentence.
Table 47 — Entries in a collection subitem dictionary
"Default: None" is stated for both D and P keys. Both D and P allow text-strings as a valid type. "None" is also capitalized and italic implying it is a value.
Does this mean the default is a text-string with the value None (as in "(None)
") - or does it mean that there is no default specified?
If the latter then suggest deleting both Default statements as this is the usual way of indicating there is no default value. I believe this is what is intended.
In Table 166 AP is not required in a couple of cases, but immediately after the table the text says "A PDF reader shall render the appearance dictionary without regard to any other keys and values in the annotation dictionary and shall ignore the values of the C, IC, Border, BS, BE, BM, CA, ca, H, DA, Q, DS, LE, LL, LLE, and Sy keys."
So what should the renderer do if there is no AP?
For a zero-size annotation I recommend that it doesn't render anything, and maybe the requirement that the border is drawn completely inside the annotation rectangle (12.5.4 para 1) is enough to state that anyway.
I think for Popup and Projection annotations the correct behaviour is stated adequately elsewhere, so the lack of any allowance for a missing AP here simply muddies the water. It's less clear that there is a clear statement for Link annotations anywhere. Did I miss something?
A good start would be to amend the para after Table 166 to start "For all annotations containing an AP entry, a PDF reader ...", and then amending the following note to say "Requiring an appearance dictionary for most annotations ..."
A large number of places (49 to be precise – see attached XLSX file) in ISO 32000-2 describe keys as being “meaningful” under certain conditions. This is a vague and unclear term, as it is unclear what “meaningful” really means and how/when you codify or might validate that, or if it has exactly the same meaning in every case.
Clause 11.6.5.2, "Table 143 — Restrictions on the entries in a soft-mask image dictionary" defines a number of image XObjects keys as being "Ignored" yet this information to ignore for soft-masks is not mentioned or cross-referenced from "Table 87 — Additional entries specific to an image dictionary".
Specifically for Table 87 keys: Intent, Alternates, Name, ID, StructParent.
Table 143 also says SMask key shall be absent, but Table 87 doesn't mention this.
Various simple, not overly wordy solutions and that don't duplicate technical requirements include:
add a reference to 11.6.5.2 Soft-mask images to the NOTE above Table 87
adding "Additional limitations also apply to this key when used in soft-mask image dictionaries - see clause 11.6.5.2 Soft-mask images." to each of the above keys in Table 87.
ISO 32000-1:2008, Table 220, has the T
key identified as Optional. However, ISO 32000-2:2020, Table 226, shows the same T
key as being required. However, in 12.7.4.2, it says:
A field dictionary that does not have a partial field name (T entry) of its own shall not be considered a field but simply a Widget annotation.
Which clearly implies that a field dictionary need not have a T
.
I don't know how the Optional->Required, but I consider it a typo and we need to put it back.
Table 117 — Character collections for predefined CMaps, by PDF version
The information in "Table 117 — Character collections for predefined CMaps, by PDF version" was eliminated in ISO 32000-2. In ISO 32000-1:2008 this table contained the information in which PDF version which character collection was first introduced.
In ISO 32000-2 the contents of the table were replaced with the following text:
Table intentionally empty to retain table numbering in this document (2020). Information is now located in the appropriate normative reference for each character collection.
According to section "2 Normative references" the "appropriate normative reference for each character collection" is located in the Adobe CMap Resources GitHub repository.
Nowhere in that repository I can find any resource that contains the information that formerly was available in Table 117.
The contents of Table 117 should either be restored and extended for PDF 2.0, or the README.md file in the CMap Resources GitHub repository should be amended with this information.
In 9.6.2.1, Table 109, the entry for FontDescriptor includes the text:
For the standard 14 fonts, the entries FirstChar, LastChar, Widths, and FontDescriptor shall either all be present or all be absent. Ordinarily, these dictionary keys may be absent; specifying them enables a standard font to be overridden; see 9.6.2.2, "Standard Type 1 fonts (standard 14 fonts) (PDF 1.0-1.7)".
For PDF 2.0. all of those fields are marked required, so there is NEVER a case where they can "all be absent".
I would recommend we simply remove the paragraph.
Do all non-repeated attributes from multiple attribute objects with a repeated owner apply to a structure element?
14.7.6.1, Paragraph 1 defines:
Paragraph 3 ("When an array...") states that when the owner is repeated and a given attribute (as opposed to attribute object) is also repeated, the later entry takes precedence. I interpret "entry" to mean a single attribute, not an array entry. This implies that the following two arrays of attribute objects on a TD structure element should be equivalent:
[ << /O /Table /RowSpan 2 >> << /O /Table /ColSpan 2 >> ]
[ << /O /Table /RowSpan 2 /ColSpan 2 >> ]
In contrast, I have seen two different PDF processors treat the first array as equivalent to specifying only one of the two attribute objects, although admittedly this happened on a PDF 1.5 document. I'm not sure what is the correct interpretation here.
Proposed Solutions (mutually exclusive). Replace the final sentence in paragraph 3 with:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.