cfrg / draft-irtf-cfrg-hpke Goto Github PK

View Code? Open in Web Editor NEW

48.0 48.0 29.0 11.01 MB

Hybrid Public Key Encryption

License: Other

Makefile 100.00%

draft-irtf-cfrg-hpke's People

Contributors

Stargazers

Watchers

draft-irtf-cfrg-hpke's Issues

Point validation?

If receivers don't validate ephemeral keys (point on the curve, and in the right subgroup), what can go wrong? An active and malicious initiator could, for example, use that to learn the responder's private key: https://safecurves.cr.yp.to/twist.html

Avoid confusing normative language

From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/

Section 5.1.3: The lower-case "must" in "the sender must be the other" might be confused with normative "MUST". Suggest using a different word or changing to the normative form.

Clarify "formally verified"

From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/

Should "formally verified" be "proven secure under standard cryptographic assumptions"? Or is the intent indeed to enable tools that check correctness of an implementation?

Fix some nits

From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/

"assumed that the sender" --> "assured that the sender"

Section 8.2:

"KEM public key pkR" --> "KEM public key "pkR""

"ciphertext enc" --> "encapsulated key enc" (two occurrences). "Ciphertext" is used elsewhere in the draft to refer to the AEAD output.

Section 8.3: There is a non-normative (lower-case) "should" in the first sentence. (Contrasting against a normative/upper-case "SHOULD" in the first sentence of 8.4.) Should this "should" be "SHOULD"?

Section 8.7: There are missing quotes around "(enc2, ciphertext2, enc, ciphertext)".

Bind DHKEM labels to the group

Section 4.1: this may be paranoia, but it would be slightly nicer to include
the DH group name in the label arguments of LabeledExtract and LabeledExpand
to ensure that invocations from different DHKEM instantiations are orthogonal.

Add CCM ciphersuites

... for use in ECHO.

AES-GCM-128: https://tools.ietf.org/html/rfc5116
AES-CCM-128 (8-byte IV): https://tools.ietf.org/html/rfc6655

We should reference where these AEADs are defined, too (5116 for GCM, 8439 for ChaCha20Poly1305).

Note that DHKEM's Unmarshal function can fail

Ambiguity about Secret Export

Section 5.3 explains secrets are exported with the KDF Expand function but the included code in the same section now calls LabeledExpand with a "sec" label.

The JSON test vectors contain sample results for the export function, but they match an unlabeled implementation with Expand and not LabeledExpand.

Mismatch on zz length for P-256

zz, as computed by Decap, is indicated to have length Npk https://github.com/cfrg/draft-irtf-cfrg-hpke/blob/master/draft-irtf-cfrg-hpke.md#dh-based-kem.
This does not match the length of the test vector provided for P-256 https://github.com/cfrg/draft-irtf-cfrg-hpke/blob/master/draft-irtf-cfrg-hpke.md#base-setup-information-2

pkSm does nothing in KeySchedule

I was looking through a code coverage map of my implementation and realized that default_pkSm is never actually used anywhere. KeySchedule takes in a pkSm and uses it to sanity-check the given mode, and then uses it nowhere in the key schedule itself. Should pkSm be removed as an argument?

Add domain separation for expanded secrets

We currently have none!

Use I2OSP instead of encode_big_endian

See https://tools.ietf.org/html/rfc8017.

Clarify additional key material in authenticated modes

From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/

Section 5:
"we include two authenticated variants .": We would also suggest mentioning that these variants also contribute additional keying material to the encryption operation. See also discussion in Section 8.1.

After the sentence, "the constructions described here presume .", mention that the recipient also needs a way to determine which of its public keys was used for the encapsulation operation (if the recipient has more than one public key). Also add a reference to Section 9 which addresses the corresponding issues for message encoding.

Clarify directionality of HPKE with multiple encryptions

There's currently a single sequence number space that's incremented by 1 for each message encrypted. This implies that only the initiator can encrypt messages to the receiver, else we risk key/nonce re-use. We should be clear about this in the draft!

Length of PSKs in test vectors don't match with requirements

The PSKs used in the test vectors currently are of length 6 bytes but should be 32 bytes for HKDF-SHA256 and 64 bytes for HKDF-SHA512.

Fix some references

From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/

[ANSI]: Add "X9.63" to title.

[BNT19] and other references as needed: Add authors' names.

[MAEA10]: Use "authoritative" URI for long-term stability: https://ieeexplore.ieee.org/abstract/document/5604194/.

Ambiguous Nzz definition, possible wrong value for P-521

Table 7.1 specifies an Nzz value of 64 for DHKEM(P-521, HKDF-SHA512). The test vectors are using a value of 66, which seems right given:

Nzz: The length in bytes of a shared secret produced by the algorithm.

(in context of KEM identifiers, not KDF).

But, 4.1 also states:

For the variants of DHKEM defined in this document, Ndh is equal to Npk, and the output length of the KDF's Extract function is Nzz bytes.

We should clarify whether Nzz is based on the ECDH or HKDF-Extract output length, and fix the test vectors if necessary.

Guidance for future KEMs

From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/

As guidance for future revisions, we would recommend adding a section about the issues that need to be considered when adding support for other KEMs. There will presumably be industry interest in including post-quantum KEMs (as anticipated in Sec. 8.1), and there may also be interest in including RSA-based KEMs, for legacy support. The technical subtleties in adding such mechanisms include:

Assumptions about the relationship between the private key and the public key and the definition of the "pk()" function. For instance, GenerateKeyPair, listed as part of a KEM in Section 4, doesn't really need to be part of one (it's not part of RSA-KEM).
Assumptions about the length of the public key. It may not always be a fixed value, "Npk", for a KEM with a given set of parameters. The other (and unrelated) "hybrid" draft, draft-ietf-tls-hybrid-design, Section 3.2 ,makes accommodation for public keys associated with a given set of parameters to vary in size.

Clarify DH-only KEMs in the abstract

From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/

The draft only specifies a Diffie-Hellman-based KEM (Section 4.1). To set expectations for the implementer, we recommend stating this limitation in the abstract, e.g., by adding "based on elliptic curve Diffie-Hellman key agreement" at the end of the last sentence of the abstract.

Consider static DH oracles

Do we need to be concerned about them? If receivers don't validate ephemeral keys (point on the curve, and in the right subgroup), what can go wrong?

Inconsistent naming of mode AuthPSK

Both the constant mode_psk_auth and function names like SetupAuthPSKR are used. That's inconsistent because the order of psk and auth is different. Suggestion: change the constant to be mode_auth_psk, because there are more function names that would need to be changed otherwise.

Nits from Riad

Section 5.1.3: it would be nice to include a reference or citation for
unknown key share attacks.

Section 5.2: is there a reason to put the word "amortize" in quotes?

Section 7.1.2: it might be worth mentioning here that [keyagreement] also
includes checking that the public key is not the identity point.

Section 7.1.2: is there a reason to recommend either checking for a nonzero
scalar or checking for a non-identity DH output? Checking the latter covers
the former and also covers the check from my prior comment. Moreover, it is
not clear to me that checking the scalar is useful for the recipient, since
this is essentially just checking that their long-term secret is nonzero.

Section 8.1: the sentence "In particular, the KDFs and DH groups..." might
want to clarify that this statement is true only when these primitives are
used as specified. The concern is that HKDF is only indifferentiable under
some restrictions on salt length (for reasons noted in Section 8.3).

Clarify test vector labels

From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/

Appendix: "pkR", "pkS" values are given. These are presumably the same as the marshalled versions "pkRm", "pkSm", this should be stated for completeness. (In contrast, both "pKE" and the equivalent "enc" are shown.)

Clarify pseudocode and define undefined operands

From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/

Section 5.2.:

The symbol "<<" isn't defined, but assuming it means "shift left by a specified number of bits", the number of bits to shift should be "8*Nn" rather than "Nn".

Does "overflow" in the third paragraph refer to the same condition as "wrap" in the fifth paragraph? If so, the text should be combined and a single term used for consistency. If not, the differences between the two requirements should be explained. We would also suggest adding a note indicating that the reference code assumes the sequence number is the same length as the nonce.

The use of "Nonce" (capitalized) as a function and "nonce" (lower case) as a value may be confusing. We suggest instead that the function be named "ComputeNonce" or similar.

On a similar object-oriented programming note, it should be stated that the underlying "Seal" and "Open" functions are the ones determined by the "aead_id" property.

Contents of a context aren't well-defined

This is just an editorial comment. The spec doesn't seem to define the contents of a "context" very clearly. Section 5 says:

A "context" encodes the AEAD algorithm and key in use, and manages the nonces used so that the same nonce is not used with multiple plaintexts.

But it also has an exporter secret. Then section 5.1 says:

return Context(key, nonce, exporter_secret)

But we haven't defined the Context function yet. I'm guessing the intent is that Context produces some sort of record type with those field names? But that wouldn't initialize Context.seq used later in 5.2. (Confusingly, this Context is distinct from the context variable which contains an HPKEContext structure. Maybe the latter could be renamed?)

Then 5.2 lists out the contents of a "context' more explicitly, including the first mention of a sequence number. But it omits the exporter secret again. (Should "The sender's context MUST be used for encryption only. Similarly, the recipient's context MUST be used for decryption only." be rephrased? One could read that as saying export is also not okay.)

Then 5.3 mentions a context having an "exporter secret", but this is actually the only instance of that phrase in the document.

Inconsistent use of X25519 vs Curve25519

The HPKE draft refers to "Curve25519" and "DHKEM(Curve25519, HKDF-SHA256)" throughout the draft, but then section 8.8 mentions DHKEM-X25519.

I believe X25519 is correct here. RFC7748 defines "curve25519" as a particular Montgomery curve. It then defines "X25519" as a Diffie-Hellman primitive on top of curve25519, with particular encodings and everything else. HPKE is using the Diffie-Hellman primitive, so it should use X25519. As a bonus, it's shorter and "DHKEM(Curve25519, HKDF-SHA256)" is already a mouthful. :-)

Outdated reference

In "DH-Based KEM" the paragraph

The GenerateKeyPair, Marshal, and Unmarshal functions are the same as for the underlying DH group. The Marshal functions for the curves referenced in {#ciphersuites} are as follows:

references the #ciphersuites section that no longer seems to exist.

Clarify KEM shared secret for AuthEncap/Decap

Section 4: the definitions of AuthEncap and AuthDecap contain words to the
effect, 'the KEM shared secret key is known only to the holder of the
private key "skS".' It would be more accurate to say , 'the KEM shared
secret key was generated by the holder of the private key "skS"'.

Add negative test vectors

The draft only has success test vectors. Negative ones would be good, too.

Add an exporter

It may be desirable to export a secret, as with the TLS exporter. Adding such a feature would add a bit of complexity, and dilute the focus on PKE.

Caveat in test vectors

The plaintext is always the same, but the nonces and AADs differ by just one bit, which is hard to spot and easily missed.

Harmonize label values

From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/

The "label" argument to LabeledExtract is being used in some cases to identify the output, in one case to identify the input, and in one case to identify the intent. We suggest harmonizing on the former, and also consistently suffixing the output variable name with "_hash" when the purpose of the extraction is to produce a hash of the input. This would result in the following statements being updated:

info_hash = LabeledExtract(zero(Nh), "info_hash", info) // new label

psk_hash = LabeledExtract(zero(Nh), "psk_hash", psk) // new output name

secret = LabeledExtract(psk_hash, "secret", zz) // new input name and label

Mismatch on psk length for SHA256

At the moment, the length of psk in the test vectors is the same for SHA256 and SHA512. As far as I can tell, psk should only contain 32 bytes for SHA256 instead of 64.
Additionally, psk, pskID and pkS are not needed in the Base setup (https://github.com/cfrg/draft-irtf-cfrg-hpke/blob/master/draft-irtf-cfrg-hpke.md#base-setup-information for instance)

Shared secret size for P-256

It's currently 32 bytes, i.e., just the x-coordinate of the point. But Npk suggests it should be a fully-encoded public key. Which do we prefer?

cc @blipp @bifurcation

(Thanks to Michael Scott for raising this!)

Nenc and Npk for P512 are inconsistent within the draft

In Section “DH-Based KEM”:

* P-521: The X-coordinate of the point, encoded as a 66-octet
  big-endian integer

In “Algorithm Identifiers” > “Key Encapsulation Mechanisms”:

| Value  | KEM               | Nenc | Npk | Reference      |
|:-------|:------------------|:-----|:----|:---------------|
| 0x0012 | DHKEM(P-521)      | 65   | 65  | {{NISTCurves}} |

Updated test vectors do not match the spec

While updating labels for draft-03, I noticed that the test vectors added in 5bc57ba seem to be incorrect, matching an implementation that:

Uses the label "info_hash" to extract info_hash, when draft-03 specifies "info".
Uses the label "psk" to extract psk, when "psk_hash" is specified.

With the "_hash" suffix swapped as above, my implementation generates matching outputs.

Document security properties

For example, the base mode does not provide KCI resistance.

Include mode as KeySchedule input

From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/

Section 5.1:

"mode" should also be listed as a key schedule input.

Link of reference [SECG] broken

Hi, the link of reference “[SECG] Elliptic Curve Cryptography, Standards for Efficient Cryptography Group, ver. 2” seems broken. Instead of http://www.secg.org/download/aid-780/sec1-v2.pdf the following seems to work: https://secg.org/sec1-v2.pdf

KeySchedule notation issues

In the current draft (looking at branch master):

def KeySchedule(mode, pkRm, zz, enc, info, psk, pskID, pkIm):
VerifyMode(mode, psk, pskID, pkI)

pkRm = Marshal(pkR)
pkRm is given as parameter to KeySchedule but is calculated from pkR inside (suggestion: remove the line calculating it)
pkI is passed to VerifyMode, should be pkIm.

From @dwd and @blipp

Add some color to post quantum proof discussion

From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/

Section 8.1: "A full proof of post-quantum security .". Although we understand that a full proof of post-quantum security may not be achievable within the timeline of this draft's publication, we would nevertheless recommend some additional discussion on what might be desirable to prove. In the draft, the PSK is employed as an authentication factor, so presumably the proof being contemplated would be that authentication in the modes involving PSKs remains secure against a quantum computer. A stronger property would be more attractive: that encryption in the PSK modes remains secure against a quantum computer, whether the KEM itself is post-quantum or not. If the authors consider this property plausible, then it should be mentioned here as a goal for security analysis. If not, then the reasons for not targeting this property should also be given.

Add acknowledgements

Benjamin Lipp, David Benjamin, Benjamin Beurdouche, Riad Wahby, Kevin Jacobs, Michael Rosenberg, Michael Scott, Raphael Robert, and probably more!

Issues in test vectors

Two issues I found in the test vectors:

kemID is incorrect. eg - for Curve25519, kemID: 1 but it's specified as 2 in the draft.
Sequence numbers for generating nonces look off by one.
For DHKEM(Curve25519), HKDF-SHA256, AES-GCM-128, the initial nonce is 0d8e01f89fa5abab107f7fe9, but the nonce used in the first encryption (sequence number 0) is 0d8e01f89fa5abab107f7fe8 - the initial one XOR 1.
As I understand the spec says it should be XOR 0.

Typo for Single-Shot and clarification on AEAD binding to Context.

In section 5.1, might be useful to have return KeySchedule/Context take in the AEAD, to make it clear that a Context is bound to a particular AEAD.
In section 6, Seal calls SetupI instead of SetupS.

Clarify "hybrid" in the introduction

From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/

Asymmetric and symmetric algorithms have been combined since the 1980s, e.g., in Privacy-Enhanced Mail [RFC1113], so a hybrid approach (in the sense of combining the two) can by now be considered the "tradition" of public-key cryptography. We would therefore suggest replacing the first sentence with the following:

Encryption schemes that combine asymmetric and symmetric algorithms have been specified and practiced since the early days of public-key cryptography (e.g., [RFC1113]). Combining the two brings the "best of both worlds": the key management advantages of asymmetric cryptography and the performance benefits of symmetric cryptography. However, the traditional combination has been "encrypt the symmetric key with the public key." "Hybrid" public-key encryption schemes (HPKE), specified here, take a different combination, "generate the symmetric key and its encapsulation with the public key." .

Limits on Inputs to LabeledExtract and LabeledExpand

I'm trying to remove all allocation from my implementation, and there's really only 1 snag I'm hitting: LabeledExtract and LabeledExpand do a concat operation before passing to their respective HKDF functions, and there isn't always an upper bound on the size of the concatenated result. Specifically, there's

KeySchedule(info, psk, pskID):
    LabeledExtract(..., info)
    LabeledExtract(..., psk)
    LabeledExtract(..., pskID)
Context.Export(exporter_context):
    LabeledExpand(..., exporter_context, ...)

If there were a (reasonably small) upper bound on the sizes of info, psk, pskID, and exporter_context, then it would be trivial to implement HPKE without allocation.

I've thought about "streaming" the input into the above functions, instead of sending a concatenated bytestring. This could theoretically work for HKDF-Extract with SHA256, since it's an MD hash, but this doesn't work generically. Also the definition of HKDF-Expand does not admit a way to stream in the info string.

ExtractAndExpand input parameters

From Michael Scott:

A minor observation. In ExtractAndExpand the salt parameter is zero(Nh).

In fact this is the same as using zero(0), as HMAC internally pads this up to a blocksize of zeros.

So for example if using SHA512 and Nh=64, the hash blocksize is 128, and zero(0) gets padded up to 128 zeros, as does zero(64) . In fact the parameter to zero(.) is irrelevant.

We might consider zero(2*Nh) or zero(0). What do you think, @blipp?

Cite Shoup for identity misbinding prevention in 8.2

From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/

"avoid identity mis-binding issues": Perhaps also note that including the public key and the encapsulated key as inputs to key derivation can help with the security proof. [Shoup] makes this observation in Section 15.6.1.

[Shoup] @Article{shoup2001proposal,
title={A proposal for an ISO standard for public key encryption (version 2.1)},
author={Shoup, Victor},
journal={IACR e-Print Archive},
volume={112},
year={2001}
}

Unnecessary return value in Decap(), AuthDecap()

The spec definition of Decap() includes taking enc as a parameter, and returning it unmodified:

   def Decap(enc, skR):
     pkE = Unmarshal(enc)
     dh = DH(skR, pkE)

     pkRm = Marshal(pk(skR))
     kemContext = concat(enc, pkRm)

     zz = ExtractAndExpand(dh, kemContext)
     return zz, enc

Only return zz is needed. The same applies to AuthDecap.

Clarify unsigned property of encode_big_endian

From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/

Section 3: Definition of encode_big_endian: Add "unsigned" before "integer" if this is the intent (so that the set of encodable n-byte integers clearly includes 0 through 2^{8n}-1).

cfrg / draft-irtf-cfrg-hpke Goto Github PK

draft-irtf-cfrg-hpke's People

Contributors

Stargazers

Watchers

Forkers

draft-irtf-cfrg-hpke's Issues

Recommend Projects

Recommend Topics

Recommend Org