cfrg / draft-irtf-cfrg-hpke Goto Github PK
View Code? Open in Web Editor NEWHybrid Public Key Encryption
License: Other
Hybrid Public Key Encryption
License: Other
If receivers don't validate ephemeral keys (point on the curve, and in the right subgroup), what can go wrong? An active and malicious initiator could, for example, use that to learn the responder's private key: https://safecurves.cr.yp.to/twist.html
From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/
Section 5.1.3: The lower-case "must" in "the sender must be the other" might be confused with normative "MUST". Suggest using a different word or changing to the normative form.
From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/
Should "formally verified" be "proven secure under standard cryptographic assumptions"? Or is the intent indeed to enable tools that check correctness of an implementation?
From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/
"assumed that the sender" --> "assured that the sender"
Section 8.2:
"KEM public key pkR" --> "KEM public key "pkR""
"ciphertext enc" --> "encapsulated key enc" (two occurrences). "Ciphertext" is used elsewhere in the draft to refer to the AEAD output.
Section 8.3: There is a non-normative (lower-case) "should" in the first sentence. (Contrasting against a normative/upper-case "SHOULD" in the first sentence of 8.4.) Should this "should" be "SHOULD"?
Section 8.7: There are missing quotes around "(enc2, ciphertext2, enc, ciphertext)".
- Section 4.1: this may be paranoia, but it would be slightly nicer to include
the DH group name in the label arguments of LabeledExtract and LabeledExpand
to ensure that invocations from different DHKEM instantiations are orthogonal.
... for use in ECHO.
AES-GCM-128: https://tools.ietf.org/html/rfc5116
AES-CCM-128 (8-byte IV): https://tools.ietf.org/html/rfc6655
We should reference where these AEADs are defined, too (5116 for GCM, 8439 for ChaCha20Poly1305).
Section 5.3 explains secrets are exported with the KDF Expand function but the included code in the same section now calls LabeledExpand with a "sec" label.
The JSON test vectors contain sample results for the export function, but they match an unlabeled implementation with Expand and not LabeledExpand.
zz
, as computed by Decap
, is indicated to have length Npk
https://github.com/cfrg/draft-irtf-cfrg-hpke/blob/master/draft-irtf-cfrg-hpke.md#dh-based-kem.
This does not match the length of the test vector provided for P-256 https://github.com/cfrg/draft-irtf-cfrg-hpke/blob/master/draft-irtf-cfrg-hpke.md#base-setup-information-2
I was looking through a code coverage map of my implementation and realized that default_pkSm
is never actually used anywhere. KeySchedule
takes in a pkSm
and uses it to sanity-check the given mode
, and then uses it nowhere in the key schedule itself. Should pkSm
be removed as an argument?
We currently have none!
From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/
Section 5:
"we include two authenticated variants .": We would also suggest mentioning that these variants also contribute additional keying material to the encryption operation. See also discussion in Section 8.1.
After the sentence, "the constructions described here presume .", mention that the recipient also needs a way to determine which of its public keys was used for the encapsulation operation (if the recipient has more than one public key). Also add a reference to Section 9 which addresses the corresponding issues for message encoding.
There's currently a single sequence number space that's incremented by 1 for each message encrypted. This implies that only the initiator can encrypt messages to the receiver, else we risk key/nonce re-use. We should be clear about this in the draft!
The PSKs used in the test vectors currently are of length 6 bytes but should be 32 bytes for HKDF-SHA256 and 64 bytes for HKDF-SHA512.
From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/
[ANSI]: Add "X9.63" to title.
[BNT19] and other references as needed: Add authors' names.
[MAEA10]: Use "authoritative" URI for long-term stability: https://ieeexplore.ieee.org/abstract/document/5604194/.
Table 7.1 specifies an Nzz value of 64 for DHKEM(P-521, HKDF-SHA512). The test vectors are using a value of 66, which seems right given:
Nzz: The length in bytes of a shared secret produced by the algorithm.
(in context of KEM identifiers, not KDF).
But, 4.1 also states:
For the variants of DHKEM defined in this document, Ndh is equal to Npk, and the output length of the KDF's Extract function is Nzz bytes.
We should clarify whether Nzz is based on the ECDH or HKDF-Extract output length, and fix the test vectors if necessary.
From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/
As guidance for future revisions, we would recommend adding a section about the issues that need to be considered when adding support for other KEMs. There will presumably be industry interest in including post-quantum KEMs (as anticipated in Sec. 8.1), and there may also be interest in including RSA-based KEMs, for legacy support. The technical subtleties in adding such mechanisms include:
Assumptions about the relationship between the private key and the public key and the definition of the "pk()" function. For instance, GenerateKeyPair, listed as part of a KEM in Section 4, doesn't really need to be part of one (it's not part of RSA-KEM).
Assumptions about the length of the public key. It may not always be a fixed value, "Npk", for a KEM with a given set of parameters. The other (and unrelated) "hybrid" draft, draft-ietf-tls-hybrid-design, Section 3.2 ,makes accommodation for public keys associated with a given set of parameters to vary in size.
From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/
The draft only specifies a Diffie-Hellman-based KEM (Section 4.1). To set expectations for the implementer, we recommend stating this limitation in the abstract, e.g., by adding "based on elliptic curve Diffie-Hellman key agreement" at the end of the last sentence of the abstract.
Do we need to be concerned about them? If receivers don't validate ephemeral keys (point on the curve, and in the right subgroup), what can go wrong?
Both the constant mode_psk_auth
and function names like SetupAuthPSKR
are used. That's inconsistent because the order of psk
and auth
is different. Suggestion: change the constant to be mode_auth_psk
, because there are more function names that would need to be changed otherwise.
Section 5.1.3: it would be nice to include a reference or citation for
unknown key share attacks.Section 5.2: is there a reason to put the word "amortize" in quotes?
Section 7.1.2: it might be worth mentioning here that [keyagreement] also
includes checking that the public key is not the identity point.Section 7.1.2: is there a reason to recommend either checking for a nonzero
scalar or checking for a non-identity DH output? Checking the latter covers
the former and also covers the check from my prior comment. Moreover, it is
not clear to me that checking the scalar is useful for the recipient, since
this is essentially just checking that their long-term secret is nonzero.Section 8.1: the sentence "In particular, the KDFs and DH groups..." might
want to clarify that this statement is true only when these primitives are
used as specified. The concern is that HKDF is only indifferentiable under
some restrictions on salt length (for reasons noted in Section 8.3).
From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/
Appendix: "pkR", "pkS" values are given. These are presumably the same as the marshalled versions "pkRm", "pkSm", this should be stated for completeness. (In contrast, both "pKE" and the equivalent "enc" are shown.)
From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/
Section 5.2.:
The symbol "<<" isn't defined, but assuming it means "shift left by a specified number of bits", the number of bits to shift should be "8*Nn" rather than "Nn".
Does "overflow" in the third paragraph refer to the same condition as "wrap" in the fifth paragraph? If so, the text should be combined and a single term used for consistency. If not, the differences between the two requirements should be explained. We would also suggest adding a note indicating that the reference code assumes the sequence number is the same length as the nonce.
The use of "Nonce" (capitalized) as a function and "nonce" (lower case) as a value may be confusing. We suggest instead that the function be named "ComputeNonce" or similar.
On a similar object-oriented programming note, it should be stated that the underlying "Seal" and "Open" functions are the ones determined by the "aead_id" property.
This is just an editorial comment. The spec doesn't seem to define the contents of a "context" very clearly. Section 5 says:
A "context" encodes the AEAD algorithm and key in use, and manages the nonces used so that the same nonce is not used with multiple plaintexts.
But it also has an exporter secret. Then section 5.1 says:
return Context(key, nonce, exporter_secret)
But we haven't defined the Context
function yet. I'm guessing the intent is that Context
produces some sort of record type with those field names? But that wouldn't initialize Context.seq
used later in 5.2. (Confusingly, this Context
is distinct from the context
variable which contains an HPKEContext
structure. Maybe the latter could be renamed?)
Then 5.2 lists out the contents of a "context' more explicitly, including the first mention of a sequence number. But it omits the exporter secret again. (Should "The sender's context MUST be used for encryption only. Similarly, the recipient's context MUST be used for decryption only." be rephrased? One could read that as saying export is also not okay.)
Then 5.3 mentions a context having an "exporter secret", but this is actually the only instance of that phrase in the document.
The HPKE draft refers to "Curve25519" and "DHKEM(Curve25519, HKDF-SHA256)" throughout the draft, but then section 8.8 mentions DHKEM-X25519.
I believe X25519 is correct here. RFC7748 defines "curve25519" as a particular Montgomery curve. It then defines "X25519" as a Diffie-Hellman primitive on top of curve25519, with particular encodings and everything else. HPKE is using the Diffie-Hellman primitive, so it should use X25519. As a bonus, it's shorter and "DHKEM(Curve25519, HKDF-SHA256)" is already a mouthful. :-)
In "DH-Based KEM" the paragraph
The GenerateKeyPair, Marshal, and Unmarshal functions are the same as for the underlying DH group. The Marshal functions for the curves referenced in {#ciphersuites} are as follows:
references the #ciphersuites section that no longer seems to exist.
- Section 4: the definitions of AuthEncap and AuthDecap contain words to the
effect, 'the KEM shared secret key is known only to the holder of the
private key "skS".' It would be more accurate to say , 'the KEM shared
secret key was generated by the holder of the private key "skS"'.
The draft only has success test vectors. Negative ones would be good, too.
It may be desirable to export a secret, as with the TLS exporter. Adding such a feature would add a bit of complexity, and dilute the focus on PKE.
The plaintext is always the same, but the nonces and AADs differ by just one bit, which is hard to spot and easily missed.
From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/
The "label" argument to LabeledExtract is being used in some cases to identify the output, in one case to identify the input, and in one case to identify the intent. We suggest harmonizing on the former, and also consistently suffixing the output variable name with "_hash" when the purpose of the extraction is to produce a hash of the input. This would result in the following statements being updated:
info_hash = LabeledExtract(zero(Nh), "info_hash", info) // new label
psk_hash = LabeledExtract(zero(Nh), "psk_hash", psk) // new output name
secret = LabeledExtract(psk_hash, "secret", zz) // new input name and label
At the moment, the length of psk in the test vectors is the same for SHA256 and SHA512. As far as I can tell, psk should only contain 32 bytes for SHA256 instead of 64.
Additionally, psk, pskID and pkS are not needed in the Base setup (https://github.com/cfrg/draft-irtf-cfrg-hpke/blob/master/draft-irtf-cfrg-hpke.md#base-setup-information for instance)
It's currently 32 bytes, i.e., just the x-coordinate of the point. But Npk suggests it should be a fully-encoded public key. Which do we prefer?
(Thanks to Michael Scott for raising this!)
In Section “DH-Based KEM”:
* P-521: The X-coordinate of the point, encoded as a 66-octet
big-endian integer
In “Algorithm Identifiers” > “Key Encapsulation Mechanisms”:
| Value | KEM | Nenc | Npk | Reference |
|:-------|:------------------|:-----|:----|:---------------|
| 0x0012 | DHKEM(P-521) | 65 | 65 | {{NISTCurves}} |
While updating labels for draft-03, I noticed that the test vectors added in 5bc57ba seem to be incorrect, matching an implementation that:
info_hash
, when draft-03 specifies "info".psk
, when "psk_hash" is specified.With the "_hash" suffix swapped as above, my implementation generates matching outputs.
For example, the base mode does not provide KCI resistance.
From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/
Section 5.1:
"mode" should also be listed as a key schedule input.
Hi, the link of reference “[SECG] Elliptic Curve Cryptography, Standards for Efficient Cryptography Group, ver. 2” seems broken. Instead of http://www.secg.org/download/aid-780/sec1-v2.pdf the following seems to work: https://secg.org/sec1-v2.pdf
In the current draft (looking at branch master):
def KeySchedule(mode, pkRm, zz, enc, info, psk, pskID, pkIm):
VerifyMode(mode, psk, pskID, pkI)
pkRm = Marshal(pkR)
pkRm is given as parameter to KeySchedule but is calculated from pkR inside (suggestion: remove the line calculating it)
pkI is passed to VerifyMode, should be pkIm.
From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/
Section 8.1: "A full proof of post-quantum security .". Although we understand that a full proof of post-quantum security may not be achievable within the timeline of this draft's publication, we would nevertheless recommend some additional discussion on what might be desirable to prove. In the draft, the PSK is employed as an authentication factor, so presumably the proof being contemplated would be that authentication in the modes involving PSKs remains secure against a quantum computer. A stronger property would be more attractive: that encryption in the PSK modes remains secure against a quantum computer, whether the KEM itself is post-quantum or not. If the authors consider this property plausible, then it should be mentioned here as a goal for security analysis. If not, then the reasons for not targeting this property should also be given.
Benjamin Lipp, David Benjamin, Benjamin Beurdouche, Riad Wahby, Kevin Jacobs, Michael Rosenberg, Michael Scott, Raphael Robert, and probably more!
Two issues I found in the test vectors:
kemID: 1
but it's specified as 2 in the draft.0d8e01f89fa5abab107f7fe9
, but the nonce used in the first encryption (sequence number 0) is 0d8e01f89fa5abab107f7fe8
- the initial one XOR 1.In section 5.1, might be useful to have return KeySchedule/Context take in the AEAD, to make it clear that a Context is bound to a particular AEAD.
In section 6, Seal calls SetupI instead of SetupS.
From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/
Asymmetric and symmetric algorithms have been combined since the 1980s, e.g., in Privacy-Enhanced Mail [RFC1113], so a hybrid approach (in the sense of combining the two) can by now be considered the "tradition" of public-key cryptography. We would therefore suggest replacing the first sentence with the following:
Encryption schemes that combine asymmetric and symmetric algorithms have been specified and practiced since the early days of public-key cryptography (e.g., [RFC1113]). Combining the two brings the "best of both worlds": the key management advantages of asymmetric cryptography and the performance benefits of symmetric cryptography. However, the traditional combination has been "encrypt the symmetric key with the public key." "Hybrid" public-key encryption schemes (HPKE), specified here, take a different combination, "generate the symmetric key and its encapsulation with the public key." .
I'm trying to remove all allocation from my implementation, and there's really only 1 snag I'm hitting: LabeledExtract and LabeledExpand do a concat
operation before passing to their respective HKDF functions, and there isn't always an upper bound on the size of the concatenated result. Specifically, there's
KeySchedule(info, psk, pskID):
LabeledExtract(..., info)
LabeledExtract(..., psk)
LabeledExtract(..., pskID)
Context.Export(exporter_context):
LabeledExpand(..., exporter_context, ...)
If there were a (reasonably small) upper bound on the sizes of info
, psk
, pskID
, and exporter_context
, then it would be trivial to implement HPKE without allocation.
I've thought about "streaming" the input into the above functions, instead of sending a concatenated bytestring. This could theoretically work for HKDF-Extract with SHA256, since it's an MD hash, but this doesn't work generically. Also the definition of HKDF-Expand does not admit a way to stream in the info
string.
From Michael Scott:
A minor observation. In ExtractAndExpand the salt parameter is zero(Nh).
In fact this is the same as using zero(0), as HMAC internally pads this up to a blocksize of zeros.
So for example if using SHA512 and Nh=64, the hash blocksize is 128, and zero(0) gets padded up to 128 zeros, as does zero(64) . In fact the parameter to zero(.) is irrelevant.
We might consider zero(2*Nh)
or zero(0)
. What do you think, @blipp?
From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/
"avoid identity mis-binding issues": Perhaps also note that including the public key and the encapsulated key as inputs to key derivation can help with the security proof. [Shoup] makes this observation in Section 15.6.1.
[Shoup] @Article{shoup2001proposal,
title={A proposal for an ISO standard for public key encryption (version 2.1)},
author={Shoup, Victor},
journal={IACR e-Print Archive},
volume={112},
year={2001}
}
The spec definition of Decap()
includes taking enc
as a parameter, and returning it unmodified:
def Decap(enc, skR):
pkE = Unmarshal(enc)
dh = DH(skR, pkE)
pkRm = Marshal(pk(skR))
kemContext = concat(enc, pkRm)
zz = ExtractAndExpand(dh, kemContext)
return zz, enc
Only return zz
is needed. The same applies to AuthDecap.
From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/
Section 3: Definition of encode_big_endian: Add "unsigned" before "integer" if this is the intent (so that the set of encodable n-byte integers clearly includes 0 through 2^{8n}-1).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.