loupvaillant / monocypher Goto Github PK

View Code? Open in Web Editor NEW

580.0 580.0 80.0 4.02 MB

An easy to use, easy to deploy crypto library

Home Page: https://monocypher.org

License: Other

Makefile 5.24% C 76.93% Shell 5.49% Python 8.13% C++ 4.22%

monocypher's People

Contributors

Stargazers

Watchers

Forkers

skyformat99 gaigelama njlr kimusan vdt jakschu linecode rain-1 christophejunke secworks mikejsavage piyush-kurur vbmithr nattuck advmach h6ntechnologies dl6aku eax- buckaroo-pm leoribg sadoon-albader codyps elix22 refaqtor nnathan davylandman michaelforney moneytech monrapps zeta1999 stevefan1999-personal pascal-cuoq forksnd fscoto piomerti maroneze unilang tis-cv rrozestw kknckk claycharlie perfchecks1 px4 laplacekorea david-maison-trustinsoft kryptine walterschell snej cfogelklou ewd340 all-your-locks-are-belong-to-us mliberty1 awnumar 844562078 laczen ceedriic dch yonasbsd rafal-rozestwinski-forks setharchambault aikawayataro davidpu warfork alebcay

monocypher's Issues

Slight change in README, or a suggestion to really add manual.

Aloha!

The README.md states "There is a html version in doc/html/, that you can regenerate by executing the doc/man2html.sh script (this requires mandoc)."

the first part "There is a html version in doc/html/" is not really correct. The html version of the manual is not there. The html pages aren't that big so could we please add them instead of either (1) having users generate the pages, or (2) leave the repo (on Github) to go to the Monocypher to find the manual? Having the html files included would make the repo self contained manual-wise.

If not adding the pages, I would suggest changing the README text to state: "The html version of the manual can be regenerated by running the..." etc.

Split monocypher.c into a few separate files?

Aloha!

The monocypher.c is 1600+ is one gigantic file with all but the sha512 functionality. How about splitting the file inta few separate files? One for poly1305, chacha etc.
It would make reading the code a bit easier. And possibly to maintain.

Consider renaming or aliasing make test to make check

Currently, the test suite runs with make test. automake specifies make check that does: "Run the test suite, if any."

Based on the principle of least surprise, it may be worth renaming or aliasing make test to make check.

Optimising AEAD

I used to think RFC 7539 was over-engineered. I suspected their padding game were little more than some alignment game, which probably arisen from sub-par interfaces for the primitives. Interfaces that I was quite proud to generalise, if not simplify a bit.

Hem.

Turned out I was misguided and ignorant. First, I believed serialization and de-serialization took a negligible amount of time, compared to the heavy duty work. Nope. Nope. Nope. Nope. Nope. Nope. Nope. The most impresive there was Poly1305, which became 5 times faster than it was. (De-)serialisation was a full blown bottleneck.

Second, I didn't know that the best way to optimise many such primitives, despite their structure, was to perform the good old "unroll then vectorise" trick. For Chacha20, processing 8 blocks at a time with AVX-2 instructions is four times faster than using AVX instructions for a single round. Despite Chacha20 being designed for single-round vectorisation. I wonder how fast full blown bit slicing would be… Poly1305 enjoys similar effects (Libsodium is 2.3 times faster than Monocypher).

Thus, the details of the AEAD construction have a significant impact on the performance of small and medium messages (I'm thinking 1-8 kilobytes). I've measured something between 10-30% slow-down. Not that we want Monocypher to reach ultimate performance. This would cost way too much code not to mention the portability nightmare, of which #85 gives a taste. But we do want to scale as needed. If some user start to notice Monocypher's suboptimal performance is hurting their power bill (they just overthrew Google or something), it would be nice to get that performance without breaking compatibility.

In this light, I have to admit RFC 7539 is actually pretty well thought out. The only real qualm I have with it is it's half assed nonce, which isn't long enough to support randomness and cuts the counter to a worryingly short length. It's padding scheme however is pretty much ideal:

Wasting a whole Chacha block (64 bytes) to compute the authentication key instead of just 32-bytes is actually faster than keeping to 32 bytes. First, it tends to hit the fast path for de-serialization in typical implementations. Second, it facilitates streaming block by block, where the size of the blocks are powers of 2: by starting at a block boundary, we can ensure we never hit the slow serialisation path, and we take full advantage of the "unroll-then-vectorise" scheme that is fond of multiple of 8 blocks, without odd remainders at the end.

Assuming 4k blocks (a size that is dear to me personally because I'd like to implement a file system like encrypted archive), this alone speeds up the fast version of Monocypher by more than 15%. Smaller blocks (2k, 1k) are even more sensitive to this effect, and such sizes might be relevant to something like a chat application (where you would pad messages to multiples of 1kiB to reduce information leak).
In terms of wasted computation, padding Poly1305 wastes at most 1 block. Considering this enables a 2.3 speedup on message boundaries when doing the same as Chacha20, and the fact that the encrypted buffers are generally distinct from the additional data, aligning the block at the beginning of the message makes sense.

Of course, the padding means we now have to authenticate the length of both the additional data and the encrypted message. RFC 7539 have us covered there, using exactly 1 additional block to authenticate both lengths. 64 bits each should be plenty. Oh, and it removes the minor worry I had about authenticating the length of the additional data. Now it's all clean, even if that length is neither fixed nor self contained. That should simplify the manual a bit.

So I'm thinking of breaking compatibility. Again. This should fit the current API, we'll just have incompatible encryption and mac. The upside is, if some one wants to optimise this up to the heavens later, they can.

While we're at it, how about adding raw Chacha20 and IETF initialisation? The former can help with performance in some cases that don't require a random nonce, and the latter has test vectors. I still don't endorse 96-bit nonces, but at this point we might as well add it for completion's sake. I'm not sure yet. Maybe we could limit those alternate initialisation schemes to the incremental interface, and have the monolithic interface present a clear default?

There are no man pages

People are making packages for Monocypher, e.g. one in AUR and the one for Void Linux in progress, which is interesting. However, Monocypher decidedly does currently not really work as a package.

The AUR package currently installs monocypher.c into /usr/include, which is questionable at best. However, Monocypher cannot be built as a library. Maybe doing so is overkill or unnecessarily complex in terms of encumberance of the build path. It is already currently broken with BSD make and requires GNU make. Dynamic libraries should probably be discouraged as Monocypher intends to be fairly static in the first place.
There are no man pages. Any self-respecting C library supplies man pages. The manual is currently on the website. If going down this path, mandoc is probably the way to go (supported natively by GNU man, but always puts "BSD General Commands Manual" in the heading; man on at least FreeBSD and OpenBSD natively support it because they use it; not sure about macOS/Solaris/that one guy probably running on VMS). It's semantic (e.g. .Nm inserts the program name and formats it as expected, as opposed to manual repetition and formatting on old-style man/roff pages), which allows searching by macro keys like apropos Fa=ad to find all man pages referencing a function argument ad. Sensible mandoc guidelines can be found on OpenBSD mdoc(7). Otherwise, ronn exists.

Integrate tests from google

https://github.com/google/wycheproof/blob/master/testvectors/eddsa_test.json#L94

thanks to thaidn_ for sharing this

Build target self is broken on OpenBSD 6.1

OpenBSD 6.1 is the current version of OpenBSD. It ships gcc 4.2.1. While OpenBSD is going to use clang for 6.2, this only affects i386 and amd64 targets for now.

In tests/self.c, there are constructions that attempt to build a const value by involving another const value. gcc on OpenBSD 6.1 rejects those, saying "error: initializer element is not constant". Affected lines in tests/self.c are 295, 296, 334, 376, 377, 416, 417, 453.

This currently breaks the build. I would normally file a pull request, but in this case, I'm not sure how you would want it to be handled, I could imagine simply hardcoding the values, hardcoding the values and extending the comments (usually pushing them past the 80 column boundary), or using macros.

(After fixing this, the build on OpenBSD 6.1 goes through without issues. There's a warning that the ALIGN macro is redefined in tests/ed25519-donna/ed25519-donna-portable.h, conflicting with the one in /usr/include/sys/param.h, but the code in the donna implementation is out of your control.)

Wording for the poly1305 key in the manual.

I would like to suggest a slight change of wording. The manual for Poly1305, user error.
https://monocypher.org/manual/crypto_poly1305.html#User_error

(Wouldn't it be great if that link went to the repo instead? ;-)

First thing:
I don't think "User error" is a great heading. "User considerations", "Important details" etc. The text does not describe errors users do, but what the user must consider to use the function in a secure manner.

Secondly:
The third paragraph states: "Random numbers cannot be used either..." In fact this is not true. In order to follow the security of Poly1305. The DJB paper (https://cr.yp.to/mac/poly1305-20050329.pdf), chapter 3 states: "Any protocol that uses Poly1305-AES must ensure unpredictability of the secret key (k,r). This section assumes that secret keys are chosen from the uniform distribution: i.e., probability 2−234 for each of the 2234 possible pairs (k, r)." Stating that random numbers can not be used open open for misuse.

The text in the manual continues to describe one way to generate a key (using the initial keystream generated by the stream cipher). That keystream is a stream of random numbers. The important second requirement is that the key must be kept secret. That secrecy is achieved with the scheme presented. And this corresponds to what DJB states in chapter 3: "Any protocol that uses Poly1305-AES must ensure that the secret key is, in fact, kept secret."

I would therefore suggest to:

Change the name of the heading to "User considerations" or similar.
Add a note about the key has to be a non predictable random number
Split up para three to have a new paragraph focusing on the importance of keeping the key secret. For example by using the method described.

crypto_sign() requires the user to have non-overlaping input message and output signature

This has to do with the way the signature is computed internally: part of the signature is written to before we last read the message. If they overlap, we would mess up the last message read, leading to a wrong hash, and a wrong signature.

Regardless of whether the user was asking for trouble or not, this restriction is avoidable, and therefore should be lifted. The test suite should also be augmented accordingly.

The rest of the library should be checked for these kind of restrictions while we're at it.

EdDSA overlap test depends on uninitialized memory

This surfaced as failure when testing on OpenBSD. If the test has been passing until now, it was by coincidence.

https://github.com/LoupVaillant/Monocypher/blob/master/tests/test.c#L621-L627

input is INPUT_SIZE bytes long, i.e. 160 bytes. SHA_512_BLOCK_SIZE is 128 bytes. The routine thus reads 128 bytes from input + 64. This overreads input at offset 32.

I suspect you meant MESSAGE_SIZE instead of SHA_512_BLOCK_SIZE, but then it'd make no sense to do the multiplication by two in #define INPUT_SIZE (MESSAGE_SIZE + (2 * 64)). Due to this uncertainty, I'm reporting this as an issue, rather than creating a pull request.

package for Void linux

Hello !
I proposed a package for Void linux, see my Pull Request #7407.

Regards.

Warnings and errors when compiling using MSVC 2017

I get the following warnings and errors when compiling using Microsoft Visual C/C++ 2017.

monocypher.c(64): warning C4244: 'function': conversion from 'u64' to 'u32', possible loss of data
monocypher.c(240): error C4146: unary minus operator applied to unsigned type, result still unsigned
monocypher.c(329): warning C4244: '=': conversion from 'const u64' to 'uint32_t', possible loss of data
monocypher.c(314): warning C4244: 'initializing': conversion from 'const u64' to 'const u32', possible loss of data
monocypher.c(381): error C4146: unary minus operator applied to unsigned type, result still unsigned
monocypher.c(432): warning C4244: 'function': conversion from 'const u64' to 'u32', possible loss of data
monocypher.c(433): warning C4244: 'function': conversion from 'const u64' to 'u32', possible loss of data
monocypher.c(434): warning C4244: 'function': conversion from 'const u64' to 'u32', possible loss of data
monocypher.c(435): warning C4244: 'function': conversion from 'const u64' to 'u32', possible loss of data
monocypher.c(587): error C4146: unary minus operator applied to unsigned type, result still unsigned
monocypher.c(948): warning C4267: 'function': conversion from 'size_t' to 'u32', possible loss of data
monocypher.c(954): warning C4267: 'initializing': conversion from 'size_t' to 'u32', possible loss of data
monocypher.c(955): warning C4267: 'initializing': conversion from 'size_t' to 'u32', possible loss of data
monocypher.c(958): warning C4267: 'initializing': conversion from 'size_t' to 'u32', possible loss of data
monocypher.c(1052): warning C4244: '=': conversion from 'i64' to 'i32', possible loss of data
monocypher.c(1062): warning C4244: '=': conversion from 'i64' to 'i32', possible loss of data
monocypher.c(1118): warning C4244: '=': conversion from 'i64' to 'i32', possible loss of data
monocypher.c(1152): warning C4244: '=': conversion from 'i64' to 'i32', possible loss of data
monocypher.c(1714): error C4146: unary minus operator applied to unsigned type, result still unsigned
monocypher.c(1758): error C4146: unary minus operator applied to unsigned type, result still unsigned

Due to the errors compilation fails with MSVC. Perhaps you might want to check this for a future release.

Seen these warnings?

Aloha!

Did a "make test" and clang/llvm emitted a number of warnings on test.c:
tests/test.c:237:12: warning: comparison of unsigned expression < 0 is always false [-Wtautological-compare]
d = (d < 0 ? -d : d);
~ ^ ~
tests/test.c:254:12: warning: comparison of unsigned expression < 0 is always false [-Wtautological-compare]
d = (d < 0 ? -d : d);
~ ^ ~
tests/test.c:274:25: warning: variable length array folded to constant array as an extension [-Wgnu-folding-constant]
u8 output_chunk[input_size];
^~~~~~~~~~
tests/test.c:275:25: warning: variable length array folded to constant array as an extension [-Wgnu-folding-constant]
u8 output_whole[input_size];
^~~~~~~~~~
tests/test.c:277:25: warning: variable length array folded to constant array as an extension [-Wgnu-folding-constant]
u8 input [input_size]; p_random(input, input_size);
^~~~~~~~~~
tests/test.c:308:24: warning: variable length array folded to constant array as an extension [-Wgnu-folding-constant]
u8 output_part[stream_size ];
^~~~~~~~~~~
tests/test.c:309:24: warning: variable length array folded to constant array as an extension [-Wgnu-folding-constant]
u8 output_all [stream_size ];
^~~~~~~~~~~
tests/test.c:310:24: warning: variable length array folded to constant array as an extension [-Wgnu-folding-constant]
u8 output_more[stream_size * 2];
^~~~~~~~~~~~~~~
tests/test.c:357:18: warning: variable length array folded to constant array as an extension [-Wgnu-folding-constant]
u8 input[input_size]; p_random(input, input_size);
^~~~~~~~~~
tests/test.c:398:18: warning: variable length array folded to constant array as an extension [-Wgnu-folding-constant]
u8 input[input_size]; p_random(input, input_size);
^~~~~~~~~~
tests/test.c:436:18: warning: variable length array folded to constant array as an extension [-Wgnu-folding-constant]
u8 input[input_size]; p_random(input, input_size);
^~~~~~~~~~
tests/test.c:466:16: warning: variable length array folded to constant array as an extension [-Wgnu-folding-constant]
u8 message[message_size]; p_random(message, 32);
^~~~~~~~~~~~

This is with gcc aliased to clang Apple LLVM version 9.0.0 (clang-900.0.37).

Have you seen these before?

crypto_lock_key doesn't exist

There appears to be a mistake where the manual refers to crypto_lock_key() which doesn't exist in the header/source, and vice versa crypto_key_exchange() exists but is not reflected in the manual.

Makefiles aren't portable

The various makefiles seem to only work with GNU make. Which is a pity, considering how simple they now are. They should be compatible with other make utilities.

Ginormous, impossible to review assembly output for constant time comparison functions

Tested with GCC 5.4.0, -O3 optimisations. Upon noticing suspect timings for some usages of crypto_memcmp(), I tried to look at the assembly output. I got this hard to review mess:

.LCOLDB18:
.text
.LHOTB18:
	.p2align 4,,15
	.globl	crypto_memcmp
	.type	crypto_memcmp, @function
crypto_memcmp:
.LFB11:
	.cfi_startproc
	testq	%rdx, %rdx
	pushq	%r12
	.cfi_def_cfa_offset 16
	.cfi_offset 12, -16
	pushq	%rbp
	.cfi_def_cfa_offset 24
	.cfi_offset 6, -24
	pushq	%rbx
	.cfi_def_cfa_offset 32
	.cfi_offset 3, -32
	je	.L174
	movq	%rdi, %r8
	negq	%r8
	andl	$15, %r8d
	cmpq	%rdx, %r8
	cmova	%rdx, %r8
	cmpq	$17, %rdx
	ja	.L197
	movq	%rdx, %r8
.L164:
	movzbl	(%rdi), %ecx
	xorb	(%rsi), %cl
	cmpq	$1, %r8
	movzbl	%cl, %eax
	je	.L177
	movzbl	1(%rdi), %eax
	xorb	1(%rsi), %al
	orl	%eax, %ecx
	cmpq	$2, %r8
	movzbl	%cl, %eax
	je	.L178
	movzbl	2(%rdi), %eax
	xorb	2(%rsi), %al
	orl	%eax, %ecx
	cmpq	$3, %r8
	movzbl	%cl, %eax
	je	.L179
	movzbl	3(%rdi), %eax
	xorb	3(%rsi), %al
	orl	%eax, %ecx
	cmpq	$4, %r8
	movzbl	%cl, %eax
	je	.L180
	movzbl	4(%rdi), %eax
	xorb	4(%rsi), %al
	orl	%eax, %ecx
	cmpq	$5, %r8
	movzbl	%cl, %eax
	je	.L181
	movzbl	5(%rdi), %eax
	xorb	5(%rsi), %al
	orl	%eax, %ecx
	cmpq	$6, %r8
	movzbl	%cl, %eax
	je	.L182
	movzbl	6(%rdi), %eax
	xorb	6(%rsi), %al
	orl	%eax, %ecx
	cmpq	$7, %r8
	movzbl	%cl, %eax
	je	.L183
	movzbl	7(%rdi), %eax
	xorb	7(%rsi), %al
	orl	%eax, %ecx
	cmpq	$8, %r8
	movzbl	%cl, %eax
	je	.L184
	movzbl	8(%rdi), %eax
	xorb	8(%rsi), %al
	orl	%eax, %ecx
	cmpq	$9, %r8
	movzbl	%cl, %eax
	je	.L185
	movzbl	9(%rdi), %eax
	xorb	9(%rsi), %al
	orl	%eax, %ecx
	cmpq	$10, %r8
	movzbl	%cl, %eax
	je	.L186
	movzbl	10(%rdi), %eax
	xorb	10(%rsi), %al
	orl	%eax, %ecx
	cmpq	$11, %r8
	movzbl	%cl, %eax
	je	.L187
	movzbl	11(%rdi), %eax
	xorb	11(%rsi), %al
	orl	%eax, %ecx
	cmpq	$12, %r8
	movzbl	%cl, %eax
	je	.L188
	movzbl	12(%rdi), %eax
	xorb	12(%rsi), %al
	orl	%eax, %ecx
	cmpq	$13, %r8
	movzbl	%cl, %eax
	je	.L189
	movzbl	13(%rdi), %eax
	xorb	13(%rsi), %al
	orl	%eax, %ecx
	cmpq	$14, %r8
	movzbl	%cl, %eax
	je	.L190
	movzbl	14(%rdi), %eax
	xorb	14(%rsi), %al
	orl	%eax, %ecx
	cmpq	$15, %r8
	movzbl	%cl, %eax
	je	.L191
	movzbl	15(%rdi), %eax
	xorb	15(%rsi), %al
	orl	%eax, %ecx
	cmpq	$17, %r8
	movzbl	%cl, %eax
	jne	.L192
	movzbl	16(%rsi), %eax
	xorb	16(%rdi), %al
	orl	%eax, %ecx
	movzbl	%cl, %eax
	movl	$17, %ecx
.L166:
	cmpq	%r8, %rdx
	je	.L167
.L165:
	movq	%rdx, %r11
	leaq	-1(%rdx), %r10
	subq	%r8, %r11
	leaq	-16(%r11), %r9
	subq	%r8, %r10
	shrq	$4, %r9
	addq	$1, %r9
	movq	%r9, %rbx
	salq	$4, %rbx
	cmpq	$14, %r10
	jbe	.L168
	pxor	%xmm1, %xmm1
	leaq	(%rdi,%r8), %r12
	xorl	%r10d, %r10d
	pxor	%xmm5, %xmm5
	addq	%rsi, %r8
	pxor	%xmm2, %xmm2
	xorl	%ebp, %ebp
.L169:
	movdqu	(%r8,%r10), %xmm0
	addq	$1, %rbp
	pxor	(%r12,%r10), %xmm0
	addq	$16, %r10
	cmpq	%rbp, %r9
	movdqa	%xmm0, %xmm3
	punpckhbw	%xmm5, %xmm0
	punpcklbw	%xmm5, %xmm3
	movdqa	%xmm3, %xmm4
	punpckhwd	%xmm2, %xmm3
	punpcklwd	%xmm2, %xmm4
	por	%xmm4, %xmm3
	movdqa	%xmm0, %xmm4
	punpckhwd	%xmm2, %xmm0
	punpcklwd	%xmm2, %xmm4
	por	%xmm4, %xmm3
	por	%xmm3, %xmm0
	por	%xmm0, %xmm1
	ja	.L169
	movdqa	%xmm1, %xmm0
	addq	%rbx, %rcx
	psrldq	$8, %xmm0
	por	%xmm1, %xmm0
	movdqa	%xmm0, %xmm1
	psrldq	$4, %xmm1
	por	%xmm1, %xmm0
	movd	%xmm0, %r8d
	orl	%r8d, %eax
	cmpq	%r11, %rbx
	je	.L167
.L168:
	movzbl	(%rsi,%rcx), %r8d
	xorb	(%rdi,%rcx), %r8b
	movzbl	%r8b, %r8d
	orl	%r8d, %eax
	leaq	1(%rcx), %r8
	cmpq	%r8, %rdx
	jbe	.L167
	movzbl	1(%rdi,%rcx), %r8d
	xorb	1(%rsi,%rcx), %r8b
	movzbl	%r8b, %r8d
	orl	%r8d, %eax
	leaq	2(%rcx), %r8
	cmpq	%r8, %rdx
	jbe	.L167
	movzbl	2(%rdi,%rcx), %r8d
	xorb	2(%rsi,%rcx), %r8b
	movzbl	%r8b, %r8d
	orl	%r8d, %eax
	leaq	3(%rcx), %r8
	cmpq	%r8, %rdx
	jbe	.L167
	movzbl	3(%rdi,%rcx), %r8d
	xorb	3(%rsi,%rcx), %r8b
	movzbl	%r8b, %r8d
	orl	%r8d, %eax
	leaq	4(%rcx), %r8
	cmpq	%r8, %rdx
	jbe	.L167
	movzbl	4(%rdi,%rcx), %r8d
	xorb	4(%rsi,%rcx), %r8b
	movzbl	%r8b, %r8d
	orl	%r8d, %eax
	leaq	5(%rcx), %r8
	cmpq	%r8, %rdx
	jbe	.L167
	movzbl	5(%rdi,%rcx), %r8d
	xorb	5(%rsi,%rcx), %r8b
	movzbl	%r8b, %r8d
	orl	%r8d, %eax
	leaq	6(%rcx), %r8
	cmpq	%r8, %rdx
	jbe	.L167
	movzbl	6(%rdi,%rcx), %r8d
	xorb	6(%rsi,%rcx), %r8b
	movzbl	%r8b, %r8d
	orl	%r8d, %eax
	leaq	7(%rcx), %r8
	cmpq	%r8, %rdx
	jbe	.L167
	movzbl	7(%rdi,%rcx), %r8d
	xorb	7(%rsi,%rcx), %r8b
	movzbl	%r8b, %r8d
	orl	%r8d, %eax
	leaq	8(%rcx), %r8
	cmpq	%r8, %rdx
	jbe	.L167
	movzbl	8(%rdi,%rcx), %r8d
	xorb	8(%rsi,%rcx), %r8b
	movzbl	%r8b, %r8d
	orl	%r8d, %eax
	leaq	9(%rcx), %r8
	cmpq	%r8, %rdx
	jbe	.L167
	movzbl	9(%rdi,%rcx), %r8d
	xorb	9(%rsi,%rcx), %r8b
	movzbl	%r8b, %r8d
	orl	%r8d, %eax
	leaq	10(%rcx), %r8
	cmpq	%r8, %rdx
	jbe	.L167
	movzbl	10(%rdi,%rcx), %r8d
	xorb	10(%rsi,%rcx), %r8b
	movzbl	%r8b, %r8d
	orl	%r8d, %eax
	leaq	11(%rcx), %r8
	cmpq	%r8, %rdx
	jbe	.L167
	movzbl	11(%rdi,%rcx), %r8d
	xorb	11(%rsi,%rcx), %r8b
	movzbl	%r8b, %r8d
	orl	%r8d, %eax
	leaq	12(%rcx), %r8
	cmpq	%r8, %rdx
	jbe	.L167
	movzbl	12(%rdi,%rcx), %r8d
	xorb	12(%rsi,%rcx), %r8b
	movzbl	%r8b, %r8d
	orl	%r8d, %eax
	leaq	13(%rcx), %r8
	cmpq	%r8, %rdx
	jbe	.L167
	movzbl	13(%rdi,%rcx), %r8d
	xorb	13(%rsi,%rcx), %r8b
	movzbl	%r8b, %r8d
	orl	%r8d, %eax
	leaq	14(%rcx), %r8
	cmpq	%r8, %rdx
	jbe	.L167
	movzbl	14(%rsi,%rcx), %edx
	xorb	14(%rdi,%rcx), %dl
	movzbl	%dl, %edx
	orl	%edx, %eax
.L167:
	subl	$1, %eax
	shrl	$8, %eax
	andl	$1, %eax
	subl	$1, %eax
.L163:
	popq	%rbx
	.cfi_remember_state
	.cfi_def_cfa_offset 24
	popq	%rbp
	.cfi_def_cfa_offset 16
	popq	%r12
	.cfi_def_cfa_offset 8
	ret
	.p2align 4,,10
	.p2align 3
.L197:
	.cfi_restore_state
	testq	%r8, %r8
	jne	.L164
	xorl	%ecx, %ecx
	xorl	%eax, %eax
	jmp	.L165
	.p2align 4,,10
	.p2align 3
.L174:
	xorl	%eax, %eax
	jmp	.L163
	.p2align 4,,10
	.p2align 3
.L183:
	movl	$7, %ecx
	jmp	.L166
	.p2align 4,,10
	.p2align 3
.L184:
	movl	$8, %ecx
	jmp	.L166
	.p2align 4,,10
	.p2align 3
.L185:
	movl	$9, %ecx
	jmp	.L166
	.p2align 4,,10
	.p2align 3
.L186:
	movl	$10, %ecx
	jmp	.L166
	.p2align 4,,10
	.p2align 3
.L187:
	movl	$11, %ecx
	jmp	.L166
	.p2align 4,,10
	.p2align 3
.L188:
	movl	$12, %ecx
	jmp	.L166
	.p2align 4,,10
	.p2align 3
.L189:
	movl	$13, %ecx
	jmp	.L166
	.p2align 4,,10
	.p2align 3
.L190:
	movl	$14, %ecx
	jmp	.L166
	.p2align 4,,10
	.p2align 3
.L179:
	movl	$3, %ecx
	jmp	.L166
	.p2align 4,,10
	.p2align 3
.L180:
	movl	$4, %ecx
	jmp	.L166
	.p2align 4,,10
	.p2align 3
.L181:
	movl	$5, %ecx
	jmp	.L166
	.p2align 4,,10
	.p2align 3
.L182:
	movl	$6, %ecx
	jmp	.L166
	.p2align 4,,10
	.p2align 3
.L177:
	movl	$1, %ecx
	jmp	.L166
	.p2align 4,,10
	.p2align 3
.L178:
	movl	$2, %ecx
	jmp	.L166
	.p2align 4,,10
	.p2align 3
.L191:
	movl	$15, %ecx
	jmp	.L166
	.p2align 4,,10
	.p2align 3
.L192:
	movl	$16, %ecx
	jmp	.L166
	.cfi_endproc

crypto_zerocmp() is just as gigantic:

.LHOTB19:
	.p2align 4,,15
	.globl	crypto_zerocmp
	.type	crypto_zerocmp, @function
crypto_zerocmp:
.LFB12:
	.cfi_startproc
	testq	%rsi, %rsi
	je	.L210
	movq	%rdi, %rdx
	movq	%rsi, %rcx
	negq	%rdx
	andl	$15, %edx
	cmpq	%rsi, %rdx
	cmova	%rsi, %rdx
	cmpq	$19, %rsi
	ja	.L217
.L200:
	xorl	%edx, %edx
	xorl	%eax, %eax
	.p2align 4,,10
	.p2align 3
.L202:
	movzbl	(%rdi,%rdx), %r8d
	addq	$1, %rdx
	orl	%r8d, %eax
	cmpq	%rcx, %rdx
	jne	.L202
	cmpq	%rcx, %rsi
	movq	%rcx, %rdx
	je	.L203
.L201:
	movq	%rsi, %r10
	leaq	-1(%rsi), %r9
	subq	%rdx, %r10
	leaq	-16(%r10), %r8
	subq	%rdx, %r9
	shrq	$4, %r8
	addq	$1, %r8
	movq	%r8, %r11
	salq	$4, %r11
	cmpq	$14, %r9
	jbe	.L204
	pxor	%xmm1, %xmm1
	addq	%rdi, %rdx
	xorl	%r9d, %r9d
	pxor	%xmm5, %xmm5
	pxor	%xmm2, %xmm2
.L205:
	movdqa	(%rdx), %xmm0
	addq	$1, %r9
	addq	$16, %rdx
	cmpq	%r9, %r8
	movdqa	%xmm0, %xmm3
	punpckhbw	%xmm5, %xmm0
	punpcklbw	%xmm5, %xmm3
	movdqa	%xmm3, %xmm4
	punpckhwd	%xmm2, %xmm3
	punpcklwd	%xmm2, %xmm4
	por	%xmm4, %xmm3
	movdqa	%xmm0, %xmm4
	punpckhwd	%xmm2, %xmm0
	punpcklwd	%xmm2, %xmm4
	por	%xmm4, %xmm3
	por	%xmm3, %xmm0
	por	%xmm0, %xmm1
	ja	.L205
	movdqa	%xmm1, %xmm0
	addq	%r11, %rcx
	psrldq	$8, %xmm0
	por	%xmm1, %xmm0
	movdqa	%xmm0, %xmm1
	psrldq	$4, %xmm1
	por	%xmm1, %xmm0
	movd	%xmm0, %edx
	orl	%edx, %eax
	cmpq	%r11, %r10
	je	.L203
.L204:
	movzbl	(%rdi,%rcx), %edx
	orl	%edx, %eax
	leaq	1(%rcx), %rdx
	cmpq	%rdx, %rsi
	jbe	.L203
	movzbl	1(%rdi,%rcx), %edx
	orl	%edx, %eax
	leaq	2(%rcx), %rdx
	cmpq	%rdx, %rsi
	jbe	.L203
	movzbl	2(%rdi,%rcx), %edx
	orl	%edx, %eax
	leaq	3(%rcx), %rdx
	cmpq	%rdx, %rsi
	jbe	.L203
	movzbl	3(%rdi,%rcx), %edx
	orl	%edx, %eax
	leaq	4(%rcx), %rdx
	cmpq	%rdx, %rsi
	jbe	.L203
	movzbl	4(%rdi,%rcx), %edx
	orl	%edx, %eax
	leaq	5(%rcx), %rdx
	cmpq	%rdx, %rsi
	jbe	.L203
	movzbl	5(%rdi,%rcx), %edx
	orl	%edx, %eax
	leaq	6(%rcx), %rdx
	cmpq	%rdx, %rsi
	jbe	.L203
	movzbl	6(%rdi,%rcx), %edx
	orl	%edx, %eax
	leaq	7(%rcx), %rdx
	cmpq	%rdx, %rsi
	jbe	.L203
	movzbl	7(%rdi,%rcx), %edx
	orl	%edx, %eax
	leaq	8(%rcx), %rdx
	cmpq	%rdx, %rsi
	jbe	.L203
	movzbl	8(%rdi,%rcx), %edx
	orl	%edx, %eax
	leaq	9(%rcx), %rdx
	cmpq	%rdx, %rsi
	jbe	.L203
	movzbl	9(%rdi,%rcx), %edx
	orl	%edx, %eax
	leaq	10(%rcx), %rdx
	cmpq	%rdx, %rsi
	jbe	.L203
	movzbl	10(%rdi,%rcx), %edx
	orl	%edx, %eax
	leaq	11(%rcx), %rdx
	cmpq	%rdx, %rsi
	jbe	.L203
	movzbl	11(%rdi,%rcx), %edx
	orl	%edx, %eax
	leaq	12(%rcx), %rdx
	cmpq	%rdx, %rsi
	jbe	.L203
	movzbl	12(%rdi,%rcx), %edx
	orl	%edx, %eax
	leaq	13(%rcx), %rdx
	cmpq	%rdx, %rsi
	jbe	.L203
	movzbl	13(%rdi,%rcx), %edx
	orl	%edx, %eax
	leaq	14(%rcx), %rdx
	cmpq	%rdx, %rsi
	jbe	.L203
	movzbl	14(%rdi,%rcx), %edx
	orl	%edx, %eax
.L203:
	subl	$1, %eax
	shrl	$8, %eax
	andl	$1, %eax
	subl	$1, %eax
	ret
	.p2align 4,,10
	.p2align 3
.L217:
	testq	%rdx, %rdx
	jne	.L218
	xorl	%ecx, %ecx
	xorl	%eax, %eax
	jmp	.L201
	.p2align 4,,10
	.p2align 3
.L210:
	xorl	%eax, %eax
	ret
.L218:
	movq	%rdx, %rcx
	jmp	.L200
	.cfi_endproc
.LFE12:
	.size	crypto_zerocmp, .-crypto_zerocmp
	.section	.text.unlikely

Are those really constant time?

Please consider using a free software license

I would recommend https://directory.fsf.org/wiki/License:ISC or BSD 3-clause. I think that using a standard free software license will encourage greater adoption of this excellent library!

The current license seems to be some kind of home brew public domain disclaimer. I believe there are some problems with such a license and it doesn't correctly/legally express what you're aiming for with releasing this code http://www.linuxjournal.com/article/6225.

Date in ChangeLog, and possible news page?

(This is really for the monocypher home page, but github issues are nice.)

I would like to suggest to add date information to items on the ChangeLog page. There is information that there are a 2.0.0 version, but no info when it was released. Since there is neither a News-page, a potential user won't be able to figure out how old a release is. Is the project really active or is it in fact stale and unsupported? It is hard to know without date information for releases. How about at least add a News-section or release info section at the top of the home page? Makes it easier to promoto Monocypher too.

The Github repo of course shows a lot of info about activity. And since the first three search items on Google for "Monocypher" or "monocypher" is to Github, this is where people will end up. But still I think the home page would be improved with date info. (The actual home page does not appear on the first page of search results anymore.)

Package self-tests into Monocypher itself

The README reads:

Do not use Monocypher without running the self contained tests at least once.

Which is solid advice. However, I'm concerned about the scenario where you're the developer of an application unable to control where your code is actually going to run (e.g. you have some code, someone runs it on MIPS64 and Monocypher starts breaking; however, the user only sees that the application "doesn't work" because the crypto is broken).

Thoughts on including tests/self.c into Monocypher itself or at least a mention in the manual to cat src/monocypher.c tests/self.c > amalgamation.c and renaming main()? It has to be optional and not called automatically in case of things like embedded platforms, where developers can check the platform itself but running all the tests would be (1) rather time consuming, or (2) impossible with its stack size.

crypto_key_exchange may leak shared_secret

It looks to me that the shared_secret array shall be zeroed before returning from the function.

I am not familiar with HChaCha20 so what is done maybe OK in some context, but if the key is to be used as a session secret key you shall hash (Blake2B or SHA512) the x25519 secret with the 2 public keys

Not thread safe

You should remove "static" from "sv": https://github.com/LoupVaillant/Monocypher/blob/master/src/sha512.c#L4

README.md is completely out of sync with the curent repository

It should explain that new dist/ business, and the fact that end users should fetch the latest tarball instead. (Note: there is no such tarball just yet, just older releases.)

style.css is installed to man page location

After writing PR #44, I noticed that style.css is installed to $PREFIX/share/man. That is very likely the wrong place for it.

I see two options:

Do not install style.css, but that would prevent it from being included in packages, where users may want to
Put it in $PREFIX/share/doc/monocypher and make a note in README.md; which will likely get packaged there as well.

Recovering EdDSA performance

The latest bug fix took a huge toll on EdDSA performance. We should try and re-introduce the optimisation it reverted, this time without the bug.

missing make uninstall target

Don't want to lock people in, do we?

www.monocypher.org is parked

www.monocypher.org

Adding an issue for this so it doesn't get forgotten

Monocypher.org is up. Any comment?

Here: http://monocypher.org

No HTTPS for now, I'll set that up shortly.

Finish releasing 1.1.0

website is down atm so i can't check it, but IIRC the wget/tar listing on the front page refers to 1.0.1 still, and there's no 1.1.0 on github

2.0.0 release candidate

I have pushed the new AEAD interface. Hopefully it will be really stable this time. (I believe it will be because it is now compatible with Libsodium's crypto_aead_xchacha20poly1305_ietf_encrypt(), to the point where I was able to make test vectors.)

Since this change breaks everything, I took the opportunity to tweak the incremental interface. The subsequent compile time errors may help users figure out why their data isn't decrypting itself any more. Or maybe it will just piss them off. I'm not sure.

I'm thinking of renaming crypto_aead_lock() into crypto_lock_aead() (and crypto_aead_unlock into crypto_unlock_aead). This would have 2 effects:

Naming wise, this would make everything start by crypto_lock or crypto_unlock. This may or may not matter. I just like being consistent, and this was a tiny inconsistency.
Versioning wise, this change will trigger compile time errors for virtually every user. Is signalling the incompatibility this way actually a good idea, or will it just piss them off, and I'd better not force them to change their source code? (I know I already did this for the incremental interface, but I expect virtually no one had the time to actually use it yet. Doing the same to the monolithic interface will affect virtually everyone.)

@culex, @mikejsavage, @secworks, @konovod, what do you think?

C++ compile Settings

thanks for your great work。
To make a small suggestion, add the following code to the head and tail of the h file:

#ifdef __cplusplus
extern "C" {
#endif

#ifdef __cplusplus
}
#endif

Proper handling of non-portable optimisations

Monocypher has two significant bottlenecks at the moment: Chacha20, and Argon2i. This makes libsodium 3.6 times as fast on authenticated encryption, and 57% faster at password hashing. The former is such a huge gap that many will take it as a deal breaker. The latter is not as brutal, but its incentive structure means performance directly impacts security.

So I've started an experimental speed branch to try and go beyond purely portable C code. So far, I've integrated the vectorised implementation of Chacha20 from Romain Dolebeau. The question is, how should we handle this code?

Right now, I (very crudely) rely on the __AVX2__ pre-processor definition, which on GCC is activated whenever the architecture supports AVX-2 instructions (modern Intel and AMD processors do). Obviously, this is not enough: part of the code is actually AVX compatible (only use 128-bit registers), and could be activated separately on AVX-2, AVX, MMX, and perhaps others I have yet to be aware of.

Still, those pre-processor options may not be exactly what we want. This is non-portable code we're talking about. Is it good practice to rely on those prep-processor definitions? If so, should we enable optimisations by default? If so, should we provide a way to disable it even for users who use -march=native?

My inclination would be to activate optimisations by default, so that users who compile non-portably (either enabling AVX or such, or by using -march=native) could benefit from the optimisations without thinking about it. Rationale being, if you compile non-portably, you're not portable anyway, so you might as well use those options to the fullest. I just fear unanticipated problems.

Oh, and I'd rather not have the user "initialise" the library like Libsodium does.

@culex , @mikejsavage, @secworks, @anywone, what do you think?

Timing tests are too strict

Reported by /u/knotdjb on reddit.

Timing tests for constant time comparison fail on some machines. To reproduce, just run:

$ make self
$ ./self

It might or might not fail.

We could make the threshold more tolerant, or take several measurements (the same way we do for the speed benchmark), or be a little cleverer, and compare against a variable time comparison.

CC0 is not recommended, Public Domain is not available everywhere.

CC itself recommends not to use CC0 for software.. Please reconsider. This is similar like Public Domain not being available in many juristictions (Europe). MIT/BSD or GPL/EUPL are better choices depending on your needs.

Random number generation and the manual

Throughout the Monocypher Manual, recommendations saying to use the operating system's RNG can be found. That is in and of itself probably good advice, but if your goal is being "easy to deploy", just stating the following in the manual may not be enough.

The easiest (and recommended) way to generate this nonce is to use your OS's random number generator (/dev/urandom on UNIX systems). Don't worry about accidental collisions, the nonce is big enough to make them virtually impossible.

Using /dev/urandom is actually surprisingly difficult:

read(2) can get interrupted, violating the expectations some programmers may have, resulting in less random bytes than expected;
If you care about your randomness not being manipulated by an attacker that already basically owns the machine, verifying the device numbers between various operating systems is difficult because the operating systems are free to re-number them and there is a time of check/time of use disparity between the device number verification and the time; and
/dev/urandom may not necessarily be a CSPRNG on the OS the code is being run on.

The point is that doing random numbers is difficult, just like the cryptographic primitives provided in Monocypher are. Providing some way to wrap the RNGs of the most relevant operating systems (Windows, macOS, Linux, Solaris, FreeBSD, OpenBSD, NetBSD) or at least some further reading in the documentation may be a good idea.

crypto_verify32 and crypto_verify64 fail to spot some differences

Step to reproduce: just run the latest tests:

$ make test

Docs feedback

I'm not done yet, but here are my notes so far. I hope this is readable!

intro.html

Should explicitly mention that dictionary coders (i.e. every mainstream compression library) leaks info about the plaintext
Not related to intro.html but perhaps there should be a validation layer that checks things are aligned how they should be/things aren't overlapping when they shouldn't be/etc.

crypto_aead_lock.html
crypto_aead_unlock.html
crypto_lock.html
crypto_unlock.html

Is it ok to use a counter for the nonce?
"Make sure message lengths do not leak secret information." - Again be explicit about dictionary coders
"That length is not authenticated." - I don't fully understand the problem or the solution here. You have to put the length in the authenticated data or it's not secure? Why doesn't monocypher do that for you? It's only 8 bytes.
In the in place encryption example the cipher_text variable is unused, except for the last comment where it is incorrect.
Do you need to call crypto_wipe even when crypto_unlock fails? Can it partially decrypt your data?

crypto_argon2i.html

"It can be used to encrypt private keys or password databases." - "it" is the resulting key?
"malloc() should yield a suitable chunk of memory." - malloc will (not should) return 8 byte aligned addresses on any normal platform. Might be worth explicitly mentioning that.
"The version provided by Monocypher has no threading support, so the degree of parallelism is limited to 1. This is considered good enough for most purposes." - I don't know why you mention this
"This most likely has no practical application but is exposed for the sake of completeness." - TBH I would not expose it in that case
"Use crypto_verify16(3monocypher), crypto_verify32(3monocypher) or crypto_verify64(3monocypher) to compare password hashes to prevent timing attacks, which are feasible against the weakest passwords." - maybe drop the bit after the comma

"The hardness of the computation can be chosen thus:" and the two bullet points are oddly phrased. How about:

  To select the nb_blocks and nb_iterations parameters, you should first
  decide how long the computation should take. For user authentication,
  somewhere between half a second (convenient) and several seconds
  (paranoid) is a good starting point, and you should use as much memory
  as you can spare.

  Since parameter selection depends on your hardware, you will need to
  use some trial and error to find the ideal settings. Three iterations
  and 100k blocks (one hundred megabytes of memory) is a good starting
  point. Adjust nb_blocks first. If using all available memory is not
  slow enough, increase nb_iterations.

crypto_blake2b.html
crypto_blake2b_final.html
crypto_blake2b_general.html
crypto_blake2b_general_init.html
crypto_blake2b_init.html
crypto_blake2b_update.html

"It can authenticate messages with a naive approach." - Naive approach = just calling the functions? Is it still a naive approach if it works and is the intended usage?
"The output hash. Must have at least hash_size free bytes." - You don't normally say "Must have at least x free bytes".
"Use crypto_verify16(3monocypher), crypto_verify32(3monocypher), or crypto_verify64(3monocypher) to compare MACs created this way." - is it ok to use memcmp if you didn't use a key?
"Message to hash. May overlap with the hash argument where present." - What do you mean by "where present"? The hash and message are always present
"the length of the message in bytes." - should start with a capital T for consistency (I went back and looked at the argon2i page, you stop using capitals half down there too. Presumably this comes up elsewhere too)
"The direct interface has 2 functions" - I think it's considered good style to write two etc for small numbers. Doesn't really matter though

"crypto_blake2b(), provided for convenience (calling it is the same as calling crypto_blake2b_general() with a null key and a 64-byte hash)." - how about:

  The direct interface has two functions, crypto_blake2b() and
  crypto_blake2b_general(). crypto_blake2b() is provided for convenience,
  and is identical to calling crypto_blake2b_general() with no key and a
  64 bit hash.

crypto_chacha20_H.html

"The input in fills the space reserved for the nonce and counter. It does not have to be random." - huh?
"will be random iff the above holds" - capitalisation inconsistency again. Rather than "iff the above holds", you could say "iff key has enough entropy". It's not much longer
Should this function actually be exposed?

crypto_chacha20_encrypt.html
crypto_chacha20_init.html
crypto_chacha20_set_ctr.html
crypto_chacha20_stream.html
crypto_chacha20_x_init.html

TODO

crypto_check.html
crypto_sign.html
crypto_sign_public_key.html

No block describing the arguments!
TODO

crypto_check_final.html
crypto_check_init.html
crypto_check_update.html
crypto_sign_final.html
crypto_sign_init_first_pass.html
crypto_sign_init_second_pass.html
crypto_sign_update.html

TODO

crypto_key_exchange.html
crypto_x25519.html
crypto_x25519_public_key.html

I don't understand what crypto_x25519 is for. Is it an optimisation if you want to generate multiple keys? The old manual was more forceful about telling people to not use it unless they understand it.
I think the randomised key exchange part needs to be explained better. Why crypto_lock instead of crypto_sign?
There should probably be a function like "crypto_is_dangerous_key". If you're randomly generating keys for each session it would be nice to be able to check the key you're sending isn't going to be rejected for being malicious.

crypto_lock_auth.html
crypto_lock_encrypt.html
crypto_lock_final.html
crypto_lock_init.html
crypto_lock_update.html
crypto_unlock_final.html
crypto_unlock_update.html

TODO

crypto_memcmp.html
crypto_zerocmp.html

Remove these functions? Or update them so they do run in constant time

crypto_poly1305.html
crypto_poly1305_final.html
crypto_poly1305_init.html
crypto_poly1305_update.html

No block describing the arguments!
Not done reading this one yet

crypto_verify16.html
crypto_verify32.html
crypto_verify64.html

"Standard comparison functions tend to exit as soo as they find a difference" - "as soo" -> "as soon"

how about:

  You will often need to compare secrets, or values derived from secrets.
  Standard comparison functions (like memcmp) tend to exit when they find
  the first difference, leaking information through timing differences.

"provide comparison functions whose timing is independent from the content of their input." - maybe: "provide comparison functions that run in constant time"

crypto_wipe.html

"Secrets and values derived from them should stay in memory for the shortest possible amount of time..." - how about:

  crypto_wipe() securely erases sensitive data in memory.

  Sensitive data (such as cryptographic keys or secret plaintexts) should
  be erased from memory as early as possible, to minimise the window in
  which it can be leaked.

Size argument should maybe be called secret_size.

Critical vulnerability in crypto_check*()

Signature verification accepts all-zero signatures as valid, even with a genuine public key.

Should we alias multi-purpose functions?

This has become glaringly apparent when shuffling the header for ease of use by end users: some functions are either "missing" or have the wrong name.

crypto_lock_init() and crypto_lock_auth(), and crypto_lock_ctx do not have any unlock counterpart (they are used for both locking and unlocking). Also, crypto_lock_auth() is a general authentication function, that can authenticate additional data and the rest. A users writing (or reviewing) code using those functions might think the following:

crypto_lock_ctx ctx;                            // wrong type! use crypto_unlock_ctx;
crypto_lock_init(ctx, key, nonce);              // wrong init!
crypto_lock_auth(ctx, aead, aead_size);         // not AEAD specific, still not unlocking!
crypto_unlock_update(ctx, msg, msg_size);       // OK
if (crypto_unlock_final(ctx, mac)) { abort(); } // OK

Should we add the following aliases?

typedef crypto_unlock_ctx     crypto_lock_ctx
#define crypto_unlock_init    crypto_lock_init
#define crypto_lock_aead      crypto_lock_auth
#define crypto_unlock_aead    crypto_lock_auth

We have crypto_x25519(), crypto_key_exchange(), and crypto_x25519_public_key(). End users are meant to use the last two, which then display inconsistent naming. Should we add the following alias?

#define crypto_key_exchange_public_key  crypto_x25519_public_key

crypto_sign_update() is used for the first pass and second pass. Should we add the following aliases?

#define crypto_sign_update_first_pass   crypto_sign_update
#define crypto_sign_update_second_pass  crypto_sign_update

I'm a bit torn. On the one hand, I want to keep the minimum required to use Monocypher. On the other hand, I'd like to avoid needless confusion. I believe we should add some aliases, but not all:

I would alias the crypto_lock*() functions, but not the context.
I would alias crypto_x225519_public_key() to avoid the blatant inconsistency, or rename crypto_key_exchange() into crypto_x25519_HChacha20() (and keep the old name for backward compatibility). I'm not sure which is best.
I would probably not alias crypto_sign_update.

@culex, @mikejsavage, what do you think?

Document signature malleability

Signatures are not like hashes. For any public key/message pair, there are many possible signatures.

Some of those signatures can be constructed with just the message, the public key, and an existing valid signature. They're what we call "S malleability", where we just add L to S. This can be detected by verifying that S < L. Some libraries do this (Libsodium) others don't (TweetNaCl, Monocypher).
Others require the private key as well. To make those, we use a random nonce instead of hashing the message and the private key. The resulting signature will still be valid, and the trick is undetectable by the verifier.
There might be other kinds of malleability I'm not aware of yet.

Some protocol designers tend to expect security properties not provided by the primitives, resulting in insecure designs. For instance, if you assume a message/public key pair can only yield a single signature, the same way it would yield a single cryptographic hash, you might be tempted to treat the signature as a unique identifier, and fall prey to replay attacks (crypto currencies seem to be especially vulnerable).

The documentation should be updated to make clear what EdDSA does not provide.

Building produces many switch/case fallthrough many warnings

When compiling the tests there are so many warnings it is hard to see what's happening. Based on Warning Options why not either:

add -Wimplicit-fallthrough=0
add -Wimplicit-fallthrough=3 plus /* FALLTHRU */ comments.

Add comment about generating vectors.h in dist/README.md

Aloha!

I would suggest pointing to, or adding a comment like "remember to generate the test vectors and the file vectors.h, see XYZ" in dist/README.md under the "make test" info. The need for generating the vectors are described in the top level README.md, but is worth repeating in the dist/README.md to not confuse those that gets surprised when following the "make test" instruction in the dist/README.md.

typo on monocypher.org :-)

Hey,
sorry for contacting you like this but there was no contact info on monocypher.org.

Way down on the page (ctrl-f yh will point it out) you misspelled the file name the user is recommended to copy.

:-)

Ambiguous header

(First reported by @mikejsavage)

It is not clear by looking at the header alone whether one should use crypto_lock_update() or crypto_lock_encrypt() (we should use the former). Users may use one instead of the other, and end up not authenticating their messages. This may happen even if they have read the manual.

We should add a little comment about crypto_lock_encrypt() being a low level function, not intended for end users.

Actually, we should add that warning about every low-level primitives (Chacha20, Poly1305 etc…).

Packaging is not as easy as it should be

Packages started to appear, but we've learned since #19 that packagers make mistakes. This suggest the procedure to packages Monocypher lacks either automation or documentation —possibly both.

We should make the packager's life easy. I suggest the following:

Add a system wide installation script (or Makefile) with sensible defaults for POSIX systems. Possibly provide both static and dynamic libraries.
Add a PACKAGING.md file or similar to motivate these choices.
Have a way to easily link to a Monocypher installation (might be an environment variable, might be pkg_config).

Undefined behaviour in Blake2b (no visible effect)

This line was meant to zero out the relevant byte of the input buffer before actually putting in the input. This basically filters out the garbage before putting in the information. In practice, this works perfectly well, and compilers generate correct machine code.

Unfortunately, this also means that when the input buffer is uninitialised, we are reading uninitialised memory. And that's undefined behaviour under those crazy C and C++ standards. Here's a minimal example:

uint32_t x;
x &= 0xffffff00; // reading x is undefined

The bug was first found with Frama-C, and can be reproduced with the TIS interpreter (run the formal-analysis.sh script, then run the TIS interpreter on the sources in tests/formal-analysis/ folder, it errors out within the first 20 minutes). It could have been caught if I ran it before release. I didn't, because I didn't want to wait the 15+ hours required to run the entire test suite under the TIS interpreter, and doing so never caught bugs that Valgrind or the clang sanitisers couldn't catch. Until now, that is. Lesson learned, I guess.

A fix will be coming by the end of the week. In the mean time, users who are in a hurry may replace line 532 by the following:

if (byte == 0) { ctx->input[word] = 0; }

(That one has been verified under the TIS interpreter.)

Profiling test doesn't show the profile

Steps to reproduce

$ make clean
$ make test CC="clang -std=c99" CFLAGS="-fprofile-instr-generate"
$ tests/coverage.sh

Observed result

error: Failed to load coverage: No coverage data found

Expected result

A big fat listing describing code coverage.

Notes

The make test step generates a default.profraw file, as expected. The tests/coverage.sh just invokes 2 commands:

llvm-profdata-3.8 merge default.profraw -o all.profdata
llvm-cov show -instr-profile=all.profdata "./test.out"

The first line generates a all.profdata, as expected. The second line is expected to read that file, but for some reason doesn't.

It used to work, so maybe my llvm setup broke upon update or something. But it could be that there is indeed a problem with the makefile or the coverage script.

Could someone reproduce this on their machine?

Incremental crypto_lock/crypto_sign API

It may be worthwhile to also create an incremental crypto_lock/crypto_unlock API for cases where you have multiple buffers to decrypt (e.g. header and body in separate buffers) or a large file to read out in chunks.

Same for crypto_sign/crypto_check (think: signing fairly large update files).

YMMV, but it's an idea.

Line count on the website mismatches observations

The website says that "Monocypher takes less than 1300 lines of code". This is no longer true. By no measures I could get less than 1300.

Methods attempted:

wc -l monocypher.c (1649)
grep -v '^$' monocypher.c|wc -l (1492)
sloccount, which gets very close (1309)

You may want to either adjust the figure or note how you got your number.

(In more positive news, sloccount thinks monocypher.c costs $35,845 to develop.)

Add RFC 7539 test vectors

RFC 7539 "ChaCha20 and Poly1305 for IETF Protocols" contains serveral test vectors for both ChaCha20, Poly1305 as separate constructs, as well for the combined construct. At least the test vectors for ChaCha and Poly1305 could be useful to add to Monocypher. This would show that the implementations in Monocypher seems to match other specs and implementations.

See for example chapter 2.5.2 for Poly1305 test vectors with a three block message with results for each block and final result.

https://tools.ietf.org/html/rfc7539

Explain the purpose of "key" and "ad" in argon

Hi,
I like the manual because it's pretty much self-sufficient for a crypto newbie, but it doesn't really explain anything about these two parameters of the argon hash, except that they're optional.
What is their purpose, and when would you want to use them?

loupvaillant / monocypher Goto Github PK

monocypher's People

Contributors

Stargazers

Watchers

Forkers

monocypher's Issues

Steps to reproduce

Observed result

Expected result

Notes

Recommend Projects

Recommend Topics

Recommend Org