0xade1a1de / cryptopt Goto Github PK
View Code? Open in Web Editor NEWCryptOpt: Verified Compilation with Randomized Program Search for Cryptographic Primitives
Home Page: https://0xade1a1de.github.io/CryptOpt/
License: Apache License 2.0
CryptOpt: Verified Compilation with Randomized Program Search for Cryptographic Primitives
Home Page: https://0xade1a1de.github.io/CryptOpt/
License: Apache License 2.0
Add the cycle goal to the assemblyfile as a comment
Do you have a good sense of how much RAM and disk CryptOpt should use? I haphazardly attempted a week-long CryptOpt run and saw it reach memory exhaustion on day 2. Here's an output from the rare instance that got killed by node instead of Linux:
fiat_curve25519_solinas_square| run| 6|bs 156|#inst: 103|cycl_ 12|G 26 cycl _ 0|B 26 cycl _ 0|L 64|l/g 2.5098| P|P[ -1/ 0/ 1/ -1]|D[FL/ 48/ 13/ 5
7]|47.2M(43%) 115/s
<--- Last few GCs --->
[212:0x67a1c50] 513566914 ms: Scavenge 3885.3 (4129.6) -> 3878.9 (4129.6) MB, 3.2 / 0.0 ms (average mu = 0.629, current mu = 0.642) task;
[212:0x67a1c50] 513566933 ms: Scavenge 3886.7 (4129.6) -> 3879.1 (4129.6) MB, 3.4 / 0.0 ms (average mu = 0.629, current mu = 0.642) allocation failure;
[212:0x67a1c50] 513566952 ms: Scavenge 3886.0 (4129.6) -> 3879.2 (4145.6) MB, 3.7 / 0.0 ms (average mu = 0.629, current mu = 0.642) task;
Does 4GB/process look like expected memory usage to you? I could provision that much, but I can't think of a reason why it would be needed.
JS stacktrace:
I believe here is the invocation I used:
for i in $(seq 0 "$(("$(nproc)"-1))"); do
tmux new-window -n "c$i"
tmux send-keys "taskset -c $i ~/CryptOpt/CryptOpt --no-proof --resultDir /mnt/results --curve curve25519_solinas --method square --framePointer save --evals $((175*60*60*24*9))" C-m
sleep "0.$RANDOM"
done
Additionally, the results directory seems to have acquired 49GB of csv files (and some asm and json). Are these something I'd may want to look at, or perhaps I should not be collecting them at all?
It would be nice to use CryptOpt to generate plain x86_64 code that does not depend on the ADX extension, to serve as a fallback from CryptOpt-optimized fast assembly in distributed binaries. This is a requirement for deployment in BoringSSL, and I hear it may be relevant to adoption of mit-plv/fiat-crypto#1582 as well.
I am thinking of use of CryptOpt in this context as primarily an assurance benefit, though if it's decently fast still, even better.
I would be happy to do the work for adapting CryptOpt here if you think that this would be a good first project to hack on in the CryptOpt codebase.
Hi,
I am trying to run CryptOpt on natively on Debian testing because reasons. I'd like to know whether you think this has a chance of working. The current state is that I think I have managed to get both AssemblyLine and CryptOpt to build and link. asmline
runs, but running CryptOpt exits with a segmentation fault:
Last lines of ./CryptOpt output with DEBUG=1 --verbose:
executing cmd gcc -march=native -mtune=native -O3 -fPIC -shared -o /tmp/CryptOpt.cache/uczayywaxs/libcheckfunctions-s1678729657975-bfiat-p4149251.so /home/andreser/CryptOpt/dist/data/fiat-bridge/.cache/85ea43d3500c431ba0ae9e2f56985f9990aa34fa42fe048a10d26e52b40f2df2.c with opts {"shell":"/usr/bin/bash"}
./CryptOpt: line 31: 4149251 Segmentation fault PATH="$(realpath ./bins/node/bin):${PATH}" /usr/bin/env node "./dist/CryptOpt.js" "${@}"
gdb says:
Thread 1 "node" received signal SIGSEGV, Segmentation fault.
0x00007ffff7e25fd6 in ms_initialize () from /home/andreser/CryptOpt/node_modules/measuresuite/build/Release/measuresuite.node
(gdb) bt
#0 0x00007ffff7e25fd6 in ms_initialize ()
from /home/andreser/CryptOpt/node_modules/measuresuite/build/Release/measuresuite.node
#1 0x00007ffff7e25b1b in init ()
from /home/andreser/CryptOpt/node_modules/measuresuite/build/Release/measuresuite.node
#2 0x0000000000b10d7d in v8impl::(anonymous namespace)::FunctionCallbackWrapper::Invoke(v8::FunctionCallbackInfo const&) ()
#3 0x0000000000db0230 in v8::internal::MaybeHandle v8::internal::(anonymous namespace)::HandleApiCallHelper(v8::internal::Isolate*, v8::internal::Handle, v8::internal::Handle, v8::internal::Handle, v8::internal::Handle, v8::internal::BuiltinArguments) ()
#4 0x0000000000db176f in v8::internal::Builtin_HandleApiCall(int, unsigned long*, v8::internal::Isolate*) ()
#5 0x00000000016ef579 in Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_BuiltinExit ()
#6 0x00000000016734d0 in Builtins_InterpreterEntryTrampoline ()
#7 0x00002e0bd55015b9 in ?? ()
#8 0x000015cb6c393811 in ?? ()
#9 0x0000000800000000 in ?? ()
#10 0x00002e0bd5501689 in ?? ()
#11 0x00003ebecb2fef29 in ?? ()
#12 0x0000000500000000 in ?? ()
#13 0x0000000100000000 in ?? ()
#14 0x0000000100000000 in ?? ()
#15 0x0000000100000000 in ?? ()
#16 0x0000000100000000 in ?? ()
#17 0x0000000500000000 in ?? ()
#18 0x00003ebecb2fef29 in ?? ()
#19 0x000015cb6c393811 in ?? ()
#20 0x0000064757026351 in ?? ()
#21 0x0000009400000000 in ?? ()
#22 0x000027d6971beb29 in ?? ()
#23 0x0000000000000008 in ?? ()
#24 0x00003a9153c02699 in ?? ()
#25 0x0000064757026351 in ?? ()
#26 0x00007fffffff9220 in ?? ()
#27 0x0000000001670e62 in Builtins_JSConstructStubGeneric ()
Backtrace stopped: frame did not save the PC
(gdb) disas $pc
Dump of assembler code for function ms_initialize:
0x00007ffff7e25f30 <+0>: push %rbp
0x00007ffff7e25f31 <+1>: xor %eax,%eax
0x00007ffff7e25f33 <+3>: mov %rsp,%rbp
0x00007ffff7e25f36 <+6>: push %r15
0x00007ffff7e25f38 <+8>: mov %ecx,%r15d
0x00007ffff7e25f3b <+11>: push %r14
0x00007ffff7e25f3d <+13>: mov %edx,%r14d
0x00007ffff7e25f40 <+16>: push %r13
0x00007ffff7e25f42 <+18>: mov %esi,%r13d
0x00007ffff7e25f45 <+21>: push %r12
0x00007ffff7e25f47 <+23>: mov %rdi,%r12
0x00007ffff7e25f4a <+26>: push %rbx
0x00007ffff7e25f4b <+27>: sub $0x8,%rsp
0x00007ffff7e25f4f <+31>: call 0x7ffff7e24400
0x00007ffff7e25f54 <+36>: mov %rax,(%r12)
0x00007ffff7e25f58 <+40>: test %rax,%rax
0x00007ffff7e25f5b <+43>: je 0x7ffff7e25f6f
0x00007ffff7e25f5d <+45>: mov %r13d,%esi
0x00007ffff7e25f60 <+48>: mov %rax,%rdi
0x00007ffff7e25f63 <+51>: mov %rax,%rbx
0x00007ffff7e25f66 <+54>: call 0x7ffff7e24580
0x00007ffff7e25f6b <+59>: test %eax,%eax
0x00007ffff7e25f6d <+61>: je 0x7ffff7e25f88
0x00007ffff7e25f6f <+63>: mov $0x1,%eax
0x00007ffff7e25f74 <+68>: add $0x8,%rsp
0x00007ffff7e25f78 <+72>: pop %rbx
0x00007ffff7e25f79 <+73>: pop %r12
0x00007ffff7e25f7b <+75>: pop %r13
0x00007ffff7e25f7d <+77>: pop %r14
0x00007ffff7e25f7f <+79>: pop %r15
0x00007ffff7e25f81 <+81>: pop %rbp
0x00007ffff7e25f82 <+82>: ret
0x00007ffff7e25f83 <+83>: nopl 0x0(%rax,%rax,1)
0x00007ffff7e25f88 <+88>: mov %r14d,%esi
0x00007ffff7e25f8b <+91>: mov %rbx,%rdi
0x00007ffff7e25f8e <+94>: call 0x7ffff7e24540
0x00007ffff7e25f93 <+99>: test %eax,%eax
0x00007ffff7e25f95 <+101>: jne 0x7ffff7e25f6f
0x00007ffff7e25f97 <+103>: mov %r15d,%esi
0x00007ffff7e25f9a <+106>: mov %rbx,%rdi
0x00007ffff7e25f9d <+109>: call 0x7ffff7e242b0
0x00007ffff7e25fa2 <+114>: test %eax,%eax
0x00007ffff7e25fa4 <+116>: jne 0x7ffff7e25f6f
0x00007ffff7e25fa6 <+118>: mov %rbx,%rdi
0x00007ffff7e25fa9 <+121>: call 0x7ffff7e24460
0x00007ffff7e25fae <+126>: test %eax,%eax
0x00007ffff7e25fb0 <+128>: jne 0x7ffff7e25f6f
0x00007ffff7e25fb2 <+130>: mov %rbx,%rdi
0x00007ffff7e25fb5 <+133>: call 0x7ffff7e241f0
0x00007ffff7e25fba <+138>: test %eax,%eax
0x00007ffff7e25fbc <+140>: jne 0x7ffff7e25f6f
0x00007ffff7e25fbe <+142>: mov %rbx,%rdi
0x00007ffff7e25fc1 <+145>: call 0x7ffff7e246d0
0x00007ffff7e25fc6 <+150>: test %eax,%eax
0x00007ffff7e25fc8 <+152>: jne 0x7ffff7e25f6f
0x00007ffff7e25fca <+154>: mov %rbx,%rdi
0x00007ffff7e25fcd <+157>: call 0x7ffff7e24560
0x00007ffff7e25fd2 <+162>: test %eax,%eax
0x00007ffff7e25fd4 <+164>: jne 0x7ffff7e25f6f
=> 0x00007ffff7e25fd6 <+166>: movl $0x0,0x88(%rbx)
0x00007ffff7e25fe0 <+176>: jmp 0x7ffff7e25f74
rax 0x0 0
rbx 0x80eefa3fe0 553765191648
rcx 0x7ffff7b20733 140737349027635
rdx 0xffffffffffffff00 -256
rsi 0x1000 4096
rdi 0x0 0
rbp 0x7fffffff8e60 0x7fffffff8e60
rsp 0x7fffffff8e30 0x7fffffff8e30
r8 0xfffffff9 4294967289
r9 0x0 0
r10 0x1 1
r11 0x246 582
r12 0x7fffffff8e88 140737488326280
r13 0x5 5
r14 0x1 1
r15 0x1 1
rip 0x7ffff7e25fd6 0x7ffff7e25fd6
eflags 0x10246 [ PF ZF IF RF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
-----[ Native Stacktraces ]-----
[pc=0x00007ffff7fb569e, sp=0x00007fffffff8860] in segfault_handler(int)+0x4e
[pc=0x00007ffff7a5af90, sp=0x00007fffffff8880] in __sigaction+0x40
[pc=0x00007ffff7e25fd6, sp=0x00007fffffff8e30] in ms_initialize+0xa6
[pc=0x00007ffff7e25b1b, sp=0x00007fffffff8e70] in init+0xab
[pc=0x0000000000b10d7d, sp=0x00007fffffff8ed0] in v8impl::(anonymous namespace)::FunctionCallbackWrapper::Invoke(v8::FunctionCallbackInfo const&)+0x7d
[pc=0x0000000000db0230, sp=0x00007fffffff8f30] in v8::internal::MaybeHandle v8::internal::(anonymous namespace)::HandleApiCallHelper(v8::internal::Isolate*, v8::internal::Handle, v8::internal::Handle, v8::internal::Handle, v8::internal::Handle, v8::internal::BuiltinArguments)+0x380
[pc=0x0000000000db176f, sp=0x00007fffffff9070] in v8::internal::Builtin_HandleApiCall(int, unsigned long*, v8::internal::Isolate*)+0xaf
[pc=0x00000000016ef579, sp=0x00007fffffff90e0] in Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_BuiltinExit+0x39
[pc=0x00000000016734d0, sp=0x00007fffffff90f0] in Builtins_InterpreterEntryTrampoline+0xd0
[pc=0x0000000001670e62, sp=0x00007fffffff9100] in Builtins_JSConstructStubGeneric+0x122
[pc=0x000000000178d3ff, sp=0x00007fffffff9110] in Builtins_ConstructHandler+0x2bf
[pc=0x00000000016734d0, sp=0x00007fffffff9120] in Builtins_InterpreterEntryTrampoline+0xd0
[pc=0x00000000016734d0, sp=0x00007fffffff9130] in Builtins_InterpreterEntryTrampoline+0xd0
[pc=0x00000000016734d0, sp=0x00007fffffff9140] in Builtins_InterpreterEntryTrampoline+0xd0
[pc=0x0000000001670e62, sp=0x00007fffffff9150] in Builtins_JSConstructStubGeneric+0x122
[pc=0x000000000178d3ff, sp=0x00007fffffff9160] in Builtins_ConstructHandler+0x2bf
[pc=0x00000000016734d0, sp=0x00007fffffff9170] in Builtins_InterpreterEntryTrampoline+0xd0
[pc=0x00000000016f96d7, sp=0x00007fffffff9180] in Builtins_AsyncModuleEvaluate+0x97
[pc=0x0000000001671adc, sp=0x00007fffffff9190] in Builtins_JSEntryTrampoline+0x5c
[pc=0x0000000001671803, sp=0x00007fffffff91a0] in Builtins_JSEntry+0x83
[pc=0x0000000000e9293e, sp=0x00007fffffff91b0] in v8::internal::(anonymous namespace)::Invoke(v8::internal::Isolate*, v8::internal::(anonymous namespace)::InvokeParams const&)+0x13e
[pc=0x0000000000e93a10, sp=0x00007fffffff9a50] in v8::internal::(anonymous namespace)::InvokeWithTryCatch(v8::internal::Isolate*, v8::internal::(anonymous namespace)::InvokeParams const&) [clone .constprop.84]+0x50
[pc=0x0000000000e94012, sp=0x00007fffffff9ab0] in v8::internal::Execution::TryCall(v8::internal::Isolate*, v8::internal::Handle, v8::internal::Handle, int, v8::internal::Handle*, v8::internal::Execution::MessageHandling, v8::internal::MaybeHandle*, bool)+0x62
[pc=0x00000000011c3611, sp=0x00007fffffff9b40] in v8::internal::SourceTextModule::InnerExecuteAsyncModule(v8::internal::Isolate*, v8::internal::Handle, v8::internal::Handle)+0xd1
[pc=0x00000000011c38df, sp=0x00007fffffff9ba0] in v8::internal::SourceTextModule::ExecuteAsyncModule(v8::internal::Isolate*, v8::internal::Handle)+0x1af
[pc=0x00000000011c5785, sp=0x00007fffffff9c10] in v8::internal::SourceTextModule::AsyncModuleExecutionFulfilled(v8::internal::Isolate*, v8::internal::Handle)+0x135
[pc=0x0000000000dbdc2b, sp=0x00007fffffff9ce0] in v8::internal::Builtin_CallAsyncModuleFulfilled(int, unsigned long*, v8::internal::Isolate*)+0x3b
[pc=0x00000000016ef579, sp=0x00007fffffff9d20] in Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_BuiltinExit+0x39
[pc=0x000000000173fa71, sp=0x00007fffffff9d30] in Builtins_PromiseFulfillReactionJob+0x31
[pc=0x000000000169893b, sp=0x00007fffffff9d40] in Builtins_RunMicrotasks+0x27b
[pc=0x0000000001671a03, sp=0x00007fffffff9d50] in Builtins_JSRunMicrotasksEntry+0x83
[pc=0x0000000000e92e0a, sp=0x00007fffffff9d60] in v8::internal::(anonymous namespace)::Invoke(v8::internal::Isolate*, v8::internal::(anonymous namespace)::InvokeParams const&)+0x60a
[pc=0x0000000000e93a10, sp=0x00007fffffff9fb0] in v8::internal::(anonymous namespace)::InvokeWithTryCatch(v8::internal::Isolate*, v8::internal::(anonymous namespace)::InvokeParams const&) [clone .constprop.84]+0x50
[pc=0x0000000000e9411a, sp=0x00007fffffffa010] in v8::internal::Execution::TryRunMicrotasks(v8::internal::Isolate*, v8::internal::MicrotaskQueue*, v8::internal::MaybeHandle*)+0x5a
[pc=0x0000000000ec161a, sp=0x00007fffffffa080] in v8::internal::MicrotaskQueue::RunMicrotasks(v8::internal::Isolate*) [clone .part.50]+0x8a
[pc=0x0000000000ec19f2, sp=0x00007fffffffa130] in v8::internal::MicrotaskQueue::PerformCheckpoint(v8::Isolate*)+0x42
[pc=0x0000000000aaec29, sp=0x00007fffffffa160] in node::InternalCallbackScope::Close()+0x109
[pc=0x0000000000aaee01, sp=0x00007fffffffa1b0] in node::InternalCallbackScope::~InternalCallbackScope()+0x11
[pc=0x0000000000b70e5b, sp=0x00007fffffffa1d0] in node::fs::FileHandle::CloseReq::Resolve()+0x9b
[pc=0x0000000000b73420, sp=0x00007fffffffa250] in node::fs::FileHandle::ClosePromise()::{lambda(uv_fs_s*)#1}::_FUN(uv_fs_s*)+0x230
[pc=0x0000000000b681f4, sp=0x00007fffffffa2d0] in node::MakeLibuvRequestCallback::Wrapper(uv_fs_s*)+0x54
[pc=0x000000000164e11d, sp=0x00007fffffffa300] in uv__work_done+0x9d
[pc=0x0000000001652906, sp=0x00007fffffffa350] in uv__async_io.part.1+0x126
[pc=0x0000000001664e44, sp=0x00007fffffffa7a0] in uv__io_poll+0x494
[pc=0x000000000165326e, sp=0x00007fffffffd8b0] in uv_run+0x14e
[pc=0x0000000000aafa2d, sp=0x00007fffffffd910] in node::SpinEventLoop(node::Environment*)+0x14d
[pc=0x0000000000bb11f4, sp=0x00007fffffffd9c0] in node::NodeMainInstance::Run()+0xf4
[pc=0x0000000000b26c44, sp=0x00007fffffffda50] in node::LoadSnapshotDataAndRun(node::SnapshotData const**, node::InitializationResult const*)+0xb4
[pc=0x0000000000b2a83f, sp=0x00007fffffffdb00] in node::Start(int, char**)+0x2df
[pc=0x00007ffff7a4618a, sp=0x00007fffffffdb70] in __libc_init_first+0x8a
[pc=0x00007ffff7a46245, sp=0x00007fffffffdc10] in __libc_start_main+0x85
[pc=0x0000000000aad7ee, sp=0x00007fffffffdc60] in _start+0x2e
---[ V8 JavaScript Stacktraces ]---
at Measuresuite (file:///home/andreser/CryptOpt/node_modules/measuresuite/ts/dist/index.js:62:16)
at createMS (file:///home/andreser/CryptOpt/dist/optimizer.helper.class-13535478.js:4108:23)
at init (file:///home/andreser/CryptOpt/dist/optimizer.helper.class-13535478.js:4129:12)
at Optimizer (file:///home/andreser/CryptOpt/dist/CryptOpt.js:4535:46)
at (null) (file:///home/andreser/CryptOpt/dist/CryptOpt.js:4856:20)
$ node --version
v18.13.0
It'd be neat if we could have CryptOpt do optimization on assembly as it does, but then print the scheduled and register-allocated code as C to see how much worse C compilers do when given the easiest possible task. The code generated in this manner might also be preferrable to raw fiat-crypto output in cases a C implementation needs to be deployed. I don't think we should make any effort to use standard intrinsics, rather let's do something like fiat-crypto where we just define C functions that compute the same values as supported assembly instructions.
On a clean clone I got the following:
andreser@andreser ~/CryptOpt % make -j all
mkdir -p ./bins
Installing dependencies
curl -L https://nodejs.org/dist/v20.5.1/node-v20.5.1-linux-x64.tar.xz | tar --extract --xz --directory ./bins
cd src/bridge/fiat-bridge/data && sha256sum unsaturated_solinas word_by_word_montgomery dettman_multiplication solinas_reduction > sha256sums
/bin/sh: line 1: npm: command not found
make: *** [Makefile:37: node_modules] Error 127
make: *** Waiting for unfinished jobs....
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 23.9M 100 23.9M 0 0 12.1M 0 0:00:01 0:00:01 --:--:-- 12.1M
mv -f ./bins/node-v20.5.1-linux-x64 "/usr/local/google/home/andreser/CryptOpt/bins/node"
I was following https://github.com/0xADE1A1DE/CryptOpt/blob/main/INSTALL.md#bare-metal
rbp
-based stack unwinding is a requirement for stack-sampling profiling in a security-conscious production deployments where debug information cannot be evaluated during sampling, e.g. Google-wide profiling.
I imagine it would be pretty easy to just get CryptOpt to entirely ignore the rbp
register by ripping it out from the list of registers. It'd be nice to add a command-line flag that achieves this behavior. Perhaps we could also have an option to update rbp as in -fno-omit-frame-pointer
so time spent in CryptOpt-generated routines does not get misattributed during profiling.
Again, I'd be happy to do the work given the go-ahead and high-level guidance.
when using the red zone, the stack is shifted by 0x80
. With this switch the stack will start at 0x0
(needed non-unix for example)
CryptOpt code contains more spills than comparable handwritten code, and sometimes more than compiler-generated code. It is not clear whether this is important for performance, but I do have an implementation where CryptOpt is not winning over naive handwritten assembly and that exhibits this pattern.
What if register allocation first identified uses that can be unspilled using x86 memory operands, "allocated" temporaties for which all uses are of this form, and then allocated the remaining variables into registers based on the remaining uses. In particular, I believe this generic heuristic would do a decent job on the adx-multiplication pattern of mulx a0 b0; mulx a0 b1; mulx b0 b2 ... ; mulx a1 b0; ...
: all b
can stay in memory and only one a
needs to be live at a time. This suggestion is based on the hypothesis that memory-operand access is cheaper than a separate load even if that load is amortized over multiple instructions; I am not sure whether this is always true.
Another guess for a tweak would be to consider would be spilling and restoring callee-saved registers using push/pop at the very beginning and end of the function.
Have you tried something like these options already?
Hello,
I followed the Docker installation step.
It identified that my hardware is not incompatible with the current requirement (which I believe it is).
Then, I worked around by directly executing .dist/CryptOpt.js
by node. It identified illegal hardware instruction.
My Hardware:
Note:
./CryptOpt --bridge manual --cFile ./p256sqr2.c --jsonFile ./p256sqr2.json --seed 2
Start on brg/symbolname>>manual/p256_sqr<< >>without proofing correct<< on cpu >>Intel(R) Xeon(R) CPU @ 2.80GHz<< writing results to>>/home/andreser/CryptOpt/results<< with seed >>4156876406132912<< for >> 200<< evaluations against CC>>gcc -march=native -mtune=native -O3<< with cycle goal>>10000<< for each measurement on host>>andreser<< with pid>>3079059<< using counter>>RDTSCP<< framePointer=>>omit<< memoryConstraints>>none<< starting @>>2023-09-11T19:23:01.677Z<<
{
curOperation: {
datatype: 'u64',
name: [ 'x59', '_' ],
operation: 'addcarryx',
arguments: [ 'x58', 'x38', 'x40' ],
decisions: {
di_choose_arg: [Array],
di_spill_location: [Array],
di_flag: [Array],
di_handle_flags_kk: [Array],
di_choose_imm: [Array]
},
decisionsHot: []
},
e: Error: tried to spill OF, but didnt work. TSNH.
at Jt (file:///home/andreser/CryptOpt/dist/CryptOpt.js:1:24964)
at pe (file:///home/andreser/CryptOpt/dist/CryptOpt.js:1:37026)
at $e (file:///home/andreser/CryptOpt/dist/CryptOpt.js:1:33719)
at Ae (file:///home/andreser/CryptOpt/dist/CryptOpt.js:1:51131)
at file:///home/andreser/CryptOpt/dist/CryptOpt.js:1:61683
at Timeout._onTimeout (file:///home/andreser/CryptOpt/dist/CryptOpt.js:1:62312)
at listOnTimeout (node:internal/timers:573:17)
at process.processTimers (node:internal/timers:514:7),
allocs: {
'0x100000000': { datatype: 'u64', store: 'r10' },
arg1: { datatype: 'u64[4]', store: 'rsi' },
'calSv-r12': { datatype: 'u64', store: '[ rsp - 0x70 ]' },
'calSv-r13': { datatype: 'u64', store: '[ rsp - 0x68 ]' },
'calSv-r14': { datatype: 'u64', store: '[ rsp - 0x60 ]' },
'calSv-r15': { datatype: 'u64', store: '[ rsp - 0x58 ]' },
'calSv-rbp': { datatype: 'u64', store: '[ rsp - 0x78 ]' },
'calSv-rbx': { datatype: 'u64', store: '[ rsp - 0x80 ]' },
out1: { datatype: 'u64[4]', store: '[ rsp - 0x50 ]' },
x106: { datatype: 'u64', store: 'r15' },
x107: { datatype: 'u64', store: '[ rsp - 0x10 ]' },
x22: { datatype: 'u64', store: '[ rsp - 0x18 ]' },
x38: { datatype: 'u1', store: 'OF' },
x39: { datatype: 'u64', store: '[ rsp - 0x40 ]' },
x40: { datatype: 'u64', store: '[ rsp - 0x48 ]' },
x41: { datatype: 'u64', store: '[ rsp - 0x20 ]' },
x42: { datatype: 'u64', store: '[ rsp - 0x38 ]' },
x51: { datatype: 'u64', store: 'rcx' },
x53: { datatype: 'u64', store: 'r9' },
x55: { datatype: 'u64', store: 'r12' },
x57: { datatype: 'u64', store: 'r11' },
x58: { datatype: 'u1', store: 'CF' },
x66: { datatype: 'u64', store: 'rbp' },
x68: { datatype: 'u1', store: 'dil' },
x69: { datatype: 'u64', store: 'rdx' },
x70: { datatype: 'u1', store: 'r8b' },
x71: { datatype: 'u64', store: '[ rsp - 0x30 ]' },
x72: { datatype: 'u64', store: '[ rsp - 0x28 ]' },
x77: { datatype: 'u64', store: 'rbx' },
x78: { datatype: 'u64', store: 'r13' },
x96: { datatype: 'u64', store: 'r14' },
x97: { datatype: 'u64', store: 'rax' }
},
pres: [
'',
';should save OF(x38) but as it has not dependents, we just ignore it.'
],
failfile: '/home/andreser/CryptOpt/results/lastFail.asm'
}
{
curOperation: {
datatype: 'u64',
name: [ 'x59', '_' ],
operation: 'addcarryx',
arguments: [ 'x58', 'x38', 'x40' ],
decisions: {
di_choose_arg: [Array],
di_spill_location: [Array],
di_flag: [Array],
di_handle_flags_kk: [Array],
di_choose_imm: [Array]
},
decisionsHot: []
},
e: Error: tried to spill OF, but didnt work. TSNH.
at Jt (file:///home/andreser/CryptOpt/dist/CryptOpt.js:1:24964)
at pe (file:///home/andreser/CryptOpt/dist/CryptOpt.js:1:37026)
at $e (file:///home/andreser/CryptOpt/dist/CryptOpt.js:1:33719)
at Ae (file:///home/andreser/CryptOpt/dist/CryptOpt.js:1:51131)
at file:///home/andreser/CryptOpt/dist/CryptOpt.js:1:61683
at Timeout._onTimeout (file:///home/andreser/CryptOpt/dist/CryptOpt.js:1:62312)
at listOnTimeout (node:internal/timers:573:17)
at process.processTimers (node:internal/timers:514:7),
allocs: {
'0x100000000': { datatype: 'u64', store: 'r10' },
arg1: { datatype: 'u64[4]', store: 'rsi' },
'calSv-r12': { datatype: 'u64', store: '[ rsp - 0x70 ]' },
'calSv-r13': { datatype: 'u64', store: '[ rsp - 0x68 ]' },
'calSv-r14': { datatype: 'u64', store: '[ rsp - 0x60 ]' },
'calSv-r15': { datatype: 'u64', store: '[ rsp - 0x58 ]' },
'calSv-rbp': { datatype: 'u64', store: '[ rsp - 0x78 ]' },
'calSv-rbx': { datatype: 'u64', store: '[ rsp - 0x80 ]' },
out1: { datatype: 'u64[4]', store: '[ rsp - 0x50 ]' },
x106: { datatype: 'u64', store: 'r15' },
x107: { datatype: 'u64', store: '[ rsp - 0x10 ]' },
x22: { datatype: 'u64', store: '[ rsp - 0x18 ]' },
x38: { datatype: 'u1', store: 'OF' },
x39: { datatype: 'u64', store: '[ rsp - 0x40 ]' },
x40: { datatype: 'u64', store: '[ rsp - 0x48 ]' },
x41: { datatype: 'u64', store: '[ rsp - 0x20 ]' },
x42: { datatype: 'u64', store: '[ rsp - 0x38 ]' },
x51: { datatype: 'u64', store: 'rcx' },
x53: { datatype: 'u64', store: 'r9' },
x55: { datatype: 'u64', store: 'r12' },
x57: { datatype: 'u64', store: 'r11' },
x58: { datatype: 'u1', store: 'CF' },
x66: { datatype: 'u64', store: 'rbp' },
x68: { datatype: 'u1', store: 'dil' },
x69: { datatype: 'u64', store: 'rdx' },
x70: { datatype: 'u1', store: 'r8b' },
x71: { datatype: 'u64', store: '[ rsp - 0x30 ]' },
x72: { datatype: 'u64', store: '[ rsp - 0x28 ]' },
x77: { datatype: 'u64', store: 'rbx' },
x78: { datatype: 'u64', store: 'r13' },
x96: { datatype: 'u64', store: 'r14' },
x97: { datatype: 'u64', store: 'rax' }
},
pres: [
'',
';should save OF(x38) but as it has not dependents, we just ignore it.'
],
failfile: '/home/andreser/CryptOpt/results/lastFail.asm'
}
Done with code: 1 (statefile: /home/andreser/CryptOpt/results/manual/p256_sqr/seed0000000000000002.json)
Wrote RES/manual/p256_sqr/seed0000000000000002.json exiting.
It is possible that the input is silly, I haven't proven anything about it yet.
re bitcoin-core/secp256k1#1329 (comment)
- r and a may point to the same object, but neither can be equal to b. (...)
*/
static void secp256k1_fe_mul(secp256k1_fe *r, const secp256k1_fe *a, const secp256k1_fe *b);
The idea is that we could have three enforcement levels:
a[0]
after writing r[0]
is fine.r
and a
can be equal, but must be aligned. then, writing to r[0] and subsequently reading from a[1]
is valid. Reading a[0]
after writing r[0]
would be invalid.A cartoon image of CryptOpt I've had is that it rejects mutations that make the program slower. I know that this is idealistic, even the optimization trace in the paper goes down for a bit. But how is it that continuing to try more mutations seems to have a real chance of making the cycle counts in the CryptOpt output go up? I understand that there's some chance that a mutation will misleadingly appear attractive due to measurement noise, but looking at the above log I can tell at a glance that a previous program would likely perform better. What is going on here -- is the theory that something changes about the machine to make both versions run slower, or does the wrong one just get picked sometimes?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.