The ot-sca's discuss from lowrisc

Implement automation to upload/download traces and bitstreams to/from GCP storage buckets

Implement a mechanism to upload and download bitstreams and traces to/from gcp buckets. Additional requirements:

Store metadata associated with bitstream to be able to reproduce the build. For example, git commit hash + patches.
Store metadata associated with the traces to be able to reproduce the capture:
- Capture device and associated settings.
- Bitstream and test firmware.
- Any additional data required to recreate the physical setup.

[capture] CW310 / Husky power measurements, level toggles during capture

When capturing traces on a CW310 / Husky setup we get traces on two different levels seemingly at random. Tested using: ./capture.py --cfg-file capture_aes_cw310.yaml capture aes-random --num-traces 100 --plot-traces 100. Traces distributions are on those two levels:

cc @vogelpi @vrozic @nasahlpa

board variable should be an enum

The variable to distinguish the board (CW310 vs. CW305) is currently just a string, but this should be an enum. See comment.

Add support for CW-Husky

Our capture scripts still use the CW-Lite scope. We should switch to CW-Husky which has higher capture performance.

[capture] AES spelling "fvsr" vs "fsvr"

In the AES capture script the "aes_fvsr_key" config option is named "aes_fsvr_key".

Instrument TVLA test cases for AES

Integrate TVLA testing for the AES core using ChipWhisperer's test suite as documented here: https://cwtvla.readthedocs.io/en/latest/

[objs] update ECC384 binaries

OT PR 19380 updated ecc384_serial.c and p384_ecdsa_sca.s to enable trace capturing with the latest setup.

Binaries should be updated.

[capture] Allow user to enter notes via CLI

Currently, the notes stored in the trace database need to be added in the Python script. In a future version of the capture script, the notes could be passed as an argument when executing the script or during the execution in the CLI.

Automatic testing of tvla.py script

With the PR #95, we can now load histograms and use them for general test. This is done automatically when a PR is created but it does not cover all functionality of the tvla.py script. We should also add an automatic test which gets power measurements, computes their t-values and compares with the expected values.

In addition there may be some other functionalities of the tvla.py script that may be tested. Maybe @vrozic, @vogelpi and @andreaskurth has some ideas?

Deprecate CW-Lite support

All of us now have access to the Husky scope and we almost exclusively use Husky instead of CW-Lite due to the better ADC resolution, higher sampling frequency and higher capture rate (larger on-chip buffer). Therefore we should consider to deprecate the CW-Lite support to clean up the repository.

get_fpga_buildtime

Please see #36 for context and newaetech/chipwhisperer#306 for what needs to be done.

Remove spiflash dependency from the ot-sca repo

Implement firmware bootstrap using the native target SPI interface, or the capture board SPI connection available in the target connector.

This removes the dependency on the SPI FTDI cable currently used to load the firmware on the device.

React to changes in spi and spiflash

SPIFLASH_RAW_BUFFER_SIZE (sw/device/boot_rom/spiflash_frame.h) increased from 1024 to 2048 lowRISC/opentitan#5068.

I took a quick look and unless I missed something this shouldn't break our workflow but we should increase the frame size for the sake of efficiency.

@vogelpi @moidx fyi.

[ecdsa] Extend setup to allow for random and fixed data

FYI @bilgiday

[capture] Specify offset in clk cycles

At the moment, measurement offset in capture configuration files is specified in samples.
It probably makes more sense to use clk cycles of the target as a unit of measurement.

[capture] Trace storage: Future-proof solution

We currently hold all traces in RAM after capture / during analysis and write to CW projects, i.e. numpy files. Larger trace amounts require sequential iterations through traces to reduce RAM consumption.

Report the number of correctly guessed differences in ceca.py.

See this comment in #28.

scope.clock.clkgen_freq vs pll_frequency in batch modes

While writing the code for the otbn capture and helping debugging the otbn-batch mode, @wettermo and I stumbled across the following lines

ot-sca/cw/cw_segmented.py

Lines 175 to 187 in 111c140

    
           def _configure_scope(self, scope_gain, offset): 
        
               self._scope.gain.db = scope_gain 
        
               if offset >= 0: 
        
                   self._scope.adc.offset = offset 
        
               else: 
        
                   self._scope.adc.offset = 0 
        
                   self._scope.adc.presamples = -offset 
        
               self._scope.adc.basic_mode = "rising_edge" 
        
               if self._is_husky: 
        
                   # We sample using the target clock * 2 (200 MHz). 
        
                   self._scope.clock.adc_mul = 2 
        
                   self._scope.clock.clkgen_freq = 100000000 
        
                   self._scope.clock.clkgen_src = 'extclk'

Can someone explain to me, why self._scope.clock.clkgen_freq = 100000000 is hardcoded there? Is this just a leftover or is there a particular reason? I'd expected thatt, self._scope.clock.clkgen_freq = pll_frequency like in

ot-sca/cw/util/device.py

Line 130 in 111c140

scope.clock.clkgen_freq = pll_frequency

While writing, I just discovered that @bilgiday had changed this behavior in cw/util/device.py in f51c6ec

So, may I assume, we could change this also in the cw_segmented.py, or is this a bad idea for some reason?

add doc for sha3 capture and tvla

@y-srini mentioned in the ot-sca meeting that some doc regarding sha3 capture and tvla would be helpful.
I'll push a PR soon, but want to address #111 first.

Deprecated bokeh parameter used in cw/util/plot.py

In python-requirements.txt we do not specify the version of bokeh. In one of the recent version, plot_width was replaced by width. (The changelog says it was removed in the docs in version 2.4, however I only encountered the problem with release 3.0.3 not in 2.4.3)

This causes some errors in

ot-sca/cw/util/plot.py

Line 17 in 7455181

plot = figure(plot_width=800)

When changing the parameter, I encountered bokeh.core.serialization.SerializationError: can't serialize <class 'range'>
which is also discussed in bokeh/bokeh#12313 (comment)

So, it looks like as there are more things to deal with when upgrading to 3.0.3.

What's the best way to fix this issue?
Pin the bokeh version or fix it and require a bokeh update?

[capture] Store binary into database

Currently, only the path to the used binary is stored into the trace database. In a future version of the capture script, the binary itself could be stored in the database.

[CW310] Fix detection whether bitstream has been programmed previously

On the CW310, the detection whether a bitstream has been programmed previously doesn't work. The bitstream is always programmed which takes around 15 seconds due to the bigger FPGA.

Add wildcards to .gitattributes to include all bin files in git lfs

This is to remind me to have a look at #109 (comment)

Debug aes-fvsr-key-batch timing issues

PR #82 added the corresponding target binaries to support the aes-fvsr-key-batch command. For most of us this seems to work just fine but I saw lots of errors like:

Fixed key: b'811e3731b0120a7842781e22b25cddf9'
Connected to ChipWhisperer (num_samples: 1200, num_samples_actual: 1200, num_segments_actual: 1)
Capturing:  16%|███▋                   | 162/1000 [00:00<00:01, 492.97 traces/s]
Traceback (most recent call last):
  File "./capture.py", line 771, in <module>
    app()
 ...
  File "./capture.py", line 502, in aes_fvsr_key_batch
    capture_aes_fvsr_key_batch(
  File "./capture.py", line 460, in capture_aes_fvsr_key_batch
    check_ciphertext(ot, expected_last_ciphertext, 4)
  File "./capture.py", line 221, in check_ciphertext
    assert actual_last_ciphertext == expected_last_ciphertext[0:ciphertext_len], (
AssertionError: Incorrect encryption result!
actual: bytearray(b'^\xdc\xe8{')
expected: [ 85 154 226   3  30  12  31 147  54 197 151  35   4 134  80  98]

In words, there are sometimes mismatches between the expected and received ciphertexts. Sometimes, the failure occurs for the first batch, sometimes for a later batch, sometimes it doesn't occur at all. The failures seem to depend on timing (adding some sleep command on the Python side seem to help), Husky firmware (the latest firmware seems to be more affected), and maybe also USB connection setup (docking station, hub, laptop directly).

We should root cause and fix the problem. Otherwise we can't reliably do long running captures. Imagine we collect 10 Mio traces and get such a failure after 2 hours - all traces will be lost.

Implement trace conversion scripts to support Riscure's trs format

Riscure has made a trs library available here: https://github.com/Riscure/python-trsfile. Implement a script to interoperate with ChipWhisperer traces.

[capture] Change device-side trigger; remove FPGA-specific HW modification

Currently, the device SW sets the trigger HI/LO and in HW this is AND-ed with the AES !IDLE signal. This generates tight trigger HI/LO envelopes around the AES computation. For the longer term, we need to change this behaviour and have the trigger purely controlled by SW. TODO: Change this and test.

[ecdsa] Implement golden model

Similar to AES, KMAC and SHA3 we should have a golden model running in parallel with the target to verify the produced signature on the host side.

[capture] Simple script for KMAC

As the old capture.py file is not working anymore due to the repo restructure, we need to implement the KMAC capture.
Please use capture_aes.py as a boilerplate.

[ci] Unify configs for CI

Currently, we have configuration files in ci/ and also in capture/. To simplify the repository, there only should be one location for the configs.

[capture] Improve/simplify fvsr key batch capture device commands

Currently starts generating plaintexts and keys directly from PRNG seed=0 (using Mersenne twister PRNG).
Is split into two phases - execute and generate next batch of keys and plaintext in memory.

Device-side could be improved:

Provide command to set initial plaintext
Provide command to deliberately set (initial) sample_fixed value
Remove split and compute next key and plaintext directly before execution. To prevent a systematic difference b/w fixed and random key, always generate a next key using PRNG but only use if sample_fixed = 0. This is easy when using a SW AES instead of PRNG.
Transfer full ciphertext instead of only 4 bytes at the end of each batch to simplify capture side. Performance penalty is negligible.

Replace kmac128 from pyXKCP with pycryptodome

In #54 we added support for KMAC using pyXKCP, which uses some vendor c-sources. We did that, because at this point in time there was no good python kmac implementation available.
However, this adds some c-sources and some cffi scripting.

In the meantime, kmac128 and kmac256 were added to pycryptodome v3.12.0

Do we want to streamline our python flow? This has of course a very low priority, but could be a good first issue for some else.

Restructure the repo directories

Currently, data capture, target and analysis scripts are stored in the same directory. Consider a top level view similar to the following:

tree .
.
├── analysis (models, pre/post processing libraries)
├── attacks (mountable attacks)
├── capture (data capture)
│   ├── cw
│   └── picoscope
├── obj (access to precompiled bitstreams, etc, TBD)
├── targets (target boards)
│   └── cw305
├── tests (integration tests, and end-to-end testing)
└── utils (utility libraries, e.g. cw to trs conversion scripts)

CW310 binaries built from OpenTitan master currenly not working

At the moment, binaries built for the CW310 board from OpenTitan master don't seem to work. I am not 100% sure what the root cause for this is. I suspect the main issue is that we changed the clock frequency for the CW310 some time ago to speed up TLT (see this PR lowRISC/opentitan#19479 or specifically commit lowRISC/opentitan@162cdab).

As a temporary workaround, we can use the ot-sca_2023-07-15_dd8b709_cw310 tag on my fork as a base (this is when I last updated CW310 bitstreams and binary): https://github.com/vogelpi/opentitan/releases/tag/ot-sca_2023-07-15_dd8b709_cw310

[podman] Add container to CI

As discussed with @vrozic , currently there is no CI job for testing the container. To ensure that future changes to the container infrastructure are working, adding a new CI job could be useful.

[capture] Naming convention for capture commands

Test capture script #150 uncovered a problem with naming of some of the capture commands and arguments. At the moment we don't use consistent naming strategy, which makes it complicated to automate the testing. Since the repository is likely to be extended with newer tests, we should come up with a consistent naming strategy to make maintaining easier.
Tasks:

Document the naming strategy
Modify capture.py and the test scripts.

Consider removing FTDI related code and update readme

Now that #23 is merged and programming OpenTitan using SAM3U seems to be working we should consider removing FTDI related code and update the getting started doc.

ktp.next() for keys

As discussed in #115 and https://docs.google.com/document/d/1sZjASiUji_IT-t9jEFlK-kGc_Gw3mGfubNvpkM_IQT4/edit#heading=h.qd65x0y76b0e we should check the usage of key, text = ktp.next() in cw/capture.py

Default behavior is that ktp.next() returns const_key, random_plaintext. This default behavior can be changed with ctx.obj.ktp.fixed_key = False.

Afaik, we call ktp.next()[0] only once per function and use only the first call of the iterator to assign a constant key. For random keys we use ktp.next()[1]. Thus, I think we ware fine and can change the default behavior, but we should double check that and read the spec if the first call of the iterator still outputs a deterministic value. On the other hand, if we really rely on a constant non random value, we should use a constant and not a constant iterator.

[capture] USB and Unknown ChipWhisperer errors

After the latest changes in capture setups, we are experiencing problems when capturing using ChipWhisperer Husky.
Currently, I am not sure what is causing the problem, but these are the symptoms:

The first capture after powering-up Husky creates a permanent problem with USB communication. This happens during scope.capture_and_transfer_waves() functions, and results in the USB error raiseUSBError raise__STATUS_TO_EXCEPTION_DICT.get(value, __USBError)(value) usb1.USBErrorIO: LIBUSB_ERROR_IO [-1]
In every following capture attempt, the USB communication is not working, so Husky is unable to identify itself. This results in the following error: ValueError: Unknown ChipWhisperer:

I will investigate this further to find a permanent fix, but a temporary workaround is to:

Power-up Husky
Run one of the captures from the earlier versions of the repository
Run any captures from the current repository. Everything should work until the next power-down.

Huskies in the CI setup are always powered-on, so the CI should be stable for now.

aes_serial does not send a response for the first check version command

Noticed this while working on #23 (should not be related). When I run simple_capture_traces.py. If I send multiple commands, it sends a response for each one except for the first one.

[capture] Change PRNG for on-device test data generation (TVLA)

We currently use Python random host-side and an implementation of the same Mersenne twister on device side.

There is a danger of the Python class changing.
This is not precisely in line with the definition of TVLA.
Using something like AES to generate random data is very straightforward / comprehensible.

We could replace it by a SW implementation of AES to generate the data on-device.

[ecdsa] Try increasing sampling rate with updated firmware

Currently, we're sampling with 50 MHz and the target is running at ~5 MHz for ECDSA captures to keep a 10x oversampling ratio (some of this work is not yet in the repo). Above 50 MHz it seems the synchronization between multiple runs/segments is lost. We should re-investigate this with an updated Husky firmware.

[capture] Implement all needed capture.py-provided commands using simple capture script

When the simple capture script is mature enough, use as template to implement remaining capture commands.

Implement Correlation-Enhanced Power Analysis Collision Attack

Evaluate and consider implementation of the correlation-enhanced power analysis collision attack formulated here: https://eprint.iacr.org/2010/297.pdf

[tvla] Usability and performance improvements, code cleanup

With #71 we have moved our TVLA code to the public repo. Besides adding new features, we have so far identified the following things to improve:

[ci] Automatic comparison of traces

Currently, traces generated in the CI need to be manually inspected. Ideally, an automated comparison of generated traces to a golden model is conducted.

[scope] Adapt waverunner to cycle-based config

Implement cycle-based config as shown in #203.

[capture] Simple script for OTBN

As the old capture.py file is not working anymore due to the repo restructure, we need to implement the OTBN capture.
Please use capture_aes.py as a boilerplate.

[ecdsa] Don't reset target between traces

It seems the target is currently reset between capturing different traces or trace segments which leads to very low performance as the binary needs to be transferred over SPI for every segment. We should look into changing the capture setup to not require a reset of the target between segments.

Dockerfile may need updating

We haven't been updating the Dockerfile for a while and it would be worth checking whether it's still working the the latest changes in ot-sca or if it needs updating.

[capture] Simple script for SHA3

As the old capture.py file is not working anymore due to the repo restructure, we need to implement the SHA3 capture.
Please use capture_aes.py as a boilerplate.

	def _configure_scope(self, scope_gain, offset):
	self._scope.gain.db = scope_gain
	if offset >= 0:
	self._scope.adc.offset = offset
	else:
	self._scope.adc.offset = 0
	self._scope.adc.presamples = -offset
	self._scope.adc.basic_mode = "rising_edge"
	if self._is_husky:
	# We sample using the target clock * 2 (200 MHz).
	self._scope.clock.adc_mul = 2
	self._scope.clock.clkgen_freq = 100000000
	self._scope.clock.clkgen_src = 'extclk'

lowrisc / ot-sca Goto Github PK

ot-sca's Issues

Recommend Projects

Recommend Topics

Recommend Org