Giter Site home page Giter Site logo

spispy's Introduction

ULX3S connected to SPI flash with a 8-SOIC clip and debugged with oscilloscope probes

SPI Spy: Flash emulation

The SPI Spy is an open source (both hardware and software) SPI flash emulation tool. It can store a flash image in the SDRAM connected to the FPGA and serve the image to a host CPU over the SPI bus. This allows you to avoid the lengthy SPI flash erase/write cycles during firmware development as well as to more easily explore early boot time security against TOCTOU attacks.

Platform

The design is currently based on the ULX3S which has an Lattice ECP5-12F FPGA and a 16-bit wide 32 MB SDRAM. It might be portable to the TinyFPGA-EX or other open source ECP5 boards, although it uses a custom SDRAM controller to be able to meet the difficult timing requirements of the SPI flash protocol (described below).

It has been tested on the Thinkpad x230 (no SFDP) and the Supermicro X11SSH-F (with SFDP). Write support is very flaky; the entire state machine needs to be redone (issue #17).

Supported features

  • Single SPI up to 20 MHz clock
  • 3-byte addressing (up to 16 MB of flash image)
  • High-speed (1 MB/s) /dev/ttyACM0 interface
  • Serial port updates to the SDRAM (could have a better interface issue #3)
  • Logging flash access patterns (could be longer, issue #5)
  • SFDP pages (with some caveats)
  • TOCTOU changes to the flash image based on read patterns

Not yet supported

  • Dual- and Quad-SPI (issue #1)
  • Multiple !CS pins (issue #7)
  • Fast read command (issue #1)
  • Erase/Write emulation (issue #12)
  • Status registers (partially supported, could be better)
  • Block protection bits (maybe worth it, probably not)
  • Linux RISC-V core in the FPGA

Wiring

8-SOIC chip clip and !CS pin mod

Typical 8-SOIC and 8-DIP flash chips (!RST and Vcc are optional):

                    +------+
   J1 7     !CS 1---| o    |---8  Vcc   J1 + (for low voltage)
   J1 10     SO 2---|      |---7 !RST   J1 11 (optional)
            !WP 3---|      |---6  SCK   J1 8
   J1 GND   GND 4---|      |---5  SI    J1 9
                    +------+

Typical 16-SOIC flash chips (!RST and Vcc are optional):

                   +--------+
           IO3 1---| o      |--16 CLK      J1 8
   J1 +    Vcc 2---|        |--15 SI/IO1   J1 9
   J1 11  !RST 3---|        |--14
               4---|        |--13
               5---|        |--12
               6---|        |--11
   J1 7    !CS 7---|        |--10 GND      J1 GND
   J1 10    SO 8---|        |---9 !WP/IO2
                   +--------+

If there is a series resistor on the !CS pin, it might be possible to clip directly to the chip with a Pomona 8-SOIC "chip clip" and use TOCTOU mode to override the signal from the PCH. However, this doesn't always work so sometimes it is necessary to desolder pin 1 from the board, bend the leg upwards and solder a jumper wire to the pad on the maiboard as shown in the above photo.

If the board has a "Dediprog" or programming header it might be possible to attach directly to the header and also override the chip select pin, although more testing is necessary.

IMPORTANT NOTE the system defaults to using 3.3v signalling for the SPI bus. If you have more modern system, it might use 1.8v and driving it at the higher voltage can cause problems. It is possible to remove the RV3 resistor from the board and provide power to the FPGA GPIO bank through the + pin on the left side connector (J1, pin labeled "2.5/3.3V"); you can connect this pin to the Vcc pin on the SPI flash, which will allow the FPGA to output the same voltage. More details are in issue #10.

Usage

spispy connected to a Supermicro X11SSH-F mainboard

If using the spispy with a 3.3V chip and a clip you can leave the Vcc pin disconnected; otherwise be sure to see the important note above about hardware changes to support lower voltage flash chips. Be sure to set the TOCTOU flag in the spispy.v file so that the spispy will prevent the real flash from responding (or use the #RESET pin; need to document when this works).

When you plug in the spispy it should show up as a USB-CDC-ACM device with a device file like /dev/ttyACM0. You might have to start minicom or some other terminal program to configure the control lines correctly (and to prevent ModemManager from screwing with it).

Install the sfdp.bin image into the top of DRAM to tell the PCH that the flash only supports single read commands at the slowest speed:

write-ram 0x1000000 sfdp.bin > /dev/ttyACM0

Install the ROM image into the bottom of DRAM (pv is optional to provide a bargraph and bandwidth measurement):

write-ram 0x0 coreboot.bin | pv > /dev/ttyACM0

If you want to update part of the ROM image, such as the top 8 MB of the coreboot image, you can use dd to extract that part:

dd < coreboot.bin bs=1024 skip=8192 | write-ram 0x800000 | pv > /dev/ttyACM0

Protocol

SPI data

The SPI protocol is difficult to emulate without specialized hardware since it has very demanding timing requirements. The flash device has no control over the clock and must be able to respond to a random read request on the very next clock. At 20 MHz, the slowest SPI bus on some Intel PCH chipsets, this is 50ns from receiving the last bit of the address to having to supply the first bit of the data.

Unfortunately, most microcontroller CPUs aren't able to respond to an incoming SPI byte on the next SPI cycle due to internal muxes and buses, so they aren't able to reply in time. Even if the CPU could do it, most DRAM memory has a 100ns or longer latency for a random read, so it won't be able to answer quickly enough. Additionally, DRAM requires a refresh cycle that takes it offline during the refresh, which adds a random latency.

SDRAM read waveform

These difficulties can be overcome with an FPGA using a custom DRAM controller. The FPGA is able to inhibit refresh cycles during the SPI critical sections, which reduces the latency jitter, and it can split the DRAM access into two parts: the "row activation" once 16 of the 24 address bits are known, and then a "column read" of two bytes worth of data once 7 of the last 8 bits are known. The correct byte is selected once the last bit of the address has been received.

The row activation command requires at least four DRAM clock cycles, but can be stretched arbitrarily long with a special control signal wired into the FPGA's sdram controller from the SPI bus interface. This allows the FPGA to overlap the activation with the reception of the last bits, and the final column read requires only two clock cycles when the DRAM is configured with a CAS latency of two.

Subsequent bytes are "easy" at 20 MHz for single SPI since the full SDRAM read cycle (7 FPGA clocks) fits into the 8 clocks of the SPI bus (roughly 24 FPGA clocks). For dual or quad-SPI it will be necessary to configure a burst mode on the SDRAM controller or allow new column addresses to be provided dynamically.

spispy's People

Contributors

noopwafel avatar osresearch avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

spispy's Issues

Multiple !CS pins

A few mainboards have multiple flash chips (like the x230) with their own separate !CS pins. Since they won't be accessed simultaneously, the FPGA should be able to emulate the two chips with separate address ranges.

spispy CLI uses 100% CPU in monitor mode if device disconnects

It seems like the read loop in spispy will enter into spin if the underlying /dev/ttyACM0 disappears when running. I showed up to work today and found a pretty angry sounding laptop that had spent a few hundred CPU hours in spispy seemingly since everything was powered off.

I don't know Perl, and it's syntax confuses and scares me, so I can't suggest a PR - but my guess would be that the issue is here:

spispy/bin/spispy

Lines 223 to 224 in cbf8d48

my $response = $dev->read(4)
or next;

The read shouldn't fail in blocking mode, so probably exiting with an error is better there - maybe?

Support flash chip ID command

The flash protocol could be improved to support more commands, including the chip ID. The serial protocol would also need a way to set the chip ID.

Improve command line interface

The write-ram tool is a bit of a hack and could be merged into a read/write/monitor tool. The command protocol should be updated to include ACKs and ensure that the serial port is in the correct mode.

SPI flash state machine is a mess

48199502661_35396806d4_k

The SPI flash state machine is a mess of intertwined state and registers; it needs an overhaul. A back of the envelope FSM for how it should be structured instead.

Dual- and Quad- SPI support

Booting with 20mhz single SPI takes a while. It would be better to support the faster SPI modes and the SDRAM should be able to support it.

Missing LICENSE

I see you have no LICENSE file for this project. The default is copyright.

I would suggest releasing the code under the GPL-3.0-or-later or AGPL-3.0-or-later license so that others are encouraged to contribute changes back to your project.

SRAM instead ?

Cool project !
Looks quiet, too bad...

Had you considered fast parallel SRAM instead ? No RAS/CAS nonsense, 10/15ns grades should be readily available...

I haven't looked at your HDL, do you take advantage of a 16 or 32-bit wide RAM read to start the external access just before receiving the last address bits (MSB-first seems pretty typical) ?

Reprogramming via tinyfpga bootloader

When the board is flashed with the tinyfpga bootloader it is possible to upload new bitstreams via the /dev/ttyACM0 port if the bootloader is active. The PWR button can be used for this or we can add a command on the serial port to say "reboot!".

Default SFDP in block ram

The SFDP must be reloaded on every boot of the device, even though they are relatively static and can be defaulted for most platforms (based on the actual capabilities of the spispy firmware). Moving them into blockram allows the contents to live in the bitstream.

Actually fixing this probably depends on cleaning up the state machine #17

Flash erase support

The ME likes to log things during the boot and the UEFI NVRAM updates also try to write. We should support logging the writes somewhere and using the TOCTOU interface to be able to selectively apply them.

quick question about hardware for this

I'm assuming the fpga board mentioned in the readme works 'as is' with this?
sorry if it was already stated elsewhere but it was not 100% clear to me if one
could just buy the board, load this firmware/bitstream/whatever, and go to town.

Support "Serial Flash Discoverable Parameter" (SFDP) registers

More recent PCH will probe the SFDP with command 0x5A to read flash parameter pages (in addition to the parameters set in the IFD). The serial protocol will need a way to write to this data as well.

Since the SFDP is only 256-bytes, it could like in block RAM, or it could have a special region of SDRAM.

Can't load rom into RAM

Hi there,

i managed to get a working copy of spispy to run on my freshly installed Debian bookworm.
I did compile prjtrellis and nextpnr-ecp5 using this tutorial:

https://libre-soc.org/HDL_workflow/nextpnr/

I compiled spispy using a workaround.patch from a friendly dev of slack in Nov. 2022.

https://osfw.slack.com/files/U02SQ5B6J91/F04CJRP6S1Z/workaround.patch

I flashed the spispy.bit (bitstream) using openFPGAloader.
Now i didn't get a a /dev/ttyACM0 device if I put the ulx3s in but a /dev/ttyUSB0.

Trying to load a compiled coreboot image (skulls) doesn't get loaded into the RAM of the ulx3s.

What do i do wrong?
Is it even possible to load the rom into ram without being connected to the mainboard of my x230?

Also tried to hook a the devices but no changes in loading anything to emulate the spi chip.

Monitor mode

The monitor mode should be able to receive incoming data on MOSI to log both sides of the transactions.

High-speed USB

The serial port at 3 megabaud takes too long to upload a flash image. What faster USB endpoint could we support from the ECP5?

Emulate HOLD# behavior

Some SPI flash parts, such as W25Q64FV have a HOLD# signal on pin 7 instead of a RESET#. From the datasheet:

The /HOLD pin allows the device to be paused while it is actively selected. When /HOLD is brought low, while /CS is low, the DO pin will be at high impedance and signals on the DI and CLK pins will be ignored (don’t care). When /HOLD is brought high, device operation can resume.

Cant build spispy because of Error in Verilog

make
yosys
-p "read_verilog -Iusb spispy.v"
-p "synth_ecp5 -json spispy.json"
-E .spispy.d
-q \

uart.v:218: ERROR: Identifier \serial_txd_data' is implicitly declared and default_nettype is set to none.
make: *** [Makefile.icestorm:10: spispy.json] Error 1

Support higher SPI clock frequency

Many of the recent PC uses 33Mhz SPI clock. So the current 20Mhz limitation might not work on those platforms. I am thinking the implementation could be enhanced a little bit to allow higher SPI clock freq.

Current implementation issues the SDRAM read command immediately after receiving A1 SPI address on a SPI read command. Later on A0 SPI address can be used to select low/high byte of the 16bit word. This approach provides around 1.5 SPI clock timing for SDRAM to complete read and prepare the first bit output on SPI MISO. However, the timing is very tight for SDRAM read and it limits the max supported SPI clock freq.

Could we issue the SDRAM read command immediately after receiving A2 SPI address instead ? This provides one more SPI clock timing for SDRAM read. Accordingly SDRAM needs to do a burst read with BL=2. It reads a 32bit dword. And later on A1:A0 SPI address can be used to index one of the 8-bit slice. In this way, I think it should be able to allow higher SPI freq.

Is it feasible to do ?

Understanding vulnerable BIOS landscape

From reading about SpiSpy I understand that the vulnerability exploited by SpiSpy has been fixed by Intel and/or BIOS vendors. Of course the fix may not be distributed to all affected computers in the field. But I'm wondering if SpiSpy works with older systems only or are still vulnerable systems coming on the market?

`spi_cs` pullup might not be strong enough

The SPI_CS pin seems to be floating or have a very weak pullup. When the mainboard is disconnected the pin seems to glitch and when the mainboard is powered off it is grounded?

SDRAM contents corrupted

The refresh might not be doing the right thing - after sitting all night the SDRAM contents had a few bit errors compared to what was loaded into it. For example:

29969c29969
< 00075100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
---
> 00075100: ff ff ff ff ff ff ff ff ff ff bf ff ff ff ff ff  ................
44677c44677
< 000ae840: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
---
> 000ae840: ff ff ff ff ff ff ff ff ff ff fe ff ff ff ff ff  ................
50487c50487
< 000c5360: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
---
> 000c5360: ff ff ff ff fd ff ff ff ff ff ff ff ff ff ff ff  ................
52533c52533
< 000cd340: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
---
> 000cd340: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff fe  ................
86337c86337
< 00151400: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
---
> 00151400: ff ff ff ff ff ff ff f7 ff ff ff ff ff ff ff ff  ................
93746c93746
< 0016e310: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.