byuccl / bfat Goto Github PK
View Code? Open in Web Editor NEWBitstream Fault Analysis Tool
License: Apache License 2.0
Bitstream Fault Analysis Tool
License: Apache License 2.0
Add in bit identification and association for bits used in HCLK and RIO tiles so Fault Errors can be identified in these tiles.
Currently, bfat.py and other utility scripts only function correctly if run in the main bfat
directory. We need to make it possible for these scripts run correctly from anywhere.
BFAT current only supports the part architectures included in the ProjectXRay database, but it would be nice to be able to run it on designs implemented on Ultrascale or Ultrascale+ architectures.
BFAT would run much faster if it were able to schedule different analysis processes on different threads.
This could improve the stability and functionality of the TCL based design query.
In our latest BFAT analysis of the VexRiscv Linux system we found that the data lines on the DDR are the largest contributor to faults in our TMR system. It would be nice to "predict" what these bits are and actually count the sensitive bits associated with these nets. To do this, it would be nice to have the ability to trace nets from source to all sinks and return all the PIPs and bits associated with the net so we can get a feel for the sensitivity of the net. This would help us get a feel for how "long" the nets are in terms of CRAM sensitivity.
Expanding on issue #29, it would be nice if we could analyze the .dcp of a file and come up with a list of sensitive resources/bits from BFAT. This may not be fully comprehensive but it would allow us to estimate ahead of time the sensitivity of a design.
The following paper is related and should be reviewed for contributions to bfat.
Steps to reproduce:
b
in itExample:
Design name is 'betrusted_soc'. Header looks like this:
00000000 00 09 0f f0 0f f0 0f f0 0f f0 00 00 01 61 00 2f |.............a./|
00000010 62 65 74 72 75 73 74 65 64 5f 73 6f 63 3b 55 73 |betrusted_soc;Us|
00000020 65 72 49 44 3d 30 58 46 46 46 46 46 46 46 46 3b |erID=0XFFFFFFFF;|
00000030 56 65 72 73 69 6f 6e 3d 32 30 32 30 2e 32 00 62 |Version=2020.2.b|
00000040 00 0c 37 73 35 30 63 73 67 61 33 32 34 00 63 00 |..7s50csga324.c.|
00000050 0b 32 30 32 32 2f 30 38 2f 30 32 00 64 00 09 31 |.2022/08/02.d..1|
00000060 34 3a 31 35 3a 33 38 00 65 00 21 72 8c ff ff ff |4:15:38.e.!r....|
00000070 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
00000080 ff ff ff ff ff ff ff ff ff ff ff ff ff 00 00 00 |................|
00000090 bb 11 22 00 44 ff ff ff ff ff ff ff ff aa 99 55 |..".D..........U|
This line:
Line 98 in 42c4a78
Finds the letter b
(ord value 98) in the design name and decides that is the beginning of the part number record, but it is actually the design name.
Workaround:
dd if=betrusted_soc.bit of=betrusted_soc_trunc.bit skip=30 bs=1
Will simply lop off the name of the design and allow the script to run on the resulting _trunc.bit
file.
A more permanent solution might be to parse the bitstream to look for a more robust sentinel. I'm not so familiar with the .bit format to recommend what that would be, but maybe searching for the trailing and leading '00', so a sequence of [0x00, 0x62, 0x00], would be robust since the file name is terminated with a ;
character and not a null. See also http://www.pldtool.com/pdf/fmt_xilinxbit.pdf. This sequence would work for any part number length that is shorter than 255 characters (a longer length would put a 0x01 after the 0x62), but I don't know of any Xilinx part numbers that are that long.
GND and VCC nets change based on the design and need to be handled as such. Develop a way to automatically assess any potential GND/VCC nets and their connections. (Generally, GND/VCC nets end with some form of GND/VCC and may have many names but are all routed the same, but this could potentially be incorrect for some designs)
Provide a simple HDL/XDC and its corresponding DCP as an example for users to run. With the source, the users can generate their own DCP to make sure they can run the tool on a simple example.
BFAT currently only supports 7 series device based o the Project Xray. It would be nice to support UltraScale device based on Project uray.
Extending Issue #30 , it would be nice to have the ability to analyze a TMR design and identify the single point failures here and ignore the sensitivity of the TMRd portions.
Because the xc7a35t is missing a tilegrid.json
file in Project X-Ray, the part is not currently supported by BFAT.
For undefined bits, print out information about the possible tiles they could affect even though a specific resource and function could not be determined for the bit in the Project X-Ray database. If no possible tiles could be found, make that clear.
For some CLB functions (such as CLKINV) there is not cell mapped onto the BEL. We need to properly detect these bits and classify them correctly, making it clear that there isn't a design resource related to that BEL.
As part of the fault analysis process, BFAT must determine the resources associated with a given CRAM bit. However, this determination is embedded in higher level fault analysis logic. It would be helpful to separate the functionality of the higher level fault analysis from the low-level CRAM bit resource identification. This way, users can apply BFAT for simplify identifying resources or even create other higher level functions based on this resource identification.
The top.route file from an F4PGA build folder may work in replacement of a Vivado DCP file.
It would be nice to model failures in the LUTs of a CLB. While the current approach clearly indicates that a LUT bit has been upset, it would be nice to see how the overall LUT changed. A simple approach would be to print the complete LUT contents before and after the LUT upset. This could later be expanded to evaluate how it impacts the logic function based on the number of inputs (for example, if one of the LUT inputs is hard coded to a value it is possible that the LUT upset is a "don't care"). Some logic evaluation could help with this understanding.
The general best practice is to make the LICENSE file the full text of the license and then putting a reference in the README and at the top of files. If you do this, then GitHub should detect this repository is available under an Apache 2.0 license.
You can see examples under https://github.com/chipsalliance and https://github.com/f4pga
Add support for using RapidWright instead of Vivado for design database querying.
Provide a test script that can be used to test the installation of bfat as well as the other tools. The script could be used as a start of a CI for automated testing.
I'm running Vivado 2022.1 and Python 3.9, and invoking the tool with this command:
python3.9 find_fault_bits.py betrusted_soc.bit betrusted_soc_route.dcp -r -d
Running off of commit 415fab1
If I run this command twice in a row, I get different outputs for e.g. betrusted_soc_sample_bits.json
.
One run, for example, yields this:
[
[
[
"00020aa0",
"027",
"31"
]
],
[
[
"00020691",
"051",
"22"
]
],
[
[
"00020694",
"051",
"22"
]
],
[
[
"00020694",
"052",
"00"
]
],
[
[
"0000002a",
"000",
"00"
]
],
[
[
"00400120",
"002",
"15"
],
[
"00400215",
"004",
"07"
]
]
]
and the next will give this
[
[
[
"00020ca2",
"086",
"31"
]
],
[
[
"00020e15",
"080",
"09"
]
],
[
[
"00000904",
"093",
"07"
]
],
[
[
"0000090e",
"094",
"11"
]
],
[
[
"0000002a",
"000",
"00"
]
],
[
[
"00401320",
"004",
"15"
],
[
"00001515",
"083",
"07"
]
]
]
Here's what the diff of the two is:
4,5c4,5
< "00020aa0",
< "027",
---
> "00020ca2",
> "086",
11,13c11,13
< "00020691",
< "051",
< "22"
---
> "00020e15",
> "080",
> "09"
18,20c18,20
< "00020694",
< "051",
< "22"
---
> "00000904",
> "093",
> "07"
25,27c25,27
< "00020694",
< "052",
< "00"
---
> "0000090e",
> "094",
> "11"
39,40c39,40
< "00400120",
< "002",
---
> "00401320",
> "004",
44,45c44,45
< "00400215",
< "004",
---
> "00001515",
> "083",
Is this the expected outcome? My assumption is the tool would list all the problem bits and it would be deterministic, but perhaps I am not understanding the nature of the tool correctly.
To reproduce, you can copy these files (temporarily staged in a weird directory, will be removed eventually):
https://ci.betrusted.io/trng/betrusted_soc_route.dcp
https://ci.betrusted.io/trng/betrusted_soc.bit
Provide support for parsing .ll files and essential bits files so that users can easily identify potential bits that can be analyzed. Users could generate a list of bits to query from these files without doing fault injection or radiation testing.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.