ucb-bar / midas Goto Github PK
View Code? Open in Web Editor NEWFPGA-Accelerated Simulation Framework Automatically Transforming Arbitrary RTL
License: Other
FPGA-Accelerated Simulation Framework Automatically Transforming Arbitrary RTL
License: Other
The remote url for
zc706_MIG/fpga-images-zc706 is
[email protected]:ucb-bar/fpga-images-zc706.git
This makes it inaccessible to those outside of the project members.
Fix:
Edit .gitmodules
git submodule sync
The functional model needs to be more flexible since it's being driven by an edge which can vary widely from target-to-target.
Eg. non-powers-of-two multi-queues
Generating a Midas Memory Model
Max Read Requests: 16
Max Write Requests: 16
Max Read Length: 8
Max Write Length: 8
Max Read ID Reuse: 3
Max Write ID Reuse: 3
Timing Model Parameters
Timing Model Class: Latency Bandwidth Pipe
No LLC Model Instantiated
[error] (run-main-0) java.lang.IllegalArgumentException: requirement failed
[error] java.lang.IllegalArgumentException: requirement failed
[error] at scala.Predef$.require(Predef.scala:264)
[error] at midas.widgets.MultiQueue.<init>(Lib.scala:118)
``
For now, midas depends on rocketchip, so regardless of the target design, we should import rocketchip to use midas. This is very unacceptable in most cases including midas-examples. midas needs to depend only on chissel and firrtl. Here's stats for code sharing from rocketchip:
Thus, we use only <300 lines from >10K lines of rocketchip. If we don't need parameterized bundle any more, this is less. I don't see any justification of rocketchip dependency right now. We may use tilelink for midas, but it is an uncertain future. Also, don't tell me that submoduling rocketchip and its build time do not matter at all. Even not importing riscv-tools is very hard without a script. Writing a build system for rocketchip is even harder.
Here's my plan to cut off the rocketchip dependency:
Of course, we should cut off the barstools dependency too.
With #6, ZynqShimTester
is not working because of weird timing behavior of testers, so it's time for strober to graduate from chisel tester.
The quick fix: add messages to assertions.
This will let us better allocate host-dram.
DRAM FRFCFSModel and PCRAM model use the queue and buffer with configurable-size by mmReg. But, for example, if transactionQueueDepth=8 is applied to the model, this model set the queue depth of "000". As discussed with David, this may be caused by overflow issue, so the below code should be modified to increase register size.
===
class FirstReadyFCFSMMRegIO(val cfg:FirstReadyFCFSConfig) extends BaseDRAMMMRegIO(cfg) {
val schedulerWindowSize = Input(UInt(log2Ceil(cfg.schedulerWindowSize).W))
val transactionQueueDepth = Input(UInt(log2Ceil(cfg.transactionQueueDepth).W))
Currently, all target DecoupledIO between endpoints and the transformed-RTL model is given a decoupled channel with latency = 1 (they are seeded with one initial token).
While this will be fixed in the new FAME compiler; a short term solution would be to allow endpoints to specify what sort of channel (or latencies) they'd like on the interconnect moving between the transformed-RTL.
The new endpoint system will break how this currently invoked.
step()
in simif can be deceptive for users who are familiar with Chisel's PeekPokeTesters. Consider the following example:
import chisel3._
class ShiftRegister extends Module {
val io = IO(new Bundle {
val in1 = Input(UInt(8.W))
val in2 = Input(UInt(8.W))
val out = Output(UInt(8.W))
val enable = Input(Bool())
})
val out = RegInit(0.U(8.W))
io.out := out
when (io.enable) {
out := io.in1 + io.in2
}
}
#include "simif.h"
class ShiftRegister_t: virtual simif_t
{
public:
void run() {
std::vector<uint32_t> reg(4);
target_reset();
poke(io_enable, 0);
step(5);
poke(io_enable, 1);
step(1);
poke(io_in1, 10);
poke(io_in2, 20);
step(1);
expect(io_out, 30);
}
};
In chisel-testers, the poke-step-expect sequence works as expected, but since simif's "step" is really "fire one cycle of targetFire", the expect line actually dequeues the old value of io_out right before the posedge, which is different from what the chisel-testers do.
Suggestions: either document this, or perhaps rename as "targetFireStep()" or some other disambiguated name, or provide step(1)
as an alias for targetFireStep(2)
, etc.
MSHRs can and should be made a runtime-configurable setting.
The SerialWidget inBuf starts to be filled in a modified design (modified to also stall when myStall is high) while the stall signal is active. As a result, the inBuf already contains an element once the simulation is restored and this is immediately sent to the target. Instead, in the golden design, the inBuf sends the element 2 rocket-chip cycles later than in the modified design because the inBuf had to be filled first
I'm just going to start opening issues for things that need obvious improvement. It'll be easy to track them here.
Presently, all leaf signals are broken into fame decoupled bundles, despite the fact all input tokens will be consumed on the same host cycle and all outputs are being produced on the same cycle.
We should only create fame decoupled bundles for subsets of the output that can be produced on different cycles, and subsets of the input that can be consumed on different host cycles. An example of this would be a fame-1 decoupled target with multiple clock-domains (whose frequencies differ).
For fame-1 decoupled targets with a signal clock, there should be only a single output and input fame-1 channel produced.
Seems like there is a mismatch between the cycle count associated with synthesize prints, and the target cycle count. This seems to manifest in the case of of sparse prints (~370000 print statements out of 6246847756 target cycles). As a particular example, the last print statement in an experiment indicates CYCLE: 167551685485
, while the end of simulation indicates Runs 6246847756 cycles
For long running workloads we stuff like this:
==> spec-test/473.astar.test.err <==
SEED: 7282986
time elapsed: 18446744072974.2 s, simulation speed = 0.00 KHz
*** PASSED *** after 74901751853 cycles
Runs 74901751853 cycles
[PASS] MidasTop Test
SEED: 7282989
real 130m58.391s
user 0m0.237s
sys 0m0.526s
We recently made changes to IceNet/SimpleNIC that changed the NICIO to be a subclass of SerialIO instead of StreamIO. Unfortunately, this caused midas widget mapping to break, because the SerialWidget was matching on SerialIO and all its subclasses, so it mistakenly matched the NICIO with the SerialWidget instead of the SimpleNICWidget. The solution was to change SimSerialIO's matchType function to explicitly return false for NICIO. I'm not sure how exactly this issue could be avoided in the future.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.