harvard-acc / gem5-aladdin Goto Github PK

End-to-end SoC simulation: integrating the gem5 system simulator with the Aladdin accelerator simulator.

License: BSD 3-Clause "New" or "Revised" License

Python 13.80% Shell 0.23% C 7.11% C++ 76.56% Makefile 0.17% CMake 0.18% M4 0.12% HTML 0.30% Assembly 1.40% Perl 0.07% Emacs Lisp 0.01% Java 0.01% Roff 0.02% Scala 0.01% Awk 0.01% sed 0.01% Dockerfile 0.01% Forth 0.01% SWIG 0.01% BASIC 0.01%

gem5-aladdin's People

Contributors

Stargazers

Watchers

Forkers

andrewfu0325 sepandhaghighi hoangt cfandy hongyunnchen davetw jjasoliya cambridge-mlg arun-sub almostday hamidu68 huanpass kenkuang shivmgg chihmin noureddine-as sosarkar alan-turing-institute aomglova gccfancier ajupazhamayil chrpilat yaoyuannnn gergely-flamich cansudemirkiran xyzsam doitdodo travisdai sacusa tanvirarafin zongwuwang smmzhang chaogaoucr rick1chen kezhou2 beenli quanganh-hoang melodylail trellixvulnteam nerv1128 nmtrmail starkerfirst amargaritov cippo95 lrg1 sukarnagarwal hitqshao charlieisacat mariamelgamal wliuxingxiangyu guoz7 snowmanliu love-lilly chamikasudusinghe js4ngu zifeng-yang x-tinkerer grigoriy-chirkov

gem5-aladdin's Issues

panic: invalid stat name 'system.aes-aes_datapath.tlb.hits'

I can create the config files (aladdin/gem5-cache/gem5-cpu)and compile the simulator. However, when I run the run.sh file I have 2 problems. First, I need to change the cpu-type from timing to DerivedO3 and memory type from DDR3_1600 to something else because they are not longer supported. Then, if I run the simulator I get this error:
panic: invalid stat name 'system.aes-aes_datapath.tlb.hits'
And then simulation is terminated.
Any idea?
Thanks

can we execute deep RL maddpg -pytorch algorithm on gem5-aladdin

Hello,
I'm trying to execute maddpg https://github.com/shariqiqbal2810/maddpg-pytorch on an architecture simulator. So, I'm looking around for various applications. I wanted to ask if I can simulate maddpg on gem5-Aladdin?
Note: maddpg uses in python 3.6

panic: invalid stat name 'system.<BenchmarkName>_datapath.tlb.hits'

Reminder to fix the issue!! (Probably in docker image)

Compiltation error

On running command scons ./build/RISCV/gem5.debug i get following compilation error!

build/RISCV/arch/riscv/linux/process.cc:671:44: error: no matching function for call to 'SyscallDesc::SyscallDesc(const char [7], )'
{215, SyscallDesc("munmap", munmapFunc)},
^
In file included from build/RISCV/arch/riscv/linux/process.cc:48:0:
build/RISCV/sim/syscall_desc.hh:74:5: note: candidate: SyscallDesc::SyscallDesc(const char*, SyscallDesc::SyscallExecutor)
SyscallDesc(const char name, SyscallExecutor sys_exec=unimplementedFunc)
^~~~~~~~~~~
build/RISCV/sim/syscall_desc.hh:74:5: note: no known conversion for argument 2 from '' to 'SyscallDesc::SyscallExecutor {aka std::function<SyscallReturn(SyscallDesc, int, ThreadContext*)>}'
build/RISCV/sim/syscall_desc.hh:69:7: note: candidate: SyscallDesc::SyscallDesc(const SyscallDesc&)
class SyscallDesc {
^~~~~~~~~~~
build/RISCV/sim/syscall_desc.hh:69:7: note: candidate expects 1 argument, 2 provided
build/RISCV/sim/syscall_desc.hh:69:7: note: candidate: SyscallDesc::SyscallDesc(SyscallDesc&&)
build/RISCV/sim/syscall_desc.hh:69:7: note: candidate expects 1 argument, 2 provided

How did file dynamic_trace.gz come about?

/usr/bin/ld: /tmp/ccP7FT6j.o: relocation R_X86_64_32 against `.rodata.str1.1' can not be used when making a PIE object; recompile with -fPIC

root@3f112d8239ca:/workspace/gem5-aladdin/src/aladdin/SHOC/triad# ls
Makefile example triad.c triad.h
root@3f112d8239ca:/workspace/gem5-aladdin/src/aladdin/SHOC/triad# make run-trace
/workspace/LLVM-Tracer/bin/get-labeled-stmts triad.c -- -I/usr/local/lib/clang/6.0.0/include -I/workspace/gem5-aladdin/src/aladdin -I/workspace/gem5-aladdin/src/aladdin/SHOC/common/ -DLLVM_TRACE
"/usr/local/bin/clang-6.0" "-cc1" "-triple" "x86_64-unknown-linux-gnu" "-emit-llvm" "-disable-free" "-disable-llvm-verifier" "-main-file-name" "triad.c" "-static-define" "-mrelocation-model" "static" "-mthread-model" "posix" "-fmath-errno" "-masm-verbose" "-mconstructor-aliases" "-munwind-tables" "-fuse-init-array" "-target-cpu" "x86-64" "-dwarf-column-info" "-debug-info-kind=limited" "-dwarf-version=4" "-debugger-tuning=gdb" "-momit-leaf-frame-pointer" "-coverage-notes-file" "/workspace/gem5-aladdin/src/aladdin/SHOC/triad/triad.gcno" "-resource-dir" "/usr/local/lib/clang/6.0.0" "-I" "/workspace/gem5-aladdin/src/aladdin" "-I" "/workspace/gem5-aladdin/src/aladdin/SHOC/common/" "-D" "LLVM_TRACE" "-I" "/workspace/gem5-aladdin/src/aladdin/gem5" "-D" "LLVM_TRACE" "-internal-isystem" "/usr/local/include" "-internal-isystem" "/usr/local/lib/clang/6.0.0/include" "-internal-externc-isystem" "/usr/include/x86_64-linux-gnu" "-internal-externc-isystem" "/include" "-internal-externc-isystem" "/usr/include" "-O1" "-fdebug-compilation-dir" "/workspace/gem5-aladdin/src/aladdin/SHOC/triad" "-ferror-limit" "19" "-fmessage-length" "0" "-fno-unroll-loops" "-fno-builtin" "-fno-inline" "-fobjc-runtime=gcc" "-fdiagnostics-show-option" "-o" "triad.llvm" "-x" "c" "triad.c"
opt -S -load=/workspace/LLVM-Tracer/lib/full_trace.so -fulltrace -labelmapwriter triad.llvm -o triad-opt.llvm
llvm-link -o full.llvm triad-opt.llvm /workspace/LLVM-Tracer/lib/trace_logger.llvm
llc -O0 -disable-fp-elim -filetype=asm -o full.s full.llvm
g++ -O0 -fno-inline -o triad-instrumented full.s -lm -lz -lpthread
/usr/bin/ld: /tmp/ccP7FT6j.o: relocation R_X86_64_32 against `.rodata.str1.1' can not be used when making a PIE object; recompile with -fPIC
/usr/bin/ld: final link failed: Nonrepresentable section on output
collect2: error: ld returned 1 exit status
../common/Makefile.tracer:64: recipe for target 'triad-instrumented' failed
make: *** [triad-instrumented] Error 1
root@3f112d8239ca:/workspace/gem5-aladdin/src/aladdin/SHOC/triad#

Incorrect path to misc.hh

In src/aladdin/gem5/HybridDatapath.cpp, line 11:
#include "base/misc.hh" --> #include "arch/x86/regs/misc.hh"

visualization of the DDDG.

Is there any tool to visualize the DDDG?

CPU clock doesn't get set right

I have written about this in the gem5-aladdin group some time ago but configs/aladdin/aladdin_se.py needs some further configuration to set the CPU clock correctly or CPU will be set at system clock.

You need to add:

# All cpus belong to a common cpu_clk_domain, therefore running at a common     
# frequency.                                                           
for cpu in system.cpu:    
    cpu.clk_domain = system.cpu_clk_domain

I have put it at line 333 since gem5 puts it there, but I think is fine anywhere.

Error: can't find library python2.7

Hi,
I am getting the following error in the latest gem5-aladdin image downloaded from docker.
Error: can't find library python2.7. In the scons_config.log, it states the following:
/usr/bin/ld: cannot find -lboost_regex
collect2: error: ld returned 1 exit status
Am I doing something wrong here? I ran the following command
which python2.7 which scons build/X86/gem5.opt
Also, I am able to build gem5 on the same machine.

How to fix error; build/X86/gem5.opt configs/example/se.py -c tests/test-progs/hello/bin/x86/linux/hello

in /workspace/gem5-addlin,

build/X86/gem5.opt configs/example/se.py -c tests/test-progs/hello/bin/x86/linux/hello
Error:
AttributeError: Values instance has no attribute 'accel_cfg_file'
How to fix this error?
071_CompileRun_Gem5_Aladdin_C_code_CommonZone.pptx

Checking ISA Word Length

On line 622 of syscall_emul.cc, it looks like you're trying to get the ISA's word size? It might be better to just say size_t word_size = sizeof(Addr); instead of explicitly checking for ISA, since the size of an address should be the size of a word (actually you can probably entirely replace word_size with sizeof(Addr)). I think this will make gem5-Aladdin entirely ISA-independent, since I only had to change that to make it compatible with RISC-V.

Can we simulate 8-bit integer by using Gem5-Aladdin?

Hi.
As the title. I wonder can we make the simulation with datawidth=1byte (8bits). such as 8-bit fixed-point. and system_bus =32 or 64 bits?

High memory usage

Appears while executing __to_nodes(self)

OSError: [Errno 2] No such file or directory

root@5e60d163e16d:/workspace/gem5-aladdin/sweeps# python generate_design_sweeps.py benchmarks/machsuite.xe
use benchmarks.designsweeptypes.Gem5DesignSweep
begin Gem5DesignSweep single
use benchmarks.machsuite.*
generate configs
generate dma_trace
generate gem5_binary
set output_dir "machsuite"
set source_dir "../src/aladdin/MachSuite"
set simulator "gem5-cpu"
set memory_type "spad"
sweep cycle_time from 1 to 5
set unrolling for bfs_bulk.bfs.loop_horizons 1
On line 4: set unrolling for bfs_bulk.bfs.loop_horizons 1
XenonSelectionError: Failed to find object named b.f.s._.b.u.l.k
root@5e60d163e16d:/workspace/gem5-aladdin/sweeps#

Docker pull problem

Using default tag: latest
latest: Pulling from xyzsam/gem5-aladdin
34667c7e4631: Pull complete
d18d76a881a4: Pull complete
119c7358fbfc: Pull complete
2aaf13f3eff0: Pull complete
d6b6c17dc2be: Pull complete
39cd4d0c35bb: Pull complete
d3671fc2b01c: Pull complete
16f31d567feb: Pull complete
7626505b6474: Pull complete
a70da26685ea: Pull complete
b30642ff8ef6: Extracting 10.56GB/10.56GB
22ab926116f1: Download complete
c9cfe83b14f9: Download complete
bfa5f9231501: Download complete
a20900fd9e01: Download complete
5cd15bb32aec: Download complete
eea7b62d6dbc: Download complete
8644728c092c: Download complete
latest: Pulling from xyzsam/gem5-aladdin
34667c7e4631: Downloading 7.105MB/43.56MB
d18d76a881a4: Pulling fs layer
119c7358fbfc: Download complete
2aaf13f3eff0: Download complete
d6b6c17dc2be: Downloading 7.506MB/189.2MB
39cd4d0c35bb: Waiting
d3671fc2b01c: Waiting
16f31d567feb: Waiting
7626505b6474: Waiting
a70da26685ea: Waiting
b30642ff8ef6: Waiting
22ab926116f1: Waiting
c9cfe83b14f9: Waiting
bfa5f9231501: Waiting
a20900fd9e01: Waiting
5cd15bb32aec: Waiting
eea7b62d6dbc: Waiting
8644728c092c: Waiting

Using Docker Image, pull is resumed immediately after pulling once, but the docker images command cannot find the container that was pulled down.

Error when running test_multiple_accelerators example with default settings

I ran the default run.sh script for test_multiple_accelerators example and got the following error:

root@ed4c7a3ad50b:/workspace/gem5-aladdin/src/aladdin/integration-test/with-cpu/test_multiple_accelerators# sh run.sh 
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "build/X86/python/m5/main.py", line 468, in main
  File "/workspace/gem5-aladdin/src/aladdin/../../configs/aladdin/aladdin_se.py", line 398, in <module>
    system.find_all(SystolicArray)[0]) == len(system.ruby._cpu_ports))
AssertionError

On investigating further, I found the assertion error is due to "ruby" option in run.sh.

The failing assertion was:
assert(options.num_cpus + 3*len(system.find_all(HybridDatapath)[0] + \ system.find_all(SystolicArray)[0]) == len(system.ruby._cpu_ports))

I printed the str(arg) of each of the arguments in the assertion. These are the values.
str(options.num_cpus) = 1, str(len(system.find_all(HybridDatapath)[0]) ) = 4, str (system.find_all(SystolicArray)[0]) ) = [], str (len(system.ruby._cpu_ports) ) = 1
Hope this helps. Also, can you comment why len(system.ruby._cpu_ports) equals 1 ?

Thanks,
Isaar

docker compile error

I use docker to build the platform, but the error "Error: can't find library python2.7 required by python"

root@db2d3e806f12:/workspace/gem5-aladdin# scons build/RISCV/gem5.opt -j4
scons: Reading SConscript files ...

You're missing the gem5 style or commit message hook. These hooks help
to ensure that your code follows gem5's style rules on git commit.
This script will now install the hook in your .git/hooks/ directory.
Press enter to continue, or ctrl-c to abort:
Mkdir("/workspace/gem5-aladdin/build/sconsign")
Warning: Your compiler doesn't support incremental linking and lto at the same time, so lto is being disabled. To force lto on anyway, use the --force-lto option. That will disable partial linking.
Checking for C header file Python.h... yes
Checking for C library python2.7... no
Error: can't find library python2.7 required by python

Systolic arrays sizes

I am testing the systolic array accelerator, and I am having problems when changing the array sizes. For the default configuration (8x8), everything works fine.

For smaller sizes, it never ends. It gets stuck in a loop, and never leaves. The following is an example of a loop for a 4x4 systolic array. This happens even for the base test.c.

963152000: system.systolic_array_acc.input_fetch0: Fetch queue occupied space: 0 / 32, allFetched: 1, allConsumed: 1, arrived at barrier: 1.
963152000: system.systolic_array_acc.input_fetch1: Fetch queue occupied space: 0 / 32, allFetched: 1, allConsumed: 1, arrived at barrier: 1.
963152000: system.systolic_array_acc.input_fetch2: Fetch queue occupied space: 0 / 32, allFetched: 1, allConsumed: 1, arrived at barrier: 1.
963152000: system.systolic_array_acc.input_fetch3: Fetch queue occupied space: 0 / 32, allFetched: 1, allConsumed: 1, arrived at barrier: 1.
963152000: system.systolic_array_acc.weight_fetch0: Fetch queue occupied space: 0 / 32, allFetched: 1, allConsumed: 1, arrived at barrier: 1.
963152000: system.systolic_array_acc.weight_fetch1: Fetch queue occupied space: 0 / 32, allFetched: 1, allConsumed: 1, arrived at barrier: 1.
963152000: system.systolic_array_acc.weight_fetch2: Fetch queue occupied space: 0 / 32, allFetched: 1, allConsumed: 1, arrived at barrier: 1.
963152000: system.systolic_array_acc.weight_fetch3: Fetch queue occupied space: 0 / 32, allFetched: 1, allConsumed: 1, arrived at barrier: 1.
963153000: global: Weight fold barrier, arrived: 1.
963153000: global: Weight fold barrier, arrived: 2.
963153000: global: evaluate
963153000: system.systolic_array_acc.input_fetch0: Fetch queue occupied space: 0 / 32, allFetched: 1, allConsumed: 1, arrived at barrier: 1.
963153000: system.systolic_array_acc.input_fetch1: Fetch queue occupied space: 0 / 32, allFetched: 1, allConsumed: 1, arrived at barrier: 1.
963153000: system.systolic_array_acc.input_fetch2: Fetch queue occupied space: 0 / 32, allFetched: 1, allConsumed: 1, arrived at barrier: 1.
963153000: system.systolic_array_acc.input_fetch3: Fetch queue occupied space: 0 / 32, allFetched: 1, allConsumed: 1, arrived at barrier: 1.
963153000: system.systolic_array_acc.weight_fetch0: Fetch queue occupied space: 0 / 32, allFetched: 1, allConsumed: 1, arrived at barrier: 1.
963153000: system.systolic_array_acc.weight_fetch1: Fetch queue occupied space: 0 / 32, allFetched: 1, allConsumed: 1, arrived at barrier: 1.
963153000: system.systolic_array_acc.weight_fetch2: Fetch queue occupied space: 0 / 32, allFetched: 1, allConsumed: 1, arrived at barrier: 1.
963153000: system.systolic_array_acc.weight_fetch3: Fetch queue occupied space: 0 / 32, allFetched: 1, allConsumed: 1, arrived at barrier: 1.
963154000: global: Weight fold barrier, arrived: 3.
963154000: global: Weight fold barrier, arrived: 4.
963154000: global: evaluate
963154000: system.systolic_array_acc.input_fetch0: Fetch queue occupied space: 0 / 32, allFetched: 1, allConsumed: 1, arrived at barrier: 1.
963154000: system.systolic_array_acc.input_fetch1: Fetch queue occupied space: 0 / 32, allFetched: 1, allConsumed: 1, arrived at barrier: 1.
963154000: system.systolic_array_acc.input_fetch2: Fetch queue occupied space: 0 / 32, allFetched: 1, allConsumed: 1, arrived at barrier: 1.
963154000: system.systolic_array_acc.input_fetch3: Fetch queue occupied space: 0 / 32, allFetched: 1, allConsumed: 1, arrived at barrier: 1.
963154000: system.systolic_array_acc.weight_fetch0: Fetch queue occupied space: 0 / 32, allFetched: 1, allConsumed: 1, arrived at barrier: 1.
963154000: system.systolic_array_acc.weight_fetch1: Fetch queue occupied space: 0 / 32, allFetched: 1, allConsumed: 1, arrived at barrier: 1.
963154000: system.systolic_array_acc.weight_fetch2: Fetch queue occupied space: 0 / 32, allFetched: 1, allConsumed: 1, arrived at barrier: 1.
963154000: system.systolic_array_acc.weight_fetch3: Fetch queue occupied space: 0 / 32, allFetched: 1, allConsumed: 1, arrived at barrier: 1.
963155000: global: Weight fold barrier, arrived: 5.
963155000: global: Weight fold barrier, arrived: 6.
963155000: global: evaluate
963155000: system.systolic_array_acc.input_fetch0: Fetch queue occupied space: 0 / 32, allFetched: 1, allConsumed: 1, arrived at barrier: 1.
963155000: system.systolic_array_acc.input_fetch1: Fetch queue occupied space: 0 / 32, allFetched: 1, allConsumed: 1, arrived at barrier: 1.
963155000: system.systolic_array_acc.input_fetch2: Fetch queue occupied space: 0 / 32, allFetched: 1, allConsumed: 1, arrived at barrier: 1.
963155000: system.systolic_array_acc.input_fetch3: Fetch queue occupied space: 0 / 32, allFetched: 1, allConsumed: 1, arrived at barrier: 1.
963155000: system.systolic_array_acc.weight_fetch0: Fetch queue occupied space: 0 / 32, allFetched: 1, allConsumed: 1, arrived at barrier: 1.
963155000: system.systolic_array_acc.weight_fetch1: Fetch queue occupied space: 0 / 32, allFetched: 1, allConsumed: 1, arrived at barrier: 1.
963155000: system.systolic_array_acc.weight_fetch2: Fetch queue occupied space: 0 / 32, allFetched: 1, allConsumed: 1, arrived at barrier: 1.
963155000: system.systolic_array_acc.weight_fetch3: Fetch queue occupied space: 0 / 32, allFetched: 1, allConsumed: 1, arrived at barrier: 1.
963156000: global: Weight fold barrier, arrived: 7.
963156000: global: Weight fold barrier, arrived: 8.
963156000: global: All have arrived at the weight fold barrier.

As for the bigger SAs, there are two cases. In the base test.c, SAs up to a size of 16 (not included) work as expected. For bigger sizes, I am getting a segmentation fault.

If I want to simulate bigger layers (e.g., ResNet's conv3), for bigger SAs I am getting the following error:

fatal: Streaming out premature data!

How can these errors be fixed?

Fail to run test_load_store with default settings

I ran the default run.sh script for test_load_store example and got the following output:

0: system.remote_gdb: listening for remote gdb on port 7000
info: Entering event queue @ 0.  Starting simulation...
info: Increasing stack size by one page.
warn: ignoring syscall access(...)
warn: x86 cpuid family 0x0000: unimplemented function 2
warn: x86 cpuid family 0x0000: unimplemented function 2
warn: x86 cpuid family 0x0000: unimplemented function 2
warn: x86 cpuid family 0x0000: unimplemented function 2
info: Received mapping for array store_vals at vaddr 6d0810 of length 8192.
info: Received mapping for array store_loc at vaddr 6d2820 of length 8192.
[WARNING]: Overlapping array declarations found!
  store_vals: 0x226e7d0 - 0x226e7d0
  store_loc: 0x22707e0 - 0x22727e0
[WARNING]: Overlapping array declarations found!
  store_loc: 0x22707e0 - 0x22727e0
  store_vals: 0x226e7d0 - 0x226e7d0
Overlapping array address ranges can lead to incorrect behavior, such as DMA nodes trying to access the wrong arrays, ACP accessing the wrong memory, etc. Please check your Aladdin configuration file and the mapArrayToAccelerator() calls to verify that they are correct.
Global loop pipelining is not ON.
 Current power model supports trig functions running at 10 ns. 
 Cycle time: 2 is not supported yet. Use 10ns power model instead.

And in stdout.gz, I got the output:

44938000: system.test_load_store_datapath: Accelerator completed.
44938000: system.test_load_store_datapath: Elapsed host seconds 0.39.
44938000: system.test_load_store_datapath: Sent finished signal.
44954000: system.test_load_store_datapath: cacheRespCallback for control signal access: 0xd9860
44956000: system.test_load_store_datapath: Woken up the CPU thread context.
Accelerator finished!
FAILED: store_loc[0] = -1, should be 0
FAILED: store_loc[1] = -1, should be 1
FAILED: store_loc[2] = -1, should be 2
FAILED: store_loc[3] = -1, should be 3
FAILED: store_loc[4] = -1, should be 4
FAILED: store_loc[5] = -1, should be 5
...
...
FAILED: store_loc[2045] = -1, should be 2045
FAILED: store_loc[2046] = -1, should be 2046
FAILED: store_loc[2047] = -1, should be 2047
Test failed with 2048 errors.Exiting @ tick 3275724000 because exiting with last active thread context
Simulated exit code not 0! Exit code is 255

Thanks.

How to build gem5 on Linux?

How to build gem5 on Linux? In my case, it is in Secure zone with default Python 2.7.
scons only have > 3.0. I have to reset the system python (2.7 into 3.0+) Attached ppt is step-by-step with image capture for both Ubuntu and python 2.6 terminal .
052_BuildRun_Gem5_Dependency_RedZone.pptx
).

Regarding PE's configuration

When accelerator type is set to aladdin then does it imply a certain configuration of how the PE's are arranged? If I wish to change the number of PE's, say 16 in a 4x4 grid each with 8 vector MAC units then how can I configure this?
For this,
in smv_convolution_op.cpp
const int kNumPEs = 16;
const int kNumMaccsPerPE = 64? cause originally it was 32 (assuming 4 NUM_MACC_INSTS x 8)
and in params.h
#define NUM_MACC_INSTS 8
#define NUM_PE_INSTS 16

but making these changes hardly brought a change in the stats files

Also if I want to change word width, by reducing the bits (10 or 8 bits ) in which data can be stored and communicated will changing the smv-accel.cfg be fine? Where else will I have to make changes? ( will this be feasible or only fp16 and fp32 work? )

Thank you.

For each algorithm of MachSuite, how to generate the corresponding executable file and the necessary files configured in gem.5.cfg such as dynamic_trace.gz?

For each algorithm of MachSuite, how to generate the corresponding executable file and the necessary files configured in gem.5.cfg such as dynamic_trace.gz?
For example, I want to simulate the test_aes to experiment on the gemm algorithm. Which files are necessary, how are they generated, which are generated by tools, and which need to be manually configured by themselves?

MachSuite and SHOC benchmarks are still using DMA interface v1

DMA interface v1 is deprecated. All benchmarks should be migrated to v3. The primary difference is that now there is an explicit separation between host memory and accelerator memory, rather than the implicit difference in the past.

compilation error in the docker image

Hello,

I am trying to compile a gem5 in the docker image but it is unsuccessful. My workflow is as follows:

Getting the docker image and accessing it:

docker pull xyzsam/gem5-aladdin
docker run -it --rm --mount source=gem5-aladdin-workspace,target=/workspace xyzsam/gem5-aladdin

Updating submodules and aladdin

cd /workspace/gem5-aladdin/; git pull; git submodule update --init --recursive
cd /workspace/gem5-aladdin/src/aladdin; git pull origin master

Finally compiling gem5:

cd /workspace/gem5-aladdin/; scons build/X86/gem5.opt

First it asks about the missing hook

scons: Reading SConscript files ...

You're missing the gem5 style or commit message hook. These hooks help
to ensure that your code follows gem5's style rules on git commit.
This script will now install the hook in your .git/hooks/ directory.
Press enter to continue, or ctrl-c to abort:

Then it fails with the following message:

{standard input}: Assembler messages:
{standard input}:6254574: Warning: end of file not at end of a line; newline inserted
g++: internal compiler error: Killed (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-5/README.Bugs> for instructions.
scons: *** [build/X86/arch/x86/generated/inst-constrs.o] Error 4
scons: building terminated because of errors.

Please find the log attached.
log.txt

Any help would be much appreciated,

Tomas

llvm-tracer github url in dockerfile need to update

https://github.com/ysshao/LLVM-Tracer ==> https://github.com/harvard-acc/LLVM-Tracer

error while loading shared libraries: libprotobuf.so.9: cannot open shared object file: No such file or directory

Hi Sam,

I am getting the following error with the latest version of gem5-Aladdin docker file:

/workspace/gem5-aladdin/src/aladdin/../../build/X86_MESI_Two_Level_aladdin/gem5.opt: error while loading shared libraries: libprotobuf.so.9: cannot open shared object file: No such file or directory

Please suggest what needs to be done.

Thanks in advance.

Compilation Error

When I try to compile gem5 using
scons buid/X86/gem5.opt
I get this error
ImportError: cannot import name MemTraceProbe
File "/data/omran/emerald/gem5-aladdin/src/aladdin/gem5/HybridDatapath.py", line 2:
from m5.objects import CommMonitor, Cache, MemTraceProbe

Questions about the support of fully-coherent caches in gem5-Aladdin

Hello. It is said in the readme file that gem5-Aladdin supports three coherence models: non-coherent DMA, LLC-coherent directly access (using ACP), and fully-coherent caches. Meanwhile, in the integration test 'test_load_store', the accelerator uses the private cache to access data from the main memory.
(1) Does the 'test_load_store' test belong to the so-called 'fully-coherent caches' model?

In my mind, fully-coherent caches mean that the accelerator's private cache should be coherent with the cpu's private cache (maybe they should be both connected to a shared L2 cache with a coherence protocol).
(2) However, gem5-Aladdin directly connects the accelerator's private cache to the membus by default, and how can it be coherent with the cpu's L1 cache?

Then, I have tried to add an L2 cache in 'test_load_store' and modify the aladdin_se.py to re-connect the accelerator's private cache to the L2 cache, making the L2 shared by both the accelerator's and cpu's privated caches. The simulation result seems to make sense, but I don't know whether it is correct. Meanwhile, I find an annotation in lines 248-251 of configs/common/CacheConfig.py

gem5-aladdin/configs/common/CacheConfig.py

Line 248 in d4efbee

# The ability for the accelerator to have an L2 cache has been removed

# The ability for the accelerator to have an L2 cache has been removed for now. The original implementation of attaching the accelerator's dcache to the CPU's L2 cache is probably not what users would expect anyways.

(3) I wonder what does the authors mean by this annotation.
(4) What does the real 'fully-coherent caches' mean in gem5-Aladdin?

Xenon sweep scripts should be updated to always run with a cpu

Several issues in one here:

Xenon sweep scripts currently default to set simulator "gem5-cache", for an accelerator-only simulation. This feature has been deprecated. All scripts should be updated to set simulator "gem5-cpu".
Xenon sweep scripts need to add generate gem5_binary so that the benchmarks get built.
We removed the hyphens from the Machsuite benchmark names (issue #6). This now conflicts with the names of the benchmarks in their Makefiles, which still contain the hyphens. This needs to be updated too, along with any other references.

SCons and python3 issues

gem5 has not yet fully migrated to Python3, but the version of Scons we use in the docker image has, which causes issues like this if the user just runs scons build/X86/gem5.opt:

TypeError: a bytes-like object is required, not 'str':
  File "/workspace/gem5-aladdin/SConstruct", line 393:
    main['GCC'] = CXX_version and (CXX_version.find('g++') >= 0 or \

For now, a workaround is to use the command: python2.7 `which scons` build/X86/gem5.opt.

Is gem5-aladdin okay in simulating systolic arrays?

I hope that systolic array is not a strange name nowadays. I see a lot of custom accelerators (like TPU) use systolic array as very crucial part in their design. Therefore, I wonder is Gem5-Aladdin eligible to build such a systolic array-based accelerator? And then fetching data from surrounding buffer to feed into this accelerator as a specific dataflow?

Of course, my final target is interaction analysis between this accelerator and surrounding component, not internal NoC issues...

Disable performance, power, or area

Is there a way to control

Performance only
Power and area only

Error while generating sweeps

Hello,
I'm discovering Aladdin and Gem5-aladdin, and I'd like to try some examples from the SHOC benchmark suite. The example provided in https://github.com/ysshao/ALADDIN cites the SHOC/triadexample. So I'd like to generate other configurations using Xenon.

For shoc.xe and machsuite.xe I get the following error, did I miss something in my way?
(I also mention that cortexsuite is not in the repo).

Thanks for the help!
Best regards,

root@ef5b091eb106:/workspace/gem5-aladdin/sweeps# python generate_design_sweeps.py benchmarks/shoc.xe     
Traceback (most recent call last):
  File "generate_design_sweeps.py", line 31, in <module>
    main()
  File "generate_design_sweeps.py", line 28, in main
    run(args.xenon_file)
  File "generate_design_sweeps.py", line 12, in run
    genfiles = interpreter.run()
  File "/workspace/gem5-aladdin/sweeps/xenon/xenon_interpreter.py", line 85, in run
    self.execute()
  File "/workspace/gem5-aladdin/sweeps/xenon/xenon_interpreter.py", line 64, in execute
    current_sweep = command(current_sweep)
  File "/workspace/gem5-aladdin/sweeps/xenon/base/commands.py", line 39, in __call__
    return self.execute(*args)
  File "/workspace/gem5-aladdin/sweeps/xenon/base/commands.py", line 245, in execute
    selected_objs = self.selection(sweep_obj)
  File "/workspace/gem5-aladdin/sweeps/xenon/base/commands.py", line 39, in __call__
    return self.execute(*args)
  File "/workspace/gem5-aladdin/sweeps/xenon/base/commands.py", line 58, in execute
    return self.select(env)
  File "/workspace/gem5-aladdin/sweeps/xenon/base/commands.py", line 55, in select
    return common.getSelectedObjs(self.tokens, env)
  File "/workspace/gem5-aladdin/sweeps/xenon/base/common.py", line 35, in getSelectedObjs
    current_view = getattr(current_view, token)
TypeError: getattr(): attribute name must be string

Related harvard-acc/ALADDIN#16 .

GEMM in Systolic Array

Is there currently a way to compute matrix-matrix multiplications with the systolic array or is it limited to convolutional tensors?

[err][test] Get error DMA and ACP accessing the wrong memory when run the test_loop_sampling with the default config

Hi, devs! I encounter some problems(which is similar to(#35. The message looks like an error not a waning) when i use gem5-aladdin with test_loop_sampling test

Error

root@e426682e09dc:/workspace/gem5-aladdin/src/aladdin/integration-test/with-cpu/test_loop_sampling# sh run.sh 
warn: DRAM device capacity (8192 Mbytes) does not match the address range assigned (4096 Mbytes)
0: system.remote_gdb: listening for remote gdb on port 7000
info: Entering event queue @ 0.  Starting simulation...
info: Increasing stack size by one page.
warn: ignoring syscall access(...)
warn: x86 cpuid family 0x0000: unimplemented function 2
warn: x86 cpuid family 0x0000: unimplemented function 2
warn: x86 cpuid family 0x0000: unimplemented function 2
warn: x86 cpuid family 0x0000: unimplemented function 2
info: Received mapping for array inputs_host at vaddr 6d0840 of length 128.
info: Received mapping for array results_host at vaddr 6d0920 of length 128.
[WARNING]: Overlapping array declarations found!
  inputs_acc: 0x1259a60 - 0x1259ae0
  results_acc: 0x1259b40 - 0x1259bc0
[WARNING]: Overlapping array declarations found!
  inputs_host: 0x12598a0 - 0x12598a0
  results_acc: 0x1259b40 - 0x1259bc0
[WARNING]: Overlapping array declarations found!
  inputs_host: 0x12598a0 - 0x12598a0
  inputs_acc: 0x1259a60 - 0x1259ae0
[WARNING]: Overlapping array declarations found!
  inputs_host: 0x12598a0 - 0x12598a0
  results_host: 0x1259980 - 0x1259a00
[WARNING]: Overlapping array declarations found!
  results_host: 0x1259980 - 0x1259a00
  results_acc: 0x1259b40 - 0x1259bc0
[WARNING]: Overlapping array declarations found!
  results_host: 0x1259980 - 0x1259a00
  inputs_acc: 0x1259a60 - 0x1259ae0
[WARNING]: Overlapping array declarations found!
  results_host: 0x1259980 - 0x1259a00
  inputs_host: 0x12598a0 - 0x12598a0
Overlapping array address ranges can lead to incorrect behavior, such as DMA nodes trying to access the wrong arrays, ACP accessing the wrong memory, etc. Please check your Aladdin configuration file and the mapArrayToAccelerator() calls to verify that they are correct.
Global loop pipelining is not ON.
 Current power model supports trig functions running at 10 ns. 
 Cycle time: 2 is not supported yet. Use 10ns power model instead.

How to reproduce

1. get docker from docker pub
2. docker run image
3. python2.7 `which scons` build/X86/gem5.opt
4. cd /workspace/gem5-aladdin/src/aladdin/integration-test/with-cpu/test_loop_sampling
5. sh run.sh

Error: can't find library python2.7 required by python

Hi,

I am getting this error while compiling the source:
Error: can't find library python2.7 required by python
I have Python2.7 and Python-dev,a nd SQLlite installed. I can compile the latest version of gem5 on my machine, but I cannot install aladdin.

SHOC/fft dynamic trace issue

Hi,
I was trying to run the fft test in SHOC. In the normal mode (without dma) it makes the dynamic trace but it does not give out the power and area summary (it does give out the stats). When I try to make the trace using the dma-trace-binary, the fft-instrumented has a segmentation fault. Please let me know if there is a workaround. My main goal is to get power and area numbers.

Thanks