Giter Site home page Giter Site logo

esa-tu-darmstadt / tapasco Goto Github PK

View Code? Open in Web Editor NEW
104.0 14.0 24.0 109.16 MB

The Task Parallel System Composer (TaPaSCo)

License: GNU Lesser General Public License v3.0

CMake 0.22% Tcl 5.05% C 2.72% Makefile 0.12% C++ 1.81% Python 0.59% Shell 0.37% Scala 3.45% Verilog 83.66% Bluespec 0.04% Roff 0.15% Rust 1.83%
hardware-acceleration hardware fpga fpga-soc

tapasco's Introduction

The Task Parallel System Composer (TaPaSCo)

Tapasco logo

Master Branch Status: pipeline status Dev Branch Status: pipeline status

Introduction

Specialized accelerators in a heterogeneous system play a vital role in providing enough compute power for current and upcoming computational tasks. Field-programmable gate arrays (FPGA) are an established platform for such custom and highly specialized accelerators. However, an accelerator implementation alone is only part of the way to a usable system. In order to be used as a specialized co-processor in a heterogeneous setup, the accelerator still needs to be integrated into the overall system and requires a connection to the host (typically a software-programmable CPU) and often also external memory.

The open-source TaPaSCo (Task-Parallel System Composer) framework was created to serve exactly this purpose: The fast integration of FPGA-based accelerators into heterogeneous compute platforms or systems-on-chip (SoC) and their connection to relevant components on the FPGA board.

TaPaSCo can support developers in all steps of the development process of heterogeneous systems:

  • TaPaSCo Toolflow: from cores resulting from High-Level Synthesis or cores manually written in an HDL, a complete FPGA-design can be created. TaPaSCo will automatically connect all processing elements to the memory- and host-interface and generate a complete bitstream.

  • TaPaSCo Runtime API: allows to interface with accelerator from software and supports operations such as transferring data to the FPGA memory, pass values to accelerator cores and control the execution of the processing elements.

Next to the setup and usage instructions in this README, you can find additional information about TaPaSCo in the tutorial videos and the scientific publications describing and using TaPaSCo.

We welcome contributions from anyone interested in this field, check the contributor's guide for more information.

Supported FPGA devices

  • Zynq-based: PYNQ-Z1, ZC706, ZedBoard, Ultra96V2, ZCU102
  • PCIe cards: VC709, NetFPGA-SUME, VCU108, VCU118, VCU1525, Alveo U250, Alveo U280, BittWare XUP-VVH, PRO DESIGN HAWK, VCK5000

System Requirements

TaPaSCo is known to work in this environment:

  • Intel x86_64 arch
  • Linux kernel 4.4+
  • CentOS 8, Fedora 30+, Ubuntu 16.04+
  • Fedora 24/25 does not support debug mode due to GCC bug
  • Bash Shell 4.2.x+

Other setups likely work as well, but are untested.

Prerequisites for Toolflow

To use TaPaSCo, you'll need working installations of

  • Vivado Design Suite 2017.4 or newer
  • Java SDK 8 - 11
  • git
  • python3
  • GCC newer than 5.x.x for C++11 support
  • OPTIONAL: Local Installation of gradle 5.0+, if you do not want to use the included wrapper.

If you want to use the High-Level Synthesis flow for generating custom IP cores, you will also need:

  • Vivado HLS 2017.4+ or Vitis HLS 2020.2+

Check that at least the following are in your $PATH:

  • vivado - If not source path/to/vivado/settings64.sh
  • git
  • bash
  • [vivado_hls,vitis_hls] - Since Vivado 2018.1 this is included in vivado

When using Ubuntu, ensure that the following packages are installed:

  • unzip
  • zip
  • git
  • findutils
  • curl
  • default-jdk
apt-get -y install unzip git zip findutils curl default-jdk

When using Fedora, ensure that the following packages are installed:

  • which
  • java-openjdk
  • findutils
dnf -y install which java-openjdk findutils

Prerequisites for Simulation

  • Vivado Design Suite 2021 or newer
  • Questa Simulator 2021 or newer
  • python3
  • pip3

TaPaSCo-Toolflow Setup

Using the prebuilt packages, the setup of TaPaSCo is very easy:

  1. Create or open a folder, which you would like to use as your TaPaSCo workspace. Within this folder, run the TaPaSCo-Initialization-Script which is located in /opt/tapasco/tapasco-init-toolflow.sh. This will setup your current folder as TAPASCO_WORK_DIR. It will also create the file tapasco-setup.sh within your current directory.
  2. Source tapasco-setup.sh.

If you want to use a specific (pre-release) version or branch, you can do the following:

  1. Clone TaPaSCo: git clone https://github.com/esa-tu-darmstadt/tapasco.git
  2. Optionally Checkout a corresponding branch: git checkout <BRANCH>
  3. Create or open a folder, which you would like to use as your TaPaSCo workspace. Within this folder, run the TaPaSCo-Initialization-Script tapasco-init.sh which is located in the root-folder of your cloned repo. This will setup your current folder as TAPASCO_WORK_DIR. It will also create the file tapasco-setup.sh within your workdir.
  4. Source tapasco-setup.sh to setup the TaPaSCo-Environment.
  5. Build the TaPaSCo-Toolflow using tapasco-build-toolflow.

Whenever you want to use TaPaSCo in the future, just source the corresponding workspace using the tapasco-setup.sh. This also allows you to have multiple independent TaPaSCo-Workspaces.

Prerequisites for compiling the runtime

Ubuntu:

apt-get -y install build-essential linux-headers-generic python3 cmake libelf-dev git rpm protobuf-compiler

Fedora:

dnf -y install kernel-devel make gcc gcc-c++ elfutils-libelf-devel cmake python3 libatomic git rpm-build protobuf-compiler

Arch:

pacman -S linux-headers make gcc libelf libatomic_ops cmake python3 git protobuf

Rust:

The runtime uses Rust and requires a recent version of it. The versions provided by most distributions is too old. We recommend the official way of installing Rust through rustup:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs -o /tmp/rustup.sh && sh /tmp/rustup.sh -y
source ~/.cargo/env

TaPaSCo-Runtime Setup

If you want to use a specific (pre-release) version or branch, you can do the following:

  1. Clone TaPaSCo: git clone https://github.com/esa-tu-darmstadt/tapasco.git
  2. Optionally Checkout a corresponding branch: git checkout <BRANCH>
  3. Create or open a folder, which you would like to use as your TaPaSCo workspace. Within this folder, run the TaPaSCo-Initialization-Script tapasco-init.sh which is located in the root-folder of your cloned repo. This will setup your current folder as TAPASCO_WORK_DIR. It will also create the file tapasco-setup.sh within your workdir.
  4. Source tapasco-setup.sh to setup the TaPaSCo-Environment.
  5. Build the TaPaSCo-Toolflow using tapasco-build-libs.
  6. Optionally If you want to use the simulation features build the Toolflow using tapasco-build-libs -s. Currently the only supported simulator is Questa.

All of this is not necessary when using the prebuilt packages. In that case, the corresponding libraries and files are installed as usual for your OS. Simulation support is currently not available with prebuilt packages.

Getting Started - Build a TaPaSCo design

  1. Import your kernels
    • HDL flow: tapasco import path/to/ZIP as <ID> -p <PLATFORM> will import the corresponding ZIP file as a new HDL-based core. The Kernel-ID is set from and the optional flag -p <PLATFORM> determines for which platform the kernel will be available. If it is omitted, it will be made available for all platforms which may take a lot of time.
    • HLS flow: tapasco hls <KERNEL> -p <PLATFORM> will perform hls according to the kernel.json. The resulting HLS-based core will be made available for the platform given by -p <PLATFORM>. Again, -p can be omitted. HLS-Kernels are generally located in $TAPASCO_WORKDIR/kernel. If you want to add kernels you can create either symlink or copy them into the folder. Additionally, the folder can be temporarily changed using the optional --kernelDir path/to/kernels flag like this: tapasco --kernelDir path/to/kernels hls <KERNEL> -p <PLATFORM>
  2. Create a composition: tapasco compose [<KERNEL> x <COUNT>] @ <NUM> MHz -p <PLATFORM>
  3. Load the bitstream: tapasco-load-bitstream <BITSTREAM>
  4. Implement your host software
    • C API
    • C++ API

You can get more information about commands with tapasco --help and the corresponding subpages with tapasco --help <TOPIC>

Getting Started - Build a Software-Interface

  1. Design your Accelerator using HLS/HDL according to the previous section.
  2. Load your bitstream: tapasco-load-bitstream my-design.bit --reload-driver. To do this, you have to source vivado and tapasco-setup.sh.
  3. Write a C/C++ executable that interfaces with your design accordingly. To get a better understanding of this, you might want to refer to the collection of examples and the corresponding README which is located in $TAPASCO_HOME/runtime/examples
  4. Build and Compile your Software.

Getting Started - Build a Boot Image

This repository provides a script to generate boot images for some common AMD Xilinx rSoC boards. Refer to the dedicated README for more information.

Using the Simulation

  1. Design the Accelerator using HLS/HDL for the platform sim.
  2. The TaPaSCo-Toolflow will generate a ZIP-file in place of a bitstream.
  3. Load the design and start the simulation with tapasco-start-sim path/to/zip
    • After being started, the simulation can be stopped using CTRL-C
    • Consequent simulations of that design can be started simply using tapasco-start-sim
    • New designs need to be loaded again by following step 3.
    • The currently used Questa-Simulator offers the option to interact with the simulation via a GUI. To start the simulation in GUI-mode the flag --gui needs to be used.
    • To speedup datatransfers to the simulated design the flag --unsafe-sim can be used. Using this flag, the simulation will allow multiple write requests to be scheduled by the host software disabling the possibility to match errors in the simulator to the coresponding write-request in the host software.
  4. By default the simulation listens on port 4040 for incoming connections of the TaPaSCo-Runtime. Before starting your software-interface, make sure that it can connect to the simulation. You can specify that port in the runtime by setting the environment variable SIM_PORT and by using the flag --sim-port with the tapasco-start-sim command.
    • If simulation and your software-interface are running on the same host, there shouldn't be an issue
    • If simulation and software-interface are running on different host, port 4040 can be forwarded via ssh using ssh -L 4040:localhost:4040 simulation-host on the host, where the software-interface should run.
  5. Run your Software
    • Make sure to select the correct TaPaSCo kernel-device in your software when instantiating the Tapasco Class/Structure.

Acknowledgements

TaPaSCo is based on ThreadPoolComposer, which was developed by us as part of the REPARA project, a Framework Seven (FP7) funded project by the European Union.

We would also like to thank Bluespec, Inc. for making their Bluespec SystemVerilog (BSV) tools available to us and their permission to distribute the Verilog code generated by the Bluespec Compiler (bsc).

Publications

A List of publications about TaPaSCo or TaPaSCo-related research can be found here.

If you want to cite TaPaSCo, please use the following information:

[Heinz2021a] Heinz, Carsten, Jaco Hofmann, Jens Korinth, Lukas Sommer, Lukas Weber, and Andreas Koch. 2021. The Tapasco Open-Source Toolflow. In Journal of Signal Processing Systems.

Releases

We provided pre-compiled packages for many popular Linux distributions. All packages are build for the x86_64 variant.

Distribution Kernel Driver Kernel Driver (Debug) Runtime Runtime (Debug) Toolflow
Ubuntu 18.04 Download Download DEB DEB DEB
Ubuntu 20.04 Download Download DEB DEB DEB
Ubuntu 22.04 Download Download DEB DEB DEB
Rocky Linux 8 Download Download RPM RPM RPM
Fedora 36 Download Download RPM RPM RPM

tapasco's People

Contributors

atomcrafty avatar c-93 avatar cahz avatar hmentzer avatar jahofmann avatar jkorinth avatar kmeinhar avatar lukasmweber avatar m-ober avatar mhrtmnn avatar shadaar avatar sommerlukas avatar stacu avatar teflonantihaft avatar timksf avatar tsmk94 avatar wirthjohannes avatar yannickl96 avatar zyno42 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tapasco's Issues

BlueDMA support in ZC706

ZC706 could benefit from an DMA engine feature, which allows to use the on-board DDR banks. Port BlueDMA to Zynq and implement a Platform Feature for it.

Feature: System Cache

(Re-)Implement Feature for Xilinx System Cache

  • Check datasheet: Write-through strategy?
  • TCL integration

Allow direct view of the device memory on PCIe

This can be implemented by using a sliding window and a second BAR. The Xilinx Core does not support this feature directly, though. Will use a little Bluespec Module that has one configuration register for the address offset which forwards the requests accordingly.

$TAPASCO_HOME/common/common.tcl causes vivado_hls to exit (non-zero exit code)

System Details

  • 4.4.0-64-generic
  • g++-5 (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609
  • g++-4.9 (Ubuntu 4.9.3-13ubuntu2) 4.9.3
  • Ubuntu LTS 16.04
  • Vivado installed (2018.2); in Path
  • PATH="/home/demo/tapasco/bin:/home/demo/tapasco/build/install/usr/local/bin/:/home/demo/moreSpace/Vivado/2018.2/bin:/home/demo/tapasco/bin:/home/demo/tapasco/build/install/usr/local/bin/:/home/demo/moreSpace/Vivado/2018.2/bin:/home/demo/tapasco/bin:/home/demo/tapasco/build/install/usr/local/bin/:/home/demo/moreSpace/Vivado/2018.2/bin:/home/demo/.sdkman/candidates/sbt/current/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin"

Problem Summary

tapasco -v compose [ arrayinit x 4 ] @ 100 MHz -p vcu118 fails because Vivado HLS finishes with non-zero exit code (1):

[15:46:48 <main: Tapasco$> INFO] Running with configuration: [Configuration @/home/demo/tapasco/default.cfg]
Verbose = Some(verbose)
KernelDir = /home/demo/tapasco/kernel
CoreDir = /home/demo/tapasco/core
ArchDir = /home/demo/tapasco/arch
PlatformDir = /home/demo/tapasco/platform
Slurm = false
Parallel = false
MaxThreads = unlimited
MaxTasks = unlimited
Jobs = List(ComposeJob([arrayinit x 4],100.0,Vivado,None,Some(ArrayBuffer(vcu118)),None,None))
[15:46:48 <main: Compose$> INFO] need to synthesize the following cores first: arrayinit @ axi4mm@vcu118
[15:46:48 <pool-1-thread-1: VivadoHighLevelSynthesis$> INFO] starting run 'arrayinit' for axi4mm@vcu118: output in /home/demo/tapasco/core/arrayinit/axi4mm/vcu118/hls/axi4mm.log
[15:46:48 <pool-1-thread-1: VivadoHighLevelSynthesis$> INFO] verbose mode verbose is active, starting to watch /home/demo/tapasco/core/arrayinit/axi4mm/vcu118/hls/axi4mm.log
[15:46:50 <pool-1-thread-1: VivadoHighLevelSynthesis$> INFO] Script name was: /home/demo/tapasco/core/arrayinit/axi4mm/vcu118/hls/axi4mm.tcl
[15:46:50 <pool-1-thread-1: VivadoHighLevelSynthesis$> INFO] Logfile name was: /home/demo/tapasco/core/arrayinit/axi4mm/vcu118/hls/axi4mm.log
[15:46:50 <pool-1-thread-1: VivadoHighLevelSynthesis$> INFO] Pwd was: /home/demo/tapasco/core/arrayinit/axi4mm/vcu118/hls
[15:46:50 <pool-1-thread-1: VivadoHighLevelSynthesis$> ERROR] Vivado HLS finished with non-zero exit code: 1 for 'arrayinit' for axi4mm@vcu118
[15:46:50 <main: HighLevelSynthesis$> INFO] all HLS tasks have finished.
[15:46:50 <main: Compose$> ERROR] HLS tasks failed, aborting composition
[15:46:50 <main: Tapasco$> ERROR] TaPaSCo finished with errors

Note, that vivado complains about not knowing how to deal with a -notrace.

Problem Isolation (after first run of tapasco compose [...])

  • tapasco -v compose [ arrayinit x 4 ] @ 100 MHz -p vcu118 -> fails
  • cd /home/demo/tapasco/core/arrayinit/axi4mm/vcu118/hls -> change into path where vivado_hls had been executed before
  • vivado_hls -f /home/demo/tapasco/core/arrayinit/axi4mm/vcu118/hls/axi4mm.tcl -> fails
  • replace lines 25 - 27 in home/demo/tapasco/common.tcl with:
    proc source_quiet {fn} {
    eval "source [expr {[string is space [info commands version]] ? {} : {}}]" $fn
    }
  • vivado_hls -f /home/demo/tapasco/core/arrayinit/axi4mm/vcu118/hls/axi4mm.tcl -> succeeds
  • tapasco -v compose [ arrayinit x 4 ] @ 100 MHz -p vcu118 -> succeeds

Add board part repository to platform configuration

It would be nice to have an additional parameter in the platform.json to configure a board part repository (e.g. BoardRepository). The resulting code should then look like this (new code between create_project and if).

# setup the project
create_project microarch [pwd]/microarch -part {xcvu9p-flgb2104-2-i} -force

set urepo1 $::env(AWS_FPGA_REPO_DIR)/hdk/common/shell_v04261818/hlx/design/boards
set_property BOARD_PART_REPO_PATHS ${urepo1} [current_project]

if {"xilinx.com:f1_cl:part0:1.0" != ""} {
  set_property board_part {xilinx.com:f1_cl:part0:1.0} [current_project]
}

LED feature on VC709 crashes Vivado

Enabling the LED feature on VC709 compositions reproducibly crashes Vivado. While this is certainly a Vivado bug, we should investigate a workaround.

Local memory slots not considered in area estimation, causing DSE to fail

If a PE has local memories (or more than one slave interface, for that matter), DSE will still try to build more instances than will fit in the current 128 slots limit. There are several possible solutions:

  1. Have separate enumeration for memory slots (affects status core, platform_info and potentially requires a more sophisticated way to determine accessibility for each PE).
  2. Fix the algorithms to account for each slave interface instead of just assuming one.

Need to think about it some more; I guess, each PE will always have exactly one control slave interface. We could require a naming convention to identify it if more than one candidate is present on a PE, e.g., S_AXI_CTRL or similar. All other slave interfaces could be assigned a base address from a different pool, e.g., using the upper 64 base addresses already reserved for platform addresses. But we'd have to come up with some O(k) or at least O(n) scheme to find the base addresses of all slaves on a PE. 🤔

Compose fails after HLS runs

Sometimes a compose job fails after successful HLS runs with the following error:

[16:22:41 <pool-1-thread-2: ImportTask> INFO] Import of 'arrayinit_axi4mm.zip' with target axi4mm@vc709
[16:22:41 <pool-1-thread-2: Import$> INFO] SynthesisReport for arrayinit not found, starting evaluation ...
[16:22:41 <pool-1-thread-2: EvaluateIP$> INFO] starting evaluation of /home/wimi/jah/projects/tapasco/tapasco_2018.2/core/arrayinit/axi4mm/vc709/ipcore/arrayinit_axi4mm.zip for xc7vx690tffg1761-2@1000,000 MHz, output in /tmp/372075065893313964/evaluate.log
[16:30:38 <pool-1-thread-2: EvaluateIP$> INFO] evaluation of /home/wimi/jah/projects/tapasco/tapasco_2018.2/core/arrayinit/axi4mm/vc709/ipcore/arrayinit_axi4mm.zip for xc7vx690tffg1761-2@1000,000 MHz finished successfully, report in /home/wimi/jah/projects/tapasco/tapasco_2018.2/core/arrayinit/axi4mm/vc709/ipcore/arrayinit_export.xml
[16:30:38 <pool-1-thread-3: VivadoHighLevelSynthesis$> INFO] starting run 'arraysum' for axi4mm@vc709: output in /home/wimi/jah/projects/tapasco/tapasco_2018.2/core/arraysum/axi4mm/vc709/hls/axi4mm.log
[16:31:16 <pool-1-thread-3: VivadoHighLevelSynthesis$> INFO] Vivado HLS finished successfully for 'arraysum' for axi4mm@vc709
[16:31:16 <main: HighLevelSynthesis$> INFO] all HLS tasks have finished.
[16:31:16 <main: HighLevelSynthesis$> WARN] executed HLS with co-sim for [Kernel @/home/wimi/jah/projects/tapasco/tapasco_2018.2/kernel/arraysum/kernel.json]
Name = arraysum
TopFunction = arraysum
Version = 1.0
Files = /home/wimi/jah/projects/tapasco/tapasco_2018.2/kernel/arraysum/arraysum.c
TestbenchFiles = /home/wimi/jah/projects/tapasco/tapasco_2018.2/kernel/arraysum/arraysum-tb.c
CompilerFlags = 
TestbenchCompilerFlags = 
Args = arr by reference
OtherDirectives = None, but no co-simulation report was found
[16:31:16 <pool-1-thread-2: ImportTask> INFO] Import of 'arraysum_axi4mm.zip' with target axi4mm@vc709
[16:31:16 <pool-1-thread-2: Import$> INFO] SynthesisReport for arraysum not found, starting evaluation ...
[16:31:16 <pool-1-thread-2: EvaluateIP$> INFO] starting evaluation of /home/wimi/jah/projects/tapasco/tapasco_2018.2/core/arraysum/axi4mm/vc709/ipcore/arraysum_axi4mm.zip for xc7vx690tffg1761-2@1000,000 MHz, output in /tmp/9791558545089762559/evaluate.log
[16:39:08 <pool-1-thread-2: EvaluateIP$> INFO] evaluation of /home/wimi/jah/projects/tapasco/tapasco_2018.2/core/arraysum/axi4mm/vc709/ipcore/arraysum_axi4mm.zip for xc7vx690tffg1761-2@1000,000 MHz finished successfully, report in /home/wimi/jah/projects/tapasco/tapasco_2018.2/core/arraysum/axi4mm/vc709/ipcore/arraysum_export.xml
[16:39:08 <main: Compose$> INFO] all HLS tasks finished successfully, beginning compose run...
[16:39:08 <pool-1-thread-4: ComposeTask> ERROR] java.lang.Exception: could not find all required cores for target axi4mm@vc709, missing: arrayinit, arraysum

Command to reproduce:
tapasco compose '[counter x 16, arrayinit x16, arraysum x 32]' @ 200MHz -p vc709

@cahz mentioned that it does work sometimes but not reliably.
I'd guess the problem lies in src/main/scala/tapasco/filemgmt/FileAssetManager.scala not refreshing properly on directory changes.

Support PE-local memories in HLS

Use new PE-local memory support to enable a new kind of HLS port pattern: localmem. A Tcl script should automatically wrap the PE with BRAM and also make the BRAM accessible via secondary S-AXI. Using the new PE-local memories, it should be possible to use BRAMs for HLS-based kernels, e.g., AES.

Tcl: Writes[T]

Implement Tcl serialization support like Json: Define package tcl with Writes[T]. Default types should include Writes[(String, Int)] (which writes set name 42) and similar.

Interface

trait Writes[T] {
  def writes(t: T): String
}

object Tcl {
  def toTcl[T](t: T)(implicit w: Writes[T]): String = w.writes(t)
}

Asynchronous Memory Transfers

From @jkorinth: Similar to asynchronous job launches: Check if asynchronous memory transfers could be useful. I guess probably not so much, because we need to wait for the transfers to finish, before we can launch the job in anycase - worst/ideal case would be that the job starts immediately and data must be available. It would be possible to add mem barriers based on the job struct, but I do not think this would be worth the effort.

Make synthesis and implementation effort configurable

The default settings used at the moment are AlternateRoutability + Retiming for Synthesis and Explore + PHYS_OPT_DESIGN for Implementation. These settings could be considered to be very high effort. A switch could be added to let the user decide between different "effort levels". For most synthesis runs it is not necessary to go with very high effort and the user might be happy about the much lower run-time.

Possible overlap with #21

Improve SLURM job names

SLURM compose jobs currently have names like compose-0xd4c6a941-axi4mm-pynq-180.00. Use new naming scheme instead to show full configuration. Possibly also change the comment to the working directory.

Improve LogTrackingPanel

The LogTrackingPanel is not yet as useful as it could be. Ideas:

  • highlight lines from different files in different colors
  • prepend logfile name (probably unreadable)
  • implement search field for free text / regex searching
  • enable quick filters: ERROR, CRITICAL, WARNING

relative path names in values for --coreDir and --kernelDir

Something like

tapasco -v --coreDir ./Cores --kernelDir ./Ips/ hls FooIp -p vcu118

does not work on my system. In my shell the environment variable TAPASCO_HOME is set to /home/demo/tapasco.

The error message shows that tapasco appends Ips to the TAPASCO_HOME environment variable. Is this intended behavior?

Exception in thread "main" java.lang.IllegalArgumentException: requirement failed: kernels directory /home/demo/tapasco/Ips does not exist
	at scala.Predef$.require(Predef.scala:277)
	at de.tu_darmstadt.cs.esa.tapasco.base.ConfigurationImpl.$anonfun$new$2(ConfigurationImpl.scala:77)
	at de.tu_darmstadt.cs.esa.tapasco.base.ConfigurationImpl.$anonfun$new$2$adapted(ConfigurationImpl.scala:74)
	at scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:788)
	at scala.collection.immutable.List.foreach(List.scala:388)
	at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:787)
	at de.tu_darmstadt.cs.esa.tapasco.base.ConfigurationImpl.<init>(ConfigurationImpl.scala:74)
	at de.tu_darmstadt.cs.esa.tapasco.base.ConfigurationImpl.copy(ConfigurationImpl.scala:50)
	at de.tu_darmstadt.cs.esa.tapasco.base.ConfigurationImpl.kernelDir(ConfigurationImpl.scala:60)
	at de.tu_darmstadt.cs.esa.tapasco.parser.GlobalOptions$.mkConfig(GlobalOptions.scala:113)
	at de.tu_darmstadt.cs.esa.tapasco.parser.GlobalOptions$.$anonfun$globalOptions$1(GlobalOptions.scala:104)
	at fastparse.parsers.Transformers$Mapper.parseRec(Transformers.scala:20)
	at fastparse.parsers.Combinators$Sequence$Flat.parseRec(Combinators.scala:316)
	at fastparse.parsers.Transformers$Mapper.parseRec(Transformers.scala:19)
	at fastparse.core.Parser.parseInput(Parsing.scala:376)
	at fastparse.core.Parser.parse(Parsing.scala:358)
	at de.tu_darmstadt.cs.esa.tapasco.parser.CommandLineParser$.apply(CommandLineParser.scala:35)
	at de.tu_darmstadt.cs.esa.tapasco.Tapasco$.main(Tapasco.scala:76)
	at de.tu_darmstadt.cs.esa.tapasco.Tapasco.main(Tapasco.scala)

Boot: Replace Xilinx Root FS

The rootfs is currently repurposed from the official PyNQ image (publicly available). This has been convenient, but in the long term it would be preferable to build a custom rootfs from scratch with less baggage. Replace it with an Ubuntu rootfs, or buildroot.

Properly report incorrect kernel.json files

tapasco -v hls mytest -p vcu118 does not produce a helpful error message on the following json:

{
    "Description" : "Matrix multiplication kernel",
    "Name" : "mytesu",
     ...
    "Files" : ["mytest.c"],
    "Arguments" : [
       ...
    ]
}
[15:49:00 <main: Tapasco$> ERROR] no valid Kernels selected! (available: warraw, mytesu, myarray, arrayinit, arrayupdate, arraysum, counter, sudoku, rot13)
  • Maybe add a check for correct semantics?

Build fails at target tapasco-benchmark

System Details

  • 4.4.0-64-generic
  • g++-5 (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609
  • g++-4.9 (Ubuntu 4.9.3-13ubuntu2) 4.9.3
  • Ubuntu LTS 16.04
  • Vivado installed (2018.2); in Path
  • PATH="/home/demo/tapasco/bin:/home/demo/tapasco/build/install/usr/local/bin/:/home/demo/moreSpace/Vivado/2018.2/bin:/home/demo/tapasco/bin:/home/demo/tapasco/build/install/usr/local/bin/:/home/demo/moreSpace/Vivado/2018.2/bin:/home/demo/tapasco/bin:/home/demo/tapasco/build/install/usr/local/bin/:/home/demo/moreSpace/Vivado/2018.2/bin:/home/demo/.sdkman/candidates/sbt/current/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin"

Steps to reproduce

  • cd ~/tapasco
  • mkdir build && cd build
  • cmake ..
  • make

Make output at fail

[ 90%] Generating json11/json11.cpp, json11/json11.hpp
Klone nach '/home/demo/tapasco/build/examples/tapasco-benchmark/json11' ...
remote: Enumerating objects: 299, done.
remote: Total 299 (delta 0), reused 0 (delta 0), pack-reused 299
Empfange Objekte: 100% (299/299), 82.88 KiB | 0 bytes/s, Fertig.
Löse Unterschiede auf: 100% (164/164), Fertig.
Prüfe Konnektivität ... Fertig.
Scanning dependencies of target tapasco-benchmark
[ 92%] Building CXX object examples/tapasco-benchmark/CMakeFiles/tapasco-benchmark.dir/tapasco_benchmark.cpp.o
In file included from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:34:0:
/home/demo/tapasco/arch/include/tapasco.hpp:30:2: error: #error "g++ 5.x.x or newer required (C++11 features)"
 #error "g++ 5.x.x or newer required (C++11 features)"
  ^
In file included from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:34:0:
/home/demo/tapasco/arch/include/tapasco.hpp: In constructor ‘tapasco::OutOnly<T>::OutOnly(T&)’:
/home/demo/tapasco/arch/include/tapasco.hpp:59:19: error: ‘is_trivially_copyable’ was not declared in this scope
     static_assert(is_trivially_copyable<T>::value, "Types must be trivially copyable!");
                   ^
/home/demo/tapasco/arch/include/tapasco.hpp:59:42: error: expected primary-expression before ‘>’ token
     static_assert(is_trivially_copyable<T>::value, "Types must be trivially copyable!");
                                          ^
/home/demo/tapasco/arch/include/tapasco.hpp:59:43: error: ‘::value’ has not been declared
     static_assert(is_trivially_copyable<T>::value, "Types must be trivially copyable!");
                                           ^
/home/demo/tapasco/arch/include/tapasco.hpp: In constructor ‘tapasco::InOnly<T>::InOnly(T&)’:
/home/demo/tapasco/arch/include/tapasco.hpp:79:19: error: ‘is_trivially_copyable’ was not declared in this scope
     static_assert(is_trivially_copyable<T>::value, "Types must be trivially copyable!");
                   ^
/home/demo/tapasco/arch/include/tapasco.hpp:79:42: error: expected primary-expression before ‘>’ token
     static_assert(is_trivially_copyable<T>::value, "Types must be trivially copyable!");
                                          ^
/home/demo/tapasco/arch/include/tapasco.hpp:79:43: error: ‘::value’ has not been declared
     static_assert(is_trivially_copyable<T>::value, "Types must be trivially copyable!");
                                           ^
/home/demo/tapasco/arch/include/tapasco.hpp: In constructor ‘tapasco::RetVal<T>::RetVal(T&)’:
/home/demo/tapasco/arch/include/tapasco.hpp:98:19: error: ‘is_trivially_copyable’ was not declared in this scope
     static_assert(is_trivially_copyable<T>::value, "Types must be trivially copyable!");
                   ^
/home/demo/tapasco/arch/include/tapasco.hpp:98:42: error: expected primary-expression before ‘>’ token
     static_assert(is_trivially_copyable<T>::value, "Types must be trivially copyable!");
                                          ^
/home/demo/tapasco/arch/include/tapasco.hpp:98:43: error: ‘::value’ has not been declared
     static_assert(is_trivially_copyable<T>::value, "Types must be trivially copyable!");
                                           ^
/home/demo/tapasco/arch/include/tapasco.hpp: In constructor ‘tapasco::Local<T>::Local(T&)’:
/home/demo/tapasco/arch/include/tapasco.hpp:110:19: error: ‘is_trivially_copyable’ was not declared in this scope
     static_assert(is_trivially_copyable<T>::value, "Types must be trivially copyable!");
                   ^
/home/demo/tapasco/arch/include/tapasco.hpp:110:42: error: expected primary-expression before ‘>’ token
     static_assert(is_trivially_copyable<T>::value, "Types must be trivially copyable!");
                                          ^
/home/demo/tapasco/arch/include/tapasco.hpp:110:43: error: ‘::value’ has not been declared
     static_assert(is_trivially_copyable<T>::value, "Types must be trivially copyable!");
                                           ^
/home/demo/tapasco/arch/include/tapasco.hpp: In constructor ‘tapasco::WrappedPointer<T>::WrappedPointer(T*, size_t)’:
/home/demo/tapasco/arch/include/tapasco.hpp:126:19: error: ‘is_trivially_copyable’ was not declared in this scope
     static_assert(is_trivially_copyable<T>::value, "Types must be trivially copyable!");
                   ^
/home/demo/tapasco/arch/include/tapasco.hpp:126:42: error: expected primary-expression before ‘>’ token
     static_assert(is_trivially_copyable<T>::value, "Types must be trivially copyable!");
                                          ^
/home/demo/tapasco/arch/include/tapasco.hpp:126:43: error: ‘::value’ has not been declared
     static_assert(is_trivially_copyable<T>::value, "Types must be trivially copyable!");
                                           ^
In file included from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:34:0:
/home/demo/tapasco/arch/include/tapasco.hpp: In member function ‘tapasco_res_t tapasco::Tapasco::set_arg(tapasco_job_id_t, size_t, tapasco::WrappedPointer<T>, tapasco_device_alloc_flag_t, tapasco_copy_direction_flag_t)’:
/home/demo/tapasco/arch/include/tapasco.hpp:414:19: error: ‘is_trivially_copyable’ was not declared in this scope
     static_assert(is_trivially_copyable<T>::value, "Types must be trivially copyable!");
                   ^
/home/demo/tapasco/arch/include/tapasco.hpp:414:42: error: expected primary-expression before ‘>’ token
     static_assert(is_trivially_copyable<T>::value, "Types must be trivially copyable!");
                                          ^
/home/demo/tapasco/arch/include/tapasco.hpp:414:43: error: ‘::value’ has not been declared
     static_assert(is_trivially_copyable<T>::value, "Types must be trivially copyable!");
                                           ^
In file included from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:42:0:
/home/demo/tapasco/examples/tapasco-benchmark/TransferSpeed.hpp: In member function ‘double TransferSpeed::operator()(size_t, long int)’:
/home/demo/tapasco/examples/tapasco-benchmark/TransferSpeed.hpp:46:60: error: no match for ‘operator-’ (operand types are ‘std::chrono::_V2::system_clock::time_point {aka std::chrono::time_point<std::chrono::_V2::system_clock, std::chrono::duration<long int, std::ratio<1l, 1000000000l> > >}’ and ‘std::initializer_list<std::chrono::time_point<std::chrono::_V2::system_clock, std::chrono::duration<long int, std::ratio<1l, 1000000000l> > > >’)
     duration<double> d      { high_resolution_clock::now() - tstart };
                                                            ^
/home/demo/tapasco/examples/tapasco-benchmark/TransferSpeed.hpp:46:60: note: candidates are:
In file included from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:30:0:
/usr/include/c++/4.9/chrono:393:7: note: template<class _Rep1, class _Period1, class _Rep2, class _Period2> constexpr typename std::common_type<std::chrono::duration<_Rep1, _Period1>, std::chrono::duration<_Rep2, _Period2> >::type std::chrono::operator-(const std::chrono::duration<_Rep1, _Period1>&, const std::chrono::duration<_Rep2, _Period2>&)
       operator-(const duration<_Rep1, _Period1>& __lhs,
       ^
/usr/include/c++/4.9/chrono:393:7: note:   template argument deduction/substitution failed:
In file included from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:42:0:
/home/demo/tapasco/examples/tapasco-benchmark/TransferSpeed.hpp:46:62: note:   ‘std::chrono::_V2::system_clock::time_point {aka std::chrono::time_point<std::chrono::_V2::system_clock, std::chrono::duration<long int, std::ratio<1l, 1000000000l> > >}’ is not derived from ‘const std::chrono::duration<_Rep1, _Period1>’
     duration<double> d      { high_resolution_clock::now() - tstart };
                                                              ^
In file included from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:30:0:
/usr/include/c++/4.9/chrono:639:7: note: template<class _Clock, class _Dur1, class _Rep2, class _Period2> constexpr std::chrono::time_point<_Clock, typename std::common_type<_Dur1, std::chrono::duration<_Rep2, _Period2> >::type> std::chrono::operator-(const std::chrono::time_point<_Clock, _Duration1>&, const std::chrono::duration<_Rep2, _Period2>&)
       operator-(const time_point<_Clock, _Dur1>& __lhs,
       ^
/usr/include/c++/4.9/chrono:639:7: note:   template argument deduction/substitution failed:
In file included from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:42:0:
/home/demo/tapasco/examples/tapasco-benchmark/TransferSpeed.hpp:46:62: note:   ‘std::initializer_list<std::chrono::time_point<std::chrono::_V2::system_clock, std::chrono::duration<long int, std::ratio<1l, 1000000000l> > > >’ is not derived from ‘const std::chrono::duration<_Rep2, _Period2>’
     duration<double> d      { high_resolution_clock::now() - tstart };
                                                              ^
In file included from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:30:0:
/usr/include/c++/4.9/chrono:650:7: note: template<class _Clock, class _Dur1, class _Dur2> constexpr typename std::common_type<_Duration1, _Duration2>::type std::chrono::operator-(const std::chrono::time_point<_Clock, _Duration1>&, const std::chrono::time_point<_Clock, _Duration2>&)
       operator-(const time_point<_Clock, _Dur1>& __lhs,
       ^
/usr/include/c++/4.9/chrono:650:7: note:   template argument deduction/substitution failed:
In file included from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:42:0:
/home/demo/tapasco/examples/tapasco-benchmark/TransferSpeed.hpp:46:62: note:   ‘std::initializer_list<std::chrono::time_point<std::chrono::_V2::system_clock, std::chrono::duration<long int, std::ratio<1l, 1000000000l> > > >’ is not derived from ‘const std::chrono::time_point<_Clock, _Duration2>’
     duration<double> d      { high_resolution_clock::now() - tstart };
                                                              ^
In file included from /usr/include/c++/4.9/vector:65:0,
                 from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:32:
/usr/include/c++/4.9/bits/stl_bvector.h:208:3: note: std::ptrdiff_t std::operator-(const std::_Bit_iterator_base&, const std::_Bit_iterator_base&)
   operator-(const _Bit_iterator_base& __x, const _Bit_iterator_base& __y)
   ^
/usr/include/c++/4.9/bits/stl_bvector.h:208:3: note:   no known conversion for argument 1 from ‘std::chrono::_V2::system_clock::time_point {aka std::chrono::time_point<std::chrono::_V2::system_clock, std::chrono::duration<long int, std::ratio<1l, 1000000000l> > >}’ to ‘const std::_Bit_iterator_base&’
In file included from /usr/include/c++/4.9/bits/stl_algobase.h:67:0,
                 from /usr/include/c++/4.9/bits/char_traits.h:39,
                 from /usr/include/c++/4.9/ios:40,
                 from /usr/include/c++/4.9/ostream:38,
                 from /usr/include/c++/4.9/iostream:39,
                 from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:26:
/usr/include/c++/4.9/bits/stl_iterator.h:1128:5: note: template<class _Iterator> decltype ((__x.base() - __y.base())) std::operator-(const std::move_iterator<_Iterator>&, const std::move_iterator<_Iterator>&)
     operator-(const move_iterator<_Iterator>& __x,
     ^
/usr/include/c++/4.9/bits/stl_iterator.h:1128:5: note:   template argument deduction/substitution failed:
In file included from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:42:0:
/home/demo/tapasco/examples/tapasco-benchmark/TransferSpeed.hpp:46:62: note:   ‘std::chrono::_V2::system_clock::time_point {aka std::chrono::time_point<std::chrono::_V2::system_clock, std::chrono::duration<long int, std::ratio<1l, 1000000000l> > >}’ is not derived from ‘const std::move_iterator<_Iterator>’
     duration<double> d      { high_resolution_clock::now() - tstart };
                                                              ^
In file included from /usr/include/c++/4.9/bits/stl_algobase.h:67:0,
                 from /usr/include/c++/4.9/bits/char_traits.h:39,
                 from /usr/include/c++/4.9/ios:40,
                 from /usr/include/c++/4.9/ostream:38,
                 from /usr/include/c++/4.9/iostream:39,
                 from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:26:
/usr/include/c++/4.9/bits/stl_iterator.h:1121:5: note: template<class _IteratorL, class _IteratorR> decltype ((__x.base() - __y.base())) std::operator-(const std::move_iterator<_Iterator>&, const std::move_iterator<_IteratorR>&)
     operator-(const move_iterator<_IteratorL>& __x,
     ^
/usr/include/c++/4.9/bits/stl_iterator.h:1121:5: note:   template argument deduction/substitution failed:
In file included from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:42:0:
/home/demo/tapasco/examples/tapasco-benchmark/TransferSpeed.hpp:46:62: note:   ‘std::chrono::_V2::system_clock::time_point {aka std::chrono::time_point<std::chrono::_V2::system_clock, std::chrono::duration<long int, std::ratio<1l, 1000000000l> > >}’ is not derived from ‘const std::move_iterator<_Iterator>’
     duration<double> d      { high_resolution_clock::now() - tstart };
                                                              ^
In file included from /usr/include/c++/4.9/bits/stl_algobase.h:67:0,
                 from /usr/include/c++/4.9/bits/char_traits.h:39,
                 from /usr/include/c++/4.9/ios:40,
                 from /usr/include/c++/4.9/ostream:38,
                 from /usr/include/c++/4.9/iostream:39,
                 from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:26:
/usr/include/c++/4.9/bits/stl_iterator.h:380:5: note: template<class _IteratorL, class _IteratorR> decltype ((__y.base() - __x.base())) std::operator-(const std::reverse_iterator<_Iterator>&, const std::reverse_iterator<_IteratorR>&)
     operator-(const reverse_iterator<_IteratorL>& __x,
     ^
/usr/include/c++/4.9/bits/stl_iterator.h:380:5: note:   template argument deduction/substitution failed:
In file included from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:42:0:
/home/demo/tapasco/examples/tapasco-benchmark/TransferSpeed.hpp:46:62: note:   ‘std::chrono::_V2::system_clock::time_point {aka std::chrono::time_point<std::chrono::_V2::system_clock, std::chrono::duration<long int, std::ratio<1l, 1000000000l> > >}’ is not derived from ‘const std::reverse_iterator<_Iterator>’
     duration<double> d      { high_resolution_clock::now() - tstart };
                                                              ^
In file included from /usr/include/c++/4.9/bits/stl_algobase.h:67:0,
                 from /usr/include/c++/4.9/bits/char_traits.h:39,
                 from /usr/include/c++/4.9/ios:40,
                 from /usr/include/c++/4.9/ostream:38,
                 from /usr/include/c++/4.9/iostream:39,
                 from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:26:
/usr/include/c++/4.9/bits/stl_iterator.h:328:5: note: template<class _Iterator> typename std::reverse_iterator<_Iterator>::difference_type std::operator-(const std::reverse_iterator<_Iterator>&, const std::reverse_iterator<_Iterator>&)
     operator-(const reverse_iterator<_Iterator>& __x,
     ^
/usr/include/c++/4.9/bits/stl_iterator.h:328:5: note:   template argument deduction/substitution failed:
In file included from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:42:0:
/home/demo/tapasco/examples/tapasco-benchmark/TransferSpeed.hpp:46:62: note:   ‘std::chrono::_V2::system_clock::time_point {aka std::chrono::time_point<std::chrono::_V2::system_clock, std::chrono::duration<long int, std::ratio<1l, 1000000000l> > >}’ is not derived from ‘const std::reverse_iterator<_Iterator>’
     duration<double> d      { high_resolution_clock::now() - tstart };
                                                              ^
/home/demo/tapasco/examples/tapasco-benchmark/TransferSpeed.hpp:46:69: error: no matching function for call to ‘std::chrono::duration<double>::duration(<brace-enclosed initializer list>)’
     duration<double> d      { high_resolution_clock::now() - tstart };
                                                                     ^
/home/demo/tapasco/examples/tapasco-benchmark/TransferSpeed.hpp:46:69: note: candidates are:
In file included from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:30:0:
/usr/include/c++/4.9/chrono:270:14: note: template<class _Rep2, class _Period2, class> constexpr std::chrono::duration<_Rep, _Period>::duration(const std::chrono::duration<_Rep, _Period>&)
    constexpr duration(const duration<_Rep2, _Period2>& __d)
              ^
/usr/include/c++/4.9/chrono:270:14: note:   template argument deduction/substitution failed:
/usr/include/c++/4.9/chrono:263:23: note: template<class _Rep2, class> constexpr std::chrono::duration<_Rep, _Period>::duration(const _Rep2&)
    constexpr explicit duration(const _Rep2& __rep)
                       ^
/usr/include/c++/4.9/chrono:263:23: note:   template argument deduction/substitution failed:
/usr/include/c++/4.9/chrono:257:2: note: std::chrono::duration<_Rep, _Period>::duration(const std::chrono::duration<_Rep, _Period>&) [with _Rep = double; _Period = std::ratio<1l>]
  duration(const duration&) = default;
  ^
/usr/include/c++/4.9/chrono:257:2: note:   no known conversion for argument 1 from ‘<type error>’ to ‘const std::chrono::duration<double>&’
/usr/include/c++/4.9/chrono:252:12: note: constexpr std::chrono::duration<_Rep, _Period>::duration() [with _Rep = double; _Period = std::ratio<1l>]
  constexpr duration() = default;
            ^
/usr/include/c++/4.9/chrono:252:12: note:   candidate expects 0 arguments, 1 provided
In file included from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:42:0:
/home/demo/tapasco/examples/tapasco-benchmark/TransferSpeed.hpp:62:40: error: no match for ‘operator-’ (operand types are ‘std::chrono::_V2::system_clock::time_point {aka std::chrono::time_point<std::chrono::_V2::system_clock, std::chrono::duration<long int, std::ratio<1l, 1000000000l> > >}’ and ‘std::initializer_list<std::chrono::time_point<std::chrono::_V2::system_clock, std::chrono::duration<long int, std::ratio<1l, 1000000000l> > > >’)
       d = high_resolution_clock::now() - tstart;
                                        ^
/home/demo/tapasco/examples/tapasco-benchmark/TransferSpeed.hpp:62:40: note: candidates are:
In file included from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:30:0:
/usr/include/c++/4.9/chrono:393:7: note: template<class _Rep1, class _Period1, class _Rep2, class _Period2> constexpr typename std::common_type<std::chrono::duration<_Rep1, _Period1>, std::chrono::duration<_Rep2, _Period2> >::type std::chrono::operator-(const std::chrono::duration<_Rep1, _Period1>&, const std::chrono::duration<_Rep2, _Period2>&)
       operator-(const duration<_Rep1, _Period1>& __lhs,
       ^
/usr/include/c++/4.9/chrono:393:7: note:   template argument deduction/substitution failed:
In file included from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:42:0:
/home/demo/tapasco/examples/tapasco-benchmark/TransferSpeed.hpp:62:42: note:   ‘std::chrono::_V2::system_clock::time_point {aka std::chrono::time_point<std::chrono::_V2::system_clock, std::chrono::duration<long int, std::ratio<1l, 1000000000l> > >}’ is not derived from ‘const std::chrono::duration<_Rep1, _Period1>’
       d = high_resolution_clock::now() - tstart;
                                          ^
In file included from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:30:0:
/usr/include/c++/4.9/chrono:639:7: note: template<class _Clock, class _Dur1, class _Rep2, class _Period2> constexpr std::chrono::time_point<_Clock, typename std::common_type<_Dur1, std::chrono::duration<_Rep2, _Period2> >::type> std::chrono::operator-(const std::chrono::time_point<_Clock, _Duration1>&, const std::chrono::duration<_Rep2, _Period2>&)
       operator-(const time_point<_Clock, _Dur1>& __lhs,
       ^
/usr/include/c++/4.9/chrono:639:7: note:   template argument deduction/substitution failed:
In file included from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:42:0:
/home/demo/tapasco/examples/tapasco-benchmark/TransferSpeed.hpp:62:42: note:   ‘std::initializer_list<std::chrono::time_point<std::chrono::_V2::system_clock, std::chrono::duration<long int, std::ratio<1l, 1000000000l> > > >’ is not derived from ‘const std::chrono::duration<_Rep2, _Period2>’
       d = high_resolution_clock::now() - tstart;
                                          ^
In file included from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:30:0:
/usr/include/c++/4.9/chrono:650:7: note: template<class _Clock, class _Dur1, class _Dur2> constexpr typename std::common_type<_Duration1, _Duration2>::type std::chrono::operator-(const std::chrono::time_point<_Clock, _Duration1>&, const std::chrono::time_point<_Clock, _Duration2>&)
       operator-(const time_point<_Clock, _Dur1>& __lhs,
       ^
/usr/include/c++/4.9/chrono:650:7: note:   template argument deduction/substitution failed:
In file included from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:42:0:
/home/demo/tapasco/examples/tapasco-benchmark/TransferSpeed.hpp:62:42: note:   ‘std::initializer_list<std::chrono::time_point<std::chrono::_V2::system_clock, std::chrono::duration<long int, std::ratio<1l, 1000000000l> > > >’ is not derived from ‘const std::chrono::time_point<_Clock, _Duration2>’
       d = high_resolution_clock::now() - tstart;
                                          ^
In file included from /usr/include/c++/4.9/vector:65:0,
                 from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:32:
/usr/include/c++/4.9/bits/stl_bvector.h:208:3: note: std::ptrdiff_t std::operator-(const std::_Bit_iterator_base&, const std::_Bit_iterator_base&)
   operator-(const _Bit_iterator_base& __x, const _Bit_iterator_base& __y)
   ^
/usr/include/c++/4.9/bits/stl_bvector.h:208:3: note:   no known conversion for argument 1 from ‘std::chrono::_V2::system_clock::time_point {aka std::chrono::time_point<std::chrono::_V2::system_clock, std::chrono::duration<long int, std::ratio<1l, 1000000000l> > >}’ to ‘const std::_Bit_iterator_base&’
In file included from /usr/include/c++/4.9/bits/stl_algobase.h:67:0,
                 from /usr/include/c++/4.9/bits/char_traits.h:39,
                 from /usr/include/c++/4.9/ios:40,
                 from /usr/include/c++/4.9/ostream:38,
                 from /usr/include/c++/4.9/iostream:39,
                 from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:26:
/usr/include/c++/4.9/bits/stl_iterator.h:1128:5: note: template<class _Iterator> decltype ((__x.base() - __y.base())) std::operator-(const std::move_iterator<_Iterator>&, const std::move_iterator<_Iterator>&)
     operator-(const move_iterator<_Iterator>& __x,
     ^
/usr/include/c++/4.9/bits/stl_iterator.h:1128:5: note:   template argument deduction/substitution failed:
In file included from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:42:0:
/home/demo/tapasco/examples/tapasco-benchmark/TransferSpeed.hpp:62:42: note:   ‘std::chrono::_V2::system_clock::time_point {aka std::chrono::time_point<std::chrono::_V2::system_clock, std::chrono::duration<long int, std::ratio<1l, 1000000000l> > >}’ is not derived from ‘const std::move_iterator<_Iterator>’
       d = high_resolution_clock::now() - tstart;
                                          ^
In file included from /usr/include/c++/4.9/bits/stl_algobase.h:67:0,
                 from /usr/include/c++/4.9/bits/char_traits.h:39,
                 from /usr/include/c++/4.9/ios:40,
                 from /usr/include/c++/4.9/ostream:38,
                 from /usr/include/c++/4.9/iostream:39,
                 from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:26:
/usr/include/c++/4.9/bits/stl_iterator.h:1121:5: note: template<class _IteratorL, class _IteratorR> decltype ((__x.base() - __y.base())) std::operator-(const std::move_iterator<_Iterator>&, const std::move_iterator<_IteratorR>&)
     operator-(const move_iterator<_IteratorL>& __x,
     ^
/usr/include/c++/4.9/bits/stl_iterator.h:1121:5: note:   template argument deduction/substitution failed:
In file included from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:42:0:
/home/demo/tapasco/examples/tapasco-benchmark/TransferSpeed.hpp:62:42: note:   ‘std::chrono::_V2::system_clock::time_point {aka std::chrono::time_point<std::chrono::_V2::system_clock, std::chrono::duration<long int, std::ratio<1l, 1000000000l> > >}’ is not derived from ‘const std::move_iterator<_Iterator>’
       d = high_resolution_clock::now() - tstart;
                                          ^
In file included from /usr/include/c++/4.9/bits/stl_algobase.h:67:0,
                 from /usr/include/c++/4.9/bits/char_traits.h:39,
                 from /usr/include/c++/4.9/ios:40,
                 from /usr/include/c++/4.9/ostream:38,
                 from /usr/include/c++/4.9/iostream:39,
                 from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:26:
/usr/include/c++/4.9/bits/stl_iterator.h:380:5: note: template<class _IteratorL, class _IteratorR> decltype ((__y.base() - __x.base())) std::operator-(const std::reverse_iterator<_Iterator>&, const std::reverse_iterator<_IteratorR>&)
     operator-(const reverse_iterator<_IteratorL>& __x,
     ^
/usr/include/c++/4.9/bits/stl_iterator.h:380:5: note:   template argument deduction/substitution failed:
In file included from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:42:0:
/home/demo/tapasco/examples/tapasco-benchmark/TransferSpeed.hpp:62:42: note:   ‘std::chrono::_V2::system_clock::time_point {aka std::chrono::time_point<std::chrono::_V2::system_clock, std::chrono::duration<long int, std::ratio<1l, 1000000000l> > >}’ is not derived from ‘const std::reverse_iterator<_Iterator>’
       d = high_resolution_clock::now() - tstart;
                                          ^
In file included from /usr/include/c++/4.9/bits/stl_algobase.h:67:0,
                 from /usr/include/c++/4.9/bits/char_traits.h:39,
                 from /usr/include/c++/4.9/ios:40,
                 from /usr/include/c++/4.9/ostream:38,
                 from /usr/include/c++/4.9/iostream:39,
                 from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:26:
/usr/include/c++/4.9/bits/stl_iterator.h:328:5: note: template<class _Iterator> typename std::reverse_iterator<_Iterator>::difference_type std::operator-(const std::reverse_iterator<_Iterator>&, const std::reverse_iterator<_Iterator>&)
     operator-(const reverse_iterator<_Iterator>& __x,
     ^
/usr/include/c++/4.9/bits/stl_iterator.h:328:5: note:   template argument deduction/substitution failed:
In file included from /home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:42:0:
/home/demo/tapasco/examples/tapasco-benchmark/TransferSpeed.hpp:62:42: note:   ‘std::chrono::_V2::system_clock::time_point {aka std::chrono::time_point<std::chrono::_V2::system_clock, std::chrono::duration<long int, std::ratio<1l, 1000000000l> > >}’ is not derived from ‘const std::reverse_iterator<_Iterator>’
       d = high_resolution_clock::now() - tstart;
                                          ^
/home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp: In function ‘int main(int, const char**)’:
/home/demo/tapasco/examples/tapasco-benchmark/tapasco_benchmark.cpp:182:45: error: ‘put_time’ was not declared in this scope
     str << put_time(&tm, "%Y-%m-%d %H:%M:%S");
                                             ^
examples/tapasco-benchmark/CMakeFiles/tapasco-benchmark.dir/build.make:70: die Regel für Ziel „examples/tapasco-benchmark/CMakeFiles/tapasco-benchmark.dir/tapasco_benchmark.cpp.o“ scheiterte
make[2]: *** [examples/tapasco-benchmark/CMakeFiles/tapasco-benchmark.dir/tapasco_benchmark.cpp.o] Fehler 1
CMakeFiles/Makefile2:573: die Regel für Ziel „examples/tapasco-benchmark/CMakeFiles/tapasco-benchmark.dir/all“ scheiterte
make[1]: *** [examples/tapasco-benchmark/CMakeFiles/tapasco-benchmark.dir/all] Fehler 2
Makefile:149: die Regel für Ziel „all“ scheiterte

Core Import: Add Synthesis and PnR parameters

It would be useful to be able to control the parameters of synthesis and implementation directly from TaPaSCo. Maybe we should define modes, e.g.,

  • fastest - lowest effort, minimal runtime
  • fast - slightly slower, but still short runtime
  • normal - default options
  • optimal - slower, get as close to real values as possible
  • aggressive_performance - maximal optimization to performance
  • aggressive_area - maximal optimization area

libncurses5-dev not in the list of required packages

Steps to reproduce

  • cd ~/tapasco
  • mkdir build && cd build
  • cmake ..

Cmake output at fail

-- The C compiler identification is GNU 4.9.3
-- The CXX compiler identification is GNU 4.9.3
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
CMake Error at /usr/share/cmake-3.5/Modules/FindPackageHandleStandardArgs.cmake:148 (message):
  Could NOT find Curses (missing: CURSES_LIBRARY CURSES_INCLUDE_PATH)
Call Stack (most recent call first):
  /usr/share/cmake-3.5/Modules/FindPackageHandleStandardArgs.cmake:388 (_FPHSA_FAILURE_MESSAGE)
  /usr/share/cmake-3.5/Modules/FindCurses.cmake:206 (FIND_PACKAGE_HANDLE_STANDARD_ARGS)
  examples/tapasco-debug/CMakeLists.txt:12 (find_package)


-- Configuring incomplete, errors occurred!

DSE: Abort runs after PlacerErrors

When a run results in a PlacerError it is extremely unlikely that any run with the same (or a larger) Composition will succeed. Runs are already pruned after the batch finishes, but it could be useful to be even more aggressive and abort runs in the current batch, if they would be pruned. This could speed up batches and increase convergence speed.

[VC709] Seperate addresses of different memory regions

Currently TPC has different devices at the same address depending on the viewpoint. For example the TPC configuration registers start at 0x0 which is visible from the host. The on-board DDR memory is also located at 0x0 but only visible by the DMA engine and the PEs. It might be advisable to split these memory regions. A new address map could look like

Address Device
0x0001000000000000 MIG
0x0002000000000000 Configuration
0x0003000000000000 PEs

etc. Accordingly Configuration and PEs would be separated into different BARs.

Infrastructure: Tapasco Status Core

Upgrade TPC Status Core to incorporate performance counters. Extend libtpc to gather statistics after the run, possibly writing to a file using a environment variable.

  • Version Register: Vivado
  • Version Register: TaPaSCo
  • PerfCounter: # or IRQs/slot
  • PerfCounter: busy cycles/slot
  • PerfCounter: IRQ cycles (waiting for ACK)/slot

Add PE to interrupt mapping in Status Core

Interrupts are currently mapped iterative to the corresponding interrupt line. To increase flexibility the status core can store the mapping used.

Advantages are flexible mappings that enable the use of more than one interrupt per PE.

Interrupts go missing sometimes

The PCIe MSIx interrupts coming from the DMA engine are received properly by the interrupt controller. The interrupt controller properly issues a AXI write request to the correct address in host memory. The PCIe AXI bridge does ACK the transfer and Bresp is OKAY. However, sometimes the interrupts do not reach the host for some reason. This can be confirmed checking /proc/interrupt.

This might be related to the interrupt controller taking too long. However, the DMA interrupt simply increases a value and schedules the userspace. This should not take too long.

Another alternative is that the PCIe bridge looses data when it is under heavy pressure.

For now as a quick fix I will try to disable a certain interrupt whenever the interrupt has just fired and see if that fixes the problem at the cost of latency. If that doesn't help maybe there is some possibility to remove protocol converters in between the interrupt handler and the PCIe bridge to avoid problems with those.

Overall no clear indication to what might go wrong as long as we don't have the hardware to debug right on the PCIe bus.

Feature: BRAM

Implement a Feature to generate a chunk of BRAM and map it into address space for small allocations. Parameters: size + offset

Implement tapasco_load_bitstream* functions

Since its inception, the TaPaSCo/TPC API had two functions to load a new bitstream at runtime. This is meant to support complex use cases where an application switches between multiple bitstreams optimized for the specific stage of computation. This is arguably a useful thing and reasonably simple to implement on Zynq (given appropriate permissions on /dev/xdevcfg).

Is there a way to implement similar support on PCIe devices with reasonable effort? I suppose it would involve an ICAP as a platform component; however, I'm not sure if this works with non-partial bitstreams.

Include Scala-based tools in OS-packages

Identify a reliable way to include the Scala-based tools (tapasco CLI-tool and itapasco) in the .deb/.rpm-packages.

All necessary dependencies should also be packaged.

Investigate Logic Utilization reports

It seems that the utilization report does not make sense for BRAM in the user logic. Sometimes utilization for user logic is higher than for the complete system logic.

`get_number_of_processors` does not take into account hyper-threading

Currently, the procedure get_number_of_processors is reading /proc/cpuinfo and returns the number of logical cores. On a 6-core CPU with hyper-threading, this results in 12 threads being launched for synthesis - when enabling out of context mode. This is probably slower than launching only 6 threads due to HT overhead. It also happens that this number of threads exhausts even large amounts of RAM (32 GiB in my case), so it could be required to lower the number of threads even below the number of available (physical) cores.

My suggestion is:

  • Use nproc (output can be used directly) instead of (probably error prone) parsing of /proc/cpuinfo
  • Maybe introduce an environment variable like TAPASCO_MAX_THREADS to set a maximum number of threads or TAPASCO_NUM_THREADS to use a fixed number of threads

Don't choke on patched Vivado versions

Currently, when using a patched Vivado version, Tapasco fails at least at two places:

Could not find /scratch/mio/tapasco_f1/common/common_2018.2_AR71715.tcl, Vivado 2018.2_AR71715 is not supported yet!

While this can easily be resolved by placing a symlink to the common_2018.2.tcl file, there is also a problem with the tapasco_status.json which causes Tapasco (or rather the JSON Parser) to throw an Exception and exit:

   "Versions"              : [{
        "Software" : "Vivado",
        "Year"     : 2018_AR71715,
        "Release"  : 2_AR71715

Results in:

com.fasterxml.jackson.core.JsonParseException: Unexpected character ('_' (code 95))

Allow PE Masters to have any valid AXI Data Width

The data width of PE masters is currently limited to either 32 or 64 bit. Considering that most platforms outside of Zynq have much broader memory controllers it is beneficial to support all valid AXI Data Widths up to 1024 bits. This might also be relevant for Zynq platforms if the designer of a PE wants to keep their logic simple and rely on data width converters to interface with the memories correctly.

Evaluate 64 bit for platform_addr_t

All platforms except for legacy Zynq, such as the PCIe based systems or MPSoC, use larger than 32 bit addresses. While we currently get by with smaller addresses this might change in the future and we should consider a move to 64 bit addresses.

I currently don't see any problem just changing the address width. The Zynq platform should continue to work with the required casts and all other platforms currently cast to 64 bit addresses anyway.

Fix VLNV for Vivado 2018.3

When composing a design for vc709 I encounter the following error:
ERROR: [BD 5-216] VLNV xilinx.com:ip:mig_7series:4.1 is not supported for the current part. The latest supported version for this part is:4.2
ERROR: [Common 17-39] 'create_bd_cell' failed due to earlier errors.

while executing

"create_bd_cell -type ip -vlnv $vlnv $name"
(procedure "tapasco::ip::create_mig_core" line 7)
invoked from within
"tapasco::ip::create_mig_core $name"
(procedure "create_mig_core" line 10)
invoked from within
"create_mig_core "mig""
(procedure "create_subsystem_memory" line 25)
invoked from within
"create_subsystem_memory"
("eval" body line 1)
invoked from within
"eval $cmd"
("foreach" body line 9)
invoked from within
"foreach ss $sss {
set name [string trim $ss "/"]
set cmd "create_subsystem_$name"
puts "Creating subsystem $name ..."
if {[ll..."
(procedure "platform::create" line 12)
invoked from within
"platform::create"

Just a reminder to fix the VLNV for Vivado 2018.3.

Slurm HLS: Store evaluation temporaries in accessible directory

Evaluation log and files are stored in /tmp, which is fine on a local system. In SLURM mode this means that the log file cannot be tracked in iTPC in many cases, since /tmp is node-local. Check if it is possible to move to a location that is at least accessible across the workgroup.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.