Giter Site home page Giter Site logo

pc2 / aurora-hls Goto Github PK

View Code? Open in Web Editor NEW
7.0 7.0 2.0 450 KB

Ready-to-link, packaged Aurora IP on four QSFP28 lanes, providing 100Gb/s throughput

License: Apache License 2.0

Makefile 5.60% Shell 3.91% Jupyter Notebook 8.57% C++ 37.68% Verilog 28.75% Tcl 10.44% CMake 3.76% C 1.28%

aurora-hls's People

Contributors

mellich avatar michaellass avatar papeg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

papeg mellich

aurora-hls's Issues

Only support a single connection per Aurora core in emulator

Currently, the AuroraSwitch instance only works as a basic message router based on the provided string tags. This functionality can be "abused" by using it for many-to-all or all-to-many communication. However, this kind of setup is not supported by Aurora and should thus be prevented by the emulator in some way.

Add more general Aurora Constructor

Currently, the kernel name for the Aurora cores is hard-coded in the host header files. In most situations, the user will create his/her own bitstream using a custom link configuration. In these cases, the kernel name for the Aurora cores may be arbitrarily chosen by the user. There should be a constructor in our host side interface for the aurora core, that supports arbitrary kernel names.

Suggestion:
Offer a constructor taking a xrt::kernel instance.

Use less than 4 lanes with one aurora core

Using for example two lanes per aurora core could enable more network typologies, as one QSFP port could be connected in two different directions. This is just idea with no guarantees that it is possible.

Flow Control in Emulation

Currently, the TX direction of the emulator will never stall, even if the RX FIFO of the receiver is full. In that case, data will be buffered within 0MQ until the data can be pushed to the FIFO, so no data will get lost. However, this can lead to drastically different behavior, where executions working in emulation will deadlock in hardware.
To achieve a better quality of emulation results, (optional) flow control similar to the one used in hardware should be implemented in the emulator to prevent buffering of data outside the FIFOs.

Integrate Aurora Emulator in example code

Currently, the Aurora emulator in the emulation folder is not used by our actual Aurora Tests/Examples.
Instead, the XRT software emulation is used. We should also add support for the Aurora emulator to all our codes provided in this repository to show how it can be used.

Test other Boards

The design is only tested on the U280 Alveo card. Verify that it also works on other boards.

Make the CRC counter accessible from the host code

The CRC counter is a module, which counts the number of frames and the number of frames with errors. The correctness of the module can be verified with the integrated logic analyzer. But when passing the value out with the AXI Control module, it is just zero. This needs to be investigated. It may has something to do with constraints on which register addresses can be really be read.

Test transfer of messages smaller than 64 byte

The smallest message size which can be configured in the example design is 64 bytes, which is the width of the FIFO. The aurora core itself sends messages in chunks of 32 bytes. This can be reached with setting the keep bits in a single transfer to half of the bytes. This should flush the datawidth converter and lead to a single transfer of the aurora core. With some small changes to the issue kernel, it could be verified, if this is really the case.

Enable software emulation

Right now, designs containing this aurora kernel can only be verified in hardware, which results in a very tedious development process. One possible solution would be to create a drop-in replacement of the aurora core which uses named pipes instead of the QSFP ports.

Add helper functions to support two cores per MPI rank

Right now, the Aurora class is handling one aurora core. The check_status_core_global() function is assuming that there is one rank per aurora core. But in typical use cases, there is one rank per FPGA, which in this case means two aurora cores. This should be possible to handle in some way, maybe by adapting the Aurora Class or creating another Class which handles two aurora cores.

Implement software-controlled reset

Currently the FPGA needs to be reset/reprogrammed to flush the FIFOs in this design. It would be nice to have a software-triggerable reset that purges the FIFOs. This could also then automatically be performed when instantiating the Aurora object.

Before implementing this, one has to think about a proper sequence though, such that purged RX FIFOs are not immediately filled again by outstanding transfers of the connected party (e.g., if flow control was triggered such that not only RX but also TX FIFOs still contain data).

Remove MPI as dependency of the Aurora host interface

Currently, the Aurora host interface has MPI as a mandatory dependency:

#include <mpi.h>

However, there is no specific reason for this dependency and our Aurora core could also be used without it. Therefore, it should only be an optional dependency, i.e. by using preprocessing, to allow using the interface also in environments w/o MPI.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.