Giter Site home page Giter Site logo

cactusdynamics / cactus-rt Goto Github PK

View Code? Open in Web Editor NEW
84.0 84.0 19.0 5.26 MB

A C++ framework for programming real-time applications

License: Mozilla Public License 2.0

CMake 7.07% C++ 90.15% C 0.23% Dockerfile 0.46% Shell 1.05% Makefile 1.04%

cactus-rt's People

Contributors

botbench avatar shuhaowu avatar stephanie-eng avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

cactus-rt's Issues

std::thread parity

Should consider making cactus_rt::Thread equivalent to std::thread.

  • joinable()
  • std::terminate on destruction?

Built-in tracing

The framework should be able to automatically trace regions of application run similar to Golang's trace region API: https://pkg.go.dev/runtime/trace#hdr-User_annotation. The application can then leverage the same system to create even more detailed traces.

One possible way to do this is via LTTng-UST. Another way to do this is via Perfetto. The former should be lock-free and constant time, the latter should also be. However, the latter uses string interning internally and it's unclear how that affects the worst-case runtime of the trace API calls.

Refactor `TraceAggregator::Run`

  • Reduce duplication of logic during the loop and during the flush before exit.
  • Check for dropped message counts and investigate Perfetto data packets for appropriate signals to emit.
  • Consider using a buffer (maybe after benchmarking is setup)
  • Better handle write errors (both inside Run and outside Run in places like RegisterThreadTracer and RegisterSink)

Misc fixes for tracing

  • Investigate TraceSpan move operation correctness
  • Verify protobuf version and shutdown protobuf library at stop
  • Fix documentations such that images and links show up properly in doxygen

Builtin data logging with MPMC queue

The message passing example shows an example of passing data around via boost::lockfree::spsc_queue, which is fine as a demo. It would be more flexible to provide a built-in way to log a data struct directly via a MPMC queue.

One idea is to use iceoryx, which can even allow us to pass the data to another process. However, there's currently some latency issues with the latest release of iceoryx (which has since been fixed in production). It is also relatively complex to setup due to the requirement to start RouDi, it's service discovery layer.

Another possible library is https://github.com/cameron314/concurrentqueue.

Ideally, the thread has an API that allows the user to directly log data to disk without blocking. The serialization format can be pluggable, but we can have some "sane" defaults (like MCAP, CSV).

Bad variant access in simple example

On graceful termination of examples, the following error occurs:

terminate called after throwing an instance of 'std::bad_variant_access'
  what():  std::get: wrong index for variant

Benchmark tracing performance

  • Emit latency (min, avg, max, p99) and throughput (max emit per second).
  • TraceAggregator throughput: need to understand the impact of serializing the data and also file write performance
    • Can use a "test sink" that doesn't write to file to test for CPU overhead.
  • Data rate: need to know the data size per event which can be used to extrapolate IO rates when writing data to sinks. Compare results with when string interning is implemented.

Built-in tracing mega issue

Tracing is an important aspect of developing real-time applications as it allows the developer to identify long-running code blocks. This involves two components: a real-time trace collection system and an offline trace analysis/visualization system. The idea is to integrate trace collection into cactus_rt such that the program is automatically traced during development (either for the entire duration of the run, or be started/stopped dynamically via an external signal). The cactus_rt framework should also allow the program to be traced during production runs should the user opt to do so. If the performance impact of the trace event emission is low and the number of emissions are kept to a reasonably number, there's no reason why tracing can't be done continuously while the program is running to gain better insights into the program under production conditions.

A trace analysis system that includes gantt-chart-style visualization should be available for the tracing data. More complex analysis such as using SQL can also be good.

A bonus feature would be to pass log messages out of the RT thread and be able to format +print in a separate thread/process.

Perfetto

Perfetto is a Google-developed tracing tool with three major components: (1) the tracing SDK, (2) the trace processor, and (3) the trace visualizer. The tracing SDK enables application-specific traces by passing the trace data quickly out of the application process into a tracing service, which can then record the data into a file. It also has the ability to record the data directly in process, via a separate thread. The trace processor allows users to run SQL queries on an existing trace file, which can simplify the trace analysis. The trace visualizer is a web UI that allows for visualization of the trace data in a gantt-chart-style view, as well as providing a web UI for interacting SQL execution.

This theoretically checks all boxes on paper. My understanding on how it works is as follows, based on this document:

  1. When trace events are emitted, it grabs a free page in a shared memory buffer and serializes the protobuf-encoded message into it (via a specialized protozero library that has very low overhead).
  2. An async IPC gets sent to the tracing service which instructs the tracing service to copy the shared memory buffer into its own buffer (central buffer) and mark the shard memory buffer as free again for reuse.
  3. From the central buffer, the data is written either periodically to disk, or written at the end of the program, depending on the configuration.

However, after careful reading of the documentations and quick look through the code base shows that the emission of trace events are not real time safe. Specifically, the documentation states:

At some point one of the set_int_val() calls will hit the slow-path and acquire a new buffer. The overall idea is having a serialization mechanism that is extremely lightweight most of the times and that requires some extra function calls when buffer boundary, so that their [time] cost gets amortized across all trace events.

In the context of the overall Perfetto tracing use case, the slow-path involves grabbing a process-local mutex and finding the next free chunk in the shared memory buffer. Hence writes are lock-free as long as they happen within the thread-local chunk and require a critical section to acquire a new chunk once every 4KB-32KB (depending on the trace configuration).

My understanding is that this occurs during the shared memory buffer write. If a trace event is emitted from the RT thread at the same time as a non-RT thread and the slow-path is triggered (due to the buffer boundary being crossed by the trace packet), a priority inversion problem could occur, which can result in unbounded latency. Further, the documentation suggests that memory allocation occurs in the slow path (not 100% sure on this tho), which can also trigger problems for real-time.

Thus, Perfetto is not suitable for real-time production tracing. However, it's possible we can still use Perfetto to trace in development, and use a compile time flag to disable tracing for release builds.

Even though the Perfetto tracing SDK is unusable in real-time, we might still be able to use the trace processor and visualizer components, if we can emit a Perfetto-compatible data file with a custom tracing solution, perhaps based on LTTng. Since the Perfetto trace processor also takes the Chromium trace JSON format, we can maybe emit that as well.

Also, Perfetto tracing SDK can't pass log messages (by default), but can emit counter information which can be plotted in the UI.

LTTng

TBD.

revisit default heap reservation

Currently we reserve 512MB of heap on startup. Is this necessary? This is not that useful because the default allocator is not O(1), and RT code shouldn't allocate on the heap anyway.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.