irobot-ros / ros2-performance Goto Github PK

Framework to evaluate peformance of ROS 2

License: BSD 3-Clause "New" or "Revised" License

CMake 6.48% Shell 9.60% C++ 66.69% Python 17.23%

ros2 cpp benchmark performance

ros2-performance's Introduction

iRobot ROS 2 Performance Evaluation Framework

This repository contains executables and tools that allow to easily simulate arbitrary ROS 2 systems and then measures their performance. The system topology can be provided at runtime using JSON files or with command line options.

The framework tracks the following metrics:

Latency
Reliability
CPU usage
Memory usage

The core of the framework is entirely developed in C++ and it has no external dependencies beside the ROS 2 core libraries. This makes it very easy to be compiled and used on embedded platforms. The iRobot cross-compilation framework can be found at: https://github.com/irobot-ros/ros2-cross-compilation.

Note that this framework is mostly meant for evaluating single process applications. Although it is also able to measure the performance of multi-process applications, not all metrics will be available in this case.

The nodes under test currently don't perform any sort of computation while they are tested. This means that most of the measured resource usage is due to the overhead of ROS2 communication.

Build

The only runtime requirement is ROS 2 rolling. The build machine requires Python 3, CMake and colcon.

mkdir -p ~/performance_ws/src
cd ~/performance_ws/src
git clone https://github.com/irobot-ros/ros2-performance
cd ros2-performance
git submodule update --init --recursive
cd ../..
colcon build

Run

The irobot_benchmark package contains the main application and example of graph topologies for evaluation.

source ~/performance_ws/install/setup.bash
cd ~/performance_ws/install/irobot_benchmark/lib/irobot_benchmark
./irobot_benchmark topology/sierra_nevada.json

The results will be printed to screen and also saved in the directory ./sierra_nevada_log.

Extending the performance framework and testing your own system

The irobot_benchmark/topology directory contains some examples of json files that can be used to define a system.

If you want to create your own JSON topology, follow the instructions on how to create a new topology. If you want to use your custom ROS 2 message interfaces in the topology, you should look at the performance_test_plugin_cmake.

Structure of the framework

performance_test: this package provides the performance_test::PerformanceNode class, which provides API for easily adding publishers, subscriptions, clients and services and monitor the performance of the communication. Moreover the performance_test::System class allows to start multiple nodes at the same time, while ensuring that they discover each other, and to monitor the performance of the whole system. Moreover, this pacakge contains scripts for visualizing the performance of applications.
performance_metrics: provides tools to measure and log various performance-related metrics in ROS 2 systems.
performance_test_msgs: this package contains basic interface definitions that are directly used by the performance_test package to measure performance.
performance_test_factory: this package provides the performance_test_factory::TemplateFactory class that can be used to create performance_test::PerformanceNode objects with specific publishers and subscriptions according to some arguments provided at runtime: this can be done through json files or command line options. The interfaces (msg and srv) that can be used in these nodes have to be defined in the so called performance_test_factory_plugins.
performance_test_plugin_cmake: this package provides the CMake function used to generate a factory plugin from interfaces definitions.
irobot_interfaces_plugin: this package is a performance_test_factory_plugin that provides all the interfaces used in the iRobot system topologies.
irobot_benchmark: this package provides our main benchmark application. This executable can load one or multiple json topologies and it creates a ROS2 system running in a specific process from each of them. It also contains sample json topologies.
composition_benchmark: this package contains applications and tools that can be used to profile ROS 2 composition concepts.

External tools and resources

Apex AI ROS2 Evaluation tool

ApexAI provides an alternative valid performance evaluation framework, which allows to test different type of messages. Our implementation is inspired by their work.

Other evaluation tools

DDS Vendors advertised performance

Performances discussions

Papers

ros2-performance's People

Contributors

Stargazers

Watchers

Forkers

raghaprasad aalon yyu juanrh malsbat cwyark daggarwa evshary betab0t eprosima ipa-rwu micro-ros zhouchengming99 alsora 5l1v3r1 taichi1999 tokr-bit kotenev tractonomy y-okumura-isp 1su1 lucianzhong jhdcs junjiewang-ai pikrypto bpwilcox peterliu2 channgo2203 ros-robot-series ros2middleware wentaobi lijian8 doinbo sfrobot fishtan nian47 daoran gilaadb linwk20 cheolung86 zongkai28 frankfanslc xczhanjun ccarballido woshigerunze jefferyyjhsu mjcarroll v1otusc postrantor jfinken royforlife mauropasse lp02781 laurencechenyi rodperex apojomovsky imstevenpmwork

ros2-performance's Issues

The `mauro/humble-events-executor` branch is missing `topology/debug_white_mountain_best_effort.json`

The mauro/humble-events-executor branch will not clean-build due to missing the topology/debug_white_mountain_best_effort.json file.

See CMakeLists.txt line 37

Please add this file.

Thanks!

whereis ros_performance_plot.py

Are all Python scripts under scripts unusable?

Plot is empty

Hi.when I try to plot the results,it's empty.

`source env.sh` error: Directory /root/perf_ws/install doesn't exist

thanks for your benchmark.When I try to repeat your experiments.I first run the :source env.sh. Then I get some problem:Directory /root/perf_ws/install doesn't exist.Also I wonder how to run the expriments(eg:cpu memory etc.)

unknown CMake command "rosidl_get_typesupport_target"

i am so sorry to bother you again. i just can not to colon all the package . Even so the ros2 environment is rolling .
Would you mind help me to solve the problem.

Linker relocation error

there is a error from cmake like this

Question about supporting CPU affinity

I am new to this benchmark application. My question is that: Does it support CPU affinity? I mean the option that we can run, for example, 50 nodes on one core and others 50 nodes on second core.

Thanks in advance and thanks for such an awesome application.

could not find ament_cmake

i am so sry to get another problem like that.

Errors in performance_test/scripts

Hi, I find something wrong when use for Publisher/Subscriber latency. Would you consider to fix it if the bug is confirmed? Here is something wrong I found.

PERFORMANCE_TEST_EXAMPLES_PKG change from performance_test to performance_test_factory in file performances/performance_test/env.sh line 22.

Because the app simple_pub_sub_main is in the performance_test_factory package, so it should change.

PERFORMANCE_TEST_EXAMPLES_PKG="performance_test_factory"

MSG_TYPES change from 10b to stamped10b in file performances/performance_test/scripts/pub_sub_ros2.sh line 18.

Because the type of msg in source file is stamped10b, maybe it had updated yet?

MSG_TYPES=${MSG_TYPES:="stamped10b"}

I find that in branch of foxy.

Thanks. Hope for replying.

The C compiler is not able to compile a simple test program

i am sorry to bother you again ,
i remove the package and now when i restart the project
there is also a fehler

pub_sub examples produce empty CSV files

I can now bash some .sh file,and get some results--csv file.But these files are empty

I cant get a plot by using python3

has no member named ‘add_on_shutdown_callback’

we run into the following problem when we do the build, does anyone have an idea how we can solve the problem?

ros@roshost1:~/performance_ws$ colcon build
Starting >>> performance_test_msgs
Finished <<< performance_test_msgs [8.10s]                     
Starting >>> performance_metrics
Finished <<< performance_metrics [7.30s]                      
Starting >>> performance_test
--- stderr: performance_test                              
/home/ros/performance_ws/src/ros2-performance/performance_test/src/executors.cpp: In function ‘void performance_test::sleep_task(std::chrono::milliseconds)’:
/home/ros/performance_ws/src/ros2-performance/performance_test/src/executors.cpp:61:39: error: ‘using element_type = class rclcpp::contexts::DefaultContext’ {aka ‘class rclcpp::contexts::DefaultContext’} has no member named ‘add_on_shutdown_callback’; did you mean ‘get_on_shutdown_callbacks’?
   61 |   auto callback_handle = ros_context->add_on_shutdown_callback(
      |                                       ^~~~~~~~~~~~~~~~~~~~~~~~
      |                                       get_on_shutdown_callbacks
/home/ros/performance_ws/src/ros2-performance/performance_test/src/executors.cpp:74:16: error: ‘using element_type = class rclcpp::contexts::DefaultContext’ {aka ‘class rclcpp::contexts::DefaultContext’} has no member named ‘remove_on_shutdown_callback’; did you mean ‘get_on_shutdown_callbacks’?
   74 |   ros_context->remove_on_shutdown_callback(callback_handle);
      |                ^~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                get_on_shutdown_callbacks
make[2]: *** [CMakeFiles/performance_test.dir/build.make:63: CMakeFiles/performance_test.dir/src/executors.cpp.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:123: CMakeFiles/performance_test.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
make: *** [Makefile:141: all] Error 2
---
Failed   <<< performance_test [5.24s, exited with code 2]

Summary: 2 packages finished [20.8s]
  1 package failed: performance_test
  1 package had stderr output: performance_test
  4 packages not processed

The command provided to run.sh does not exists

when I try to "bash scripts/pub_sub_separate_process.sh",it shows that The command provided to run.sh does not exists!!
/home/calvin/performance01_ws/performance_test_factory/lib/performance_test_factory/subscriber_nodes_main.

How to use a custom JSON file

Error: The source directory does not contain a CMakeLists.txt file

CMake Error at CMakeLists.txt:22 (add_subdirectory):
The source directory

/home/humble/ros2-performance/performance_test_factory/external/json

does not contain a CMakeLists.txt file.

could not find ament_cmake

When I run colcon build ,it appears that CMake Error at CMakeLists.txt:13 (find_package):
By not providing "Findament_cmake.cmake" in CMAKE_MODULE_PATH this project
has asked CMake to find a package configuration file provided by
"ament_cmake", but CMake did not find one.

Isn't Benchmark changing thresholds with CL options implemented?

According to README.md, I tried to change thresholds, but it was seemed not to move.

How do I change these thresholds?

$ ./benchmark -t 10 topology/mont_blanc.json
:
$ cat mont_branc_log/events.txt
Time[ms]    Caller                   Code  Description         
91          SYSTEM                   0     [discovery] PDP completed
302         SYSTEM                   0     [discovery] EDP completed
306         amazon->lyon             1     msg 0 late. 3410us > 2000us
306         ganges->hamburg          1     msg 0 late. 3154us > 2000us
307         danube->hamburg          1     msg 0 late. 2744us > 2000us
308         tigris->hamburg          1     msg 0 late. 2340us > 2000us
308         danube->hamburg          1     msg 1 late. 2519us > 2000us
308         tigris->hamburg          1     msg 1 late. 2380us > 2000us
312         parana->osaka            1     msg 0 late. 6824us > 2000us
314         danube->ponce            1     msg 0 late. 9371us > 2000us
314         loire->ponce             1     msg 0 late. 5548us > 5000us
314         danube->ponce            1     msg 1 late. 8757us > 2000us
316         mekong->rotterdam        1     msg 0 late. 5621us > 5000us
316         congo->monaco            1     msg 0 late. 5599us > 5000us
317         columbia->tripoli        1     msg 0 late. 7980us > 5000us
317         columbia->taipei         1     msg 0 late. 8251us > 5000us
318         congo->geneva            1     msg 0 late. 7091us > 5000us
318         parana->geneva           2     msg 0 too late. 13107us > 10000us
318         parana->geneva           1     msg 1 late. 5871us > 2000us
320         salween->mandalay        1     msg 0 late. 10367us > 5000us
320         danube->mandalay         2     msg 0 too late. 15513us > 10000us
320         godavari->mandalay       1     msg 0 late. 10007us > 5000us
:
$ ./benchmark -t 10 --late-absolute 5 --too-late-absolute 50 topology/mont_blanc.json
:
$ cat mont_branc_log/events.txt
Time[ms]    Caller                   Code  Description         
61          SYSTEM                   0     [discovery] PDP completed
242         SYSTEM                   0     [discovery] EDP completed
247         amazon->lyon             1     msg 0 late. 3823us > 2000us
249         parana->osaka            1     msg 0 late. 4976us > 2000us
250         nile->hamburg            1     msg 0 late. 5689us > 2000us
250         parana->osaka            1     msg 1 late. 3039us > 2000us
250         tigris->hamburg          1     msg 0 late. 5790us > 2000us
251         ganges->hamburg          1     msg 0 late. 7584us > 2000us
251         danube->hamburg          1     msg 0 late. 6267us > 2000us
251         nile->hamburg            1     msg 1 late. 4164us > 2000us
251         tigris->hamburg          1     msg 1 late. 2998us > 2000us
251         ganges->hamburg          1     msg 1 late. 6099us > 2000us
252         danube->hamburg          1     msg 1 late. 2388us > 2000us
252         godavari->tripoli        1     msg 0 late. 5227us > 5000us
253         danube->ponce            1     msg 0 late. 8267us > 2000us
254         godavari->ponce          1     msg 0 late. 7056us > 5000us
254         yamuna->ponce            1     msg 0 late. 5389us > 5000us
254         loire->ponce             1     msg 0 late. 7196us > 5000us
254         danube->ponce            1     msg 1 late. 4899us > 2000us
256         congo->monaco            1     msg 0 late. 5968us > 5000us
257         mekong->rotterdam        1     msg 0 late. 5772us > 5000us
259         ganges->hamburg          1     msg 2 late. 2872us > 2000us
:

KeyError: 'time[ms]' in `cpu_ram_plot.py`

I use this command to generate csv files：

$ source env.sh
$ export MAX_PUBLISHERS=1
$ export MAX_SUBSCRIBERS=1
$ export MSG_TYPES="10b 100b 250b 1kb 10kb 100kb 250kb 1mb"
$ export PUBLISH_FREQUENCY="10 100 500 1000"
$ export DURATION=30
$ export NUM_EXPERIMENTS=10
$ export MON_CPU_RAM=true
$ bash scripts/pub_sub_ros2.sh
$ python scripts/plot_scripts/cpu_ram_plot.py <path_to_experiments> --x msg_type --y cpu --separator send_frequency --skip 5
Traceback (most recent call last):
  File "scripts/plot_scripts/cpu_ram_plot.py", line 209, in <module>
    main(sys.argv[1:])
  File "scripts/plot_scripts/cpu_ram_plot.py", line 182, in main
    parsed_csv = parse_csv(file_path, skip)
  File "scripts/plot_scripts/cpu_ram_plot.py", line 56, in parse_csv
    time = int(row_dict['time[ms]'])
KeyError: 'time[ms]'

Unknown CMake command rosidl_get_typesupport_target

i have get a Error Unknown Cmake command rosidl_get_typesupport_target

How to run multi-process applications

Hi all,

As I was running ros2-benchmark on Linux with ros2 dashing environment,
I use this command,
./benchmark topology/sierra_nevada.json -ipc off
But in the JSON file, the 'msg_pass_by' is 'shared pointer',
I want to evaluate the benchmarks for all nodes running in separate processes and also communication medium to be Sockets/UDP which is what I want to see. Is there some way to do that?

How to set `ROS2_PERFORMANCE_TEST_INSTALL_PATH`

SORRY,I have update the bug you fixed,But there is still the same problem.I just follow the bash_scriptsREADME.md to do.

performance_test README_dev not found

Could you please show how the framework is implemented and how it can be extended

How to set KeepAll History QoS in JSON files

may i change the QoS setting here in the json
i do not know if that function

Make `--tracking` off by default

Given CPU performance implications, turn event tracking off by default.

I believe that would simply be setting this to false but correct me if I'm wrong
If correct, I'd be happy to submit a quick PR. Please let me know the correct destination branch.

[Feature request] wait_pdp_discovery should use the topology to wait for participant discovery.

I all!

We have been using this tool to test the performance of Fast-RTPS for some time. Recently, Fast-RTPS has added a new feature which is a participant white-list, meaning that users can use the XML to set, for each participant, a list of participants that are "allowed" to be discovered. We believe that this feature really much applies to the benchmark case where the topology is known beforehand.

However, we have encountered a problem when using this participant white-listing, which is that we hit the assertion in performance_test::System::wait_pdp_discovery()

// check if maximum discovery time exceeded
auto t = std::chrono::high_resolution_clock::now();
auto duration =
        std::chrono::duration_cast<std::chrono::milliseconds>(t - pdp_start_time - max_pdp_time).count();
if (duration > 0){
        assert(0 && "[discovery] PDP took more than maximum discovery time");
}

The problem is that with the white-list, the nodes do not discover all the other nodes, but only they ones the are interested in. We have tackled this issue by avoiding to wait for participant discovery using PR #7, but we believe that a better approach would be to refactor performance_test::System::wait_pdp_discovery() so it uses the topology to check which participant has discovered which, returning when all the necessary connections are established.

bash scripts/pub_sub_ros2.sh error

I have fix this bug:deleting the s.But the error is same as before.

Passing two topologies to irobot_benchmark results in a 'std::out_of_range' exception

x86_64
commit b450132 (HEAD, master)
invoking irobot_benchmark and passing in two topologies results in a 'std::out_of_range' exception

$ ros2-performance/irobot_benchmark/irobot_benchmark \
        --topology \
            ros2-performance/irobot_benchmark/topology/white_mountain.json \
            ros2-performance/irobot_benchmark/topology/sierra_nevada.json
...
Start test!
[ResourceUsageLogger]: Logging to sierra_nevada_log/resources.txt
[ResourceUsageLogger]: Logging to white_mountain_log/resources.txt
...
terminate called after throwing an instance of 'std::out_of_range'
  what():  basic_string::substr: __pos (which is 18446744073709551615) > this->size() (which is 7)

main symptom appears to be caused by this static function in system.cpp
but the resulting stdout files look buggy and/or have bad data?

received[#]    mean[us]  late[#]   late[%]   too_late[#]    too_late[%]    lost[#]   lost[%]
12249          162       68        0.5551    8              0.06531        25499220830229100

and

Subscriptions stats:
node           topic          size[b]   received[#]    late[#]   too_late[#]    lost[#]   mean[us]  sd[us]    min[us]   max[us]   freq[hz]  throughput[Kb/s]
lyon           amazon         36        1002           0         0              215177861479566        50        19        554       100       7.04013        
hamburg        danube         8         1002           1         0              2151778614795126       114       23        2967      100       1.56605

Perhaps that invocation of irobot_benchmark is incorrect or isn't supported?
If truly an issue then I'll work on it and submit a PR.