benchi
is a benchmarking tool. It can benchmark some tools and inspect the results to generate neat, paper-ready comparative graphs and other various metrics.
Running:
- dispatch benchmarks in parallel
- for each benchmark dispatched, dispatch tools in parallel
- log stdout of the tools, and stderr if non-empty
- validate tool runs individually and against each other with user-defined validators
Analyzing data:
- cumulative plot of some or all of the tools
- comparative plot of two of the tools
- comparative plot distinguishing between validation status
- produce super cool breakdown of the data in markdown
- customizable markdown table generation
Running:
- retrieve and log user-defined data from runs (memory usage, something from the output, etc.)
Analyzing data:
- customizable LaTeX
tabular
generation - interactive data exploration
Clone this repository, make sure you have a recent version of rust installed and build with
cargo build --release
Your binary will be in target/release/benchi
.
The most up-to-date documentation is benchi's command-line help.
benchi help
To run benchmarks, you will need a simple configuration file documenting your tools and how to run them. Read about it with
benchi help conf
and generate a nice example with
benchi conf <file_name>
# Typically:
benchi conf example/test.conf
After creating the example configuration file, benchi will let you know about a few things you can do with it: running the "tools" it defines and generate plots of the results.
Running benchmarks is done through the run
subcommand. Read more about it with
benchi help run
Here are the two kinds of graphs benchi can generate. The gnuplot
file can be tweaked directly, there is currently no easy way to do it from benchi.
# To read more about plots:
benchi help plot
# Learn about cumulative plot options:
benchi plot help cumul
In the keys, the number between parentheses is the number of benchmark passed.
# Comparative plot:
benchi plot help compare
Comparative between two runs. The timeout used during the runs was 100 seconds, notice that the timeout line is actually slightly above the real timeout value to distinguish timeouts from almost-timeouts.