martinus / nanobench Goto Github PK
View Code? Open in Web Editor NEWSimple, fast, accurate single-header microbenchmarking functionality for C++11/14/17/20
Home Page: https://nanobench.ankerl.com
License: MIT License
Simple, fast, accurate single-header microbenchmarking functionality for C++11/14/17/20
Home Page: https://nanobench.ankerl.com
License: MIT License
Not entirely clear if mResult is reachable by the public interface, does appear necessary for printing results.
EG:
const Result &Bench::aggregate_result() noexcept {
return mResult;
}
C2598 linkage specification must be at global scope nb C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29333\include\setjmp.h 24
C2624 'WyRng::umul128::__m128d': local classes cannot be used to declare 'extern' variables nb C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29333\include\emmintrin.h 77
Is there a mechanism (similar to google benchmark) to perform manual timing. I.e. not use the clock builtin to nanbench but provide our own iteration time to nanobench.
For some benchmarks nanoseconds are too small a time unit. On the other hand, it isn't like their use is likely to actually hurt, so this may not provide enough benefit.
The latest v3.1.0
release does not match any of the examples since changing Config
to Bench
. This is quite a big (currently) undocumented API break and potentially warrants a new release?
I think it would be useful to be able to output to CSV files for some graphing in Excel.
Works great for me on Linux but not on macOS:
FAILED: _deps/nanobench-build/CMakeFiles/nanobench.dir/src/test/app/nanobench.cpp.o
ccache /usr/local/opt/ccache/libexec/c++ -I_deps/nanobench-src/src/include -isysroot /Applications/Xcode_12.4.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk -mmacosx-version-min=10.14 -MD -MT _deps/nanobench-build/CMakeFiles/nanobench.dir/src/test/app/nanobench.cpp.o -MF _deps/nanobench-build/CMakeFiles/nanobench.dir/src/test/app/nanobench.cpp.o.d -o _deps/nanobench-build/CMakeFiles/nanobench.dir/src/test/app/nanobench.cpp.o -c _deps/nanobench-src/src/test/app/nanobench.cpp
In file included from _deps/nanobench-src/src/test/app/nanobench.cpp:2:
Warning: _deps/nanobench-src/src/include/nanobench.h:117:15: warning: alias declarations are a C++11 extension [-Wc++11-extensions]
using Clock = std::conditional<std::chrono::high_resolution_clock::is_steady, std::chrono::high_resolution_clock,
^
Error: _deps/nanobench-src/src/include/nanobench.h:296:19: error: expected function body after function declarator
char const* csv() noexcept;
^
Error: _deps/nanobench-src/src/include/nanobench.h:308:27: error: expected function body after function declarator
char const* htmlBoxplot() noexcept;
^
Error: _deps/nanobench-src/src/include/nanobench.h:319:20: error: expected function body after function declarator
char const* json() noexcept;
^
Error: _deps/nanobench-src/src/include/nanobench.h:347:7: error: function definition does not declare parameters
T pageFaults{};
^
Error: _deps/nanobench-src/src/include/nanobench.h:348:7: error: function definition does not declare parameters
T cpuCycles{};
^
Error: _deps/nanobench-src/src/include/nanobench.h:349:7: error: function definition does not declare parameters
T contextSwitches{};
^
Error: _deps/nanobench-src/src/include/nanobench.h:350:7: error: function definition does not declare parameters
T instructions{};
^
Error: _deps/nanobench-src/src/include/nanobench.h:351:7: error: function definition does not declare parameters
T branchInstructions{};
^
Error: _deps/nanobench-src/src/include/nanobench.h:352:7: error: function definition does not declare parameters
T branchMisses{};
^
Warning: _deps/nanobench-src/src/include/nanobench.h:360:33: warning: in-class initialization of non-static data member is a C++11 extension [-Wc++11-extensions]
std::string mBenchmarkTitle = "benchmark";
^
Warning: _deps/nanobench-src/src/include/nanobench.h:361:32: warning: in-class initialization of non-static data member is a C++11 extension [-Wc++11-extensions]
std::string mBenchmarkName = "noname";
^
Warning: _deps/nanobench-src/src/include/nanobench.h:362:23: warning: in-class initialization of non-static data member is a C++11 extension [-Wc++11-extensions]
std::string mUnit = "op";
^
Warning: _deps/nanobench-src/src/include/nanobench.h:363:19: warning: in-class initialization of non-static data member is a C++11 extension [-Wc++11-extensions]
double mBatch = 1.0;
^
Warning: _deps/nanobench-src/src/include/nanobench.h:364:25: warning: in-class initialization of non-static data member is a C++11 extension [-Wc++11-extensions]
double mComplexityN = -1.0;
^
Warning: _deps/nanobench-src/src/include/nanobench.h:365:23: warning: in-class initialization of non-static data member is a C++11 extension [-Wc++11-extensions]
size_t mNumEpochs = 11;
^
Warning: _deps/nanobench-src/src/include/nanobench.h:366:37: warning: in-class initialization of non-static data member is a C++11 extension [-Wc++11-extensions]
size_t mClockResolutionMultiple = static_cast<size_t>(1000);
^
Warning: _deps/nanobench-src/src/include/nanobench.h:367:44: warning: in-class initialization of non-static data member is a C++11 extension [-Wc++11-extensions]
std::chrono::nanoseconds mMaxEpochTime = std::chrono::milliseconds(100);
^
Error: _deps/nanobench-src/src/include/nanobench.h:368:30: error: function definition does not declare parameters
std::chrono::nanoseconds mMinEpochTime{};
^
Error: _deps/nanobench-src/src/include/nanobench.h:369:14: error: function definition does not declare parameters
uint64_t mMinEpochIterations{1};
^
Error: _deps/nanobench-src/src/include/nanobench.h:370:14: error: function definition does not declare parameters
uint64_t mEpochIterations{0}; // If not 0, run *exactly* these number of iterations per epoch.
^
Warning: _deps/nanobench-src/src/include/nanobench.h:371:22: warning: in-class initialization of non-static data member is a C++11 extension [-Wc++11-extensions]
uint64_t mWarmup = 0;
^
Warning: _deps/nanobench-src/src/include/nanobench.h:372:24: warning: in-class initialization of non-static data member is a C++11 extension [-Wc++11-extensions]
std::ostream* mOut = nullptr;
^
Warning: _deps/nanobench-src/src/include/nanobench.h:373:45: warning: in-class initialization of non-static data member is a C++11 extension [-Wc++11-extensions]
std::chrono::duration<double> mTimeUnit = std::chrono::nanoseconds{1};
^
Warning: _deps/nanobench-src/src/include/nanobench.h:374:31: warning: in-class initialization of non-static data member is a C++11 extension [-Wc++11-extensions]
std::string mTimeUnitName = "ns";
^
Warning: _deps/nanobench-src/src/include/nanobench.h:375:35: warning: in-class initialization of non-static data member is a C++11 extension [-Wc++11-extensions]
bool mShowPerformanceCounters = true;
^
Warning: _deps/nanobench-src/src/include/nanobench.h:376:22: warning: in-class initialization of non-static data member is a C++11 extension [-Wc++11-extensions]
bool mIsRelative = false;
^
Warning: _deps/nanobench-src/src/include/nanobench.h:381:29: warning: rvalue references are a C++11 extension [-Wc++11-extensions]
Config& operator=(Config&&);
^
Warning: _deps/nanobench-src/src/include/nanobench.h:383:18: warning: rvalue references are a C++11 extension [-Wc++11-extensions]
Config(Config&&) noexcept;
^
Error: _deps/nanobench-src/src/include/nanobench.h:383:21: error: expected ';' at end of declaration list
Config(Config&&) noexcept;
^
;
Error: _deps/nanobench-src/src/include/nanobench.h:373:71: error: expected '(' for function-style cast or type construction
std::chrono::duration<double> mTimeUnit = std::chrono::nanoseconds{1};
~~~~~~~~~~~~~~~~~~~~~~~~^
Warning: _deps/nanobench-src/src/include/nanobench.h:391:10: warning: scoped enumerations are a C++11 extension [-Wc++11-extensions]
enum class Measure : size_t {
^
Warning: _deps/nanobench-src/src/include/nanobench.h:407:29: warning: rvalue references are a C++11 extension [-Wc++11-extensions]
Result& operator=(Result&&);
^
Warning: _deps/nanobench-src/src/include/nanobench.h:409:18: warning: rvalue references are a C++11 extension [-Wc++11-extensions]
Result(Result&&) noexcept;
^
Error: _deps/nanobench-src/src/include/nanobench.h:409:21: error: expected ';' at end of declaration list
Result(Result&&) noexcept;
^
;
Error: _deps/nanobench-src/src/include/nanobench.h:415:61: error: expected ';' at end of declaration list
ANKERL_NANOBENCH(NODISCARD) Config const& config() const noexcept;
^
;
Error: _deps/nanobench-src/src/include/nanobench.h:420:60: error: expected ';' at end of declaration list
ANKERL_NANOBENCH(NODISCARD) double sum(Measure m) const noexcept;
^
;
Error: _deps/nanobench-src/src/include/nanobench.h:421:80: error: expected ';' at end of declaration list
ANKERL_NANOBENCH(NODISCARD) double sumProduct(Measure m1, Measure m2) const noexcept;
^
;
Error: _deps/nanobench-src/src/include/nanobench.h:422:64: error: expected ';' at end of declaration list
ANKERL_NANOBENCH(NODISCARD) double minimum(Measure m) const noexcept;
^
;
fatal error: too many errors emitted, stopping now [-ferror-limit=]
found in bitcoin with
bench.minEpochIterations(10).batch(BATCH_SIZE * BATCHES).unit("job").epochs(100).run([&] {
Does nanobench support to filtering benchmark tasks like ./nanobench --fitler=.vector?
So more systems are tested, e.g. Mac
Hi,
I'm using nanobench in some of my projects, everything's good.
Some question though. On one of my linux system using arch-linux, I got all the cycles/value, IPC, branch etc measures displayed.
One of my coworker use Linux Mint ubuntu 18.04 and he got non of those.
Is there a specific package to install so that the extra perf counter get picked up ?
Ability to store and later compare results. Maybe multiple results, to create a graph with changes over time
Some statistical analysis would be nice. Maybe output in a format that's understood by some good tool
Sfc64, without random seeding
The documentation says
In the remaining examples, I’m using doctest as a unit test framework, which is like Catch2 - but compiles much faster. It pairs well with nanobench.
The benefits I can see from it is benchmark registration and filtering; is there anything else? And it may be useful to explicitly mention/show how it allows filtering.
On the other hand, you get the test results table which isn't too useful because they definitely should all pass. It can be suppressed by using
int main(int argc, char** argv) {
doctest::Context context;
context.setOption("out", "/dev/null");
context.setOption("no-version", true);
context.applyCommandLine(argc, argv);
return context.run();
}
which may also be worth mentioning.
ns/op | op/s | err% | ins/op | cyc/op | IPC | bra/op | miss% | total | benchmark |
---|---|---|---|---|---|---|---|---|---|
7,266,190.00 | 137.62 | 3.3% | 4,721,603.00 | 15,556,024.00 | 0.304 | 1,302,758.00 | 13.0% | 0.18 | hopscotch_map |
35,033,938.00 | 28.54 | 0.9% | 4,961,470.00 | 76,638,408.00 | 0.065 | 1,097,837.00 | 23.8% | 0.44 | unordered_map |
6,696,755.00 | 149.33 | 1.7% | 8,040,577.00 | 14,634,752.00 | 0.549 | 767,069.00 | 17.1% | 0.12 | flat_hash_map |
7,676,762.00 | 130.26 | 2.3% | 7,126,320.00 | 16,794,536.00 | 0.424 | 774,163.00 | 17.6% | 0.09 | F14FastMap |
flat_hash_map: ns/op : 6,696,755.00, total : 0.12
F14FastMap: ns/op: 7,676,762.00, total: 0.09
Instructions, branches, branch misses, cache information, ...
https://github.com/lemire/simdjson/blob/master/benchmark/linux/linux-perf-events.h
Benchmark all random number generators, plus my own. This will also be a good example
PERF_EVENT_IOC_ID
is not defined, see bitcoin/bitcoin#21549
Benchmark time
CPU?
Uname?
Clock resolution
Warnings
Also in json
E.g. run with
NANOBENCH_RUN_INFINITELY="std::sin(x)" ./benchmark
will run the benchmark with the name std::sin(x)
indefinitely long, so it is possible to attach with a profiler.
It would be nice to have a facility similar to Google Benchmark's PauseTiming()
and ResumeTiming()
for benchmarks that need a setup phase at each iteration.
I'd like to be able to perform some action before/after an iteration which doesn't get counted towards the runtime. For example, I want to copy an array before each iteration so it starts from a clean slate but I also do not want to measure the runtime of the copying.
There is a CMakeLists.txt
in the nanobench repository, but it seems for development and not for users of the library. It would be nice if nanobench could be integrated in a project via CMake's FetchContent or as a Git submodule (if one can't use relatively new versions of CMake). Specifically, the following CMakeLists.txt
should work:
cmake_minimum_required(VERSION 3.14)
project(
CMakeNanobenchExample
VERSION 1.0
LANGUAGES CXX)
include(FetchContent)
FetchContent_Declare(
doctest
GIT_REPOSITORY https://github.com/onqtam/doctest.git
GIT_TAG 2.4.0
GIT_SHALLOW TRUE)
FetchContent_Declare(
nanobench
GIT_REPOSITORY https://github.com/martinus/nanobench.git
GIT_TAG v4.0.2
GIT_SHALLOW TRUE)
FetchContent_MakeAvailable(doctest nanobench)
add_executable(MyExample my_example.cpp)
target_link_libraries(MyExample PRIVATE doctest nanobench)
Basically, what FetchContent_MakeAvailable
does here is cloning repositories declared by FetchContent_Declare
and then including them via add_subdirectory
.
The problems are:
CMakeLists.txt
in the nanobench repository performs many things, searching ccache and including subdirectories for testings, which are not relevant for library users. Especially, it leads to building many test programs. To avoid this, perhaps looking into CMakeLists.txt
of doctest or Catch2 is helpful, where they detect whether or not the libraries are included as a subdirectory. See also here, which is an example in this site.What does the err%
metric measure? I check my application with unit tests, libFuzzer, and many clang flags. I would expect this value to be zero, but it is reporting about 5% error rate.
Does err%
indicate that nanobench observes an iteration crashing? Could you please expand the documentation for what this metric means?
In some cases this significantly improves the precision with very little overhead. Done so in bitcoin's usage of nanobench
Hi, I want to do some microbenchmarking of parallel data structures where I run a bunch of parallel tasks and then time the work that they do in different threads. The problem is that I want to combine the timings somehow of the work done in each task to avoid including the overhead of the task scheduler. Any suggestions for using nanobench for this usage scenario? Thanks!
Though it is a good thing to show warnings for unstable results (:wavy_dash: ... (Unstable with ...
), sometimes one wants to run benchmarks in an environment where the execution is potentially perturbed by other processes and so results are known to be unstable; for example, in continuous integration.
So, it might be handy if an environment variable (NANOBENCH_NO_UNSTABLE_WARNING
or something) could change the behaviour and suppress warnings for unstable results.
Changing the behaviour by a command-line option could be an alternative way, but I think an environment variable is more suitable for CI, and I guess it is easier to implement.
The table is labled ns/op but the explanation says:
Which means that one x.compare_exchange_strong(y, 0); call takes 7.81s on my machine
Trying to add nanobench via vcpkg manifest yields bizarre results.
vcpkg manifest dependency looks like so,
...
"dependencies": [
"xtensor-blas",
"nanobench",
{
"name": "highfive",
"default-features": false,
"features": [ "xtensor" ]
},
...
It only seems to be happy if I don't add anything to CMake at all.
If I try to find_package(nanobench) or link against nanobench it complains that it can't be found or that I have to add to CMake_Prefix_List etc
Is not having anything setup against it in CMake standard practice for this or is there some wizardry going on?
Hi,
I am confused about the state of the CPU governor. nanobench reports:
Warning, results might be unstable:
- CPU governor is '' but should be 'performance'
- Turbo is enabled, CPU frequency will fluctuate
But I have the following configuration for my CPU frequency:
$ grep GOVERNOR /etc/init.d/cpufrequtils
# GOVERNOR="ondemand"
GOVERNOR="performance"
if [ -f $info ] && grep -q "\<$GOVERNOR\>" $info ; then
if [ -n "$GOVERNOR" ] ; then
CPUFREQ_OPTIONS="$CPUFREQ_OPTIONS --governor $GOVERNOR"
log_action_begin_msg "$DESC: Setting $GOVERNOR CPUFreq governor"
I have disabled the systemd service for ondemand. I regenerated initramfs and rebooted.
Is nanobench correctly detecting the CPU governor state?
Maybe nanobench is mistaken in its parsing. This is an EC2 instance virtual machine, I do not truly know if CPU governor customization is able to change. What do you think about hiding this message on virtual machines?
nanobench 4.3.0, Ubuntu 20.04 Focal Fossa, EC2 t2.micro.
I'm benchmarking protobuf. There are always some warning on unstable result on first run, even after increasing the warmup and minimum iteration:
TEST_CASE("varint encode benchmark"){
nanobench::Bench b;
b.title("varint encode")
.warmup(100000)
.relative(true);
b.performanceCounters(true);
uint8_t buf[10] = {};
b.minEpochIterations(10000000).run("google uint32_t", [&]{
nanobench::doNotOptimizeAway(
google::protobuf::io::CodedOutputStream::WriteVarint32ToArray(static_cast<uint32_t>(2961488830), buf));
});
b.minEpochIterations(10000000).run("google uint64_t", [&]{
nanobench::doNotOptimizeAway(
google::protobuf::io::CodedOutputStream::WriteVarint64ToArray(static_cast<uint64_t>(-41256202580718336), buf));
});
// std::ofstream f{"varint encode benchmark.html"};
// b.render(nanobench::templates::htmlBoxplot(), f);
} // TEST_CASE("varint encode benchmark")
TEST_CASE("varint decode benchmark"){
nanobench::Bench b;
b.title("varint decode")
.warmup(100000)
.relative(true);
b.performanceCounters(true);
std::initializer_list<uint8_t> buf32 = {0xbe, 0xf7, 0x92, 0x84, 0x0b};
std::initializer_list<uint8_t> buf64 = {0x9b, 0xa8, 0xf9, 0xc2, 0xbb, 0xd6, 0x80, 0x85, 0xa6, 0x01};
b.minEpochIterations(10000000).run("google uint32_t", [&]{
uint32_t v;
nanobench::doNotOptimizeAway(
google::protobuf::io::CodedInputStream{buf32.begin(), (int)buf32.size()}.ReadVarint32(&v));
});
b.minEpochIterations(10000000).run("google uint64_t", [&]{
uint64_t v;
nanobench::doNotOptimizeAway(
google::protobuf::io::CodedInputStream{buf64.begin(), (int)buf64.size()}.ReadVarint64(&v));
});
// std::ofstream f{"varint decode benchmark.html"};
// b.render(nanobench::templates::htmlBoxplot(), f);
} // TEST_CASE("varint decode benchmark")
encode results for 3 runs:
relative | ns/op | op/s | err% | total | varint encode |
---|---|---|---|---|---|
100.0% | 6.95 | 143,943,329.37 | 5.3% | 0.87 | 〰️ google uint32_t (Unstable with ~11,003,388.7 iters. Increase minEpochIterations to e.g. 110033887) |
54.0% | 12.86 | 77,751,471.35 | 2.0% | 1.56 | google uint64_t |
relative | ns/op | op/s | err% | total | varint encode |
---|---|---|---|---|---|
100.0% | 7.26 | 137,743,666.50 | 4.8% | 0.88 | google uint32_t |
55.0% | 13.21 | 75,702,505.04 | 1.6% | 1.62 | google uint64_t |
relative | ns/op | op/s | err% | total | varint encode |
---|---|---|---|---|---|
100.0% | 7.22 | 138,464,875.81 | 2.9% | 0.86 | google uint32_t |
56.4% | 12.80 | 78,102,820.79 | 1.9% | 1.55 | google uint64_t |
decode results for 3 runs:
relative | ns/op | op/s | err% | total | varint decode |
---|---|---|---|---|---|
100.0% | 26.93 | 37,132,748.98 | 1.9% | 3.26 | google uint32_t |
101.0% | 26.67 | 37,490,487.35 | 3.4% | 3.23 | google uint64_t |
relative | ns/op | op/s | err% | total | varint decode |
---|---|---|---|---|---|
100.0% | 26.64 | 37,543,305.23 | 1.4% | 3.23 | google uint32_t |
102.7% | 25.93 | 38,558,548.08 | 1.2% | 3.17 | google uint64_t |
relative | ns/op | op/s | err% | total | varint decode |
---|---|---|---|---|---|
100.0% | 27.45 | 36,434,999.13 | 0.8% | 3.31 | google uint32_t |
103.1% | 26.62 | 37,568,188.30 | 2.8% | 3.23 | google uint64_t |
compiled with vs2019 16.8 msvc /O2
runs on Win10, [email protected]
Can I use nanoBench to benchmark c codes?
If not, how can this be done?
See if the system is properly configured. See what pyperf actually does under it's hood, and then at least check that the proper values are set. Much like sudo python3 -m pyperf system
. Warn when something is not set.
Hi,
Is there a reason why does it recommend a fork of pyperf (https://github.com/vstinner/pyperf) instead of pyperf itself? (https://github.com/psf/pyperf)
Thanks!
Run app 100 times, show nice diagram with the accuracy
Perhaps there's already a way? But I couldn't find how to disable the output altogether. The use case is to be able to keep benchmarking with different data sizes so that one can programmatically determine algorithm thresholds using nanobench.
Can give hugely (about 100 times slower) differ results when run AVX2 code
int main(int argc, char ** argv)
{
alignas(32) float res[8];
float * mem = static_cast<float *>(operator new(256, std::align_val_t(32)));
float * mulmem = static_cast<float *>(operator new(256, std::align_val_t(32)));
Bench().run("simd", [&]()
{
__m256 simdvec_ = _mm256_loadu_ps(mem);
__m256 simdvecmul_ = _mm256_loadu_ps(mulmem);
simdvec_ = _mm256_mul_ps(simdvec_, simdvecmul_);
_mm256_storeu_ps(res, simdvec_);
});
return res[0];
}
Sometimes when i rebuild writes about 0.5 ns/op, and when i relaunch writes about 29 ns/op, i think it related to Windows 11 thread manager or/and because my processor is Alder Lake i5-12600k with E-cores.
Google Benchmark seems give more consistent results about 0.23 ns
add number of iterations etc. Do this optional, e.g. with an environment variable
If the benchmarked code itself uses threads and you want to compare it to single-threaded code, you may want to use real-time ("wallclock") measurements for latency comparisons... Without
UseRealTime
, CPU time is used by default.
and
By default, the CPU timer only measures the time spent by the main thread. If the benchmark itself uses threads internally, this measurement may not be what you are looking for. Instead, there is a way to measure the total CPU usage of the process, by all the threads.
Does nanobench use wallclock or CPU time? And if CPU, for all threads or main thread only? I'd assume wallclock, but not certain. You may also want to make it configurable.
All microbenchmark software run code under benchmark in loop. This will lead CPU running with all instructions are hot, cached with microcode commands, predictions known for 1M iterations ahead...
My Feature request is to add some single run tests with getting time from rdtsc before common benchmark.
Next run current benchmark.
Benchmark statistics need include cold and two hot tsc for each test.
This three numbers (cold and two hot runs) will provide information about times for cold code run (first run when instructions not in CPU cache) and running cached instructions (and data). Two cold numbers will allow evaluate jitter. This numbers will be not so accurate as running code 1M times and get dispersion, but it will provide useful information for running times without trained predictor.
It compiles fine with g++, fails with clang++ with e.g
nb.cpp:(.text+0x2f): undefined reference to `ankerl::nanobench::Config::Config()'
nb.cpp:(.text+0xcf): undefined reference to `ankerl::nanobench::Config::~Config()'
nb.cpp:(.text+0x11d): undefined reference to `ankerl::nanobench::Config::~Config()'
/tmp/nb-dff5f7.o: In function `ankerl::nanobench::Result ankerl::nanobench::Config::run<main::$_0>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, main::$_0)':
nb.cpp:(.text+0x197): undefined reference to `ankerl::nanobench::detail::IterationLogic::IterationLogic(ankerl::nanobench::Config const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)'
nb.cpp:(.text+0x1af): undefined reference to `ankerl::nanobench::detail::IterationLogic::numIters() const'
nb.cpp:(.text+0x26f): undefined reference to `ankerl::nanobench::detail::IterationLogic::add(std::chrono::duration<long, std::ratio<1l, 1000000000l> >)'
nb.cpp:(.text+0x2a2): undefined reference to `ankerl::nanobench::detail::IterationLogic::result() const'
nb.cpp:(.text+0x2de): undefined reference to `ankerl::nanobench::detail::IterationLogic::result() const'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
``
This is likely because the definition of those methods can't be resolved by clang++ somehow.
Recently I've been testing a few different sorting algorithms, the rough setup has a preallocated block of memory filled with random data. However that means for each benchmark run after a sort, I have to scramble or generate more random data, and this code is shared between all the benchmarks. Which roughly translates to a benchmark which is really measuring the cost of those two things together, rather than just the sorting algorithm on its own. Presumably the sort algorithm overwhelms the cost of generating new data, but it's difficult to gauge exactly how much of a cost generating data is without running a benchmark with that part on it's own.
It seems almost as if with a few edits to add additional callbacks it'd be possible to add timings or even ignore parts of code which are not really part of the test. If a second callback doesn't make the code more unstable / slow to test it'd probably be a handy tool for cases like this.
Might look like:
template <typename Start, typename Op, typename End>
ANKERL_NANOBENCH(NOINLINE)
Bench& run(std::string const& benchmarkName, Start&& start, Op&& op, End&& end);
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.