Giter Site home page Giter Site logo

martinus / nanobench Goto Github PK

View Code? Open in Web Editor NEW
1.3K 20.0 75.0 7.5 MB

Simple, fast, accurate single-header microbenchmarking functionality for C++11/14/17/20

Home Page: https://nanobench.ankerl.com

License: MIT License

CMake 7.27% Shell 7.01% C++ 81.78% Python 3.07% HTML 0.87%
benchmark cpp microbenchmark single-header single-header-lib single-file header-only cpp11

nanobench's Introduction

nanobench's People

Contributors

chipot avatar cj-tommi-rantala avatar cozycactus avatar fferflo avatar jonas-schulze avatar lectem avatar martinus avatar mxmlkzdh avatar pr8x avatar tocic avatar tridacnid avatar vadi2 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nanobench's Issues

Add access to mResult?

Not entirely clear if mResult is reachable by the public interface, does appear necessary for printing results.
EG:

const Result &Bench::aggregate_result() noexcept {
return mResult;
}

Visual Studio Version 16.8.2: example_random_number_generators.cpp fails to build

C2598 linkage specification must be at global scope nb C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29333\include\setjmp.h 24

C2624 'WyRng::umul128::__m128d': local classes cannot be used to declare 'extern' variables nb C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29333\include\emmintrin.h 77

Manual Timing

Is there a mechanism (similar to google benchmark) to perform manual timing. I.e. not use the clock builtin to nanbench but provide our own iteration time to nanobench.

Request: CSV output

I think it would be useful to be able to output to CSV files for some graphing in Excel.

Doesn't compile on macOS

Works great for me on Linux but not on macOS:

  FAILED: _deps/nanobench-build/CMakeFiles/nanobench.dir/src/test/app/nanobench.cpp.o 
  ccache /usr/local/opt/ccache/libexec/c++  -I_deps/nanobench-src/src/include -isysroot /Applications/Xcode_12.4.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk -mmacosx-version-min=10.14 -MD -MT _deps/nanobench-build/CMakeFiles/nanobench.dir/src/test/app/nanobench.cpp.o -MF _deps/nanobench-build/CMakeFiles/nanobench.dir/src/test/app/nanobench.cpp.o.d -o _deps/nanobench-build/CMakeFiles/nanobench.dir/src/test/app/nanobench.cpp.o -c _deps/nanobench-src/src/test/app/nanobench.cpp
  In file included from _deps/nanobench-src/src/test/app/nanobench.cpp:2:
  Warning: _deps/nanobench-src/src/include/nanobench.h:117:15: warning: alias declarations are a C++11 extension [-Wc++11-extensions]
  using Clock = std::conditional<std::chrono::high_resolution_clock::is_steady, std::chrono::high_resolution_clock,
                ^
  Error: _deps/nanobench-src/src/include/nanobench.h:296:19: error: expected function body after function declarator
  char const* csv() noexcept;
                    ^
  Error: _deps/nanobench-src/src/include/nanobench.h:308:27: error: expected function body after function declarator
  char const* htmlBoxplot() noexcept;
                            ^
  Error: _deps/nanobench-src/src/include/nanobench.h:319:20: error: expected function body after function declarator
  char const* json() noexcept;
                     ^
  Error: _deps/nanobench-src/src/include/nanobench.h:347:7: error: function definition does not declare parameters
      T pageFaults{};
        ^
  Error: _deps/nanobench-src/src/include/nanobench.h:348:7: error: function definition does not declare parameters
      T cpuCycles{};
        ^
  Error: _deps/nanobench-src/src/include/nanobench.h:349:7: error: function definition does not declare parameters
      T contextSwitches{};
        ^
  Error: _deps/nanobench-src/src/include/nanobench.h:350:7: error: function definition does not declare parameters
      T instructions{};
        ^
  Error: _deps/nanobench-src/src/include/nanobench.h:351:7: error: function definition does not declare parameters
      T branchInstructions{};
        ^
  Error: _deps/nanobench-src/src/include/nanobench.h:352:7: error: function definition does not declare parameters
      T branchMisses{};
        ^
  Warning: _deps/nanobench-src/src/include/nanobench.h:360:33: warning: in-class initialization of non-static data member is a C++11 extension [-Wc++11-extensions]
      std::string mBenchmarkTitle = "benchmark";
                                  ^
  Warning: _deps/nanobench-src/src/include/nanobench.h:361:32: warning: in-class initialization of non-static data member is a C++11 extension [-Wc++11-extensions]
      std::string mBenchmarkName = "noname";
                                 ^
  Warning: _deps/nanobench-src/src/include/nanobench.h:362:23: warning: in-class initialization of non-static data member is a C++11 extension [-Wc++11-extensions]
      std::string mUnit = "op";
                        ^
  Warning: _deps/nanobench-src/src/include/nanobench.h:363:19: warning: in-class initialization of non-static data member is a C++11 extension [-Wc++11-extensions]
      double mBatch = 1.0;
                    ^
  Warning: _deps/nanobench-src/src/include/nanobench.h:364:25: warning: in-class initialization of non-static data member is a C++11 extension [-Wc++11-extensions]
      double mComplexityN = -1.0;
                          ^
  Warning: _deps/nanobench-src/src/include/nanobench.h:365:23: warning: in-class initialization of non-static data member is a C++11 extension [-Wc++11-extensions]
      size_t mNumEpochs = 11;
                        ^
  Warning: _deps/nanobench-src/src/include/nanobench.h:366:37: warning: in-class initialization of non-static data member is a C++11 extension [-Wc++11-extensions]
      size_t mClockResolutionMultiple = static_cast<size_t>(1000);
                                      ^
  Warning: _deps/nanobench-src/src/include/nanobench.h:367:44: warning: in-class initialization of non-static data member is a C++11 extension [-Wc++11-extensions]
      std::chrono::nanoseconds mMaxEpochTime = std::chrono::milliseconds(100);
                                             ^
  Error: _deps/nanobench-src/src/include/nanobench.h:368:30: error: function definition does not declare parameters
      std::chrono::nanoseconds mMinEpochTime{};
                               ^
  Error: _deps/nanobench-src/src/include/nanobench.h:369:14: error: function definition does not declare parameters
      uint64_t mMinEpochIterations{1};
               ^
  Error: _deps/nanobench-src/src/include/nanobench.h:370:14: error: function definition does not declare parameters
      uint64_t mEpochIterations{0}; // If not 0, run *exactly* these number of iterations per epoch.
               ^
  Warning: _deps/nanobench-src/src/include/nanobench.h:371:22: warning: in-class initialization of non-static data member is a C++11 extension [-Wc++11-extensions]
      uint64_t mWarmup = 0;
                       ^
  Warning: _deps/nanobench-src/src/include/nanobench.h:372:24: warning: in-class initialization of non-static data member is a C++11 extension [-Wc++11-extensions]
      std::ostream* mOut = nullptr;
                         ^
  Warning: _deps/nanobench-src/src/include/nanobench.h:373:45: warning: in-class initialization of non-static data member is a C++11 extension [-Wc++11-extensions]
      std::chrono::duration<double> mTimeUnit = std::chrono::nanoseconds{1};
                                              ^
  Warning: _deps/nanobench-src/src/include/nanobench.h:374:31: warning: in-class initialization of non-static data member is a C++11 extension [-Wc++11-extensions]
      std::string mTimeUnitName = "ns";
                                ^
  Warning: _deps/nanobench-src/src/include/nanobench.h:375:35: warning: in-class initialization of non-static data member is a C++11 extension [-Wc++11-extensions]
      bool mShowPerformanceCounters = true;
                                    ^
  Warning: _deps/nanobench-src/src/include/nanobench.h:376:22: warning: in-class initialization of non-static data member is a C++11 extension [-Wc++11-extensions]
      bool mIsRelative = false;
                       ^
  Warning: _deps/nanobench-src/src/include/nanobench.h:381:29: warning: rvalue references are a C++11 extension [-Wc++11-extensions]
      Config& operator=(Config&&);
                              ^
  Warning: _deps/nanobench-src/src/include/nanobench.h:383:18: warning: rvalue references are a C++11 extension [-Wc++11-extensions]
      Config(Config&&) noexcept;
                   ^
  Error: _deps/nanobench-src/src/include/nanobench.h:383:21: error: expected ';' at end of declaration list
      Config(Config&&) noexcept;
                      ^
                      ;
  Error: _deps/nanobench-src/src/include/nanobench.h:373:71: error: expected '(' for function-style cast or type construction
      std::chrono::duration<double> mTimeUnit = std::chrono::nanoseconds{1};
                                                ~~~~~~~~~~~~~~~~~~~~~~~~^
  Warning: _deps/nanobench-src/src/include/nanobench.h:391:10: warning: scoped enumerations are a C++11 extension [-Wc++11-extensions]
      enum class Measure : size_t {
           ^
  Warning: _deps/nanobench-src/src/include/nanobench.h:407:29: warning: rvalue references are a C++11 extension [-Wc++11-extensions]
      Result& operator=(Result&&);
                              ^
  Warning: _deps/nanobench-src/src/include/nanobench.h:409:18: warning: rvalue references are a C++11 extension [-Wc++11-extensions]
      Result(Result&&) noexcept;
                   ^
  Error: _deps/nanobench-src/src/include/nanobench.h:409:21: error: expected ';' at end of declaration list
      Result(Result&&) noexcept;
                      ^
                      ;
  Error: _deps/nanobench-src/src/include/nanobench.h:415:61: error: expected ';' at end of declaration list
      ANKERL_NANOBENCH(NODISCARD) Config const& config() const noexcept;
                                                              ^
                                                              ;
  Error: _deps/nanobench-src/src/include/nanobench.h:420:60: error: expected ';' at end of declaration list
      ANKERL_NANOBENCH(NODISCARD) double sum(Measure m) const noexcept;
                                                             ^
                                                             ;
  Error: _deps/nanobench-src/src/include/nanobench.h:421:80: error: expected ';' at end of declaration list
      ANKERL_NANOBENCH(NODISCARD) double sumProduct(Measure m1, Measure m2) const noexcept;
                                                                                 ^
                                                                                 ;
  Error: _deps/nanobench-src/src/include/nanobench.h:422:64: error: expected ';' at end of declaration list
      ANKERL_NANOBENCH(NODISCARD) double minimum(Measure m) const noexcept;
                                                                 ^
                                                                 ;
  fatal error: too many errors emitted, stopping now [-ferror-limit=]

the cycles/value output doesn't show on some platform

Hi,

I'm using nanobench in some of my projects, everything's good.
Some question though. On one of my linux system using arch-linux, I got all the cycles/value, IPC, branch etc measures displayed.
One of my coworker use Linux Mint ubuntu 18.04 and he got non of those.

Is there a specific package to install so that the extra perf counter get picked up ?

comparisons of results

Ability to store and later compare results. Maybe multiple results, to create a graph with changes over time

Some statistical analysis would be nice. Maybe output in a format that's understood by some good tool

add Rng

Sfc64, without random seeding

Clarify use of doctest in documentation?

The documentation says

In the remaining examples, I’m using doctest as a unit test framework, which is like Catch2 - but compiles much faster. It pairs well with nanobench.

The benefits I can see from it is benchmark registration and filtering; is there anything else? And it may be useful to explicitly mention/show how it allows filtering.

On the other hand, you get the test results table which isn't too useful because they definitely should all pass. It can be suppressed by using

int main(int argc, char** argv) {
    doctest::Context context;
    context.setOption("out", "/dev/null");
    context.setOption("no-version", true);
    context.applyCommandLine(argc, argv);
    return context.run();
}

which may also be worth mentioning.

which one is more important, ns/op or total?

ns/op op/s err% ins/op cyc/op IPC bra/op miss% total benchmark
7,266,190.00 137.62 3.3% 4,721,603.00 15,556,024.00 0.304 1,302,758.00 13.0% 0.18 hopscotch_map
35,033,938.00 28.54 0.9% 4,961,470.00 76,638,408.00 0.065 1,097,837.00 23.8% 0.44 unordered_map
6,696,755.00 149.33 1.7% 8,040,577.00 14,634,752.00 0.549 767,069.00 17.1% 0.12 flat_hash_map
7,676,762.00 130.26 2.3% 7,126,320.00 16,794,536.00 0.424 774,163.00 17.6% 0.09 F14FastMap

flat_hash_map: ns/op : 6,696,755.00, total : 0.12
F14FastMap: ns/op: 7,676,762.00, total: 0.09

Better documentaton

  • separate into multiple files
  • describe mustache-like templates (with demonstration graph)
  • Start with simple usage

pre and post actions for benching

I'd like to be able to perform some action before/after an iteration which doesn't get counted towards the runtime. For example, I want to copy an array before each iteration so it starts from a clean slate but I also do not want to measure the runtime of the copying.

CMake integration (interface library)

There is a CMakeLists.txt in the nanobench repository, but it seems for development and not for users of the library. It would be nice if nanobench could be integrated in a project via CMake's FetchContent or as a Git submodule (if one can't use relatively new versions of CMake). Specifically, the following CMakeLists.txt should work:

cmake_minimum_required(VERSION 3.14)

project(
  CMakeNanobenchExample
  VERSION 1.0
  LANGUAGES CXX)

include(FetchContent)

FetchContent_Declare(
  doctest
  GIT_REPOSITORY https://github.com/onqtam/doctest.git
  GIT_TAG 2.4.0
  GIT_SHALLOW TRUE)

FetchContent_Declare(
  nanobench
  GIT_REPOSITORY https://github.com/martinus/nanobench.git
  GIT_TAG v4.0.2
  GIT_SHALLOW TRUE)

FetchContent_MakeAvailable(doctest nanobench)

add_executable(MyExample my_example.cpp)
target_link_libraries(MyExample PRIVATE doctest nanobench)

Basically, what FetchContent_MakeAvailable does here is cloning repositories declared by FetchContent_Declare and then including them via add_subdirectory.

The problems are:

  1. nanobench is not an interface library. It should be easy to create a header-only library with CMake's interface library feature, for example, see here.
  2. CMakeLists.txt in the nanobench repository performs many things, searching ccache and including subdirectories for testings, which are not relevant for library users. Especially, it leads to building many test programs. To avoid this, perhaps looking into CMakeLists.txt of doctest or Catch2 is helpful, where they detect whether or not the libraries are included as a subdirectory. See also here, which is an example in this site.

Clarify err%

What does the err% metric measure? I check my application with unit tests, libFuzzer, and many clang flags. I would expect this value to be zero, but it is reporting about 5% error rate.

Does err% indicate that nanobench observes an iteration crashing? Could you please expand the documentation for what this metric means?

Raise default to 1ms runtime

In some cases this significantly improves the precision with very little overhead. Done so in bitcoin's usage of nanobench

Suggestions for benchmarking parallel data structures?

Hi, I want to do some microbenchmarking of parallel data structures where I run a bunch of parallel tasks and then time the work that they do in different threads. The problem is that I want to combine the timings somehow of the work done in each task to avoid including the overhead of the task scheduler. Any suggestions for using nanobench for this usage scenario? Thanks!

Option to suppress unstable warnings at run-time?

Though it is a good thing to show warnings for unstable results (:wavy_dash: ... (Unstable with ...), sometimes one wants to run benchmarks in an environment where the execution is potentially perturbed by other processes and so results are known to be unstable; for example, in continuous integration.

So, it might be handy if an environment variable (NANOBENCH_NO_UNSTABLE_WARNING or something) could change the behaviour and suppress warnings for unstable results.

Changing the behaviour by a command-line option could be an alternative way, but I think an environment variable is more suitable for CI, and I guess it is easier to implement.

Setup with vcpkg

Trying to add nanobench via vcpkg manifest yields bizarre results.
vcpkg manifest dependency looks like so,

...
    "dependencies": [
        "xtensor-blas",
        "nanobench",
        {
            "name": "highfive",
            "default-features": false,
            "features": [ "xtensor" ]
        },
...

It only seems to be happy if I don't add anything to CMake at all.
If I try to find_package(nanobench) or link against nanobench it complains that it can't be found or that I have to add to CMake_Prefix_List etc

Is not having anything setup against it in CMake standard practice for this or is there some wizardry going on?

Confusion with CPU governor

Hi,

I am confused about the state of the CPU governor. nanobench reports:

Warning, results might be unstable:

  • CPU governor is '' but should be 'performance'
  • Turbo is enabled, CPU frequency will fluctuate

But I have the following configuration for my CPU frequency:

$ grep GOVERNOR /etc/init.d/cpufrequtils
#       GOVERNOR="ondemand"                                                                                                                        
GOVERNOR="performance"                                                                                                                             
        if [ -f $info ] && grep -q "\<$GOVERNOR\>" $info ; then                                                                                    
if [ -n "$GOVERNOR" ] ; then                                                                                                                       
        CPUFREQ_OPTIONS="$CPUFREQ_OPTIONS --governor $GOVERNOR"                                                                                    
                log_action_begin_msg "$DESC: Setting $GOVERNOR CPUFreq governor"

I have disabled the systemd service for ondemand. I regenerated initramfs and rebooted.

Is nanobench correctly detecting the CPU governor state?

Maybe nanobench is mistaken in its parsing. This is an EC2 instance virtual machine, I do not truly know if CPU governor customization is able to change. What do you think about hiding this message on virtual machines?

nanobench 4.3.0, Ubuntu 20.04 Focal Fossa, EC2 t2.micro.

how to deal with unstable results?

I'm benchmarking protobuf. There are always some warning on unstable result on first run, even after increasing the warmup and minimum iteration:

TEST_CASE("varint encode benchmark"){

    nanobench::Bench b;
    b.title("varint encode")
        .warmup(100000)
        .relative(true);
    b.performanceCounters(true);

    uint8_t buf[10] = {};

    b.minEpochIterations(10000000).run("google uint32_t", [&]{
        nanobench::doNotOptimizeAway(
            google::protobuf::io::CodedOutputStream::WriteVarint32ToArray(static_cast<uint32_t>(2961488830), buf));
    });

    b.minEpochIterations(10000000).run("google uint64_t", [&]{
        nanobench::doNotOptimizeAway(
            google::protobuf::io::CodedOutputStream::WriteVarint64ToArray(static_cast<uint64_t>(-41256202580718336), buf));
    });

//     std::ofstream f{"varint encode benchmark.html"};
//     b.render(nanobench::templates::htmlBoxplot(), f);

} // TEST_CASE("varint encode benchmark")


TEST_CASE("varint decode benchmark"){

    nanobench::Bench b;
    b.title("varint decode")
        .warmup(100000)
        .relative(true);
    b.performanceCounters(true);

    std::initializer_list<uint8_t> buf32 = {0xbe, 0xf7, 0x92, 0x84, 0x0b};
    std::initializer_list<uint8_t> buf64 = {0x9b, 0xa8, 0xf9, 0xc2, 0xbb, 0xd6, 0x80, 0x85, 0xa6, 0x01};

    b.minEpochIterations(10000000).run("google uint32_t", [&]{
        uint32_t v;
        nanobench::doNotOptimizeAway(
            google::protobuf::io::CodedInputStream{buf32.begin(), (int)buf32.size()}.ReadVarint32(&v));
    });

    b.minEpochIterations(10000000).run("google uint64_t", [&]{
        uint64_t v;
        nanobench::doNotOptimizeAway(
            google::protobuf::io::CodedInputStream{buf64.begin(), (int)buf64.size()}.ReadVarint64(&v));
    });

//     std::ofstream f{"varint decode benchmark.html"};
//     b.render(nanobench::templates::htmlBoxplot(), f);

} // TEST_CASE("varint decode benchmark")

encode results for 3 runs:

relative ns/op op/s err% total varint encode
100.0% 6.95 143,943,329.37 5.3% 0.87 〰️ google uint32_t (Unstable with ~11,003,388.7 iters. Increase minEpochIterations to e.g. 110033887)
54.0% 12.86 77,751,471.35 2.0% 1.56 google uint64_t
relative ns/op op/s err% total varint encode
100.0% 7.26 137,743,666.50 4.8% 0.88 google uint32_t
55.0% 13.21 75,702,505.04 1.6% 1.62 google uint64_t
relative ns/op op/s err% total varint encode
100.0% 7.22 138,464,875.81 2.9% 0.86 google uint32_t
56.4% 12.80 78,102,820.79 1.9% 1.55 google uint64_t

decode results for 3 runs:

relative ns/op op/s err% total varint decode
100.0% 26.93 37,132,748.98 1.9% 3.26 google uint32_t
101.0% 26.67 37,490,487.35 3.4% 3.23 google uint64_t
relative ns/op op/s err% total varint decode
100.0% 26.64 37,543,305.23 1.4% 3.23 google uint32_t
102.7% 25.93 38,558,548.08 1.2% 3.17 google uint64_t
relative ns/op op/s err% total varint decode
100.0% 27.45 36,434,999.13 0.8% 3.31 google uint32_t
103.1% 26.62 37,568,188.30 2.8% 3.23 google uint64_t

compiled with vs2019 16.8 msvc /O2
runs on Win10, [email protected]

check pyperf tunings

See if the system is properly configured. See what pyperf actually does under it's hood, and then at least check that the proper values are set. Much like sudo python3 -m pyperf system. Warn when something is not set.

Request: Ability to disable output

Perhaps there's already a way? But I couldn't find how to disable the output altogether. The use case is to be able to keep benchmarking with different data sizes so that one can programmatically determine algorithm thresholds using nanobench.

Randomly unstable results on Alder Lake, Win 11

Can give hugely (about 100 times slower) differ results when run AVX2 code

int main(int argc, char ** argv)
{
	alignas(32) float res[8];
	float * mem = static_cast<float *>(operator new(256, std::align_val_t(32)));
	float * mulmem = static_cast<float *>(operator new(256, std::align_val_t(32)));

	Bench().run("simd", [&]()
	{
		__m256 simdvec_ = _mm256_loadu_ps(mem);
		__m256 simdvecmul_ = _mm256_loadu_ps(mulmem);
		simdvec_ = _mm256_mul_ps(simdvec_, simdvecmul_);
		_mm256_storeu_ps(res, simdvec_);
	});
	return res[0];
}

Sometimes when i rebuild writes about 0.5 ns/op, and when i relaunch writes about 29 ns/op, i think it related to Windows 11 thread manager or/and because my processor is Alder Lake i5-12600k with E-cores.

Google Benchmark seems give more consistent results about 0.23 ns

Clarify whether CPU or wallclock time is used

In Google Benchmark docs:

If the benchmarked code itself uses threads and you want to compare it to single-threaded code, you may want to use real-time ("wallclock") measurements for latency comparisons... Without UseRealTime, CPU time is used by default.

and

By default, the CPU timer only measures the time spent by the main thread. If the benchmark itself uses threads internally, this measurement may not be what you are looking for. Instead, there is a way to measure the total CPU usage of the process, by all the threads.

Does nanobench use wallclock or CPU time? And if CPU, for all threads or main thread only? I'd assume wallclock, but not certain. You may also want to make it configurable.

[feature] single run benchmarks

All microbenchmark software run code under benchmark in loop. This will lead CPU running with all instructions are hot, cached with microcode commands, predictions known for 1M iterations ahead...

My Feature request is to add some single run tests with getting time from rdtsc before common benchmark.

  • first run when code is cold will produce cold TimeStamp Counter
  • second and third run the same code (with data and instructions cached) and get two hot TSC

Next run current benchmark.

Benchmark statistics need include cold and two hot tsc for each test.

This three numbers (cold and two hot runs) will provide information about times for cold code run (first run when instructions not in CPU cache) and running cached instructions (and data). Two cold numbers will allow evaluate jitter. This numbers will be not so accurate as running code 1M times and get dispersion, but it will provide useful information for running times without trained predictor.

Won't compile due to undefined references errors

It compiles fine with g++, fails with clang++ with e.g

nb.cpp:(.text+0x2f): undefined reference to `ankerl::nanobench::Config::Config()'
nb.cpp:(.text+0xcf): undefined reference to `ankerl::nanobench::Config::~Config()'
nb.cpp:(.text+0x11d): undefined reference to `ankerl::nanobench::Config::~Config()'
/tmp/nb-dff5f7.o: In function `ankerl::nanobench::Result ankerl::nanobench::Config::run<main::$_0>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, main::$_0)':
nb.cpp:(.text+0x197): undefined reference to `ankerl::nanobench::detail::IterationLogic::IterationLogic(ankerl::nanobench::Config const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)'
nb.cpp:(.text+0x1af): undefined reference to `ankerl::nanobench::detail::IterationLogic::numIters() const'
nb.cpp:(.text+0x26f): undefined reference to `ankerl::nanobench::detail::IterationLogic::add(std::chrono::duration<long, std::ratio<1l, 1000000000l> >)'
nb.cpp:(.text+0x2a2): undefined reference to `ankerl::nanobench::detail::IterationLogic::result() const'
nb.cpp:(.text+0x2de): undefined reference to `ankerl::nanobench::detail::IterationLogic::result() const'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
``

This is likely because the definition of those methods can't be resolved by clang++ somehow.

Idea? Timing only particular sections of code

Recently I've been testing a few different sorting algorithms, the rough setup has a preallocated block of memory filled with random data. However that means for each benchmark run after a sort, I have to scramble or generate more random data, and this code is shared between all the benchmarks. Which roughly translates to a benchmark which is really measuring the cost of those two things together, rather than just the sorting algorithm on its own. Presumably the sort algorithm overwhelms the cost of generating new data, but it's difficult to gauge exactly how much of a cost generating data is without running a benchmark with that part on it's own.

It seems almost as if with a few edits to add additional callbacks it'd be possible to add timings or even ignore parts of code which are not really part of the test. If a second callback doesn't make the code more unstable / slow to test it'd probably be a handy tool for cases like this.

Might look like:

    template <typename Start, typename Op, typename End>
    ANKERL_NANOBENCH(NOINLINE)
    Bench& run(std::string const& benchmarkName, Start&& start, Op&& op, End&& end);

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.