Giter Site home page Giter Site logo

Some suggestions about xunit-performance HOT 6 CLOSED

kouvel avatar kouvel commented on June 2, 2024
Some suggestions

from xunit-performance.

Comments (6)

kouvel avatar kouvel commented on June 2, 2024

CC @adamsitnik

from xunit-performance.

adamsitnik avatar adamsitnik commented on June 2, 2024

@kouvel how to get this with BenchmarkDotNet:

In BDN we have a thing called Jobs and they allow you to configure benchmarks. You can do it by using attributes or our fluent API or our object initialization API.

1 Provide a way for a test to run for a fixed amount of time rather than for a fixed number of iterations

You can specify the number of iterations and invocation per iteration or use our heuristics to get stable results without providing any values (default mode). You can also specify the minimum iteration time which will be respected by our heuristics.

DefaultConfig.Instance
            .With(
                Job.Default.WithMinIterationTime(TimeInterval.FromSeconds())

2 Provide another option to just have the test do the measurement and report a result as a time or score,

What do you mean by score? If you mark one of the Benchmarks/Jobs as the baseline then all other benchmarks are going to be scaled:

[Benchmark(Baseline = true)]
public void Baseline() => Thread.Sleep(BaseTime);

[Benchmark]
public void Slow() => Thread.Sleep(BaseTime * 2);

[Benchmark]
public void Fast() => Thread.Sleep(BaseTime / 2);

You can set the time units in following way:

var config = ManualConfig.Create(config: DefaultConfig.Instance);
config.Set(new Reports.SummaryStyle
{
	PrintUnitsInHeader = true,
	PrintUnitsInContent = false,
	TimeUnit = TimeUnit.Microsecond,
	SizeUnit = Columns.SizeUnit.B
});

2 for some extra flexibility including nontrivial setup/teardown and custom warmup/test. Utilities for taking measurements and doing calculations may be provided in libraries.

We support global setup/cleanup (Executed once) and Iteration Setup/Cleanup (executed for every itereation), the docs

public class SetupAndCleanupExample
{
  private int setupCounter;
  private int cleanupCounter;

  [IterationSetup]
  public void IterationSetup() => Console.WriteLine("// " + "IterationSetup" + " (" + ++setupCounter + ")");

  [IterationCleanup]
  public void IterationCleanup() => Console.WriteLine("// " + "IterationCleanup" + " (" + ++cleanupCounter + ")");

  [GlobalSetup]
  public void GlobalSetup() => Console.WriteLine("// " + "GlobalSetup");

  [GlobalCleanup]
  public void GlobalCleanup() => Console.WriteLine("// " + "GlobalCleanup");

  [Benchmark]
  public void Benchmark() => Console.WriteLine("// " + "Benchmark");
}

3 Not sure if there is a way to disable ETW event collection.

We have some ETW diagnosers: Inlining, HardwareCounters and TailCall. They are not enabled by default, moreover, the diagnosers with an overhead run the benchmarks once again, gather the data and ignore the results (they are skewed by the overhead)

Btw I am sure it's possible today with xunit-performance too,

  1. Provide error in % (maybe standard error) so that it's easy to tell from a glance which tests were noisy

BenchmarkDotNet provides:

  • Min, Lower Fence, Q1, Median, Mean, Q3, Upper Fence, Max, Interquartile Range, Outliers
  • Standard Error, Variance, Standard Deviation
  • Skewness, Kurtosis
  • Confidence Interval (Mean, Error, Level, Margin, Lower, Upper)
  • Percentiles (P0, P25, P50, P67, P80, P85, P90, P95, P100)

And removes the outliers by default.

@AndreyAkinshin (the main author of BDN) is a PhD with huge interest in statistics ;)

5 Provide ability to run multiple iterations of the test. Or even better would be to provide a way to specify minimum and maximum number of iterations over the whole test, a target standard error %, and have the harness run iterations until the error is below the target

It's our default mode (to run the benchmarks until our herustic is happy about the results), you can configure the accuracy by using some of the Job extension methods:

 WithMaxRelativeError
 WithMaxAbsoluteError
 WithMinIterationTime
 WithMinInvokeCount
 WithEvaluateOverhead
 WithRemoveOutliers
 WithAnalyzeLaunchVariance

6 I assume a GC.Collect is done between test invocations,

BenchmarkDotNet forces GC.Collect + GC.WaitForPendingFinalizers + GC.Collect for every iteration. This behavior can be disabled by calling Job.WithGcForce(false)

Moreover, by default, we run every benchmark in a new, dedicated process so the self-tuning nature of GC does not affect the final results and order of executing benchmarks/any other side effects do not matter due to the process isolation.

@kouvel please let me know if you have some more questions!

from xunit-performance.

kouvel avatar kouvel commented on June 2, 2024

Sounds very interesting, thanks for the info!

from xunit-performance.

kouvel avatar kouvel commented on June 2, 2024

What do you mean by score?

I meant some opaque value where higher is better. It could just be iterations per unit of time, or sometimes a test may measure several things and produce one value by weighting those measurements. For instance in a test there can be two types of operations happening at the same time (intent is to test both together), but the perf of one may be more important than the other. For testing a reader-writer lock with many readers and few writers, we'd want to see that readers are getting most of the locks but writers are also making progress.

If you mark one of the Benchmarks/Jobs as the baseline then all other benchmarks are going to be scaled

This could be useful but I was hoping to get some more flexibility with that example. Probably there's already a way though.

from xunit-performance.

adamsitnik avatar adamsitnik commented on June 2, 2024

I meant some opaque value where higher is better. It could just be iterations per unit of time

We display Operations/s by default.

from xunit-performance.

kouvel avatar kouvel commented on June 2, 2024

I see. Another thought is if a test could provide multiple results from one measurement sequence (like in the RWLock test above it could produce readers/s and writers/s that could be tracked independently instead of combining the values into a score. Maybe they could appear as child tests of the parent.

from xunit-performance.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.