Giter Site home page Giter Site logo

Comments (7)

illuhad avatar illuhad commented on May 27, 2024

I agree with your analysis. Your proposal is much closer to the classical stream workload. For measuring throughput, I think we should rather follow stream.

This is of course quite a large number, and a much larger size than many other benchmarks can support (if e.g. it is used as the range in both dimensions of a 2D buffer). One possible solution would be to always multiply the size by some fixed factor, e.g. 1024. However, I generally feel like the mapping of the --size parameter to the actual workload is too arbitrary already. It might be convenient for running lots of benchmarks in batch, but in reality I think we'll have to hand tune these values (as well as any additional parameters a benchmark might have) for each individual platform anyways. I'm thus thinking whether maybe having individual parameters for each benchmark would make more sense (e.g. having a buffer-size= for this one).

The run-suite script starts with a very small --size parameter (I think something like 64) and then doubles the problem size until the runtime has achieved a minimum runtime that is determined by the test profile in run-suite. I image this could already work well for many cases. If this is not the case here, the test profiles can override the tested problem sizes for individual benchmarks. We could simply edit the test profile to start with larger values if necessary.

In the same vein, having the ability to compute custom metrics would also be great. For example, it would be cool if this benchmark could actually print its achieved throughput, instead of having to manually compute it.

Yes, I agree. The main challenge is that the benchmarks are individual applications, that don't know which results the other benchmarks will emit to the csv file. If the benchmarks emit different columns to the csv, this prevents a correct formatting. A solution could be adding a standard field for all benchmarks that will be present in the result of every benchmark, say, metric, and benchmarks can choose to implement this or return N/A, similar to the verification.

from sycl-bench.

psalz avatar psalz commented on May 27, 2024

Yes, I agree. The main challenge is that the benchmarks are individual applications, that don't know which results the other benchmarks will emit to the csv file. If the benchmarks emit different columns to the csv, this prevents a correct formatting. A solution could be adding a standard field for all benchmarks that will be present in the result of every benchmark, say, metric, and benchmarks can choose to implement this or return N/A, similar to the verification.

That could work, yes. It is of course a bit limiting and not very descriptive. Hypothetically, if the various benchmarks were to emit additional metrics/columns -- how useful is it really to have all results for all benchmarks in a single file? Couldn't we just generate separate CSVs for each executable?

Additionally, to compute e.g. throughput, the benchmark would need to have access to the timing results, which, given the current plugin/hook architecture seems a bit messy. Maybe timing should be considered a "core" functionality of the framework instead, much like verification?

from sycl-bench.

illuhad avatar illuhad commented on May 27, 2024

That could work, yes. It is of course a bit limiting and not very descriptive. Hypothetically, if the various benchmarks were to emit additional metrics/columns -- how useful is it really to have all results for all benchmarks in a single file? Couldn't we just generate separate CSVs for each executable?

The run-suite script invokes each benchmark a dozen times with different combinations of problem size and local size, so we would have a looot of very small csv files, each with only 1 or a couple rows. For every meaningful analysis, those files would need to be aggregated into one file again anyway. It could be done, but it won't be pretty I think.

Additionally, to compute e.g. throughput, the benchmark would need to have access to the timing results, which, given the current plugin/hook architecture seems a bit messy. Maybe timing should be considered a "core" functionality of the framework instead, much like verification?

Hm. Good point. There may be more metrics than just bandwidth that require access to one or multiple measurements (e.g. once we can measure power consumption, flops/s/watt). This makes me think that maybe metrics that represent transformations of actual measurements should maybe better be implemented in a post processing step (e.g. in run-suite) for the most flexibility?

from sycl-bench.

sohansharma avatar sohansharma commented on May 27, 2024

The run-suite script invokes each benchmark a dozen times with different combinations of problem size and local size, so we would have a looot of very small csv files, each with only 1 or a couple rows. For every meaningful analysis, those files would need to be aggregated into one file again anyway. It could be done, but it won't be pretty I think.
Do we need so many different configurations? If we can restrict the number of configurations, then it will not be that messy in one CSV file. For example, we do not need to vary the problem size and local size for every experiment.

from sycl-bench.

psalz avatar psalz commented on May 27, 2024

Hm. Good point. There may be more metrics than just bandwidth that require access to one or multiple measurements (e.g. once we can measure power consumption, flops/s/watt). This makes me think that maybe metrics that represent transformations of actual measurements should maybe better be implemented in a post processing step (e.g. in run-suite) for the most flexibility?

The problem I see with that approach is that e.g. in this case, for DRAM bandwidth, we also need to know the amount of bytes copied. If we go with the mapping from --size to 1/2/3D ranges as you proposed on my PR, the number of bytes might depend on that parameter, the dimensionality of the kernel and the data type being copied. So either we have the benchmark log this information somewhere (coming back to the issue of per-benchmark metrics), or we hardcode this somehow into run-suite, which seems like a bad idea to me.

from sycl-bench.

psalz avatar psalz commented on May 27, 2024

The run-suite script invokes each benchmark a dozen times with different combinations of problem size and local size, so we would have a looot of very small csv files, each with only 1 or a couple rows. For every meaningful analysis, those files would need to be aggregated into one file again anyway. It could be done, but it won't be pretty I think.

Since the script knows which runs belong to the same benchmark, it could just concatenate those outputs into a single CSV. Then we'd have one file per benchmark - that sounds manageable to me.

from sycl-bench.

psalz avatar psalz commented on May 27, 2024

Closing this as we have merged the simpler version of the DRAM benchmark in #26 and added throughput metrics in #30. Thanks everyone!

from sycl-bench.

Related Issues (17)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.