Comments (7)
I agree with your analysis. Your proposal is much closer to the classical stream workload. For measuring throughput, I think we should rather follow stream.
This is of course quite a large number, and a much larger size than many other benchmarks can support (if e.g. it is used as the range in both dimensions of a 2D buffer). One possible solution would be to always multiply the size by some fixed factor, e.g. 1024. However, I generally feel like the mapping of the --size parameter to the actual workload is too arbitrary already. It might be convenient for running lots of benchmarks in batch, but in reality I think we'll have to hand tune these values (as well as any additional parameters a benchmark might have) for each individual platform anyways. I'm thus thinking whether maybe having individual parameters for each benchmark would make more sense (e.g. having a buffer-size= for this one).
The run-suite
script starts with a very small --size
parameter (I think something like 64) and then doubles the problem size until the runtime has achieved a minimum runtime that is determined by the test profile in run-suite
. I image this could already work well for many cases. If this is not the case here, the test profiles can override the tested problem sizes for individual benchmarks. We could simply edit the test profile to start with larger values if necessary.
In the same vein, having the ability to compute custom metrics would also be great. For example, it would be cool if this benchmark could actually print its achieved throughput, instead of having to manually compute it.
Yes, I agree. The main challenge is that the benchmarks are individual applications, that don't know which results the other benchmarks will emit to the csv file. If the benchmarks emit different columns to the csv, this prevents a correct formatting. A solution could be adding a standard field for all benchmarks that will be present in the result of every benchmark, say, metric
, and benchmarks can choose to implement this or return N/A
, similar to the verification.
from sycl-bench.
Yes, I agree. The main challenge is that the benchmarks are individual applications, that don't know which results the other benchmarks will emit to the csv file. If the benchmarks emit different columns to the csv, this prevents a correct formatting. A solution could be adding a standard field for all benchmarks that will be present in the result of every benchmark, say,
metric
, and benchmarks can choose to implement this or returnN/A
, similar to the verification.
That could work, yes. It is of course a bit limiting and not very descriptive. Hypothetically, if the various benchmarks were to emit additional metrics/columns -- how useful is it really to have all results for all benchmarks in a single file? Couldn't we just generate separate CSVs for each executable?
Additionally, to compute e.g. throughput, the benchmark would need to have access to the timing results, which, given the current plugin/hook architecture seems a bit messy. Maybe timing should be considered a "core" functionality of the framework instead, much like verification?
from sycl-bench.
That could work, yes. It is of course a bit limiting and not very descriptive. Hypothetically, if the various benchmarks were to emit additional metrics/columns -- how useful is it really to have all results for all benchmarks in a single file? Couldn't we just generate separate CSVs for each executable?
The run-suite
script invokes each benchmark a dozen times with different combinations of problem size and local size, so we would have a looot of very small csv files, each with only 1 or a couple rows. For every meaningful analysis, those files would need to be aggregated into one file again anyway. It could be done, but it won't be pretty I think.
Additionally, to compute e.g. throughput, the benchmark would need to have access to the timing results, which, given the current plugin/hook architecture seems a bit messy. Maybe timing should be considered a "core" functionality of the framework instead, much like verification?
Hm. Good point. There may be more metrics than just bandwidth that require access to one or multiple measurements (e.g. once we can measure power consumption, flops/s/watt). This makes me think that maybe metrics that represent transformations of actual measurements should maybe better be implemented in a post processing step (e.g. in run-suite
) for the most flexibility?
from sycl-bench.
The
run-suite
script invokes each benchmark a dozen times with different combinations of problem size and local size, so we would have a looot of very small csv files, each with only 1 or a couple rows. For every meaningful analysis, those files would need to be aggregated into one file again anyway. It could be done, but it won't be pretty I think.
Do we need so many different configurations? If we can restrict the number of configurations, then it will not be that messy in one CSV file. For example, we do not need to vary the problem size and local size for every experiment.
from sycl-bench.
Hm. Good point. There may be more metrics than just bandwidth that require access to one or multiple measurements (e.g. once we can measure power consumption, flops/s/watt). This makes me think that maybe metrics that represent transformations of actual measurements should maybe better be implemented in a post processing step (e.g. in
run-suite
) for the most flexibility?
The problem I see with that approach is that e.g. in this case, for DRAM bandwidth, we also need to know the amount of bytes copied. If we go with the mapping from --size
to 1/2/3D ranges as you proposed on my PR, the number of bytes might depend on that parameter, the dimensionality of the kernel and the data type being copied. So either we have the benchmark log this information somewhere (coming back to the issue of per-benchmark metrics), or we hardcode this somehow into run-suite
, which seems like a bad idea to me.
from sycl-bench.
The
run-suite
script invokes each benchmark a dozen times with different combinations of problem size and local size, so we would have a looot of very small csv files, each with only 1 or a couple rows. For every meaningful analysis, those files would need to be aggregated into one file again anyway. It could be done, but it won't be pretty I think.
Since the script knows which runs belong to the same benchmark, it could just concatenate those outputs into a single CSV. Then we'd have one file per benchmark - that sounds manageable to me.
from sycl-bench.
Closing this as we have merged the simpler version of the DRAM benchmark in #26 and added throughput metrics in #30. Thanks everyone!
from sycl-bench.
Related Issues (17)
- Add references to original implementations of benchmarks
- Use ndrange of hierarchical parallel for in pattern_shared
- More special treatment for implementations in CMakeLists.txt
- Fix the run-suite brommy.bmp not found issue HOT 2
- Investigate nbody on ComputeCpp HOT 2
- Let run-suite read test profiles from yaml or json
- Questions about the single kernel set HOT 2
- use of undeclared identifier 'device_selector' HOT 2
- Race condition in scalar prod HOT 1
- Problem in compilation stage with. computecpp 2.0.0 HOT 3
- build procedure HOT 1
- Issue with --local command line parameter. HOT 6
- blocked_transform is broken due to SYCL 2020 offset semantics HOT 1
- `emitResults` has invalid memory accesses when used with `--warmup-run` & the first run fails HOT 1
- Decide on future of sycl2020 branch - make default branch or merge into main? HOT 2
- Runtime failure for the DGEMM application HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sycl-bench.