Comments (3)
@rchen20 Updated the example above, the lambda argument itself is m_worksum
, the target for the final reduction result is worksum
. These should be different. m_worksum
is the thread local value to be used before the actual reduction work is done later.
from raja.
Hey @mdavis36, in the implicit lambda case, are there typos where data_t m_red
ought to be data_t & worksum
? If so, is this implying that we need to pass the reduced data to each lambda, regardless of whether that lambda actually performs a reduction?
from raja.
@mdavis36 if I'm reading the above would this essentially collapse all the various different reduction types (e.g. `RAJA::ReduceSum<RAJA::seq_reduce, int> RAJA::ReduceSum<RAJA::omp_reduce_ordered, int> RAJA::ReduceSum<RAJA::cuda_reduce, int>, etc...) down to one single type ? So, you would only need 1 data type for all your different execution policies?
If so I'd just like to say that I'd be very much for such a feature as forall loops of mine that have those operations are the only ones I can't abstract away to a single forall abstraction using something like raja::expt::dynamic_forall
feature for all the execution policies I support in my libraries/apps (cpu, openmp, cuda, hip, etc...).
Unfortunately, things like std::variant
or std::visit
still are not supported on the device, at least to my current knowledge of things, which would have allowed a simple-ish solution to the above.
from raja.
Related Issues (20)
- Support assignment operator for layouts (and views)
- Question about `seq_exec` compiler optimizations HOT 2
- Scan Accumulation Type Consistency HOT 1
- build issues with rocm-6.1.0, clang not found, rocm_agend_enumerator not found HOT 3
- Reducer Tuning Leftovers
- MultiReducer and Reducer Design HOT 4
- Reduction tests HOT 2
- Review disabled tests
- fix SYCL example
- Remove set/getQueue methods for SYCL, part 1
- CMAKE with RAJA CUDA backend HOT 4
- Compilation issue with new clang/Intel HOT 3
- Remove SYCL set/getQueue methods, part 2
- Atomics wishlist HOT 1
- Improving RAJA integration with perf tools HOT 1
- Add SYCL support for scan
- Add sort support for SYCL
- OpenMP Target CI Checks
- Require explicit initialization values for reductions
- Reduce redundant code in examples
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from raja.