Comments (5)
The purpose of dst is used to preserve the tail elements. Why does this only support m1 type?
The scalar only use the first element. Why does this only support m1 type?
I guess it's history reason because rvv 0.8 does not support fraction lmul.
I prefer to keeping the original design, in your case, we can use int8_t vmv_x_s_i8m1_i8 (vint8m1_t src);
to extract first element from m1
vector (reduction result), and use vint8m8_t vmv_s_x_i8m8 (vint8m8_t dst, int8_t src);
to insert reduction result to any vector type you want and keep tail elements.
Do worry the performance, ideally compiler can do optimization for you.
from rvv-intrinsic-doc.
I prefer to keep m1
type for the dst
and scalar
operands. In my opinion, this most closely matches the spec:
The scalar input and output operands are held in element 0 of a single vector register, not a vector register group, so any vector register can be the scalar source or destination of a vector reduction regardless of LMUL setting.
As an aside, another way to access the result of the reduction without loss of decoupling between scalar and vector instruction pipelines (when applicable) is just to use vrgather.vi dst, src, 0
.
from rvv-intrinsic-doc.
Just checking in on this open issue. I'm still in favor of the current approach, using m1
for dst
and scalar
operands.
I agree with @zakk0610 that this complicates the use-case of writing the reduction to the first element in an LMUL > 1 group. In principle, a compiler could fuse the scalar <-> vector copies by intelligently allocating the reduction's destination register. However, I don't think this is a terribly common use-case, so I suspect that implementers will not prioritize this optimization. If users complain, we could always add additional intrinsics following @HanKuanChen's suggestions.
Alternatively, we could implement register group fission/fusion, e.g.,
vint8m1_t fission_int8m2_m1_0 (vint8m2_t); // return zeroth vector register in group
vint8m1_t fission_int8m2_m1_1 (vint8m2_t); // return first vector register in group
which make the copies more obvious to both the user and the compiler. I think I proposed something like this many months ago, but was not in favor of it because of the "SLEN issue", which is no longer an issue.
from rvv-intrinsic-doc.
This is a good feature to consider, let us settle a version first and revisit this in the future.
from rvv-intrinsic-doc.
How about extendingvget_u8m1
to act as vlmul_ext_u8m1
when receiving mf types, and the complement for vset_u8m1
? (don't forget to include identity transforms)
Then the caller can convert anything to m1 and back around the reduction, in a type-agnostic way.
from rvv-intrinsic-doc.
Related Issues (20)
- Constraint of vector types in Zve32* HOT 2
- [Requirement]: The RISC-V RVV vector intrinsic must include support for vector groups in the __riscv_vfredosum function HOT 4
- Type-relative overloads for vreinterpret, vlmul_ext, vlmul_trunc, etc. HOT 1
- How to use a class to wrap or derive from a sizeless vector type HOT 1
- Encode all the effects of vsetvl in the return type, for use in subsequent type deductions HOT 1
- Does `__riscv_v_intrinsic >= 1000000` imply overloaded intrinsics are supported?
- Create bibliography from reference section HOT 3
- Simple questions about inline assembly in vmv.x.s instruction HOT 2
- Asterisks are not subscripts
- the wrong result of "vmerge_vvm_i32m1" HOT 5
- ta,ma reduction destination with vl=0 HOT 1
- Clarify the consequences of vxsat not being handled by the intrinsics HOT 3
- Add a section with examples HOT 3
- Rename uses of {implicit,explicit}-frm into {Implicit, Explicit} FP rounding mode HOT 1
- Clarify the mapping of pseudo-intrinsics
- Clarify what float and double means HOT 1
- Fix authors in the document
- How to use LMUL in rvv-intrinsic? HOT 6
- Mismatched bfloat16 autogenerated files HOT 3
- Freeze the specification HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rvv-intrinsic-doc.