Comments (8)
I have no preferences, but one comment: if it's part of the name, then it's obvious it has to be known at compile time - neither SEW not LMUL can be parametrized. If you put them as parameters, then someone (probably me at some point ;-) ) will find a use case where either one (or both) are parametrized and potentially not known at compile time. Which may or may not be legal and/or supported by an implementation...
This is already an open question for VL, as I have use case where I want/need to play with VL (upper bounded by the return value of vsetvl()) and it's not clear to me in the specifications what is legit and what isn't - i.e., what restrictions can an implementation impose...
from rvv-intrinsic-doc.
The V-extension provides two instructions: vsetvli
(immediate form), for when SEW and LMUL are known at compile/assembly time, and vsetvl
, for when SEW and LMUL are possibly known only at runtime. The latter instruction is very useful in restoring thread context (e.g., migrating between cores), but I'm not aware of other applications.
I propose the intrinsics library offers both forms. It's tempting to simulate the non-immediate form using the immediate form invoked within a big switch
statement, which the compiler could then optimize in the case where SEW and LMUL are compile-time constants. However, this would be inefficient when SEW or LMUL is not known at compile-time, since it would introduce branches in the generated assembly.
Clearly, the vsetvl
version must input SEW and LMUL, so the question is really about the syntax of the vsetvli
version. I'll only consider this version in the rest of this comment.
Having SEW and LMUL as arguments would involve a big switch
statement, but presumably the compiler would resolve it statically. Assuming there's no overhead here, I see no technological advantage to having SEW and LMUL as arguments versus baking them into the intrinsic name. I think this is really a matter of syntactic taste.
Some (subjective) advantages of having them as arguments:
- Fewer intrinsics to keep track of
- Same signature as the
vsetvl
version
Some (subjective) advantages of having them in the intrinsic name:
- Matches the naming scheme of other intrinsics in the proposal
- Less error-prone
- Emphasizes that SEW and LMUL are part of the (static) type system
I vote for the latter approach, based mainly on my belief that very few users will ever use the vsetvl
(non-immediate) form.
from rvv-intrinsic-doc.
Actually, I can think of cases where an end-user will use vsetvl() over the immediate form, especially if some implementations are tolerant to arbitrary VL. The obvious two are peel loops & tail loops, where it's a lot more natural in V to use a smaller VL rather than a larger VL + masking...
from rvv-intrinsic-doc.
I can think of cases where an end-user will use vsetvl() over the immediate form,
If I understand correctly that mean the dynamic LMUL and SEW is what you want under some situation, however I think it hard to write code under current type system.
e.g.
int32_t *a;
int32_t *b;
int sew = 32;
int lmul = large_lmul ? 8 : 1;
for (; vl = vsetvl(sew, lmul, n); n -= vl) {
va = vload (a); // Type for va ?
...
}
Of cause it can be resolve by write more code, but I think it hard to write and maintain?
int32_t *a;
int32_t *b;
int sew = 32;
int lmul = large_lmul ? 8 : 1;
for (; vl = vsetvl(sew, lmul, n); n -= vl) {
if (large_lmul) {
vint32m8_t va_32m8 = vload_i32m8 (a);
} else {
vint32m1_t va_32m1 = vload_i32m1 (a);
}
...
}
from rvv-intrinsic-doc.
To be honest, SEW and LMUL are much less likely to change at runtime, except for the one case for SEW mentioned below.
SEW would be conceivable (e.g. genericity between FP32 and FP64), but you still have to change every intrinsics/mnemonic name - and so it will be known at compile-time anyway in practice (i.e. macro-based generic implementation such as https://github.com/rdolbeau/fftw3/tree/riscv-v). SEW is likely to only change when reinterpreting data, e.g. between 64 and 32 bits integer as done in my Chacha20 implementation - and then the VL changes conversely so that SEW*VL remains constant (we don't change the bits or how many there are in a register, we just change how we interpret them).
LMUL should basically almost always be the rounded-down-to-a-legit-value of (number of available registers)/(number of required registers), if there's enough parallelism in the algorithm. That's unlikely to change for a given implementation of an algorithm.
from rvv-intrinsic-doc.
I absolutely agree users will want to dynamically change VL. Both vsetvl
and vsetvli
versions input an AVL argument, and return VL.
My earlier comment only concerned SEW and LMUL: Since these are part of the (static) type system in the current implementation, it is tricky to write code that changes them dynamically. Kito gave an artificial example, but I don't see any real-world value to it.
@rdolbeau is it possible for you to extract a minimal example from your "Chacha20" code, demonstrating this use-case? The type-punning I'm concerned you're suggesting can easily lead to non-portable code, since it can expose differences in the architectural parameter SLEN.
If it does prove useful in applications to type-pun on SEW and LMUL, I would prefer adding a reinterpret_cast
-style intrinsic to achieve this goal within the type system.
from rvv-intrinsic-doc.
Agree. We design the type system and intrinsics with static SEW and LMUL. It seems not useful to keep variable form for vsetvl intrinsics. For example, if we have run-time variables sew and lmul,
vl = vsetvl(avl, sew, lmul); // sew and lmul are run-time variables.
vret = vadd_vv_i32m1(va, vb);
Eventually, the compiler will convert the intrinsics into
vsetvl vl, avl, ra // Assume ra store sew and lmul run-time settings.
vsetvli x0, x0, e32,m1 // The compiler will generate this one to override the settings.
vadd.vv vret, va, vb
from rvv-intrinsic-doc.
I write a draft RFC for it. It follows the naming rules of other intrinsics, i.e., encoding SEW and LMUL into intrinsic names.
https://github.com/sifive/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md#configuration-setting
Please help me to review it. If we all agree to provide static version only, we could close the issue.
from rvv-intrinsic-doc.
Related Issues (20)
- [Requirement]: The RISC-V RVV vector intrinsic must include support for vector groups in the __riscv_vfredosum function HOT 4
- Type-relative overloads for vreinterpret, vlmul_ext, vlmul_trunc, etc. HOT 1
- How to use a class to wrap or derive from a sizeless vector type HOT 1
- Encode all the effects of vsetvl in the return type, for use in subsequent type deductions HOT 1
- Does `__riscv_v_intrinsic >= 1000000` imply overloaded intrinsics are supported?
- Create bibliography from reference section HOT 3
- Simple questions about inline assembly in vmv.x.s instruction HOT 2
- Asterisks are not subscripts
- the wrong result of "vmerge_vvm_i32m1" HOT 5
- ta,ma reduction destination with vl=0 HOT 1
- Clarify the consequences of vxsat not being handled by the intrinsics HOT 3
- Add a section with examples HOT 3
- Rename uses of {implicit,explicit}-frm into {Implicit, Explicit} FP rounding mode HOT 1
- Clarify the mapping of pseudo-intrinsics
- Clarify what float and double means HOT 1
- Fix authors in the document
- How to use LMUL in rvv-intrinsic? HOT 6
- Mismatched bfloat16 autogenerated files HOT 3
- Freeze the specification HOT 1
- `vfirst` and `vcpop` return types unexpectedly changed HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rvv-intrinsic-doc.