Comments (4)
Hi @PkmX thanks for the proposal. Seems reasonable.
In the EPI project we implemented the following interim calling convention:
Registers v0
and v16...v23
can be used to pass vector values.
The process is as follows (parameters would be processed in the order of the C declaration):
- mark registers
v0
andv16...v23
as free. - if the parameter is a mask type and
v0
is free, then assign the parameter tov0
(mark the register as used) - if the parameter does not have mask type or it does have mask type but
v0
is not free then, use the following algorithm- determine the register group size (RGS) of the type
- find the lowest numbered register
vX
in the setv16...v23
such that the register is free and the RGS dividesX
. If there are RGS consecutive free registers (starting from and includingvX
) then assign all of them to that parameter (mark all the registers as used) - if there are no free registers eligible under the case above, then the vector is passed via the stack.
We initially believed it made sense to have callee-saved registers, hence the very limited range from v16
to v23
. However the experience so far is that it may not be very valuable to have callee-saved registers (we could have them in an alternative, more specialized, calling convention such as one used by a vectorizer). The algorithm also prioritizes the mask to be in v0
(if any).
If we take the above algorithm and we use v0
, v1...v31
(instead of v0
, v16...v23
) then I think that would be similar to your proposal. It also addresses the question of the holes you make at the end.
I wouldn't be very worried about the 3 m8
. In general, if there is a mask around, 3 is also the maximum number of useable m8 register groups (because the mask in v0
prevents us from using the register group v0...v7
).
One thing missing from the algorithm above is that segment vectors. In that case I understand X
needs to divide RGS but we need X * nf
registers available. Also, as you mentioned, fractional LMULs imply a RGS=1.
from rvv-intrinsic-doc.
I think that the merger of proposals resulting in v0
and v1..31
as very interesting, including filling prior registers after an argument with LMUL > 1
. However, IMO, 3 arguments is the practical minimum and more than this enters in the "nice to have" territory, which leaves just one m8
register available for consideration for callee saved ones, seemingly too few to make much of a difference.
Therefore, methinks that it's rather premature to nail down the calling convention. I'd prefer to have a working implementation upstream so that we can then model the best calling convention, especially on the issue of callee saved registers.
from rvv-intrinsic-doc.
@PkmX @rofirrim please take a look the calling convention PR. riscv-non-isa/riscv-elf-psabi-doc#171
from rvv-intrinsic-doc.
This is discussion ongoing in the PSABI working group and out of scope of this repository. Closing the issue.
from rvv-intrinsic-doc.
Related Issues (20)
- [Requirement]: The RISC-V RVV vector intrinsic must include support for vector groups in the __riscv_vfredosum function HOT 4
- Type-relative overloads for vreinterpret, vlmul_ext, vlmul_trunc, etc. HOT 1
- How to use a class to wrap or derive from a sizeless vector type HOT 1
- Encode all the effects of vsetvl in the return type, for use in subsequent type deductions HOT 1
- Does `__riscv_v_intrinsic >= 1000000` imply overloaded intrinsics are supported?
- Create bibliography from reference section HOT 3
- Simple questions about inline assembly in vmv.x.s instruction HOT 2
- Asterisks are not subscripts
- the wrong result of "vmerge_vvm_i32m1" HOT 5
- ta,ma reduction destination with vl=0 HOT 1
- Clarify the consequences of vxsat not being handled by the intrinsics HOT 3
- Add a section with examples HOT 3
- Rename uses of {implicit,explicit}-frm into {Implicit, Explicit} FP rounding mode HOT 1
- Clarify the mapping of pseudo-intrinsics
- Clarify what float and double means HOT 1
- Fix authors in the document
- How to use LMUL in rvv-intrinsic? HOT 6
- Mismatched bfloat16 autogenerated files HOT 3
- Freeze the specification HOT 1
- `vfirst` and `vcpop` return types unexpectedly changed HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rvv-intrinsic-doc.