Comments (10)
I asked this question at riscv-v-spec and as Krste mentioned, I can combine two vectors as bellow:
vsetvli t0, 8, e8,m1
vle8.v v1 src1
vle8.v v2 src2
vsetvli t0, 16, e8,m1
vslideup.vi v1, v2, 64
But now I want to use rvv intrinsics in C++ to implement this function.
And the intrinsic of vslideup
instruction is defined as bellow:
vint8m1_t vslideup_vx_i8m1 (vint8m1_t src, size_t offset);
I can not pass the dest vector (v1) to vslideup
, so they are not combined. How can I use intrinsics to implement this assembly code? Should the dest vector parameter be added in the vslideup
intrinsic?
Sincerely,
Yin
from rvv-intrinsic-doc.
Hi Yin,
We can use mask vslideup with all-ones mask, like
vslideup_vx_i8m1_m (vmset_m_b8() , v1, vint8m1_t v2, 64);
Ideally compiler can do optimization to generate vslideup.vi
without mask.
Thanks.
from rvv-intrinsic-doc.
Hi,
There may be some problems with my previous description. In fact, what I want to do is
for example to combine two vint8mf2_t
to one vint8m1_t
and to split one vint8m1_t
to two vint8mf2_t
. It looks just like is a register allocation problem and do not need vector instructions. Are there intrinsics working for this?
from rvv-intrinsic-doc.
Should the intrinsic for vslide{up,down}
take the destination as an argument, since its previous contents are partially preserved by the operation? This is easy to express in assembly, but the intrinsics abstract this.
from rvv-intrinsic-doc.
@joy2myself Why not to combine two vint8m1_t
to one vint8m1_t
like Krste's example?
from rvv-intrinsic-doc.
@zakk0610 Hi Zakk. In fact, the input I met is two vint8mf2_t
but not vint8m1_t
. If I use vslide{up,down}
to combine, I should firstly append zero to these two vint8mf2_t
to convert them into vint8m1_t
. This can also be regarded as a kind of combination (combine vint8mf2_t
and a full zero vint8mf2_t
).
And also, split is hard to implement with vslide{up,down}
, right?
In my opinion, this kind of combination is just a register alloction problem. We just need to put them together and rewrite the vtype
register. We do not need to use vector instructions.
I think combination and split are very useful functions to make conversions between multiple short vectors and one long vector. For example two __mf2
and one __m1
or two __m1
and one __m2
. And these functions are independent of vector assembly instructions. Should we add these intrinsics? Or, is there any other way to implement it?
from rvv-intrinsic-doc.
For fractional LMUL, it's not just register allocation issue, because fractional LMUL still occupy one register, if you want to combine two vector values with fractional LMUL, you need to slide up one of the value and then combine with another one.
For LMUL >= 1 case, it's just register allocation issue only if vl == vlmax, consider vlmax = 4 and vl = 3 (for LMUL=1), v0 (m1) = 1, 2, 3, 4 and v1 (m1) = 5, 6, 7, 8, combine two register you will get v0 (m2) = 1, 2, 3, 4, 5, 6, 7, 8, not 1, 2, 3, 5, 6,7.
from rvv-intrinsic-doc.
Hi Kito, thanks a lot for the guidance.
Then how could I implement combination and split? For example from two vint8mf2_t
to one vint8m1_t
? Do I need to store them first and then use C/C++ for combination? vslide{up,down}
works with the same LMUL vector type. And I don't find intrinsics to convert vint8mf2_t
to vint8m1_t
.
from rvv-intrinsic-doc.
There are discussions converting to different LMUL under the same SEW in #37.
from rvv-intrinsic-doc.
We have lmul_trunc
, lmul_ext
, vget
and vset
in the intrinsics to match your use.
from rvv-intrinsic-doc.
Related Issues (20)
- Clarify the definition of "tail/mask agnostic" for the intrinsic interface HOT 3
- Policy Intrinsics section in rvv-intrinsic-rfc.md is out of date HOT 1
- Error: array has sizeless element type 'vint16m1_t' (aka '__rvv_int16m1_t') HOT 5
- vwcvt(u).x.x.v intrinsics are mis-categorized HOT 1
- Run api test error with gcc
- Question: What is the status of rvv intrinsic API 1.0 release HOT 43
- Document relationship between inline asm and the intrinsics
- Run api testing error with gcc HOT 2
- strcmp example is wrong
- Will CSR API be removed? HOT 1
- Question regarding __riscv_vsetvl mask/tail policy HOT 3
- Double Check about RVV Float Point Honor FRM HOT 3
- Control register grouping HOT 2
- Examples use arithmetic on `void*`, which isn't allowed by the c standard HOT 2
- First step in rvv-intrinsic-generator/README is broken
- Add 0.11 -> 0.12 compatibility headers HOT 11
- question about vm HOT 2
- Issue: wrong type HOT 2
- Does the project have absolute value functions for int and uint integer vectors? HOT 4
- Question about rvv intrinsic version HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rvv-intrinsic-doc.