Giter Site home page Giter Site logo

Comments (10)

joy2myself avatar joy2myself commented on July 18, 2024

I asked this question at riscv-v-spec and as Krste mentioned, I can combine two vectors as bellow:

vsetvli  t0, 8, e8,m1
vle8.v  v1 src1
vle8.v  v2 src2
vsetvli  t0, 16, e8,m1
vslideup.vi  v1, v2, 64

But now I want to use rvv intrinsics in C++ to implement this function.
And the intrinsic of vslideup instruction is defined as bellow:

vint8m1_t vslideup_vx_i8m1 (vint8m1_t src, size_t offset);

I can not pass the dest vector (v1) to vslideup, so they are not combined. How can I use intrinsics to implement this assembly code? Should the dest vector parameter be added in the vslideup intrinsic?

Sincerely,
Yin

from rvv-intrinsic-doc.

zakk0610 avatar zakk0610 commented on July 18, 2024

Hi Yin,
We can use mask vslideup with all-ones mask, like

vslideup_vx_i8m1_m (vmset_m_b8() , v1, vint8m1_t v2, 64);

Ideally compiler can do optimization to generate vslideup.vi without mask.
Thanks.

from rvv-intrinsic-doc.

joy2myself avatar joy2myself commented on July 18, 2024

Hi,
There may be some problems with my previous description. In fact, what I want to do is
for example to combine two vint8mf2_t to one vint8m1_t and to split one vint8m1_t to two vint8mf2_t. It looks just like is a register allocation problem and do not need vector instructions. Are there intrinsics working for this?

from rvv-intrinsic-doc.

ebahapo avatar ebahapo commented on July 18, 2024

Should the intrinsic for vslide{up,down} take the destination as an argument, since its previous contents are partially preserved by the operation? This is easy to express in assembly, but the intrinsics abstract this.

from rvv-intrinsic-doc.

zakk0610 avatar zakk0610 commented on July 18, 2024

@joy2myself Why not to combine two vint8m1_t to one vint8m1_t like Krste's example?

from rvv-intrinsic-doc.

joy2myself avatar joy2myself commented on July 18, 2024

@zakk0610 Hi Zakk. In fact, the input I met is two vint8mf2_t but not vint8m1_t. If I use vslide{up,down} to combine, I should firstly append zero to these two vint8mf2_t to convert them into vint8m1_t. This can also be regarded as a kind of combination (combine vint8mf2_t and a full zero vint8mf2_t).
And also, split is hard to implement with vslide{up,down}, right?

In my opinion, this kind of combination is just a register alloction problem. We just need to put them together and rewrite the vtype register. We do not need to use vector instructions.

I think combination and split are very useful functions to make conversions between multiple short vectors and one long vector. For example two __mf2 and one __m1 or two __m1 and one __m2. And these functions are independent of vector assembly instructions. Should we add these intrinsics? Or, is there any other way to implement it?

from rvv-intrinsic-doc.

kito-cheng avatar kito-cheng commented on July 18, 2024

For fractional LMUL, it's not just register allocation issue, because fractional LMUL still occupy one register, if you want to combine two vector values with fractional LMUL, you need to slide up one of the value and then combine with another one.

For LMUL >= 1 case, it's just register allocation issue only if vl == vlmax, consider vlmax = 4 and vl = 3 (for LMUL=1), v0 (m1) = 1, 2, 3, 4 and v1 (m1) = 5, 6, 7, 8, combine two register you will get v0 (m2) = 1, 2, 3, 4, 5, 6, 7, 8, not 1, 2, 3, 5, 6,7.

from rvv-intrinsic-doc.

joy2myself avatar joy2myself commented on July 18, 2024

Hi Kito, thanks a lot for the guidance.
Then how could I implement combination and split? For example from two vint8mf2_t to one vint8m1_t? Do I need to store them first and then use C/C++ for combination? vslide{up,down} works with the same LMUL vector type. And I don't find intrinsics to convert vint8mf2_t to vint8m1_t.

from rvv-intrinsic-doc.

Hsiangkai avatar Hsiangkai commented on July 18, 2024

There are discussions converting to different LMUL under the same SEW in #37.

from rvv-intrinsic-doc.

eopXD avatar eopXD commented on July 18, 2024

We have lmul_trunc, lmul_ext, vget and vset in the intrinsics to match your use.

from rvv-intrinsic-doc.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.