Giter Site home page Giter Site logo

Comments (15)

rdolbeau avatar rdolbeau commented on July 18, 2024 1

@knightsifive FFTW3 is peculiar, as it forces you to fit in the macro-based SIMD model (one of the reason it's my go-to test... baptism by fire). I do have DFTs and FFTs using Zvlsseg. I also have an FFTW3 using it, but it won't compile as I need to fit two registers in a single datatype - so I need "vfloat64m1x2_t", or an equivalent struct or array :-(

The bad news is that while doing that "split format" works fine with AVX-512, the resulting codelets won't be used by FFTW3 because they are lower-performing: too much spilling compared to the 'interleaved' version...

Edit: typo

from rvv-intrinsic-doc.

ebahapo avatar ebahapo commented on July 18, 2024

Hopefully, the new V types would be based on aggregates.

Thus, vint32m2x3_t would be defined as:

typedef vint32m2_t vint32m2x3_t[3];

from rvv-intrinsic-doc.

kito-cheng avatar kito-cheng commented on July 18, 2024

Based on aggregates or array type sounds good to me, and that meaning we could use subscript to access element or initialize the tuple like array to make the code clearly like below:

typedef vint32m2_t vint32m2x3_t[3];
vint32m2x3_t vt;
vint32m2_t va

vt[0] = va;
va = vt[0];

vint32m2x3_t vt2 = {va, va, va};

The only concern to me is that's mean we should allow user can declare array with vector type and/or declare struct with vector type, or disallow user to use that but compiler can use it, and that the part we didn't have much discussion yet.

Some reference here:

  • SVE disallow declare array with scalable type declare.
  • SVE disallow user declare struct with scalable type field, but IIRC, it allowed in earlier ACLE implementation.
  • SVE only allow using intrinsic to access value from scalable vector tuple type.

Note: I list how SVE do is not mean I think we should same as SVE, but for reference to discussion.

from rvv-intrinsic-doc.

rdolbeau avatar rdolbeau commented on July 18, 2024

@kito-cheng I don't think SVE 'disallow'. It's just not supported yet... to be confirmed.

from rvv-intrinsic-doc.

kito-cheng avatar kito-cheng commented on July 18, 2024

@rdolbeau yeah, let me check the ACLE/SVE spec again, I just checked their open source GCC implementation only, IIRC they allow declare struct with scalable vector type with limitations, one limitation is all filed must be scalable vector type if there is at least one scalable vector field, anyway, I'll update after check the spec.

from rvv-intrinsic-doc.

rdolbeau avatar rdolbeau commented on July 18, 2024

@kito-cheng I've a ticket open with Arm on the subject ;-), as it would be useful in some cases. An example for RVV (less so SVE as it has HW support for complex so interleaved is fine):

struct {
  vfloat64m1_t real;
  vfloat64m1_t imaginary;
};

RVV has terrible support for interleaved complex (i.e. 'real' in even lanes, 'imaginary' in odd lanes, or 'vector of struct') as in-register data manipulation rely on the very generic 'vrgather'. It's much more efficient to use split complex (two different registers, or 'struct of vector') ... but then some software infrastructure will require a single data construct to hold the complex values, hence the need for either 'struct' or 'array' of vectors...

Of course, the split format requires 2x the registers (and 2x the parallelism) which creates additional spilling but that's a different issue :-(

Edit:just prettier code

from rvv-intrinsic-doc.

kito-cheng avatar kito-cheng commented on July 18, 2024

Latest ACLE for sve I can found on there website, there spec say sizeless / scalable vector can't used on array, struct, union or class:

https://developer.arm.com/docs/100987/latest/arm-c-language-extensions-for-sve

Page 18.
Sizeless types may not be used in the following situations:
...
• as the type of an array element;
...
...
In all other respects, sizeless types have the same restrictions as the standard-defined incomplete types.
This specifically includes (but is not limited to) the following:
• Members of unions, structures and classes cannot have sizeless type.
...

from rvv-intrinsic-doc.

kito-cheng avatar kito-cheng commented on July 18, 2024

@rdolbeau
I thought segment load/store in RISC-V should provide more powerful capability to deal those issue?

This struct is equivalent vfloat64m1x2_t in this proposal, and using vseg_load_f64m1x2 (vlseg2e.v) can load vector of struct, that might what you want?

struct {
vfloat64m1_t real;
vfloat64m1_t imaginary;
};

from rvv-intrinsic-doc.

rdolbeau avatar rdolbeau commented on July 18, 2024

@kito-cheng Yes, but Zvlsseg is an extension so might no be available.

And yes, the structure is the same, it's a use case to justify we do want to have this ability :-) - and ideally not restricted to vfloat64m1x2_t but potentially more general

Indeed, this example would be well served by vfloat64m1x2_t if you can then access the two halves of the structure/array independently for further per-member computations:

vfloat64m1x2_t conjugate(const vfloat64m1x2_t input, const int requested_vl) {
  vfloat64m1x2_t result = input;
  result[1] = vneg_v_f64m1(result[1], requested_vl); // [0] is real, [1] is imaginary
  return result;
}

Edit: add the requested_vl in the code to be coherent with my own argument in #8 ;-)

from rvv-intrinsic-doc.

nick-knight avatar nick-knight commented on July 18, 2024

@kito-cheng Yes, but Zvlsseg is an extension so might no be available.

... says the guy proposing all the SLEN-dependent hacks :P

But seriously, note the following commentary in the V-extension spec:

Note | This set of instructions is intended to be included in the base "V" extension.
https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#78-vector-loadstore-segment-instructions-zvlsseg

Obviously this is non-normative text, and the V-ext has not yet been ratified. However, my understanding is that this commentary is indicative of the task group's momentum and current thinking. I've been fighting hard for this, since last July, for exactly the application we're discussing. And for what it's worth, all of SiFive's vector cores support Zvlsseg.

from rvv-intrinsic-doc.

rdolbeau avatar rdolbeau commented on July 18, 2024

@knightsifive Good, because Zvlsseg is a good (great) thing I believe, and as LMUL already requires some level of 'striping'/'interleaving' ability (at SLEN), I would imagine it's not exceedingly difficult to also do it at SEW (not a hardware implementation guy, so I might be wrong).

from rvv-intrinsic-doc.

nick-knight avatar nick-knight commented on July 18, 2024

@rdolbeau when first I learned (December 10, 2019) that your FFTW port wasn't using segmented loads and stores, I cried.

To my understanding, the real challenge for Zvlsseg implementation is handling all the odd cases (e.g., RGB values) plus the fact that the "register groups" need not be aligned to powers of two. (LMUL groups are powers of two, and aligned commensurately.)

from rvv-intrinsic-doc.

kito-cheng avatar kito-cheng commented on July 18, 2024

Status update:

I start to implement on GCC side, and plan to implement vector tuple type as primitive type at first stage, and extend to using aggregate or array next stage.

I think the back-end implementation should be similar, no matter we defined vector tuple type as primitive, aggregate or array, the only difference is what kind of syntax / operator we need to support in the front-end.

from rvv-intrinsic-doc.

Hsiangkai avatar Hsiangkai commented on July 18, 2024

We could discuss how to represent the tuple types and related C operators in another new issue.

from rvv-intrinsic-doc.

kito-cheng avatar kito-cheng commented on July 18, 2024

https://github.com/sifive/rvv-intrinsic-doc/issues/17 for further discussion on vector tuple type

from rvv-intrinsic-doc.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.