This issue was created for discussing development of the z/Architecture Vector Facilit

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Vector Facility for z/Architecture,about sdl-hercules-390/hyperion

Comments (83)

Fish-Git commented on July 21, 2024 1

Will you provide the changes for U128, vfetch16, vstore16... from swap128 or should I do it myself?

I will have to review my original implementation. What I originally coded might no longer be correct/appropriate for our current design. Maybe it is. Maybe it isn't. I don't know. I'll have to brush off the dust and take a look at it.

If you want to do it, please feel free to do so! You might actually be able to do it faster than me. Personal issues have been affecting my ability to contribute as of late. (Don't worry, it's nothing serious.)

from hyperion.

mcisho commented on July 21, 2024 1

After several days of testing I haven't discovered any problem with FP instructions using the shared VR/FPR. I would like to pull the FP changes into the develop branch, so that the changes can be exposed to a wider range of environments than I have available. Does anyone object, feel it's premature, etc?

from hyperion.

mcisho commented on July 21, 2024 1

The pull from sharedvfp to develop has been done.

Before the pull feat900.h was changed to say

//efine FEATURE_129_ZVECTOR_FACILITY

so the zVector instructions are still not supported by the develop branch.

from hyperion.

Fish-Git commented on July 21, 2024 1

I meant argue with me regarding the quality of gcc, given that you know how I feel about the product. I'm not saying we should drop support for gcc. Just that IMHO it's far too "chatty". It throws warnings for every little itty bitty thing -- even things which IMHO do not merit a waning -- which IMHO can be not only be extremely annoying but also an impediment to development given the desire to have a product that builds with no warnings or errors thereby triggering unnecessary coding changes just to silence them.

from hyperion.

Fish-Git commented on July 21, 2024 1

And we need to support it one-way-or-the-other, as much as it pisses us off.

Agreed 100%. But there's nothing wrong with encouraging others to use a more superior product than gcc when it comes to building Hercules, is there? I mean, Hercules does not require that it be built using gcc, right? It can be built just as well with clang, yes? So that's all I was doing. I was pointing out the fact that if these annoying warnings that gcc was throwing was bothering the person, then they should at least consider switching to using clang! There's nothing wrong with that! Right?

from hyperion.

mcisho commented on July 21, 2024 1

@salva-rczero I don't think I know much more than you regarding how Hercules works with floats. Changing how the register values were accessed didn't give me much insight into using SoftFloat. However, I'll start looking. Don't expect speedy results, I won't be able to do much this month.

from hyperion.

Fish-Git commented on July 21, 2024 1

For my part, I believe that my contribution to this project has come to an end. I have already warned that I do not have the necessary skills and I find everything related to the discussion/design very difficult. It is better to leave that task to those of you who know it.

We will miss you, Salva! :(

Farewell and thank you very much for your time and advice (especially to @Fish-Git).

You are VERY welcome, Salva! We all thank you from the bottom of our hearts for all of the tremendous contributions you have made to Hercules! You are a true Herculean in my book! If you send me your full real name, I will be very happy to add you to our Herculeans list.

Good luck and long live to Hercules!

Abso-fricking-lutely! :)))

from hyperion.

Fish-Git commented on July 21, 2024

My first approach was to use another area for VR and REFRESH/UPDATE from/to AFPR at every use. Then @Fish-Git proposed the shared area in POC SWAP128 (in this thread).

My development of instructions for zVector, involving bigendian storage, and the REFRESH/UPDATE mechanism. So I will absolutely have to change it.

Not necessarily! What I would like to do is determine which way -- yours or mine -- is more efficient. Thus I would be inclined to leave your current implementation as-is for the time being.

Implementing my shared registers proposal might be less efficient. I don't know. Maybe. Maybe not. I proposed it only because I thought (believed) it would be more efficient, but that remains to be seen!

I would prefer to see BOTH designs implemented (controlled via a temporary #define build option) so that we could then compare the performance of each one. It might well be that your current design is more efficient! I don't know! It might be. It might not be. It remains to be seen.

The idea here is I don't want to paint ourselves into a corner. I don't want to commit to one technique or the other until we know which one is best.

from hyperion.

Fish-Git commented on July 21, 2024

... and I don't have access to a real mainframe to test some complex instructions.

That's something else we will need to eventually do too: verify the correctness of each implemented instruction on real hardware. I seem to recall that one of our developers (I forget who) has access to a real mainframe. We will eventually need to test/debug our implementation on a real machine.

Then, once verified, we will of course ALSO need to develop a QA (Quality Assurance) runtime test ("runtest" make check test) as well, to ensure any future changes don't break our implementation.

So there is definitely enough "meat" in this project for multiple people to bite off their own piece of it. The more people we have contributing the greater the chance of our succeeding in our effort.

from hyperion.

salva-rczero commented on July 21, 2024

@Fish-Git I agree with the double implementation design.

@mcisho

Is this your understanding too?

Yes, this is how it was proposed, for byte size instructions it is not necessary or some logical operations (AND/OR), moves.... But in general you always have to be doing: BIG -> LIT -> BIG.

The FP registers contents in the regs structure are currently kept in the endianness of the host. If vector registers must be kept as big endian, then fp registers will also have to be kept as big endian. Which will have an impact on the design and usage of the shared area for vector/fp registers.

I think FP must continue with the current behaviour with regard to endianness. It will be easier to adapt zVector to the double treatment proposed by Fish.

p.s. Is you name Salva, am I addressing you correctly?

Yes! it's a common short name for Salvador.

Ian, tell me what you think!

Regards, salva.

from hyperion.

mcisho commented on July 21, 2024

Can you please have a look at the attached proposal of the Hercules changes for shared zVector/FP registers. All comments. suggestions, etc are welcome.

Shared-VR-and-FPR-proposal.txt

from hyperion.

Fish-Git commented on July 21, 2024

Can you please have a look at the attached proposal of the Hercules changes for shared zVector/FP registers. All comments. suggestions, etc are welcome.

Shared-VR-and-FPR-proposal.txt

Nice! I like it!

from hyperion.

mcisho commented on July 21, 2024

Can you please have a look at the revised attached proposal of the Hercules changes for shared zVector/FP registers. I had forgotten we would need to move data between the instruction processors variables and the zVector registers preserving host endianness. Again, all comments. suggestions, etc are welcome:

Shared-VR-and-FPR-proposal.txt

p.s. Fish, how did you add the bullet point before the link? I can't see it in the Github formatting syntax.

from hyperion.

Peter-J-Jansen commented on July 21, 2024

I too am interested to participate in the Vector Facility and have read the proposal text with interest. So far I only have some probably very basic questions which I'm seeking an answer to:

Do I assume correctly that the current FPR registers are only ever used for floating point numbers, but that when overlaid with the VR registers they will, for certain VR instructions, also contain integers, i.e. non-floating point numbers?
Can anyone with more historical Hercules information perhaps offer some insight as to why the FPR instructions were implemented using the "softfloat" external package vs. using the host's IEEE 754 floating point support like available on e.g. X86-64 and ARM?
Is it the intention to keep using "softfloat" also for the VR instructions (instead of the host's IEEE 754 floating point support)?
If the answer to question #3 above is no, would there be any hope of using the host's SIMD instructions to implement (at least some of) the IBM VR instructions?

Thanks !

Cheers,

Peter

P.S.: I'll be off-line next week.

from hyperion.

salva-rczero commented on July 21, 2024

@mcisho: Can you explain to me why this line for LITTLE-ENDIAN?

int iv = ~(_v) & 0x1f;

from hyperion.

mcisho commented on July 21, 2024

@salva-rczero: Whoops, confusion on my part, taking little endian way too far! You are quite right, the register number does not need to be flipped. Well spotted.

@Peter-J-Jansen:

At any one instant the a 128-bit VR/FPR will contain either 128-bits of vector register data, or 64-bits of floating point data and 64-bits of unpredictable data. If an FPR instruction was the last thing to place a value into the VR/FPR, the VR/FPR will contain a floating point value. If a VR instruction was the last thing to place a value into the VR/FPR, the VR/FPR might contain a string, an integer value, a decimal value, or even a floating point value. I'm not clear whether the vector register elements have to be the same type, or even size.
I don't know.
I simply intend to change those statements that use regs->fpr to use regs->FPR_x, i.e Hercules will still use "softfloat".
I don't know what @salva-rczero plans for the future, or whether the hosts SIMD is part of those plans.

from hyperion.

mcisho commented on July 21, 2024

Can you please have a look at the revised attached proposal of the Hercules changes for shared zVector/FP registers, with the corrections for the errors pointed out by @salva-rczero. Yet again, all comments. suggestions, etc are welcome.

Shared-VR-and-FPR-proposal.txt

p.s. Fish, how did you add the bullet point before the link? I can't see it in the Github formatting syntax.

from hyperion.

salva-rczero commented on July 21, 2024

@mcisho While I appreciate your effort, I really don't understand the need for all these macros.
Currently the working code only need:

#define VR_B(_v,_i)     regs->vr[(_v)].B[(_i)]
#define VR_H(_v,_i)     regs->vr[(_v)].H[(_i)]
#define VR_F(_v,_i)     regs->vr[(_v)].F[(_i)]
#define VR_G(_v,_i)     regs->vr[(_v)].G[(_i)]

We would only need to add a lit-endian mode:

#define VR_B(_v,_i)     regs->vr[(_v)].B[(15-_i)]
#define VR_H(_v,_i)     regs->vr[(_v)].H[(7-_i)]
#define VR_F(_v,_i)     regs->vr[(_v)].F[(3-_i)]
#define VR_G(_v,_i)     regs->vr[(_v)].G[(1-_i)]

from hyperion.

salva-rczero commented on July 21, 2024

@Peter-J-Jansen The first goal is to get it working, but yes, I have thought about using x86 SIMD for performance. In fact, a couple of Galois arithmetic instructions already use it.

from hyperion.

Fish-Git commented on July 21, 2024

p.s. Fish, how did you add the bullet point before the link? I can't see it in the Github formatting syntax.

Asterisk or dash (minus sign) followed by a blank, which is the markdown code for an unordered list:

one,
two,
buckle...
- ...my...
  - ...shoe.

from hyperion.

Fish-Git commented on July 21, 2024

2. Can anyone with more historical Hercules information perhaps offer some insight as to why the FPR instructions were implemented using the "SoftFloat" external package vs. using the host's IEEE 754 floating point support like available on e.g. X86-64 and ARM?

I believe Steve Orso (@srorso) would probably be the best person to answer this question, but as I recall, it was basically because of 2 things:

A compiler's IEEE 754 floating point instruction/hardware support did not behave the same way as what the architecture (Principles of Operation) required out-of-the-box. I believe it had mostly to do with rounding modes. In order to use the host CPU's IEEE 754 floating point hardware/instructions support, you would have to set the proper rounding mode beforehand, which might prove to be tricky.

(Usually you define your desired rounding mode as a compiler option and the compiler uses/presumes that rounding mode for all instruction sequences that it generates. To change the rounding mode dynamically (at run time), one would have to insert hardware instructions to change the desired default rounding mode beforehand, which, as I said, might prove to be tricky when the compiler is the one deciding which FP instruction sequences to generate and in which order they are to be executed.)
I also seem to recall that IBM's Principles of Operation also defined several new non-standard rounding modes as well. That is to say, the formal specification of how IEEE 754 floating point was to behave (with respect to its defined rounding modes) either differed slightly from IBM's definition, and/or IBM defined in their architecture several new rounding modes that were not formally defined in the official IEEE 754 floating point specification.

So, in order to support those new and/or different rounding modes, SoftFloat would be need to be used anyway, so why not just keep it simple and use SoftFloat for everything?

But those are just guesses. The truth is, I don't remember what the real reaso(s) was/were. Ask Steve. He might remember the details better than me since I believe he did a lot of work on our SoftFloat code.

from hyperion.

mcisho commented on July 21, 2024

@mcisho While I appreciate your effort, I really don't understand the need for all these macros.

I proposed the macros as an aid for endianness, but if you think they are superfluous that's fine, I'll forget about them.

The most important thing is we all agree on how the shared VR/FPR are defined in REGS.

from hyperion.

mcisho commented on July 21, 2024

Can you please have a look at the fourth and hopefully final revision of the proposal of the Hercules changes for shared zVector/FP registers. The superfluous stuff has been removed, and the suggestions from @salva-rczero have been incorporated. As always, all comments. suggestions, etc are welcome.

Shared-VR-and-FPR-proposal-4.txt

from hyperion.

mcisho commented on July 21, 2024

As it appears that no one disagrees with the proposal I will proceed. In the next few days I will branch the SDL-Hercules-390 hyperion develop branch into a branch named sharedvfp, where the changes to the floating-point instructions will be implemented.

The z/Architecture Principles of Operation says:

"Whenever a floating-point instruction or floating point support instruction writes to a floating point register, or a floating point instruction that reads a register pair reads from floating-point registers, bits 64-127 of the corresponding vector register are unpredictable.".

However, in a March 2015 presentation to SHARE titled "z13 Vector Extension Facility (SIMD)", IBM said:

"Be very aware that any use of a FPR will change all 16 bytes of the corresponding VR (this includes even LD)".

Empirical evidence from instructions executed on a z15 shows that use of a FPR changes bits 64-127 of the corresponding VR to zero.

So should Hercules set bits 64-127 of the corresponding VR to zero, or leave them unchanged (i.e. unpredictable), when an instruction writes to a FPR? Leaving the bits unchanged is simpler and less prone to coding error, but Hercules wouldn't be emulating the actions of real machines (or at least the machines to date).

from hyperion.

salva-rczero commented on July 21, 2024

@mcisho Great!

As soon as you make the branch and push the changes to esa390.h and structs.h, I'll start changes to zvector.c for endianness independence.

On 64-127 bits, I would prefer to leave them unchanged. IMHO, Hercules should mimic z/Arch not real machines.

Regards, salva.

from hyperion.

Fish-Git commented on July 21, 2024

So should Hercules set bits 64-127 of the corresponding VR to zero, or leave them unchanged (i.e. unpredictable), when an instruction writes to a FPR? Leaving the bits unchanged is simpler and less prone to coding error, but Hercules wouldn't be emulating the actions of real machines (or at least the machines to date).

I agree 100% with Salva. Hercules does not -- and indeed IMHO should not -- try to emulate any particular model of mainframe, whether manufactured by IBM or anyone else. It's sole responsibility is to only try to accurately emulate the published mainframe architecture as defined in the Principles of Operation.

The behavior of mainframes varies from model or model. The behavior of the architecture does not.

Stick to the architecture.

from hyperion.

mcisho commented on July 21, 2024

The sharedvfp branch has been created, and the esa390.h and hstructs.h changes have been pushed.

Please note that the REGS structure still contains the old U32 fpr[32] variable. It will be removed when the numerous references to it have all been changed to the new shared QW vfp[32] variable.

from hyperion.

salva-rczero commented on July 21, 2024

I've just created a pull request for the changes needed for vector instruccions (E7xx).

from hyperion.

salva-rczero commented on July 21, 2024

@Fish-Git Will you provide the changes for U128, vfetch16, vstore16... from swap128 or should I do it myself?

Thanks in advance.

from hyperion.

mcisho commented on July 21, 2024

The FP instructions using the shared zVR/FPR are complete, and the tests that we have pass. All of the changes have been committed to the sharedvpr branch.

from hyperion.

Fish-Git commented on July 21, 2024

Does anyone object, feel it's premature, etc?

No objection here! Sounds like a good plan to me!

from hyperion.

JamesWekel commented on July 21, 2024

It's a bit early as I just tied a build on my Raspberry PI 5 (Armbian Jammy Linux) with Bill's Hercules-helper script. The build failed.

there were hundreds of warnings such as:

  CCLD     libhercu.la
In file included from ../hstructs.h:61,
                 from ../hercules.h:102,
                 from ../cardpch.c:14:
../opcode.h:2091:1: warning: multi-line comment [-Wcomment]
 2091 | //    #define FPR2I(_r)     /* Convert fpr to index */                    \
      | ^
../opcode.h:2098:1: warning: multi-line comment [-Wcomment]
 2098 | //    #define FPREX         /* Offset of extended register */             \
      | ^
../opcode.h:2157:1: warning: multi-line comment [-Wcomment]
 2157 | //  #define REFRESH_READ_VR(_vr)                                          \
      | ^
../opcode.h:2166:1: warning: multi-line comment [-Wcomment]
 2166 | //  #define REFRESH_UPDATE_VR(_vr)

undefined references to:

/usr/bin/ld: ./.libs/libherc.so: undefined reference to `z900_vector_multiply_and_add_high'
/usr/bin/ld: ./.libs/libherc.so: undefined reference to `z900_vector_multiply_logical_high'
/usr/bin/ld: ./.libs/libherc.so: undefined reference to `z900_vector_galois_field_multiply_sum_and_accumulate'
/usr/bin/ld: ./.libs/libherc.so: undefined reference to `z900_vector_subtract_compute_borrow_indication'
/usr/bin/ld: ./.libs/libherc.so: undefined reference to `z900_vector_subtract_with_borrow_indication'
/usr/bin/ld: ./.libs/libherc.so: undefined reference to `z900_vector_shift_left'
/usr/bin/ld: ./.libs/libherc.so: undefined reference to `z900_vector_select'

for every instruction in zvector.c. Missing an automake to update the makefile's?

What OS's have been tested? Any big endian systems? Mac OS?

My preference would be to continue with sharevfp to allow more development and more testing. salva-rczero provided the foundation for the E7 vector instructions. I've started on the E6 instructions.

Jim

from hyperion.

Fish-Git commented on July 21, 2024

Should be fixed now.

from hyperion.

JamesWekel commented on July 21, 2024

Fish,

Hercules-helper builds the sharedvfp branch but still get a lot of warning messages.

The multi-line comment warnings are still issued; eg.

  CC       transact.lo
In file included from ../machdep.h:40,
                 from ../opcode.h:855,
                 from ../hstructs.h:61,
                 from ../hercules.h:102,
                 from ../trace.c:28:
../opcode.h:2092:1: warning: multi-line comment [-Wcomment]
 2092 | //    #define FPR2I(_r)     /* Convert fpr to index */                    \
      | ^
../opcode.h:2099:1: warning: multi-line comment [-Wcomment]
 2099 | //    #define FPREX         /* Offset of extended register */             \
      | ^
../opcode.h:2160:1: warning: multi-line comment [-Wcomment]
 2160 | //  #define REFRESH_READ_VR(_vr)                                          \
      | ^
../opcode.h:2169:1: warning: multi-line comment [-Wcomment]
 2169 | //  #define REFRESH_UPDATE_VR(_vr)                                        \
      | ^

and lots of new warnings from zvector.c.

In file included from ../zvector.c:2167:
../zvector.c: In function ‘z900_vector_load_element_8’:
../zvector.c:26:17: warning: variable ‘m3’ set but not used [-Wunused-but-set-variable]
   26 |     int     v1, m3, x2, b2;
      |                 ^~
../zvector.c:26:13: warning: variable ‘v1’ set but not used [-Wunused-but-set-variable]
   26 |     int     v1, m3, x2, b2;
      |             ^~
../zvector.c: In function ‘z900_vector_load_element_16’:
../zvector.c:43:17: warning: variable ‘m3’ set but not used [-Wunused-but-set-variable]
   43 |     int     v1, m3, x2, b2;
      |                 ^~
../zvector.c:43:13: warning: variable ‘v1’ set `but` not used [-Wunused-but-set-variable]
   43 |     int     v1, m3, x2, b2;
      |             ^~

As I'm using GCC on Linux, I've added

/* ============================================= */
/* TEMPORARY while zvector2.c is being developed */

#pragma GCC diagnostic ignored "-Wunused-variable"
#pragma GCC diagnostic ignored "-Wunused-but-set-variable"
#pragma GCC diagnostic ignored "-Wcomment"

/* ============================================= */

to zvector.c and my zvector2.c (E6 vector instructions) to get rid of the some noise.

Jim

from hyperion.

Fish-Git commented on July 21, 2024

As I'm using GCC on Linux

Stop using gcc, and use clang instead. gcc sucks.

from hyperion.

pnoliveir commented on July 21, 2024

Build this yesterday on Windows 10 and tried to boot z/OS 2.5 but wasn't successful. Got CPU spins when running Java (my installation runs several Java processes at IPL end). Ended upissuing stopall on Hercules console after getting tired of writing .0,abend to retry the operation.
Still it fixed an issue with IMS also during IPL.
Compiled with MS Studio 22 and with Bill Lewis' scripts.
Machine is Windows 10, 2 E2667v4 CPUs, 256GB DDR4 RAM.
Will try the develop branch (issuing FAC ENA 192 before IPL).

from hyperion.

salva-rczero commented on July 21, 2024

@pnoliveir zvector facility is being developed as a proof of concept. For now, only the infrastructure is completed, but it does not execute any zVector instructions. In the next days, we will add some of them for testing, but far away from a stable execution and even less in a full z/OS.

from hyperion.

JamesWekel commented on July 21, 2024

salva-rczero:

In htypes.h, you have defined U128 as:
typedef __m128i U128; // unsigned 128-bits

but __m128i is only defined when Intel (X86_64) intrinsic definitions are included. They are included for MSVC and when Gcc/Clang recognizes X84_64 SSE2 is available.

Similarly, in inline.h, bswap_128 is defined as:

/*-------------------------------------------------------------------*/
/* Change endianness of a 128bits/16bytes integer                    */
/*-------------------------------------------------------------------*/
inline U128 bswap_128( U128 input )
{
    __m128i swapmask = _mm_set_epi8( 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 );
    __m128i swapped, work = input;
    _mm_storeu_si128( &swapped, _mm_shuffle_epi8( _mm_loadu_si128( &work ), swapmask ));
    return swapped;
}

but __m128i, _mm_set_epi8, etc, are not available on other CPU architectures such as ARM AArch64 (Raspberry PI, newer Macs, etc).

This brings up a whole other discussion of when/where intrinsic's are used for optimization (for later).

Jim

Addendum:
Using hercules-helper to build the branch on a Raspberry PI 5, results in:

  CC       hdl.lo
In file included from ../hstdinc.h:312,
                 from ../cckdcdsk.c:13:
../htypes.h:112:10: error: unknown type name ‘__m128i’
  112 | typedef  __m128i    U128;       // unsigned 128-bits
      |          ^~~~~~~
make[2]: *** [Makefile:2621: cckdcdsk.o] Error 1
make[2]: *** Waiting for unfinished jobs....
In file included from ../hstdinc.h:312,
                 from ../hdl.c:10:
../htypes.h:112:10: error: unknown type name ‘__m128i’
  112 | typedef  __m128i    U128;       // unsigned 128-bits
      |          ^~~~~~~
In file included from ../hstdinc.h:312,
                 from ../codepage.c:9:
../htypes.h:112:10: error: unknown type name ‘__m128i’
  112 | typedef  __m128i    U128;       // unsigned 128-bits
      |          ^~~~~~~
In file included from ../hstdinc.h:312,
                 from ../hsys.c:8:
../htypes.h:112:10: error: unknown type name ‘__m128i’
  112 | typedef  __m128i    U128;       // unsigned 128-bits
      |          ^~~~~~~

from hyperion.

Fish-Git commented on July 21, 2024

but __m128i, _mm_set_epi8, etc, are not available on other CPU architectures such as ARM AArch64 (Raspberry PI, newer Macs, etc).

Quite right. That is something that will need to eventually be fixed.

This brings up a whole other discussion of when/where intrinsic's are used for optimization (for later).

Agreed. Maintaining Hercules's portability is indeed a challenge.

from hyperion.

salva-rczero commented on July 21, 2024

I also agree, the first goal should be the emulation and not performance. I will change __m128i to a compatible type and leave the mm* functions as an alternative when SSEx is available, to others of direct implementation in c.

Most of the instructions are implemented in c using loops for the vector elements and do not use SIMD, except for two of Galois arithmetic that I will have to investigate how to implement in plain c.

Regards, salva.

from hyperion.

salva-rczero commented on July 21, 2024

@JamesWekel Jim, can you please clone https://github.com/salva-rczero/hyperion-sharedvfp.git and try to compile for ARM, Raspberry, Mac or any other cpu.

Thanks!

from hyperion.

JamesWekel commented on July 21, 2024

Salva,

Using Bill's hercules-helper for your https://github.com/salva-rczero/hyperion-sharedvfp.git repository, the build was successful. There were lots of warnings, such as:

../zvector.c: In function ‘z900_vector_load’:
../zvector.c:184:17: warning: variable ‘m3’ set but not used [-Wunused-but-set-variable]
  184 |     int     v1, m3, x2, b2;
      |                 ^~
../zvector.c: In function ‘z900_vector_store’:
../zvector.c:314:17: warning: variable ‘m3’ set but not used [-Wunused-but-set-variable]
  314 |     int     v1, m3, x2, b2;
      |                 ^~
../zvector.c: In function ‘z900_vector_load_multiple’:
../zvector.c:620:25: warning: variable ‘m4’ set but not used [-Wunused-but-set-variable]
  620 |     int     v1, v3, b2, m4, i;
      |                         ^~

To cut down on some noise during my development, I've added the following to at the top of your zverctor.c and my zvector2.c files:

/* ============================================= */
/* TEMPORARY while zvector2.c is being developed */
#if defined(__GNUC__)
    #pragma GCC diagnostic ignored "-Wunused-variable"
    #pragma GCC diagnostic ignored "-Wunused-but-set-variable"
    #pragma GCC diagnostic ignored "-Wcomment"
#endif
/* ============================================= */

I'll be offline until late tomorrow.

Jim

from hyperion.

salva-rczero commented on July 21, 2024

Great! thanks.

from hyperion.

Fish-Git commented on July 21, 2024

There were lots of warnings, such as:

../zvector.c: In function ‘z900_vector_load’:
../zvector.c:184:17: warning: variable ‘m3’ set but not used [-Wunused-but-set-variable]
  184 |     int     v1, m3, x2, b2;
      |                 ^~

If you attach the complete list, I'm sure we could get them fixed.

Also, as I hope you are aware, such warnings do not occur when clang is used instead of gcc.

from hyperion.

wrljet commented on July 21, 2024

Fish, you aren't really ready to just state that gcc is not supported, are you?

from hyperion.

Fish-Git commented on July 21, 2024

Fish, you aren't really ready to just state that gcc is not supported, are you?

No. Just that it's an inferior product.

from hyperion.

wrljet commented on July 21, 2024

Most of the planet runs on gcc compiled code.

from hyperion.

Fish-Git commented on July 21, 2024

Most of the planet runs on gcc compiled code.

Do you really want to do this?? :)

from hyperion.

wrljet commented on July 21, 2024

I don't want to do anything. But Hercules should continue to build and run from gcc.

from hyperion.

wrljet commented on July 21, 2024

@JamesWekel Jim, can you please clone https://github.com/salva-rczero/hyperion-sharedvfp.git and try to compile for ARM, Raspberry, Mac or any other cpu.

Thanks!

Which of "yous guy" have Raspberry Pi(s) or Macs?

from hyperion.

wrljet commented on July 21, 2024

Fish,

I meant argue with me regarding the quality of gcc, given that you know how I feel about the product. I'm not saying we should drop support for gcc. Just that IMHO it's far too "chatty". It throws warnings for every little itty bitty thing -- even things which IMHO do not merit a waning -- which IMHO can be not only be extremely annoying but also an impediment to development given the desire to have a product that builds with no warnings or errors thereby triggering unnecessary coding changes just to silence them.

No, I don't need to argue with you. gcc speaks for itself. :-)

That doesn't mean that gcc isn't still the standard compiler on most *nix systems.
And we need to support it one-way-or-the-other, as much as it pisses us off.

Bill

from hyperion.

wrljet commented on July 21, 2024

Yes, right, correct.

from hyperion.

salva-rczero commented on July 21, 2024

There were lots of warnings, such as:
../zvector.c: In function ‘z900_vector_load’:
../zvector.c:184:17: warning: variable ‘m3’ set but not used [-Wunused-but-set-variable]
  184 |     int     v1, m3, x2, b2;
      |                 ^~
If you attach the complete list, I'm sure we could get them fixed.

Most of the unused warning are due to pending implementation:

//
// TODO: insert code here
//
if (1) ARCH_DEP( program_interrupt )( regs, PGM_OPERATION_EXCEPTION );

So, "pragma ignore" may be a good temporary solution.

but in a few of them, vector load, vectore store... m3/m4 are alignment hint for real mainframe hardware. Which I believe should not affect Hercules.

I tried:

#if defined(__GNUC__)
    int m3 __attribute__((unused)); // Alignment hint
#else
   int m3;
#endif

and it works. Not sure if too ugly.

Another option, may be to write extra flavors for DECODERS macros. Uglier?

I'll apreciate your comments.

from hyperion.

Fish-Git commented on July 21, 2024

Most of the unused warning are due to pending implementation

Quite right. So unless they really bother you, I suggest just ignoring them for now. They should all go away once implementation is complete.

So, "pragma ignore" may be a good temporary solution.

Agreed. With emphasis on temporary.

I tried:

#if defined(__GNUC__)
    int m3 __attribute__((unused)); // Alignment hint
#else
   int m3;
#endif

and it works. Not sure if too ugly.

Too ugly.

A better solution would be to simply use our existing UNREFERENCED(x) macro.

Another option, may be to write extra flavors for DECODERS macros.

Oh HELL no!

Uglier?

Definitely!

I'll appreciate your comments.

You have them. :)

from hyperion.

mcisho commented on July 21, 2024

@salva-rczero Well done, you're making good progress. Linux used to panic about VL and VA instructions, now it's moved on to VSLDB (which interestingly it recovers from) and VERLL instructions.

Now that the FP changes appear to be done, I'm volunteering to help with some of the 80-odd instruction still to be completed.

from hyperion.

salva-rczero commented on July 21, 2024

@mcisho What's your testing environment? I tired with linux x86 and it works until it reaches a PGM_OPERATION_EXCEPTION for a nonimplemented instructions.

VSLDB must be working, VERLL should throws a PGM_OPERATION_EXCEPTION.

from hyperion.

JamesWekel commented on July 21, 2024

wrljet Which of "yous guy" have Raspberry Pi(s) or Macs?

I have a Raspberry PI 4, a Raspberry PI 5 and Intel Nuc I5 all running "Armbian 24.2.5 jammy" Ubuntu.

Jim

from hyperion.

mcisho commented on July 21, 2024

@salva-rczero My host is Fedora 40 x86_64 and the guest was Fedora 36 s390x. As you say, it works until it reaches a non implemented instruction, but that non implemented instruction takes longer to get to. However, having looked a little more closely at the log VSLDB appears not to be working, whereas VERLL does.

[    2.229015] Linux version 6.2.15-100.fc36.s390x ([email protected]) (gcc (GCC) 12.2.1 20221121 (Red Hat 12.2.1-4), GNU ld version 2.37-37.fc36) #1 SMP Thu May 11 15:47:55 UTC 2023
[    2.229046] setup: Linux is running natively in 64-bit mode
  ::
[    3.830476] Key type asymmetric registered
[    3.830510] Asymmetric key parser 'x509' registered
HHC00801I Processor CP00: Operation exception interruption code 0001 ilc 6
HHC02324I PSW=0704E00180000000 0000000024636FF0 INST=E72220080077 VSLDB 2,2,2,8,0              vector_shift_left_double_by_byte
HHC02326I V:0000000000FA8008:R:0000000000FA8008:K:06=00000000 00000000 00000000 00000000  ................
HHC02326I V:0000000000000077:R:0000000000000077:K:06=00 00000000 00000000 00602000 000010 ..........-.....
[    5.997707] Freeing initrd memory: 17292K
[    6.061102] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 247)
[    6.061509] io scheduler mq-deadline registered
[    6.061546] io scheduler kyber registered
[    6.061890] io scheduler bfq registered
[    6.086454] illegal operation: 0001 ilc:3 [#1] SMP
[    6.086523] Modules linked in:
[    6.086561] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.2.15-100.fc36.s390x #1
[    6.086614] Hardware name: HRC 2817 EMULATOR EMULATOR (LPAR)
[    6.086649] Krnl PSW : 0704e00180000000 0000000024636ff6 (chacha20_vx+0x296/0x820)
[    6.086737]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
[    6.086811] Krnl GPRS: 0000037f0000000a ffffffffffffff60 0000000000fa8000 000000002600b0a6
[    6.086861]            0000000000000109 0000037fffb1bc68 0000037fffb1bc88 0000000025c71780
[    6.086909]            0000037fffb1bc68 000000002600b0a6 0000000000fa8000 0000000000000109
[    6.086955]            00000000006bc200 0000000000000109 00000000246364c4 0000037fffb1b728
[    6.087058] Krnl Code: 0000000024636fe4: e71100072c33        verll   %v17,%v17,7,2
[    6.087058]            0000000024636fea: e75500072c33        verll   %v21,%v21,7,2
[    6.087058]           #0000000024636ff0: e72220080077        vsldb   %v2,%v2,%v2,8
[    6.087058]           >0000000024636ff6: e76660080077        vsldb   %v6,%v6,%v6,8
[    6.087058]            0000000024636ffc: e7aaa0080077        vsldb   %v10,%v10,%v10,8
[    6.087058]            0000000024637002: e7eee0080077        vsldb   %v14,%v14,%v14,8
[    6.087058]            0000000024637008: e72220080e77        vsldb   %v18,%v18,%v18,8
[    6.087058]            000000002463700e: e76660080e77        vsldb   %v22,%v22,%v22,8
[    6.087508] Call Trace:
[    6.087575]  f6>] chacha20_vx+0x296/0x820
[    6.087634] Last Breaking-Event-Address:
[    6.087663]  e>] chacha20_crypt_s390.constprop.0+0x6e/0xe0
[    6.087724] ---[ end trace 0000000000000000 ]---
[    6.087764] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

from hyperion.

salva-rczero commented on July 21, 2024

@mcisho Ok, VSLDB throws an OPERATION_EXCEPTION. Same to me:

17:27:36 HHC00801I Processor CP00: Operation exception interruption code 0001 ilc 6
17:27:36 HHC02324I PSW=0000000180000000 0000000000000322 INST=E72220080077 VSLDB 2,2,2,8,0              vector_shift_left_double_by_byte
17:27:36 HHC02326I R:0000000000000008:K:06=00000000 00000000 00000000 00000000  ................
17:27:36 HHC02326I R:0000000000000077:K:06=30 000A0000 00000038 00000000 000000 ................
17:27:36 HHC02269I R0=0000000000000000 R1=0000000000000000 R2=0000000000000000 R3=0000000000000000
17:27:36 HHC02269I R4=0000000000000000 R5=0000000000000000 R6=0000000000000000 R7=0000000000000000
17:27:36 HHC02269I R8=0000000000000000 R9=0000000000000000 RA=0000000000000000 RB=0000000000000000
17:27:36 HHC02269I RC=0000000000000200 RD=0000000000001200 RE=0000000000000000 RF=0000000000000000
17:27:36 HHC02266I VR00=0123456789abcdef.fedcba9876543210 VR01=5555555555555555.5555555555555555
17:27:36 HHC02266I VR02=5555555555555555.5555555555555555 VR03=0000000000000000.0000000000000000
...

but not the Kernel Panic.
What is your testing case?

from hyperion.

mcisho commented on July 21, 2024

I have various Linuxes to drive different Hercules network interfaces, and none of the Linuxes use vector instructions. I updated one of them the other day to a kernel that does use vector instructions, just to see what happens. So, not a test case exactly, simply an interest.

from hyperion.

salva-rczero commented on July 21, 2024

@mcisho ok, please update to lastest version.

On the other hand, the zVector facility includes 21 instructions with floating point functionality:

Vector FP Test Data Class Immediate
Vector FP Multiply and Subtract
Vector FP Multiply and Add
Vector FP Convert to Logical 64-bit
Vector FP Convert from Logical 64-bit
Vector FP Convert to Fixed 64-bit
Vector FP Convert from Fixed 64-bit
Vector FP Load Lengthened
Vector FP Load Rounded
Vector Load FP Integer
Vector FP Compare and Signal Scalar
Vector FP Compare Scalar
Vector FP Perform Sign Operation
Vector FP Square Root
Vector FP Subtract
Vector FP Add
Vector FP Divide
Vector FP Multiply
Vector FP Compare Equal
Vector FP Compare High or Equal
Vector FP Compare High

I started with the first one "Vector FP Test Data Class Immediate", but I can't use float128_t, get_sbfp, float32_class, get_float32... functions from ieee.c in zvector.c. Not sure if this the good approach. I am very new to how hercules works with floats.

Can you help me?

from hyperion.

salva-rczero commented on July 21, 2024

Current status

from hyperion.

Fish-Git commented on July 21, 2024

Ian and Salva,

Does it really matter how Hercules's current floating point logic works? All of our current Quality Assurance (runtest tests) for all of our existing non-vector floating point instructions all pass, yes?

I was under the impression that part of the new Vector Facility design was to update our existing floating point registers whenever a corresponding vector register was updated, and vice versa. Yes? That is to say, the only important thing is how the registers in hstructs.h are accessed, yes? And that was fixed (changed) several commits ago, yes? (i.e. the shared registers design: hstructs.h fpr was replaced with vfp instead, such that both normal floating point instructions AND vector instructions now both access the same internal registers storage).

And as I said, all existing floating point instructions are working perfectly! Right?

So what's the problem? What's the concern here? What am I missing?

from hyperion.

salva-rczero commented on July 21, 2024

@Fish-Git: FP & vector are working well in the sharedvfp branch.

But, there are 21 new instructions that are vector and floating point (most BFP) at the same time. My concern is how to reuse FP functionality (load, arithmetics opers, format checks...) currently in ieee.c(?) in zvector.c.

Any ideas are welcome!

Thanks for your support.

from hyperion.

Fish-Git commented on July 21, 2024

But, there are 21 new instructions that are vector and floating point (most BFP) at the same time. My concern is how to reuse FP functionality (load, arithmetics opers, format checks...) currently in ieee.c(?) in zvector.c.

Ah! Yes. I understand now.

Hmmmm...

Any ideas are welcome!

If you could identify what code (i.e. what functionality) that you need, then we might be able to tell you which existing floating point functions you need to call.

I would suggest coding something like:

/*-------------------------------------------------------------------*/
/* E7xx VXXX   - Vector whatever...                            [Vxx] */
/*-------------------------------------------------------------------*/
DEF_INST( vector_whatever )
{
    ... VFP stuff...

    /* Call BFP helper function to do whatever... */
    new_or_existing_ieee_bfp_function_to_do_something( ... variables to be passed and returned ... );

    ... continue with VFP stuff...
}

and then documenting the requirements for each of the needed functions (i.e. what they should do, the variables that should be passed to it, the values that should be returned, etc). That is to say, just make up some descriptive name and then define a dummy version of it somewhere. (Maybe with a few simple comments explaining what the function is supposed to accomplish.)

Then hopefully we (one of us, i.e. either Ian or myself or someone else) will hopefully be able to identify which existing FP/BFP helper function you need to call, and/or which new FP/BFP helper function we will need to create for you.

Does that make sense?

from hyperion.

salva-rczero commented on July 21, 2024

@Fish-Git: Absolutely!

E74A = VFTCI (vector_fp_test_data_class_immediate) checks if BFP data in a vector register pass one or more floating point condition (infinity positive, subnormal number, NaN...).

This is exactly the same checks done by float32_class, float64_class and float128_class functions in ieee.c.

I need to call this (or equivalent) functions from zvector.c.

I also need to be able to use/convert float32_t, float64_t & float128_t types too.

Thanks again.

from hyperion.

Fish-Git commented on July 21, 2024

Well, the following patch illustrates a VERY Quick and VERY Dirty way to accomplish it, by simply making zvector.c an integral part of ieee.c itself:

--- hyperion-vect-1/ieee.c	2024-04-29 23:15:42.230119500 -0700
+++ hyperion-vect-0/ieee.c	2024-05-11 17:41:00.407165100 -0700
@@ -5587,6 +5587,17 @@
 
 #endif /* defined( FEATURE_BINARY_FLOATING_POINT ) */
 
+// PROGRAMMING NOTE: the following essentially makes source file
+// "zvector.c" an integral part of "ieee.c" (i.e. of ourselves),
+// which allows "zvector.c" to more conveniently directly access
+// any function or type or constant, etc, defined within ourself
+// since it ("zvector.c") is essentially just a part of ourself.
+
+#undef  INCLUDING_FROM_IEEE_C
+#define INCLUDING_FROM_IEEE_C
+#include "zvector.c"
+#undef  INCLUDING_FROM_IEEE_C
+
 /*-------------------------------------------------------------------*/
 /*          (delineates ARCH_DEP from non-arch_dep)                  */
 /*-------------------------------------------------------------------*/
--- hyperion-vect-1/zvector.c	2024-05-09 18:57:31.000483500 -0700
+++ hyperion-vect-0/zvector.c	2024-05-11 17:42:04.367277400 -0700
@@ -1,5 +1,6 @@
 /* ZVECTOR.C    (C) Copyright Jan Jaeger, 1999-2012                  */
 /*              (C) Copyright Roger Bowler, 1999-2012                */
+/*              (C) Copyright Salva rczero(?), 2024                  */
 /*              z/Arch Vector Operations                             */
 /*                                                                   */
 /*   Released under "The Q Public License Version 1"                 */
@@ -9,13 +10,14 @@
 /* Interpretive Execution - (C) Copyright Jan Jaeger, 1999-2012      */
 /* z/Architecture support - (C) Copyright Jan Jaeger, 1999-2012      */
 
-#include "hstdinc.h"
-#define _ZVECTOR_C_
-#define _HENGINE_DLL_
+// PROGRAMMING NOTE: the following essentially makes ourselves
+// ("zvector.c") an integral part of "ieee.c", allowing ourselves
+// ("zvector.c") to more conveniently directly access any function
+// or type or constant, etc, defined in "ieee.c" (since we are
+// essentially a part of it).
 
-#include "hercules.h"
-#include "opcode.h"
-#include "inline.h"
+#include "hstdinc.h"
+#if defined( INCLUDING_FROM_IEEE_C )
 
 #if defined( FEATURE_129_ZVECTOR_FACILITY )
 /*-------------------------------------------------------------------*/
@@ -977,6 +979,21 @@
     //
     // TODO: insert code here
     //
+
+
+// example call to float128_class function defined in ieee.c
+    {
+        float128_t  op1;
+        U32         float_class;
+
+        GET_FLOAT128_OP( op1, v1, regs );
+
+        float_class = float128_class( op1 );
+    }
+
+
+
+
     if (1) ARCH_DEP( program_interrupt )( regs, PGM_OPERATION_EXCEPTION );
     //
     ZVECTOR_END( regs );
@@ -3493,17 +3510,4 @@
 
 #endif /* defined( FEATURE_129_ZVECTOR_FACILITY ) */
 
-#if !defined( _GEN_ARCH )
-
-  #if defined(              _ARCH_NUM_1 )
-    #define   _GEN_ARCH     _ARCH_NUM_1
-    #include "zvector.c"
-  #endif
-
-  #if defined(              _ARCH_NUM_2 )
-    #undef    _GEN_ARCH
-    #define   _GEN_ARCH     _ARCH_NUM_2
-    #include "zvector.c"
-  #endif
-
-#endif /*!defined(_GEN_ARCH)*/
+#endif // defined( INCLUDING_FROM_IEEE_C )

Granted, it's ugly as hell, but hey, it works! :)

from hyperion.

salva-rczero commented on July 21, 2024

I was afraid of that. Given my poor understanding of the include structure in Hercules, I would prefer another approach.

How about splitting ieee.c into ieee.h+ieee.c and exposing the necessary types, macros and functions declarartions, and then include ieee.h from zvector.c ?

from hyperion.

mcisho commented on July 21, 2024

Alternatively, the vector fp instructions could be moved from zvector.c to ieee.c? Might be simpler than splitting ieee.c?

As an aside the following instructions probably should have the 64_bit removed from their function names.

DEF_INST( vector_fp_convert_to_logical_64_bit )
DEF_INST( vector_fp_convert_from_logical_64_bit )
DEF_INST( vector_fp_convert_to_fixed_64_bit )
DEF_INST( vector_fp_convert_from_fixed_64_bit )

Vector-enhancements facility 1 and 2 introduce new versions of the instructions that use 32-bit vectors.

Addendum

I've tried moving the vector fp instructions to ieee.c, and it seems to work, the various types and functions are usable.

from hyperion.

Fish-Git commented on July 21, 2024

How about splitting ieee.c into ieee.h+ieee.c and exposing the necessary types, macros and functions declarartions, and then include ieee.h from zvector.c ?

Yes, that IS the correct way.

Alternatively, the vector fp instructions could be moved from zvector.c to ieee.c? Might be simpler than splitting ieee.c?

That would work too.

from hyperion.

Fish-Git commented on July 21, 2024

As an aside the following instructions probably should have the 64_bit removed from their function names.

Agreed.

from hyperion.

mcisho commented on July 21, 2024

I have attached my changes to ieee.c and zvector.c so that you can see what I have done so far, and discuss/decide whether we should continue on this path? The changes to ieee.c add the vector fp instructions and implement some of them, the changes to zvector.c remove the vector fp instructions and add some comments re where they can be found.

ieee_and_zvector.zip

from hyperion.

Fish-Git commented on July 21, 2024

QUICK QUESTION:

Is the sharedvfp branch obsolete now? That is to say, is all current VFP development now being done in the normal develop branch now? Is the sharedvfp branch "finished"? Has the reason (purpose) for its creation been completed now? I just need some clarity on this. Thanks!

from hyperion.

Fish-Git commented on July 21, 2024

I have attached my changes to ieee.c and zvector.c so that you can see what I have done so far, and discuss/decide whether we should continue on this path?

Looks okay to me, Ian! And IMO yes, it seems to be a valid working path that we should probably continue on. I'm thinking the bulk of the Vector instructions should of course continue to be in zvector.c, with the few exceptions to the rule that deal with floating point moved into ieee.c just like you have them in your example .zip.

from hyperion.

mcisho commented on July 21, 2024

Is the sharedvfp branch obsolete now?

No. The develop branch doesn't have zVector support. If you want to try zVector you need to use the sharedvfp branch, and the latest commit of progress by @salva-rczero was to the sharedvfp branch.

from hyperion.

salva-rczero commented on July 21, 2024

QUICK QUESTION:

Is the sharedvfp branch obsolete now? That is to say, is all current VFP development now being done in the normal develop branch now? Is the sharedvfp branch "finished"? Has the reason (purpose) for its creation been completed now? I just need some clarity on this. Thanks!

For my part, I believe that my contribution to this project has come to an end. I have already warned that I do not have the necessary skills and I find everything related to the discussion/design very difficult. It is better to leave that task to those of you who know it.

Farewell and thank you very much for your time and advice (especially to @Fish-Git).

Good luck and long live to Hercules!

from hyperion.

JamesWekel commented on July 21, 2024

I'm working on the E6 z/vector instructions which has a lot of change to the infrastructure just as the E7 z/vector instructions did. My work is based on the sharedvfp branch. I'm hoping to be at a stable place late next week for a pull request for your review. It will need more review as ecpsvm.c implements E6 instructions for S370 which overlap with new E6 z/vector instructions.

The E6 instructions will be in zvector2.c. Rather than move vector decimal instructions to decimal.c, I was planning on changing some of the functions in decimal.c from static void to void with new function prototypes in opcode.h.

Do we have a consistent type definition for U128? For some instructions, I need to do 128 bit arithmetic.

Jim

from hyperion.

Fish-Git commented on July 21, 2024

I'm working on the E6 z/vector instructions ...

Thank you, James! I still say you should consider becoming an official Hercules developer. Your contributions over the past many months (past year?) have been invaluable.

Do we have a consistent type definition for U128?

AFAIK, type U128 does not exist in Hercules. gcc and clang both support the __int128 type, but unfortunately Microsoft's compiler still does not (even though people have been complaining about it for years now). :(

from hyperion.

mcisho commented on July 21, 2024

For my part, I believe that my contribution to this project has come to an end.

That's a pity, I thought you were doing a great job.

... I find everything related to the discussion/design very difficult.

Don't worry, you're not alone there.

from hyperion.

JamesWekel commented on July 21, 2024

mcisho

As part of pull request [https://github.com//pull/661], I have enabled the following features in feat900.h:

#define FEATURE_134_ZVECTOR_PACK_DEC_FACILITY
#define FEATURE_135_ZVECTOR_ENH_FACILITY_1
#define FEATURE_148_VECTOR_ENH_FACILITY_2
#define FEATURE_152_VECT_PACKDEC_ENH_FACILITY
#define FEATURE_165_NNET_ASSIST_FACILITY
#define FEATURE_192_VECT_PACKDEC_ENH_2_FACILITY

as all/most of the E6 instructions are defined as part of or enhanced with these facilities. I suspect that is causing some of the windows build problems, as you are referencing FEATURE_135_ZVECTOR_ENH_FACILITY_1.

Hope I haven't caused too many problems, but I wanted to get the basics in for the E6 instructions to minimize merge conflicts.

Jim

from hyperion.

Fish-Git commented on July 21, 2024

FYI: James's changes to the sharedvfp branch have been merged.

from hyperion.

JamesWekel commented on July 21, 2024

The z/vector E6 instructions, for example VECTOR FP CONVERT TO NNP, reference NNP-Data-Type-1 Format. From z/Architecture Principles of Operation, SA22-7832-13, page 26-1 states:

Neural Network Processing Data

The NEURAL NETWORK PROCESSOR ASSIST
instruction, as well as the related convert instructions
described in this chapter, perform operations on
model-dependent data types.

NNP-Data-Type-1 Format

NNP-data-type-1 format represents a 16-bit signed
floating-point number in a proprietary format with a
range and precision tailored toward neural-network
processing. Other models may use other data formats.

But the NNP-data-type-1 format is not described. Does anyone have additional reference information on the format? The closest that I've found is a DLFLOAT presentation: https://pdfs.semanticscholar.org/5359/1b203af986668ca6586f80d30257d3ee52d7.pdf

Thanks,
Jim

from hyperion.

Fish-Git commented on July 21, 2024

Does anyone have additional reference information on the format?

I'm not aware of any, no. But then I haven't tried looking for it either.

The closest that I've found is a DLFLOAT presentation: https://pdfs.semanticscholar.org/5359/1b203af986668ca6586f80d30257d3ee52d7.pdf

THAT looks to me like that's probably it! Great find, James! I say go with it!

from hyperion.

Vector Facility for z/Architecture about hyperion HOT 83 OPEN

Comments (83)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent