Comments (8)
Yes, there are pure SSE2 (non+VEX)/AVX2 (VEX) versions of these. There are "duplicates":
66 0F 3A 44 /r ib PCLMULQDQ xmm1, xmm2/m128, imm8
VEX.128.66.0F3A.WIG 44 /r ib VPCLMULQDQ xmm1, xmm2, xmm3/m128, imm8
EVEX.128.66.0F3A.WIG 44 /r /ib VPCLMULQDQ xmm1, xmm2, xmm3/m128, imm8
VEX.256.66.0F3A.WIG 44 /r /ib VPCLMULQDQ ymm1, ymm2, ymm3/m256, imm8
EVEX.256.66.0F3A.WIG 44 /r /ib VPCLMULQDQ ymm1, ymm2, ymm3/m256, imm8
I assume the VEX
/EVEX
versions are only selected by the assembler when using the extended registers.
In #146 I was looking for a way to restrict extended registry usage.
Example of CPU with GFNI and no AVX
mockcpu_test.go:177: Opening GenuineIntel0090661_ElkhartLake_02_CPUID.txt
mockcpu_test.go:180: Name: Intel Atom(R) x6425RE Processor @ 1.90GHz
mockcpu_test.go:182: Max Function:0x1b
mockcpu_test.go:184: Max Extended Function:0x80000008
mockcpu_test.go:185: VendorString: GenuineIntel
mockcpu_test.go:186: VendorID: Intel
mockcpu_test.go:187: PhysicalCores: 4
mockcpu_test.go:188: ThreadsPerCore: 1
mockcpu_test.go:189: LogicalCores: 4
mockcpu_test.go:190: Family 6 Model: 150 Stepping: 1
mockcpu_test.go:191: Features: AESNI,CLMUL,CMOV,CMPXCHG8,CX16,ERMS,FLUSH_L1D,FXSR,FXSROPT,GFNI,IA32_ARCH_CAP,IA32_CORE_CAP,IBPB,LAHF,MD_CLEAR,MMX,MOVBE,MOVDIR64B,MOVDIRI,NX,OSXSAVE,POPCNT,RDRAND,RDSEED,RDTSCP,SHA,SPEC_CTRL_SSBD,SSE,SSE2,SSE3,SSE4,SSE42,SSSE3,STIBP,SYSCALL,SYSEE,VMX,WAITPKG,X87,XGETBV1,XSAVE,XSAVEC,XSAVEOPT,XSAVES
mockcpu_test.go:192: Microarchitecture level: 2
mockcpu_test.go:193: Cacheline bytes: 64
mockcpu_test.go:194: L1 Instruction Cache: 32768 bytes
mockcpu_test.go:195: L1 Data Cache: 32768 bytes
mockcpu_test.go:196: L2 Cache: 1572864 bytes
mockcpu_test.go:197: L3 Cache: 4194304 bytes
mockcpu_test.go:198: Hz: 1900000000 Hz
mockcpu_test.go:199: Boost: 1900000000 Hz
from avo.
- Actually, I'm unsure how (or if) the Go assembler decides how to select among the VEX and EVEX, encodings for a 256-bit wide instruction. For SSE encoding the omitted "V" prefix (e.g. VPADDQ vs PADDQ) is the signal. Does it always use VEX unless an AVX512 option necessitating EVEX is used (e.g. mask, memory broadcast, ymm > ymm15, etc.)?
from avo.
And everything above also applies to VPCLMULQDQ
including the misnaming/handling in x/sys/cpu
from avo.
Thanks for pointing this out. I could consider a change in avo
to match. It would be a very minor break, but since GFNI isn't even in a tagged release I suspect it wouldn't affect anyone.
However, I tend to agree that we've got this right in avo
and x/sys/cpu
is wrong? I agree the Go project is probably not going to think this is worth a breaking change for, though.
In the event of implementing feature checks #168 I don't think having a special-case fixup for those specific ISAs is going to be that bad?
Note that @klauspost's cpuid
agrees with avo
here:
https://pkg.go.dev/github.com/klauspost/cpuid/v2#FeatureID
from avo.
Ah! Sorry I'm just now grasping what you're saying. It's not simply an issue of applying a naming transform. x/sys/cpu
does not have a flag that indicates the presence of GFNI
, it only has one that indicates AVX512F && GFNI
.
from avo.
- Actually, I'm unsure how (or if) the Go assembler decides how to select among the VEX and EVEX, encodings for a 256-bit wide instruction. For SSE encoding the omitted "V" prefix (e.g. VPADDQ vs PADDQ) is the signal. Does it always use VEX unless an AVX512 option necessitating EVEX is used (e.g. mask, memory broadcast, ymm > ymm15, etc.)?
If I recall correctly that's exactly what it does. It will use VEX unless it has to use EVEX.
avo
's handling of this is not great.
Lines 786 to 799 in 05ed388
from avo.
Yes, on reflection this is how it would have to work, otherwise it would be impossible to write valid AVX/AVX2 code and ensure that it would not fault on hardware without AVX512.
from avo.
I just checked and there doesn't appear to be an existing issue on the golang project for this problem. I've already prototyped it and the fix is straightforward, so I'm considering filing a new issue over there this week.
from avo.
Related Issues (20)
- tests/thirdparty: add @ericlagergren packages HOT 2
- Getting `CL` register for variable shifts HOT 3
- ci: slowdown due to golangci-lint install
- instructions: new MOVBE* aliases
- tests/thirdparty: add probakowski/curve1174 HOT 2
- tests/thirdparty: add oasisprotocol/deoxysii
- tests/thirdparty: add cloudflare/circl
- tests/thirdparty: standard library tests are slow HOT 1
- question: diasallow automatic use of certain register(s) HOT 1
- How to generate stub.go files with proper imports?
- instructions: GFNI HOT 5
- build: missing move deduction for booleans HOT 2
- tests/thirdparty: add crypto/internal/bigmod
- TestParseSignatureErrors fail on Go1.20 HOT 3
- ci: contributor PRs fail the pr/automerge step HOT 1
- Cannot generate stubs HOT 3
- operand: printing floating point literals without decimal causes invalid interpretation HOT 1
- how to include a asm file ? HOT 1
- why the result is reversed ? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from avo.