Comments (8)
Hi @gtgtgt1117, based on verbose output, the system supports only AVX2_VNNI: onednn_verbose,info,cpu,isa:Intel AVX2 with Intel DL Boost
.
from onednn.
Hi @gtgtgt1117, based on verbose output, the system supports only AVX2_VNNI:
onednn_verbose,info,cpu,isa:Intel AVX2 with Intel DL Boost
.
The result of "lscpu" show that the system support "avx_vnni".
Maybe you can tell me what platform or how to use the isa.
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cat_l2 cdp_l3 invpcid_single intel_ppin cdp_l2 ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb intel_pt avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local split_lock_detect avx_vnni avx512_bf16 wbnoinvd dtherm ida arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req avx512vbmi umip pku ospke waitpkg avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg tme avx512_vpopcntdq la57 rdpid bus_lock_detect cldemote movdiri movdir64b enqcmd fsrm md_clear serialize tsxldtrk pconfig arch_lbr ibt amx_bf16 avx512_fp16 amx_tile amx_int8 flush_l1d arch_capabilities
from onednn.
This looks like an SPR for me (amx_bf16
avx512_fp16
amx_tile
amx_int8
), it doesn't have AVX2_VNNI_2 capabilities.
from onednn.
This looks like an SPR for me (
amx_bf16
avx512_fp16
amx_tile
amx_int8
), it doesn't have AVX2_VNNI_2 capabilities.
Thanks.
But which platform supports it (avx_vnni)?
from onednn.
This looks like an SPR for me (
amx_bf16
avx512_fp16
amx_tile
amx_int8
), it doesn't have AVX2_VNNI_2 capabilities.Thanks. But which platform supports it (avx_vnni)?
This is what I was able to find so far: https://en.wikipedia.org/wiki/Advanced_Vector_Extensions#CPUs_with_AVX-VNNI
SPR is included in the list.
from onednn.
This looks like an SPR for me (
amx_bf16
avx512_fp16
amx_tile
amx_int8
), it doesn't have AVX2_VNNI_2 capabilities.Thanks. But which platform supports it (avx_vnni)?
This is what I was able to find so far: https://en.wikipedia.org/wiki/Advanced_Vector_Extensions#CPUs_with_AVX-VNNI SPR is included in the list.
Thanks.
Run the command on Alder lake as follow:
ONEDNN_MAX_CPU_ISA=AVX2_VNNI_2 ONEDNN_VERBOSE=1 ./benchmark_app -m ../../v3/optimized_int8_0.9626/Model0.xml -shape [1,3,48,100]
onednn_verbose,exec,cpu,inner_product,brgemm:avx2_vnni,forward_inference,src_u8::blocked:ab::f0 wei_s8:ap:blocked:AB4b24a4b::f0 bia_undef::undef::: dst_f32::blocked:ab::f0,attr-scratchpad:user attr-post-ops:binary_add:f32:2+binary_mul:f32:2+binary_add:f32:2 ,,mb12ic64oc6625,0.224854
onednn_verbose,exec,cpu,convolution,jit_uni_int8:avx2_vnni,forward_inference,src_u8::blocked:acdb::f0 wei_s8:a:blocked:ABcd2b8a4b::f0 bia_f32::blocked:a::f0 dst_u8::blocked:acdb::f0,attr-scratchpad:user attr-scales:wei:1 attr-legacy-input-zero-points::2:1024 attr-post-ops:eltwise_swish:1+eltwise_linear:369.9:103 ,alg:convolution_direct,mb1_ic1024oc64_ih1oh1kh3sh1dh0ph1_iw12ow12kw3sw1dw0pw1,0.171875
only dispatch "avx2_vnni", why not "avx2_vnni_2"?
from onednn.
AlderLake has AVX2_VNNI as top instruction set.
AVX2_VNNI_2 belongs to Sierra Forest product and has fp16 conversion instructions on top of AVX2_VNNI ISA.
from onednn.
AlderLake has AVX2_VNNI as top instruction set. AVX2_VNNI_2 belongs to Sierra Forest product and has fp16 conversion instructions on top of AVX2_VNNI ISA.
Thanks
from onednn.
Related Issues (20)
- running destructors before completion of a primitive HOT 7
- why the result of eltwise_hardswish is zero? HOT 8
- test_benchdnn_modeC_softmax_ci_cpu fails due to F16 accumulation HOT 2
- Check timings of assembly level instructions HOT 10
- How to use coverage.cmake file HOT 5
- Add option to disable python 2.7 finding via docs HOT 5
- why the result shape of conv is not same with input HOT 3
- test_benchdnn_modeC_rnn_ci_cpu failing on AArch64 with and without ACL
- how to link the dnnl library from a git submodule build HOT 7
- how can i use the cache for cpu inference HOT 12
- 'ONEDNN_VERBOSE' is not recognized at the windows cmd HOT 2
- Core utilization on heterogeneous architectures HOT 2
- when i set the ONEDNN_VERBOSE=all for windows, it only display serveral information, why? HOT 2
- why the reorder cost a lot time? HOT 9
- Getting configuration error on RiscV Qemu: Only Sequential Runtime is now supported for a Risc-V CPU HOT 4
- can support the fp16 or bf16 for cpu with 3.3.6 version? HOT 11
- when i compile the dll of onednn wit debug, it failds to build, but the release is built successfully。 HOT 2
- how can i close the verbose when running HOT 4
- Graphs with single StaticReshape or StaticTranspose fail HOT 12
- test_benchdnn_modeC_reorder_ci_cpu failing on AArch64 HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from onednn.