Giter Site home page Giter Site logo

Comments (8)

dzarukin avatar dzarukin commented on June 1, 2024

Hi @gtgtgt1117, based on verbose output, the system supports only AVX2_VNNI: onednn_verbose,info,cpu,isa:Intel AVX2 with Intel DL Boost.

from onednn.

gtgtgt1117 avatar gtgtgt1117 commented on June 1, 2024

Hi @gtgtgt1117, based on verbose output, the system supports only AVX2_VNNI: onednn_verbose,info,cpu,isa:Intel AVX2 with Intel DL Boost.

The result of "lscpu" show that the system support "avx_vnni".
Maybe you can tell me what platform or how to use the isa.
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cat_l2 cdp_l3 invpcid_single intel_ppin cdp_l2 ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb intel_pt avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local split_lock_detect avx_vnni avx512_bf16 wbnoinvd dtherm ida arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req avx512vbmi umip pku ospke waitpkg avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg tme avx512_vpopcntdq la57 rdpid bus_lock_detect cldemote movdiri movdir64b enqcmd fsrm md_clear serialize tsxldtrk pconfig arch_lbr ibt amx_bf16 avx512_fp16 amx_tile amx_int8 flush_l1d arch_capabilities

from onednn.

dzarukin avatar dzarukin commented on June 1, 2024

This looks like an SPR for me (amx_bf16 avx512_fp16 amx_tile amx_int8), it doesn't have AVX2_VNNI_2 capabilities.

from onednn.

gtgtgt1117 avatar gtgtgt1117 commented on June 1, 2024

This looks like an SPR for me (amx_bf16 avx512_fp16 amx_tile amx_int8), it doesn't have AVX2_VNNI_2 capabilities.

Thanks.
But which platform supports it (avx_vnni)?

from onednn.

dzarukin avatar dzarukin commented on June 1, 2024

This looks like an SPR for me (amx_bf16 avx512_fp16 amx_tile amx_int8), it doesn't have AVX2_VNNI_2 capabilities.

Thanks. But which platform supports it (avx_vnni)?

This is what I was able to find so far: https://en.wikipedia.org/wiki/Advanced_Vector_Extensions#CPUs_with_AVX-VNNI
SPR is included in the list.

from onednn.

gtgtgt1117 avatar gtgtgt1117 commented on June 1, 2024

This looks like an SPR for me (amx_bf16 avx512_fp16 amx_tile amx_int8), it doesn't have AVX2_VNNI_2 capabilities.

Thanks. But which platform supports it (avx_vnni)?

This is what I was able to find so far: https://en.wikipedia.org/wiki/Advanced_Vector_Extensions#CPUs_with_AVX-VNNI SPR is included in the list.

Thanks.
Run the command on Alder lake as follow:
ONEDNN_MAX_CPU_ISA=AVX2_VNNI_2 ONEDNN_VERBOSE=1 ./benchmark_app -m ../../v3/optimized_int8_0.9626/Model0.xml -shape [1,3,48,100]

onednn_verbose,exec,cpu,inner_product,brgemm:avx2_vnni,forward_inference,src_u8::blocked:ab::f0 wei_s8:ap:blocked:AB4b24a4b::f0 bia_undef::undef::: dst_f32::blocked:ab::f0,attr-scratchpad:user attr-post-ops:binary_add:f32:2+binary_mul:f32:2+binary_add:f32:2 ,,mb12ic64oc6625,0.224854
onednn_verbose,exec,cpu,convolution,jit_uni_int8:avx2_vnni,forward_inference,src_u8::blocked:acdb::f0 wei_s8:a:blocked:ABcd2b8a4b::f0 bia_f32::blocked:a::f0 dst_u8::blocked:acdb::f0,attr-scratchpad:user attr-scales:wei:1 attr-legacy-input-zero-points::2:1024 attr-post-ops:eltwise_swish:1+eltwise_linear:369.9:103 ,alg:convolution_direct,mb1_ic1024oc64_ih1oh1kh3sh1dh0ph1_iw12ow12kw3sw1dw0pw1,0.171875

only dispatch "avx2_vnni", why not "avx2_vnni_2"?

from onednn.

dzarukin avatar dzarukin commented on June 1, 2024

AlderLake has AVX2_VNNI as top instruction set.
AVX2_VNNI_2 belongs to Sierra Forest product and has fp16 conversion instructions on top of AVX2_VNNI ISA.

from onednn.

gtgtgt1117 avatar gtgtgt1117 commented on June 1, 2024

AlderLake has AVX2_VNNI as top instruction set. AVX2_VNNI_2 belongs to Sierra Forest product and has fp16 conversion instructions on top of AVX2_VNNI ISA.

Thanks

from onednn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.