Giter Site home page Giter Site logo

Comments (2)

vpirogov avatar vpirogov commented on June 1, 2024

Hi @ItayLasch,

Looks like there's no optimized implementation for binary post-op with broadcast mask 1:

$ DNNL_VERBOSE=1 ./benchdnn --matmul --wtag=abc --dt=s8:s8:s8 --attr-post-ops="binary_mul:u8:1+binary_mul:f32:1+eltwise_clip:-128:127" 6x200x16:6x16x200
onednn_verbose,info,oneDNN v3.3.0 (commit dc66df7b18ad12ecd5fa438a5055bbae4628f481)
onednn_verbose,info,cpu,runtime:OpenMP,nthr:48
onednn_verbose,info,cpu,isa:Intel AVX-512 with Intel DL Boost
onednn_verbose,info,gpu,runtime:none
onednn_verbose,info,graph,backend,0:dnnl_backend
onednn_verbose,primitive,info,template:operation,engine,primitive,implementation,prop_kind,memory_descriptors,attributes,auxiliary,problem_desc,exec_time
onednn_verbose,graph,info,template:operation,engine,partition_id,partition_kind,op_names,data_formats,logical_tensors,fpmath_mode,backend,exec_time
onednn_verbose,primitive,exec,cpu,reorder,simple:any,undef,src_f32::blocked:abc::f0 dst_f32::blocked:abc::f0,,,6x1x1,0.0319824
onednn_verbose,primitive,exec,cpu,reorder,rnn_data_reorder,undef,src_f32::blocked:abc::f0 dst_u8::blocked:abc::f0,,,6x1x1,0.0288086
onednn_verbose,primitive,exec,cpu,reorder,rnn_data_reorder,undef,src_f32::blocked:abc::f0 dst_s8::blocked:abc::f0,,,6x16x200,0.026123
onednn_verbose,primitive,exec,cpu,reorder,rnn_data_reorder,undef,src_f32::blocked:abc::f0 dst_s8::blocked:abc::f0,,,6x200x16,0.0249023
onednn_verbose,primitive,exec,cpu,matmul,ref:any,undef,src_s8:a:blocked:abc::f0 wei_s8::blocked:abc::f0 dst_s8:a:blocked:abc::f0,attr-post-ops:binary_mul:u8:1+binary_mul:f32:1+eltwise_clip:-128:127 ,,6x200x16:6x16x200,9.46313
onednn_verbose,primitive,exec,cpu,reorder,simple:any,undef,src_f32::blocked:abc::f0 dst_f32::blocked:abc::f0,,,6x200x200,0.072998
onednn_verbose,primitive,exec,cpu,reorder,jit:uni,undef,src_s8::blocked:abc::f0 dst_f32::blocked:abc::f0,,,6x200x200,0.0688477
0:PASSED __REPRO: --matmul --dt=s8:s8:s8 --wtag=abc --attr-post-ops=mul:u8:1+mul:f32:1+clip:-128:127 6x200x16:6x16x200
tests:1 passed:1 skipped:0 mistrusted:0 unimplemented:0 invalid_arguments:0 failed:0 listed:0
total: 0.03s; fill: 0.00s (16%); compute_ref: 0.01s (18%); compare: 0.01s (30%);

Either 0 or 2 go to optimized implementation:

$ DNNL_VERBOSE=1 ./benchdnn --matmul --wtag=abc --dt=s8:s8:s8 --attr-post-ops="binary_mul:u8:0+binary_mul:f32:0+eltwise_clip:-128:127" 6x200x16:6x16x200
onednn_verbose,info,oneDNN v3.3.0 (commit dc66df7b18ad12ecd5fa438a5055bbae4628f481)
onednn_verbose,info,cpu,runtime:OpenMP,nthr:48
onednn_verbose,info,cpu,isa:Intel AVX-512 with Intel DL Boost
onednn_verbose,info,gpu,runtime:none
onednn_verbose,info,graph,backend,0:dnnl_backend
onednn_verbose,primitive,info,template:operation,engine,primitive,implementation,prop_kind,memory_descriptors,attributes,auxiliary,problem_desc,exec_time
onednn_verbose,graph,info,template:operation,engine,partition_id,partition_kind,op_names,data_formats,logical_tensors,fpmath_mode,backend,exec_time
onednn_verbose,primitive,exec,cpu,reorder,simple:any,undef,src_f32::blocked:abc::f0 dst_f32::blocked:abc::f0,,,1x1x1,1.69897
onednn_verbose,primitive,exec,cpu,reorder,rnn_data_reorder,undef,src_f32::blocked:abc::f0 dst_u8::blocked:abc::f0,,,1x1x1,0.0158691
onednn_verbose,primitive,exec,cpu,reorder,rnn_data_reorder,undef,src_f32::blocked:abc::f0 dst_s8::blocked:abc::f0,,,6x16x200,0.0168457
onednn_verbose,primitive,exec,cpu,reorder,rnn_data_reorder,undef,src_f32::blocked:abc::f0 dst_s8::blocked:abc::f0,,,6x200x16,0.0180664
onednn_verbose,primitive,exec,cpu,matmul,brg:avx512_core_vnni,undef,src_s8:a:blocked:abc::f0 wei_s8::blocked:abc::f0 dst_s8:a:blocked:abc::f0,attr-post-ops:binary_mul:u8:0+binary_mul:f32:0+eltwise_clip:-128:127 ,,6x200x16:6x16x200,0.156982
onednn_verbose,primitive,exec,cpu,reorder,simple:any,undef,src_f32::blocked:abc::f0 dst_f32::blocked:abc::f0,,,6x200x200,0.0529785
onednn_verbose,primitive,exec,cpu,reorder,jit:uni,undef,src_s8::blocked:abc::f0 dst_f32::blocked:abc::f0,,,6x200x200,0.0510254
0:PASSED __REPRO: --matmul --dt=s8:s8:s8 --wtag=abc --attr-post-ops=mul:u8:0+mul:f32:0+clip:-128:127 6x200x16:6x16x200
tests:1 passed:1 skipped:0 mistrusted:0 unimplemented:0 invalid_arguments:0 failed:0 listed:0
total: 0.01s; fill: 0.00s (29%); compute_ref: 0.00s (34%); compare: 0.00s (21%);

from onednn.

vpirogov avatar vpirogov commented on June 1, 2024

This is documented in 'Attributes and Post-Ops' section in oneDNN developer guide.

from onednn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.