Dear MKL DNN developers, Do you know the cause of performance differ

Dear Roma, I used my modified <a href="https://github.com/HAIDJ/myre

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

MKL-DNN vs MKL2017 performance difference about onednn HOT 13 CLOSED

oneapi-src commented on May 18, 2024

MKL-DNN vs MKL2017 performance difference

from onednn.

Comments (13)

haidj commented on May 18, 2024 3

Dear Roma,
Thanks for your quick reply.

In my case, there is little performance change even if applying the latest MKLML version.
-MKLURL="https://github.com/01org/mkl-dnn/releases/download/v0.7/mklml_lnx_2017.0.2.20170209.tgz"
+MKLURL="https://github.com/01org/mkl-dnn/releases/download/v0.7/mklml_lnx_2017.0.3.20170424.tgz"

With regard to the compiler, I have used gcc version 4.8.5 20150623.
Does it make sense to apply the intel compiler? Is there a makefile with intel compiler?

For your reference, I have attached my makefile.

Thank you,
Daejin Jung.

from onednn.

rsdubtso commented on May 18, 2024 1

Hello Daejin, I did not realize that now IntelCaffe builds MKL-DNN on its own...

I'm not suggesting using the latest mklml, but mkldnn. But if you are relying on IntelCaffe to build MKL-DNN, then you are already using whatever is the latest.
For the best performance it is best to build both IntelCaffe and MKL-DNN with icc/icpc. The reason is not better code generation, but is that using icc's OpenMP library tends to result in better performance. To build Caffe using icc it is typically sufficient to run CC=icc CXX=icpc make.

from onednn.

rsdubtso commented on May 18, 2024

Hi Daejin,

I desperately want to improve mkl-dnn to the performance of mkl2017.

That's what the team is currently working on

Did you use this prototxt? Can you please provide per-layer timings?

Thanks,
Roma

from onednn.

haidj commented on May 18, 2024

Dear Roma,

I used my modified prototxt based on mkl2017-resnet_50 included in the intel caffe to test backward path. The prototxt you attached does not work in the backward path.

I also submitted the per-layer timings measured from my prototxt.
full log here

Actullay, the performance drop of MKLDNN compared to MKL2017 is considerably large, and some layers (e.g., BW_bn4j_branch2c, BW_bn2c_branch2c, BW_bn3h_branch2c) drop to 500 times.

I still wonder why this performance difference occurs.

Thank you,
Daejin Jung.

from onednn.

rsdubtso commented on May 18, 2024

Thanks for the timings. Can you please try the latest version? I see Vadim has just published 0.7.

Also, which compiler do you use to build mkldnn?

Thanks,
Roma

from onednn.

haidj commented on May 18, 2024

Hello Roma,

I need a little more your help with intel compiler.
When using icc to build mkl-dnn, I have encountered the following errors.

src/caffe/mkldnn_memory.cpp(242): error: no suitable constructor exists to convert from "long" to "boost::shared_ptr<caffe::PrvMemDescr>"blob->set_prv_diff_descriptor(NULL);
                                                                                                                      ^
detected during instantiation of "boost::shared_ptr<mkldnn::primitive> caffe::MKLDNNMemoryDescriptor<Dtype, is_diff>::get_blob_prv_primitive(caffe::Blob<Dtype> *, bool, bool, caffe::MKLDNNMemoryDescriptor<Dtype, is_diff> *) [with Dtype=double, is_diff=true]" at line 394

src/caffe/mkldnn_memory.cpp(247): error: no suitable constructor exists to convert from "long" to "boost::shared_ptr<caffe::PrvMemDescr>" blob->set_prv_data_descriptor(NULL);
                                                                                                                         ^
detected during instantiation of "boost::shared_ptr<mkldnn::primitive> caffe::MKLDNNMemoryDescriptor<Dtype, is_diff>::get_blob_prv_primitive(caffe::Blob<Dtype> *, bool, bool, caffe::MKLDNNMemoryDescriptor<Dtype, is_diff> *) [with Dtype=double, is_diff=true]" at line 394

src/caffe/layers/mkldnn_inner_product_layer.cpp(70): error: no instance of constructor "boost::shared_ptr<T>::shared_ptr [with T=caffe::MKLDNNDiff<double>]" matches the argument list
            argument types are: (long)
              bwdw_weights_diff(NULL),
                               ^

It seems that icc and boost are not compatible. So, I am trying to apply a new intel compiler.
Do you have any suggestion?

my icc version: 17.0.1 (gcc version 4.8.5 compatibility)
Intel® Parallel Studio version: 2017.1.132

from onednn.

haidj commented on May 18, 2024

When I updated boost version to 1.64, I found other bugs like below.

ld: warning: libimf.so, needed by external/mkldnn/install/lib/libmkldnn.so, not found (try using -rpath or -rpath-link)
ld: warning: libsvml.so, needed by external/mkldnn/install/lib/libmkldnn.so, not found (try using -rpath or -rpath-link)
ld: warning: libirng.so, needed by external/mkldnn/install/lib/libmkldnn.so, not found (try using -rpath or -rpath-link)
ld: warning: libintlc.so.5, needed by external/mkldnn/install/lib/libmkldnn.so, not found (try using -rpath or -rpath-link)
ld: warning: libimf.so, needed by external/mkldnn/install/lib/libmkldnn.so, not found (try using -rpath or -rpath-link)
ld: warning: libsvml.so, needed by external/mkldnn/install/lib/libmkldnn.so, not found (try using -rpath or -rpath-link)
ld: warning: libirng.so, needed by external/mkldnn/install/lib/libmkldnn.so, not found (try using -rpath or -rpath-link)
ld: warning: libintlc.so.5, needed by external/mkldnn/install/lib/libmkldnn.so, not found (try using -rpath or -rpath-link)
ld: .build_release/tools/convert_imageset.bin: hidden symbol `__intel_cpu_features_init_x' in /opt/intel/compilers_and_libraries_2017.1.132/linux/compiler/lib/intel64_lin/libirc.a(cpu_feature_disp.o) is referenced by DSO
ld: final link failed: Bad value
make: *** [.build_release/tools/convert_imageset.bin] Error 1

I have seen the related discussions and have not gotten the clear answer yet.. : (

from onednn.

rsdubtso commented on May 18, 2024

I have just tried reproducing this with 1.63.0 and latest IntelCaffe from github and did not run into any issues.

The messages above suggest that icc's libraries are not in LD_LIBRARY_PATH, so I suspect environment setup issues.

Can you please try setting CUSTOM_CXX := icpc in your Makefile.config? Also, please make sure that you do a make clean.

from onednn.

vpirogov commented on May 18, 2024

Clarification on IntelCaffe: it's locked to specific commit of Intel(R) MKL-DNN tracked in mkldnn.commit. Currently it builds the version from March 17th.

from onednn.

haidj commented on May 18, 2024

@rsdubtso
Hello Roma,

I have successfully built IntelCaffe with your help.

When I checked again, MKLDNN is about three times slower than MKL2017 on my Resent-152 prototxt.
However, the performance difference according to the compiler(e.g., GCC vs ICC) seems not to be large.

Each execution time of MKLDNN-GCC, MKLDNN-ICC, and MKL2017 is attached.

At this point, Is it reasonable that the MKLDNN is 2 to 3 times slower than the MKL2017?

Thank you very much for your help.
Daejin Jung

from onednn.

rsdubtso commented on May 18, 2024

I briefly looked at the logs, and I see that the top gap is for a 1x1 convolution. If I remember correctly, 1x1s have been improved in the latest MKL-DNN. So as soon as IntelCaffe moves on to the 0.7 mkl-dnn release, you should see at least some speedup. For the March release the performance ratio looks about right. Ultimately, we want MKL-DNN to show the same performance as MKL.

from onednn.

haidj commented on May 18, 2024

@rsdubtso
Dear Roma,

In addition, I confirmed that the performance of MKLDLL and MKL2017 are almost similar for VGG.
In the case of ResNet, the most significant performance degradation occurs in certain layers such as BW_res2c_branch2c, BW_bn2c_branch2c, BW_res3h_branch2c, BW_bn3h_branch2c.

These are the first encountered convolution and batch normalization layers in backward path after the eltwise operation while crossing over from convN-1 to convN. I think additional optimization also needs at this point to improve performance on resnet.

Thank you for your help and support.
Daejin Jung

from onednn.

rsdubtso commented on May 18, 2024

Glad to help! I'm closing this issue then. Please feel free to open a new one in case you have any questions or run into any issues.

from onednn.

MKL-DNN vs MKL2017 performance difference about onednn HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent