Comments (9)
void convolution_node::backward2data(const dnnl::memory& diff_dst)
{
m_src_diff_md = dnnl::memory::desc(m_src_dims, dt::f32, tag::any);
m_weights_diff_md = dnnl::memory::desc({ m_weights_dims }, dt::f32, tag::any);
m_dst_diff_md = dnnl::memory::desc({ m_dst_dims }, dt::f32, tag::any);
// // std::cout << "Creating backward Convolutional layer primitive descriptor\n";
m_conv_bwd_data_desc = dnnl::convolution_backward_data::primitive_desc(m_engine,
dnnl::algorithm::convolution_direct,
m_src_diff_md, m_weights_md, m_dst_diff_md,
m_stride_dims, m_dilation_dims, m_padding_dims, m_padding_dims,
m_conv_fwd_desc);
// if
m_arg_diff_dst = diff_dst;
if (diff_dst.get_desc() != m_conv_bwd_data_desc.diff_dst_desc()) {
m_arg_diff_dst = dnnl::memory(m_conv_bwd_data_desc.diff_dst_desc(), m_engine);
m_net_bwd_data.push_back(dnnl::reorder(diff_dst, m_arg_diff_dst));
m_net_bwd_data_args.push_back({ {DNNL_ARG_FROM, diff_dst},
{DNNL_ARG_TO, m_arg_diff_dst} });
}
m_arg_diff_src = dnnl::memory(m_conv_bwd_data_desc.diff_src_desc(), m_engine);
m_net_bwd_data.push_back(dnnl::convolution_backward_data(m_conv_bwd_data_desc));
m_net_bwd_data_args.push_back(
{ {DNNL_ARG_DIFF_SRC, m_arg_diff_src},
{DNNL_ARG_DIFF_DST, m_arg_diff_dst},
// If something does not work check this, there might be some
// reordering needed done in a similar fashion to cnn_training_f32.cpp
{DNNL_ARG_WEIGHTS, m_arg_weights} });
auto user_diff_src_md = dnnl::memory::desc({ m_src_dims }, dt::f32, tag::nchw);
m_user_diff_src = m_arg_diff_src;
if (m_arg_diff_src.get_desc() != user_diff_src_md) {
m_user_diff_src = dnnl::memory(user_diff_src_md, m_engine);
m_net_bwd_data.push_back(dnnl::reorder(m_arg_diff_src, m_user_diff_src));
m_net_bwd_data_args.push_back({ {DNNL_ARG_FROM, m_arg_diff_src},
{DNNL_ARG_TO, m_user_diff_src} });
}
assert(m_net_bwd_data.size() == m_net_bwd_data_args.size() && "something is missing");
}
from onednn.
dnnl::convolution_backward_data is quite time-consuming;
infer cost(ms): 10
backward2data cost(ms): 232 (however pytorch or libtorch cost(ms) 30~50)
backward2weights cost(ms): 12
from onednn.
Hi @w1005444804 , could you please run oneDNN with verbose enabled?
Here is the documentation: https://oneapi-src.github.io/oneDNN/dev_guide_verbose.html?highlight=verbose
from onednn.
@igorsafo thanks, Activate ONEDNN_ VERBOSE does have a certain effect, but it is very unstable, and the time consumption has changed from the previous 230ms to a dynamic range of 60-200ms,
onednn_verbose,188439297.948300,exec,cpu,convolution,jit:avx2,backward_data,src_f32:ap:blocked:aBcd8b::f0 wei_f32:ap:blocked:ABcd8a8b::f0 bia_undef::undef::: dst_f32:ap:blocked:aBcd8b::f0,,alg:convolution_direct,mb10_ic3oc6_ih160oh156kh5sh1dh0ph0_iw160ow156kw5sw1dw0pw0,100.937
from onednn.
Hi @igorsafo , Is the problem caused by me?
from onednn.
@w1005444804 Thanks for the additional information! It looks it is not an integration problem, because data formats are blocked and an optimized implementation is called. Also I was able to reproduce low performance for this case. It doesn't run on a single thread, but the optimized implementation seems to have a gap for this kind of shapes.
Is it the first layer in the model? You usually don't need to compute backward wrt data for the first layer. Unfortunately, if there are other layers before this convolution then the gradient is required.
If you can provide more details about the use case (model, hw/isa) this would be helpful. How much of time does this convolution takes comparing to the overall model time?
from onednn.
@igorsafo Yes, It is the first layer, my model is a conv-layer, I just wanted to test the speed of forward and backward propagation of convolutions, and then found this issue in comparison with Pytorch.
Thank you for your reply!
from onednn.
The code is roughly as follows:
...
dnnl::memory::dims conv1_src_tz = { 10, 3, 160, 160 };
auto conv1_src_memory = dnnl::memory({ {conv1_src_tz}, dt::f32, tag::nchw }, engine);
convolution_node conv1(engine, 3, 6, 5, 1, 0, 0, 1, 0, 1, conv1_src_memory);
...
for (size_t i = 0; i < conv1.m_net_fwd.size(); i++) {
conv1.m_net_fwd[i].execute(s, conv1.m_net_fwd_args[i]);
}
...
conv1.backward2data(top_memory);
for (size_t i = 0; i < conv1.m_net_bwd_data.size(); i++) {
conv1.m_net_bwd_data[i].execute(s, conv1.m_net_bwd_data_args[i]);
}
from onednn.
Hi @w1005444804 ,
Thank you for the information. I created an internal tracker for this issue, however I can't guarantee this issue will be fixed until we have more requests/use cases for this particular shape.
from onednn.
Related Issues (20)
- accuracy issue in a graph conv test HOT 2
- create memory with tag::any,it crash HOT 3
- bibtex ref about oneDNN HOT 4
- Understand oneDNN graph compiler HOT 7
- Difference between BRGEMM in oneDNN and GEMM in openBLAS HOT 7
- Issue building oneDNN 3.4.4 with CLANG for ARM64 on Windows HOT 1
- cblass_gemm incorrect output HOT 2
- test_shuffle fails on aarch64 when BF16 data type is enabled. HOT 2
- [ACL] Potentially redundant check in `acl_init_conf` HOT 1
- Enable building oneDNN with MKL when integrated with PyTorch and IPEX via ideep HOT 4
- oneDNN 'Build from Source' doesn't work HOT 9
- Why are the convolutional inference results of OneDNN very different from the convolutional structure of pytorch? HOT 8
- How to disable USM feature for GPU plugin HOT 9
- running destructors before completion of a primitive HOT 7
- why the result of eltwise_hardswish is zero? HOT 8
- test_benchdnn_modeC_softmax_ci_cpu fails due to F16 accumulation HOT 2
- Check timings of assembly level instructions HOT 10
- How to use coverage.cmake file HOT 5
- Add option to disable python 2.7 finding via docs HOT 5
- why the result shape of conv is not same with input HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from onednn.