Hi, when i run a gemm model with pplnn, got wrong output shape <a target="_blank"

ppl.nn commit id : <a class="commit-link" data-hovercard-type="commit" data-hovercard-

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Got wrong output shape when run a Gemm op(transB=0) use cuda about ppl.nn HOT 4 CLOSED

zhnin commented on July 18, 2024

Got wrong output shape when run a Gemm op(transB=0) use cuda

from ppl.nn.

Comments (4)

zhnin commented on July 18, 2024

ppl.nn commit id : d94dc4a

from ppl.nn.

zhnin commented on July 18, 2024

@jianfei-wangg ，你好，使用Gemm单算子模型测试了下#850 ，输出shape应该是正常了。

不过当我运行整个模型(Bert)，Gemm这个位置还是会有error，我截了一个子图放到附件里了，可以帮忙看下原因吗

编译命令

./build.sh -DPPLNN_USE_X86_64=ON -DPPLNN_USE_CUDA=ON -DPPLNN_ENABLE_CUDA_JIT=OFF

模型附件

bert_squad_bs1_opt_subgraph_gemm.zip

执行命令

(py38) $ ./pplnn --use-cuda --onnx-model bert_squad_bs1_opt_subgraph_gemm.onnx
ppl.nn version: [0.0.0], commit: [f0051dcf05d1c106dd6d23d70416ddbc4dc3f418]
[INFO][2023-07-31 18:09:35.975][pplnn.cc:315] ***** register CudaEngine *****
[INFO][2023-07-31 18:09:35.992][utils.cc:369] total partition(s) of graph[Extracted from {tf2onnx}]: 1.
[INFO][2023-07-31 18:09:35.997][opt_graph.cc:312] added 46 new bridge kernels
[INFO][2023-07-31 18:09:36.163][opt_graph.cc:578] deleted 33 bridge kernels
----- input info -----
input[0]:
    name: bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/add_1:0
    dim(s): 128 768
    data type: FLOAT32
    data format: NDARRAY
    byte(s) excluding padding: 393216
----------------------
[INFO][2023-07-31 18:09:36.165][pplnn.cc:1386] Prepare costs: 1506.97 ms.
[ERROR][2023-07-31 18:09:36.616][cuda_device.cc:169] cudaStreamSynchronize 700, an illegal memory access was encountered
[ERROR][2023-07-31 18:09:36.616][runtime_impl.cc:316] sync device[cuda] failed: other error
[ERROR][2023-07-31 18:09:36.616][pplnn.cc:1392] Run() failed: other error
[ERROR][2023-07-31 18:09:36.616][cuda_device.cc:109] sync stream failed: an illegal memory access was encountered
[ERROR][2023-07-31 18:09:36.616][cuda_device.cc:109] sync stream failed: an illegal memory access was encountered

from ppl.nn.

zhnin commented on July 18, 2024

错误看起来和gemm有关系，如果截取的子图不包括最后一个gemm，子图模型是可以运行的。如果包括了gemm，就会报上面的 illegal memory 错误

from ppl.nn.