Comments (4)
ppl.nn commit id : d94dc4a
from ppl.nn.
@jianfei-wangg ,你好,使用Gemm单算子模型测试了下#850 ,输出shape应该是正常了。
不过当我运行整个模型(Bert),Gemm这个位置还是会有error,我截了一个子图放到附件里了,可以帮忙看下原因吗
编译命令
./build.sh -DPPLNN_USE_X86_64=ON -DPPLNN_USE_CUDA=ON -DPPLNN_ENABLE_CUDA_JIT=OFF
模型附件
bert_squad_bs1_opt_subgraph_gemm.zip
执行命令
(py38) $ ./pplnn --use-cuda --onnx-model bert_squad_bs1_opt_subgraph_gemm.onnx
ppl.nn version: [0.0.0], commit: [f0051dcf05d1c106dd6d23d70416ddbc4dc3f418]
[INFO][2023-07-31 18:09:35.975][pplnn.cc:315] ***** register CudaEngine *****
[INFO][2023-07-31 18:09:35.992][utils.cc:369] total partition(s) of graph[Extracted from {tf2onnx}]: 1.
[INFO][2023-07-31 18:09:35.997][opt_graph.cc:312] added 46 new bridge kernels
[INFO][2023-07-31 18:09:36.163][opt_graph.cc:578] deleted 33 bridge kernels
----- input info -----
input[0]:
name: bert/encoder/layer_0/attention/output/LayerNorm/batchnorm/add_1:0
dim(s): 128 768
data type: FLOAT32
data format: NDARRAY
byte(s) excluding padding: 393216
----------------------
[INFO][2023-07-31 18:09:36.165][pplnn.cc:1386] Prepare costs: 1506.97 ms.
[ERROR][2023-07-31 18:09:36.616][cuda_device.cc:169] cudaStreamSynchronize 700, an illegal memory access was encountered
[ERROR][2023-07-31 18:09:36.616][runtime_impl.cc:316] sync device[cuda] failed: other error
[ERROR][2023-07-31 18:09:36.616][pplnn.cc:1392] Run() failed: other error
[ERROR][2023-07-31 18:09:36.616][cuda_device.cc:109] sync stream failed: an illegal memory access was encountered
[ERROR][2023-07-31 18:09:36.616][cuda_device.cc:109] sync stream failed: an illegal memory access was encountered
from ppl.nn.
错误看起来和gemm有关系,如果截取的子图不包括最后一个gemm,子图模型是可以运行的。如果包括了gemm,就会报上面的 illegal memory 错误
from ppl.nn.
错误看起来和gemm有关系,如果截取的子图不包括最后一个gemm,子图模型是可以运行的。如果包括了gemm,就会报上面的 illegal memory 错误
错误来自Mul&Sub,已经修复,可以再试试
from ppl.nn.
Related Issues (20)
- Onnx run error HOT 2
- 请问支持int8在高通芯片上cDSP进行推理吗?
- Slice op question HOT 1
- pplnn run mobilenet v2 model failed. (use cuda) HOT 7
- linux compile error protobuf static assertion failed HOT 3
- malloc_consolidate(): invalid chunk size HOT 2
- pplnn save-input 得到的NDARRAY的 shape不正确 HOT 1
- 如何使用cmake的将ppl.nn和依赖ppl.nn的代码一同编译? HOT 3
- Segmentation fault at ppl::nn::x86::X86Kernel::DumpOutputTensors HOT 5
- 获取模型推理结果(GetOutputs)耗时长 HOT 2
- Install Error HOT 1
- The compilation passed, but an error was reported in test phase HOT 2
- Floating point exception (core dumped) ? HOT 4
- 使用x86 engine运行resnet50 fp16 onnx模型 core dump
- (Ask) why InferInheritedType handle int8 to fp16 out? HOT 3
- Crash with ONNX Split operator
- 关于全局engine,其他线程引用导致的性能下降问题 HOT 4
- 推理误差排查
- 多模型pipeline的示例
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ppl.nn.