Comments (6)
Currently PPLNN CUDA supports fp16 only.
Just want to know if ppl.nn support half-precision or int8 inference.
Looking forward to your reply.
from ppl.nn.
Can we test the fp16 performance with pplnn-build/tools/pplnn?
from ppl.nn.
Can we test the fp16 performance with pplnn-build/tools/pplnn?
Yes, you can test the fp16 performance with pplnn in tesla T4.
from ppl.nn.
Can we test the fp16 performance with pplnn-build/tools/pplnn?
Benchmark instructions are listed here: Benchmark(中文版)
from ppl.nn.
Yeah, I read it before, eg running with:
./pplnn --onnx-model model.onnx --inputs input.bin --in-shapes 1_3_224_224 --dims 1_3_224_224 --warmuptimes 100 --runningtimes 100
will run the model on gpu with fp32 precision right? looks like we can not specify precision thought args.
from ppl.nn.
Yeah, I read it before, eg running with:
./pplnn --onnx-model model.onnx --inputs input.bin --in-shapes 1_3_224_224 --dims 1_3_224_224 --warmuptimes 100 --runningtimes 100
will run the model on gpu with fp32 precision right? looks like we can not specify precision thought args.
CUDA only support fp16 for conv and gemm op right now. PPL will convert data type before excute. So the results will show the fp16 performance for cuda.
from ppl.nn.
Related Issues (20)
- Floating point exception (core dumped) occured when using cuda engine HOT 1
- Floating point exception (core dumped) HOT 1
- Onnx run error HOT 2
- 请问支持int8在高通芯片上cDSP进行推理吗?
- Slice op question HOT 1
- pplnn run mobilenet v2 model failed. (use cuda) HOT 7
- linux compile error protobuf static assertion failed HOT 3
- malloc_consolidate(): invalid chunk size HOT 2
- pplnn save-input 得到的NDARRAY的 shape不正确 HOT 1
- 如何使用cmake的将ppl.nn和依赖ppl.nn的代码一同编译? HOT 3
- Segmentation fault at ppl::nn::x86::X86Kernel::DumpOutputTensors HOT 5
- 获取模型推理结果(GetOutputs)耗时长 HOT 2
- Install Error HOT 1
- The compilation passed, but an error was reported in test phase HOT 2
- Floating point exception (core dumped) ? HOT 4
- 使用x86 engine运行resnet50 fp16 onnx模型 core dump
- (Ask) why InferInheritedType handle int8 to fp16 out? HOT 3
- Got wrong output shape when run a Gemm op(transB=0) use cuda HOT 4
- Crash with ONNX Split operator
- 关于全局engine,其他线程引用导致的性能下降问题 HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ppl.nn.