Giter Site home page Giter Site logo

rizhaocai / pytorch_onnx_tensorrt Goto Github PK

View Code? Open in Web Editor NEW
243.0 3.0 57.0 184 KB

A tutorial about how to build a TensorRT Engine from a PyTorch Model with the help of ONNX

License: MIT License

Jupyter Notebook 41.46% Python 58.54%
tensorrt pytorch-onnx-tensorrt onnx

pytorch_onnx_tensorrt's Introduction

PyTorch_ONNX_TensorRT

A tutorial that show how could you build a TensorRT engine from a PyTorch Model with the help of ONNX. Please kindly star this project if you feel it helpful.

News

A dynamic_shape_example (batch size dimension) is added.
Just run python3 dynamic_shape_example.py

This example should be run on TensorRT 7.x. I find that this repo is a bit out-of-date since there are some API changes from TensorRT 5.0 to TensorRT 7.x. I will put sometime in a near future to make it compatible.

Environment

  1. Ubuntu 16.04 x86_64, CUDA 10.0
  2. Python 3.5
  3. PyTorch 1.0
  4. TensorRT 5.0 (If you are using Jetson TX2, TensorRT will be already there if you have installed the jetpack)
    3.1 Download TensorRT (You should pick up the right package that matches your environment)
    3.2 Debian installation
  $ sudo dpkg -i nv-tensorrt-repo-ubuntu1x04-cudax.x-trt5.x.x.x-ga-yyyymmdd_1-1_amd64.deb # The downloaeded file
  $ sudo apt-key add /var/nv-tensorrt-repo-cudax.x-trt5.x.x.x-gayyyymmdd/7fa2af80.pub
  $ sudo apt-get update
  $ sudo apt-get install tensorrt
  
  $ sudo apt-get install python3-libnvinfer

To verify the installation of TensorRT $ dpkg -l | grep TensorRT You should see similar things like

  ii  graphsurgeon-tf	5.1.5-1+cuda10.1	amd64	GraphSurgeon for TensorRT package
  ii  libnvinfer-dev	5.1.5-1+cuda10.1	amd64	TensorRT development libraries and headers
  ii  libnvinfer-samples	5.1.5-1+cuda10.1	amd64	TensorRT samples and documentation
  ii  libnvinfer5		5.1.5-1+cuda10.1	amd64	TensorRT runtime libraries
  ii  python-libnvinfer	5.1.5-1+cuda10.1	amd64	Python bindings for TensorRT
  ii  python-libnvinfer-dev	5.1.5-1+cuda10.1	amd64	Python development package for TensorRT
  ii  python3-libnvinfer	5.1.5-1+cuda10.1	amd64	Python 3 bindings for TensorRT
  ii  python3-libnvinfer-dev	5.1.5-1+cuda10.1	amd64	Python 3 development package for TensorRT
  ii  tensorrt	5.1.5.x-1+cuda10.1	amd64	Meta package of TensorRT
  ii  uff-converter-tf	5.1.5-1+cuda10.1	amd64	UFF converter for TensorRT package

3.2 Install PyCuda (This will support TensorRT)

 $ pip3 install pycuda 

If you get problems with pip, please try

$ sudo apt-get install python3-pycuda #(Install for /usr/bin/python3)

For full details, please check the TensorRT-Installtation Guide

Usage

Please check the file 'pytorch_onnx_trt.ipynb'

Int 8:

To run the int-8 optimization

python3 trt_int8_demo.py

You will see output like

Function forward_onnx called!
graph(%input : Float(32, 3, 128, 128),
%1 : Float(16, 3, 3, 3),
%2 : Float(16),
%3 : Float(64, 16, 5, 5),
%4 : Float(64),
%5 : Float(10, 64),
%6 : Float(10)):
%7 : Float(32, 16, 126, 126) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[0, 0, 0, 0], strides=[1, 1]](%input, %1, %2), scope: Conv2d
%8 : Float(32, 16, 126, 126) = onnx::Relu(%7), scope: ReLU
%9 : Float(32, 16, 124, 124) = onnx::MaxPoolkernel_shape=[3, 3], pads=[0, 0, 0, 0], strides=[1, 1], scope: MaxPool2d
%10 : Float(32, 64, 120, 120) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[5, 5], pads=[0, 0, 0, 0], strides=[1, 1]](%9, %3, %4), scope: Conv2d
%11 : Float(32, 64, 120, 120) = onnx::Relu(%10), scope: ReLU
%12 : Float(32, 64, 1, 1) = onnx::GlobalAveragePool(%11), scope: AdaptiveAvgPool2d
%13 : Float(32, 64) = onnx::Flattenaxis=1
%output : Float(32, 10) = onnx::Gemm[alpha=1, beta=1, transB=1](%13, %5, %6), scope: Linear
return (%output) Int8 mode enabled Loading ONNX file from path model_128.onnx...
Beginning ONNX file parsing
Completed parsing of ONNX file
Building an engine from file model_128.onnx; this may take a while...
Completed creating the engine
Loading ONNX file from path model_128.onnx...
Beginning ONNX file parsing
Completed parsing of ONNX file
Building an engine from file model_128.onnx; this may take a while...
Completed creating the engine
Loading ONNX file from path model_128.onnx...
Beginning ONNX file parsing
Completed parsing of ONNX file
Building an engine from file model_128.onnx; this may take a while...
Completed creating the engine
Toal time used by engine_int8: 0.0009500550794171857
Toal time used by engine_fp16: 0.001466430104649938
Toal time used by engine: 0.002231682623709525

This output is run by Jetson Xavier.
Please be noted that int8 mode is only supported by specific GPU modules, e.g. Jetson Xavier , Tesla P4, etc.

TensorRT 7 have been released. According to some feedbacks, the code is tested well with TensorRT 5.0 and might have some problems with TensorRT 7.0. I will update this repo by doing a test with TensorRT 7 and making it compatible soon.

Contact

Cai, Rizhao
Email: [email protected]

pytorch_onnx_tensorrt's People

Contributors

rizhaocai avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

pytorch_onnx_tensorrt's Issues

AdaptiveAvgPool2d not supported by tensorRT OnnxParser

Env:
ubuntu 16.04
Python 3.6
Pytorch 1.3.0
tensorRT:7.0.11

Detail:
the adaptiveavgpool2d operation is not supported by onnxparser, so it could lead to the failure of transformation of onnx to tensorrt.
you could use
for error in range(parser.num_errors):
print(parser.get_error(error))
after the parser read onnx model

代码与 tensorrt 8.4 不兼容

我正在使用 tensorrt 8.4的环境运行代码,报错
'tensorrt.tensorrt.Builder' object has no attribute 'build_cuda_engine',int8_mod, fp16_mode, int8_mode

在新的tensorrt中,使用config 接管了 builder的各种设置。

problems occured when executing "python trt_int8_demo.py "

Building an engine from file model_128.onnx; this may take a while...
Traceback (most recent call last):
File "trt_int8_demo.py", line 138, in
main()
File "trt_int8_demo.py", line 92, in main
engine_int8 = trt_helper.get_engine(batch_size,onnx_model_path,engine_model_path, fp16_mode=False, int8_mode=True, calibration_stream=calibration_stream, save_engine=True)
File "/data/zhangyl/PyTorch_ONNX_TensorRT/helpers/trt_helper.py", line 95, in get_engine
return build_engine(max_batch_size, save_engine)
File "/data/zhangyl/PyTorch_ONNX_TensorRT/helpers/trt_helper.py", line 76, in build_engine
engine = builder.build_cuda_engine(network)
TypeError: read_calibration_cache() missing 1 required positional argument: 'length'

Int8 implementation

Hi! Any idea or resources on how to implement the Int8? The documentation from Nvidia is too minimal to get how it is supposed to work. They mention creating two objects, but who knows from which classes and how.

error: Failed to parse ONNX model.

Hello,thank you for your work. I get a error when run the demo,but i just use the model_128.onnx and did not make any
changes.
What is the reason and how to solve it ?

error:
Please check if the ONNX model is compatible '
AssertionError: Failed to parse ONNX model.

Run batch size with inputs[1].host

Hi, everyone
I define batch_size_max = 4 and input size onnx= 2 to run batch_size = 2 for model trt
When i convert model that Error show "list index out of range" in inputs[1].host
That mean inputs only have one elements.
How to fix when i want to run batch size > 1

AssertionError: Failed to parse ONNX model. Please check if the ONNX model is compatible

trtexec --onnx=/home/guohao02/PyTorch_ONNX_TensorRT/model_128.onnx --explicitBatch

[10/16/2021-13:56:29] [I] === Model Options ===
[10/16/2021-13:56:29] [I] Format: ONNX
[10/16/2021-13:56:29] [I] Model: /home/guohao02/PyTorch_ONNX_TensorRT/model_128.onnx
[10/16/2021-13:56:29] [I] Output:
[10/16/2021-13:56:29] [I] === Build Options ===
[10/16/2021-13:56:29] [I] Max batch: explicit
[10/16/2021-13:56:29] [I] Workspace: 16 MB
[10/16/2021-13:56:29] [I] minTiming: 1
[10/16/2021-13:56:29] [I] avgTiming: 8
[10/16/2021-13:56:29] [I] Precision: FP32
[10/16/2021-13:56:29] [I] Calibration: 
[10/16/2021-13:56:29] [I] Safe mode: Disabled
[10/16/2021-13:56:29] [I] Save engine: 
[10/16/2021-13:56:29] [I] Load engine: 
[10/16/2021-13:56:29] [I] Inputs format: fp32:CHW
[10/16/2021-13:56:29] [I] Outputs format: fp32:CHW
[10/16/2021-13:56:29] [I] Input build shapes: model
[10/16/2021-13:56:29] [I] === System Options ===
[10/16/2021-13:56:29] [I] Device: 0
[10/16/2021-13:56:29] [I] DLACore: 
[10/16/2021-13:56:29] [I] Plugins:
[10/16/2021-13:56:29] [I] === Inference Options ===
[10/16/2021-13:56:29] [I] Batch: Explicit
[10/16/2021-13:56:29] [I] Iterations: 10 (200 ms warm up)
[10/16/2021-13:56:29] [I] Duration: 10s
[10/16/2021-13:56:29] [I] Sleep time: 0ms
[10/16/2021-13:56:29] [I] Streams: 1
[10/16/2021-13:56:29] [I] Spin-wait: Disabled
[10/16/2021-13:56:29] [I] Multithreading: Enabled
[10/16/2021-13:56:29] [I] CUDA Graph: Disabled
[10/16/2021-13:56:29] [I] Skip inference: Disabled
[10/16/2021-13:56:29] [I] === Reporting Options ===
[10/16/2021-13:56:29] [I] Verbose: Disabled
[10/16/2021-13:56:29] [I] Averages: 10 inferences
[10/16/2021-13:56:29] [I] Percentile: 99
[10/16/2021-13:56:29] [I] Dump output: Disabled
[10/16/2021-13:56:29] [I] Profile: Disabled
[10/16/2021-13:56:29] [I] Export timing to JSON file: 
[10/16/2021-13:56:29] [I] Export profile to JSON file: 
[10/16/2021-13:56:29] [I] 
----------------------------------------------------------------
Input filename:   /home/guohao02/PyTorch_ONNX_TensorRT/model_128.onnx
ONNX IR version:  0.0.6
Opset version:    9
Producer name:    pytorch
Producer version: 1.9
Domain:           
Model version:    0
Doc string:       
----------------------------------------------------------------
WARNING: ONNX model has a newer ir_version (0.0.6) than this parser was built against (0.0.3).
While parsing node number 0 [Conv]:
ERROR: ModelImporter.cpp:296 In function importModel:
[5] Assertion failed: tensors.count(input_name)
[10/16/2021-13:56:29] [E] Failed to parse onnx file
[10/16/2021-13:56:29] [E] Parsing model failed
[10/16/2021-13:56:29] [E] Engine could not be created
&&&& FAILED TensorRT.trtexec # trtexec --onnx=/home/guohao02/PyTorch_ONNX_TensorRT/model_128.onnx --explicitBatch

AttributeError: 'NoneType' object has no attribute 'create_execution_context'

Connected to pydev debugger (build 181.5540.34)
Loading ONNX file from path ./models/onnx/model.onnx...
Beginning ONNX file parsing
WARNING: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
Successfully casted down to INT32.
Completed parsing of ONNX file
Building an engine from file ./models/onnx/model.onnx; this may take a while...
[TensorRT] ERROR: Network must have at least one output
Failed to create the engine

Does onnx convert tensorRT need a data set for calibration?

I saw some quantization tutorials stating that a small part of the training data set is required for quantitative calibration to determine the range of activation values for activation and weights. Why is there no such part in the code you provided, or is it unnecessary? Thank you

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.