lsh9832 / yolov8_trt Goto Github PK

View Code? Open in Web Editor NEW

32.0 2.0 7.0 691 KB

A quick TensorRT deoloyment solution for YOLOv8.

License: Apache License 2.0

Python 100.00%

deep-learning deployment object-detection python tensorrt yolo

yolov8_trt's Introduction

YOLOv8 TensorRT 快速部署工具

What is this

通过yolov8官方代码训练生成的权重文件，均可通过此项目转换为tensorrt量化模型，同时支持常规模型以及端到端模型
进一步的，同样支持输出维度结构为

#                  批大小        预测框数量    x, y, w, h, obj_conf, label_scores
outputs.shape = [batch_size, number_of_preds, 5 + number_of_classes]
outputs.shape = [batch_size, 5 + number_of_classes, number_of_preds]

# 或者                                         x, y, w, h, scores
outputs.shape = [batch_size, number_of_preds, 4 + number_of_classes]
outputs.shape = [batch_size, 4 + number_of_classes, number_of_preds]

的其他目标检测模型的onnx文件转换为tensorrt的engine文件和便于python使用的pt文件.

环境准备

CUDA, CuDNN, TensorRT（Toolkit）
所有requirements.txt中的三方库

注意

end2end模型要求TensorRT版本为8.0.0及以上（本人使用的是TensorRT-8.0.1.6.Windows10.x86_64.cuda-11.3.cudnn8.2）
由于修改了原项目ultralytics中的一些函数，所以导出模型时不要将原项目代替本项目中的ultralytics文件夹

其中 torch2trt 安装如下

git clone https://github.com/NVIDIA-AI-IOT/torch2trt.git
cd torch2trt
python setup.py install

python 的 tensorrt 通过 TensorRT Toolkit中提供的安装包安装，然后

git clone https://github.com/LSH9832/yolov8_trt
cd yolov8_trt
pip install -r requirements.txt

使用

支持转onnx、tensorrt engine文件以及将engine文件与相关信息打包的，便于python中使用的pt文件
支持常规模型以及端到端（End2End）模型

下载模型

通过yolov8官方方式下载模型

python download_all_models.py

导出模型

# mode有以下5个

# 1. onnx：仅导出onnx常规模型

# 2. trt：导出onnx及tensorrt常规模型
# 3. end2end：导出onnx常规模型，并通过其导出end2end模型，再转化为end2end的engine模型

# 4. onnx2trt: 其他非yolov8模型的onnx文件（未添加end2end结构）转换为tensorrt常规模型
# 5. onnx2end2end: 其他非yolov8模型的onnx文件（未添加end2end结构）转换为端到端onnx模型，再转化为end2end的engine模型
python export.py --mode trt

                 # 如果是yolov8模型（trt模式或end2end模式），输入以下4个参数
                 --weights yolov8s.pt
                 --batch 1
                 --img-size 640 640    # 4:3视频流设为 480 640， 16:9视频流设为 384 640
                 --opset 11
                 
                 # 否则（onnx2trt模式或onnx2end2end模式）输入以下2个参数
                 --onnx yolov7.onnx   # 其他模型的onnx文件
                 --cfg yolov7.yaml    # 所需包含的信息见下文
                 
                 # 以下所有参数如为onnx模式不用写
                 --workspace 10   # 转换为tensorrt模型时最大使用的显存空间(GB)
                 --fp16           # int8, best，模型精度使能
                 
                 # 以下为end2end模式和onnx2end2end模式所需参数(Batched_NMS中所需参数)，其他模式不用填写
                 --conf-thres 0.2   # 置信度阈值
                 --nms-thres 0.6    # NMS 目标框重合度阈值
                 --topk 2000        # 选置信度最高的前K个目标框作为NMS的输入
                 --keep 100         # NMS之后最多保留前K个置信度最高的结果

其中，--cfg 指向的yaml文件应该包含如下信息

batch_size: 1                   # int        批大小
pixel_range: 1                  # int        输入图像像素值范围， 1代表[0, 1]之间（YOLOv5,6,7等），255代表[0, 255]之间（YOLOX等）
obj_conf_enabled: true          # bool       是否使用了前景预测（obj_conf）, 注意true和false开头字母用小写
img_size: [640, 640]            # List[int]  输入图像 [高，宽]，注意不要反了
input_name: ["input_0"]         # List[str]  onnx文件输入变量名
output_name: ["output_0"]       # List[str]  onnx文件输出变量名
names: ["person", "car", ...]   # List[str]  类别名称

不同模式导出不同模型，下面以yolov8s.pt为例

onnx模式

./yolo_export/yolov8s/yolov8s.onnx   # 常规模型
./yolo_export/yolov8s/yolov8s.yaml   # 模型相关信息
./yolo_export/yolov8s/yolov8s.json   # 模型相关信息

trt模式

./yolo_export/yolov8s/yolov8s.onnx   # 常规模型
./yolo_export/yolov8s/yolov8s.yaml   # 常规模型相关信息
./yolo_export/yolov8s/yolov8s.json   # 常规模型相关信息

./yolo_export/yolov8s/yolov8s.engine # 用于C++部署
./yolo_export/yolov8s/yolov8s.pt     # 用于python部署

end2end模式

./yolo_export/yolov8s/yolov8s.onnx   # 常规模型
./yolo_export/yolov8s/yolov8s.yaml   # 常规模型相关信息
./yolo_export/yolov8s/yolov8s.json   # 常规模型相关信息

./yolo_export/yolov8s/yolov8s_end2end.onnx   # 端到端模型
./yolo_export/yolov8s/yolov8s_end2end.json   # 端到端模型相关信息
./yolo_export/yolov8s/yolov8s_end2end.engine # 用于C++部署
./yolo_export/yolov8s/yolov8s_end2end.pt     # 用于python部署

端到端的模型最后会增加如下结构, 相关代码改自https://github.com/DataXujing/YOLOv8

推理

python

由本项目生成的，不论是常规模型还是端到端模型，均可使用，不依赖原项目的函数功能

python trt_infer.py --weight ./yolo_export/yolov8s/yolov8s.pt
                    --source path/to/your/video/file or (rtsp/rtmp/http)://xxx.xxx.xxx.xxx/live/xxx or 0
                    --no-label   # 声明此项则不会显示类别与置信度标签，仅显示目标框
                    
                    # 以下为常规模型所需参数，端到端模型无效
                    --conf 0.2     # 置信度阈值
                    --nms 0.6      # NMS 目标框重合度阈值

c++

即将推出，着急用可到其他项目中找找

yolov8_trt's People

Stargazers

Watchers

Forkers

lb15642219539 ltyanghuang jie311 suyixiu bestsongc lijuny zfg88287508

yolov8_trt's Issues

error

yolo.cpp:(.text+0x1de0): undefined reference to cv::imread(std::string const&, int)' yolo.cpp:(.text+0x1ed4): undefined reference to cv::imshow(std::string const&, cv::_InputArray const&)'
yolo.cpp:(.text+0x1ef4): undefined reference to `cv::waitKey(int)'

请问up修改了原ultralytics中哪几个文件的代码，方便告知一下吗？

batch的问题

你好感谢您的工作当设置batch>1时输出onnx依然是1x3x640x640请问怎么实现多batch

trt_infer只能推理pt文件吗

Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64

python export.py --mode end2end --weights yolov8n.pt --batch 1 --img-size 640 640 --opset 11 --onnx yolov8n.onnx --cfg yolov8n.yaml --workspace 10 --fp16 --conf-thres 0.2 --nms-thres 0.45 --topk 300 --keep 100
Fusing layers...
YOLOv8n summary: 168 layers, 3151904 parameters, 0 gradients, 8.7 GFLOPs
Ultralytics YOLOv8.0.3 🚀 Python-3.9.12 torch-1.12.1+cu113 CPU
half=True only compatible with GPU or CoreML export, i.e. use device=0 or format=coreml
Fusing layers...
YOLOv8n summary: 168 layers, 3151904 parameters, 0 gradients, 8.7 GFLOPs

PyTorch: starting from yolov8n.pt with output shape (1, 84, 8400) (6.2 MB)

ONNX: starting export with onnx 1.15.0...
ONNX: export success ✅ 2.3s, saved as ./yolo_export/yolov8n/yolov8n.onnx (12.2 MB)

Export complete (2.5s)
Results saved to /project/trt/yolov8_trt
Predict: yolo task=detect mode=predict model=./yolo_export/yolov8n/yolov8n.onnx -WARNING ⚠️ not yet supported for YOLOv8 exported models
Validate: yolo task=detect mode=val model=./yolo_export/yolov8n/yolov8n.onnx -WARNING ⚠️ not yet supported for YOLOv8 exported models
Visualize: https://netron.app
2023-12-07 10:38:19.932 | INFO | utils.export_utils:info:40 - origin input size: [1, 3, 640, 640]
2023-12-07 10:38:19.932 | INFO | utils.export_utils:info:41 - origin output size: [1, 84, 8400]
2023-12-07 10:38:19.933 | INFO | utils.export_utils:add_postprocess:57 - add transpose layer: [1, 84, 8400] ---→ [1, 8400, 84]
2023-12-07 10:38:19.933 | INFO | utils.export_utils:add_without_obj_conf:68 - add layers without obj conf
2023-12-07 10:38:19.937 | INFO | utils.export_utils:add_nms:241 - start add nms layers
2023-12-07 10:38:19.965 | INFO | utils.export_utils:add_nms:265 - onnx file saved to yolo_export/yolov8n/yolov8n_end2end.onnx
&&&& RUNNING TensorRT.trtexec [TensorRT v8204] # trtexec --onnx=yolo_export/yolov8n/yolov8n_end2end.onnx --saveEngine=yolo_export/yolov8n/yolov8n_end2end.engine --workspace=10240 --fp16
[12/07/2023-10:38:20] [I] === Model Options ===
[12/07/2023-10:38:20] [I] Format: ONNX
[12/07/2023-10:38:20] [I] Model: yolo_export/yolov8n/yolov8n_end2end.onnx
[12/07/2023-10:38:20] [I] Output:
[12/07/2023-10:38:20] [I] === Build Options ===
[12/07/2023-10:38:20] [I] Max batch: explicit batch
[12/07/2023-10:38:20] [I] Workspace: 10240 MiB
[12/07/2023-10:38:20] [I] minTiming: 1
[12/07/2023-10:38:20] [I] avgTiming: 8
[12/07/2023-10:38:20] [I] Precision: FP32+FP16
[12/07/2023-10:38:20] [I] Calibration:
[12/07/2023-10:38:20] [I] Refit: Disabled
[12/07/2023-10:38:20] [I] Sparsity: Disabled
[12/07/2023-10:38:20] [I] Safe mode: Disabled
[12/07/2023-10:38:20] [I] DirectIO mode: Disabled
[12/07/2023-10:38:20] [I] Restricted mode: Disabled
[12/07/2023-10:38:20] [I] Save engine: yolo_export/yolov8n/yolov8n_end2end.engine
[12/07/2023-10:38:20] [I] Load engine:
[12/07/2023-10:38:20] [I] Profiling verbosity: 0
[12/07/2023-10:38:20] [I] Tactic sources: Using default tactic sources
[12/07/2023-10:38:20] [I] timingCacheMode: local
[12/07/2023-10:38:20] [I] timingCacheFile:
[12/07/2023-10:38:20] [I] Input(s)s format: fp32:CHW
[12/07/2023-10:38:20] [I] Output(s)s format: fp32:CHW
[12/07/2023-10:38:20] [I] Input build shapes: model
[12/07/2023-10:38:20] [I] Input calibration shapes: model
[12/07/2023-10:38:20] [I] === System Options ===
[12/07/2023-10:38:20] [I] Device: 0
[12/07/2023-10:38:20] [I] DLACore:
[12/07/2023-10:38:20] [I] Plugins:
[12/07/2023-10:38:20] [I] === Inference Options ===
[12/07/2023-10:38:20] [I] Batch: Explicit
[12/07/2023-10:38:20] [I] Input inference shapes: model
[12/07/2023-10:38:20] [I] Iterations: 10
[12/07/2023-10:38:20] [I] Duration: 3s (+ 200ms warm up)
[12/07/2023-10:38:20] [I] Sleep time: 0ms
[12/07/2023-10:38:20] [I] Idle time: 0ms
[12/07/2023-10:38:20] [I] Streams: 1
[12/07/2023-10:38:20] [I] ExposeDMA: Disabled
[12/07/2023-10:38:20] [I] Data transfers: Enabled
[12/07/2023-10:38:20] [I] Spin-wait: Disabled
[12/07/2023-10:38:20] [I] Multithreading: Disabled
[12/07/2023-10:38:20] [I] CUDA Graph: Disabled
[12/07/2023-10:38:20] [I] Separate profiling: Disabled
[12/07/2023-10:38:20] [I] Time Deserialize: Disabled
[12/07/2023-10:38:20] [I] Time Refit: Disabled
[12/07/2023-10:38:20] [I] Skip inference: Disabled
[12/07/2023-10:38:20] [I] Inputs:
[12/07/2023-10:38:20] [I] === Reporting Options ===
[12/07/2023-10:38:20] [I] Verbose: Disabled
[12/07/2023-10:38:20] [I] Averages: 10 inferences
[12/07/2023-10:38:20] [I] Percentile: 99
[12/07/2023-10:38:20] [I] Dump refittable layers:Disabled
[12/07/2023-10:38:20] [I] Dump output: Disabled
[12/07/2023-10:38:20] [I] Profile: Disabled
[12/07/2023-10:38:20] [I] Export timing to JSON file:
[12/07/2023-10:38:20] [I] Export output to JSON file:
[12/07/2023-10:38:20] [I] Export profile to JSON file:
[12/07/2023-10:38:20] [I]
[12/07/2023-10:38:20] [I] === Device Information ===
[12/07/2023-10:38:20] [I] Selected Device: Tesla T4
[12/07/2023-10:38:20] [I] Compute Capability: 7.5
[12/07/2023-10:38:20] [I] SMs: 40
[12/07/2023-10:38:20] [I] Compute Clock Rate: 1.59 GHz
[12/07/2023-10:38:20] [I] Device Global Memory: 14971 MiB
[12/07/2023-10:38:20] [I] Shared Memory per SM: 64 KiB
[12/07/2023-10:38:20] [I] Memory Bus Width: 256 bits (ECC enabled)
[12/07/2023-10:38:20] [I] Memory Clock Rate: 5.001 GHz
[12/07/2023-10:38:20] [I]
[12/07/2023-10:38:20] [I] TensorRT version: 8.2.4
[12/07/2023-10:38:20] [I] [TRT] [MemUsageChange] Init CUDA: CPU +321, GPU +0, now: CPU 333, GPU 1244 (MiB)
[12/07/2023-10:38:21] [I] [TRT] [MemUsageSnapshot] Begin constructing builder kernel library: CPU 333 MiB, GPU 1244 MiB
[12/07/2023-10:38:21] [I] [TRT] [MemUsageSnapshot] End constructing builder kernel library: CPU 468 MiB, GPU 1278 MiB
[12/07/2023-10:38:21] [I] Start parsing network model
[12/07/2023-10:38:21] [I] [TRT] ----------------------------------------------------------------
[12/07/2023-10:38:21] [I] [TRT] Input filename: yolo_export/yolov8n/yolov8n_end2end.onnx
[12/07/2023-10:38:21] [I] [TRT] ONNX IR version: 0.0.9
[12/07/2023-10:38:21] [I] [TRT] Opset version: 11
[12/07/2023-10:38:21] [I] [TRT] Producer name: pytorch
[12/07/2023-10:38:21] [I] [TRT] Producer version: 1.12.1
[12/07/2023-10:38:21] [I] [TRT] Domain:
[12/07/2023-10:38:21] [I] [TRT] Model version: 0
[12/07/2023-10:38:21] [I] [TRT] Doc string:
[12/07/2023-10:38:21] [I] [TRT] ----------------------------------------------------------------
[12/07/2023-10:38:21] [W] [TRT] onnx2trt_utils.cpp:366: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[12/07/2023-10:38:21] [E] [TRT] ModelImporter.cpp:773: While parsing node number 124 [Resize -> "onnx::Concat_259"]:
[12/07/2023-10:38:21] [E] [TRT] ModelImporter.cpp:774: --- Begin node ---
[12/07/2023-10:38:21] [E] [TRT] ModelImporter.cpp:775: input: "onnx::Resize_254"
input: "onnx::Resize_258"
input: "onnx::Resize_420"
output: "onnx::Concat_259"
name: "Resize_120"
op_type: "Resize"
attribute {
name: "coordinate_transformation_mode"
s: "asymmetric"
type: STRING
}
attribute {
name: "cubic_coeff_a"
f: -0.75
type: FLOAT
}
attribute {
name: "mode"
s: "nearest"
type: STRING
}
attribute {
name: "nearest_mode"
s: "floor"
type: STRING
}

[12/07/2023-10:38:21] [E] [TRT] ModelImporter.cpp:776: --- End node ---
[12/07/2023-10:38:21] [E] [TRT] ModelImporter.cpp:779: ERROR: builtin_op_importers.cpp:3608 In function importResize:
[8] Assertion failed: scales.is_weights() && "Resize scales must be an initializer!"
[12/07/2023-10:38:21] [E] Failed to parse onnx file
[12/07/2023-10:38:21] [I] Finish parsing network model
[12/07/2023-10:38:21] [E] Parsing model failed
[12/07/2023-10:38:21] [E] Failed to create engine from model.
[12/07/2023-10:38:21] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8204] # trtexec --onnx=yolo_export/yolov8n/yolov8n_end2end.onnx --saveEngine=yolo_export/yolov8n/yolov8n_end2end.engine --workspace=10240 --fp16
2023-12-07 10:38:21.586 | ERROR | main:end2end:211 - Convert to engine file failed.

用tensorrt导出模型后，推理精度有所降低，会出现误判的情况，请问up有遇到过类似的情况吗？

this engine was generated by trtexec and the trt model in inference code was supported by torch2trt

It seems to be rough. Not following the common steps.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.