Giter Site home page Giter Site logo

yolov8_trt's Introduction

YOLOv8 TensorRT 快速部署工具

What is this

  • 通过yolov8官方代码训练生成的权重文件,均可通过此项目转换为tensorrt量化模型,同时支持常规模型以及端到端模型
  • 进一步的,同样支持输出维度结构为
#                  批大小        预测框数量    x, y, w, h, obj_conf, label_scores
outputs.shape = [batch_size, number_of_preds, 5 + number_of_classes]
outputs.shape = [batch_size, 5 + number_of_classes, number_of_preds]

# 或者                                         x, y, w, h, scores
outputs.shape = [batch_size, number_of_preds, 4 + number_of_classes]
outputs.shape = [batch_size, 4 + number_of_classes, number_of_preds]

的其他目标检测模型的onnx文件转换为tensorrt的engine文件和便于python使用的pt文件.

环境准备

  • CUDA, CuDNN, TensorRT(Toolkit)
  • 所有requirements.txt中的三方库

注意

  • end2end模型要求TensorRT版本为8.0.0及以上(本人使用的是TensorRT-8.0.1.6.Windows10.x86_64.cuda-11.3.cudnn8.2)
  • 由于修改了原项目ultralytics中的一些函数,所以导出模型时不要将原项目代替本项目中的ultralytics文件夹

其中 torch2trt 安装如下

git clone https://github.com/NVIDIA-AI-IOT/torch2trt.git
cd torch2trt
python setup.py install

python 的 tensorrt 通过 TensorRT Toolkit中提供的安装包安装,然后

git clone https://github.com/LSH9832/yolov8_trt
cd yolov8_trt
pip install -r requirements.txt

使用

  • 支持转onnx、tensorrt engine文件以及将engine文件与相关信息打包的,便于python中使用的pt文件
  • 支持常规模型以及端到端(End2End)模型

下载模型

通过yolov8官方方式下载模型

python download_all_models.py

导出模型

# mode有以下5个

# 1. onnx:仅导出onnx常规模型

# 2. trt:导出onnx及tensorrt常规模型
# 3. end2end:导出onnx常规模型,并通过其导出end2end模型,再转化为end2end的engine模型

# 4. onnx2trt: 其他非yolov8模型的onnx文件(未添加end2end结构)转换为tensorrt常规模型
# 5. onnx2end2end: 其他非yolov8模型的onnx文件(未添加end2end结构)转换为端到端onnx模型,再转化为end2end的engine模型
python export.py --mode trt

                 # 如果是yolov8模型(trt模式或end2end模式),输入以下4个参数
                 --weights yolov8s.pt
                 --batch 1
                 --img-size 640 640    # 4:3视频流设为 480 640, 16:9视频流设为 384 640
                 --opset 11
                 
                 # 否则(onnx2trt模式或onnx2end2end模式)输入以下2个参数
                 --onnx yolov7.onnx   # 其他模型的onnx文件
                 --cfg yolov7.yaml    # 所需包含的信息见下文
                 
                 # 以下所有参数如为onnx模式不用写
                 --workspace 10   # 转换为tensorrt模型时最大使用的显存空间(GB)
                 --fp16           # int8, best,模型精度使能
                 
                 # 以下为end2end模式和onnx2end2end模式所需参数(Batched_NMS中所需参数),其他模式不用填写
                 --conf-thres 0.2   # 置信度阈值
                 --nms-thres 0.6    # NMS 目标框重合度阈值
                 --topk 2000        # 选置信度最高的前K个目标框作为NMS的输入
                 --keep 100         # NMS之后最多保留前K个置信度最高的结果

其中,--cfg 指向的yaml文件应该包含如下信息

batch_size: 1                   # int        批大小
pixel_range: 1                  # int        输入图像像素值范围, 1代表[0, 1]之间(YOLOv5,6,7等),255代表[0, 255]之间(YOLOX等)
obj_conf_enabled: true          # bool       是否使用了前景预测(obj_conf), 注意true和false开头字母用小写
img_size: [640, 640]            # List[int]  输入图像 [高,宽],注意不要反了
input_name: ["input_0"]         # List[str]  onnx文件输入变量名
output_name: ["output_0"]       # List[str]  onnx文件输出变量名
names: ["person", "car", ...]   # List[str]  类别名称

不同模式导出不同模型,下面以yolov8s.pt为例

  • onnx模式
./yolo_export/yolov8s/yolov8s.onnx   # 常规模型
./yolo_export/yolov8s/yolov8s.yaml   # 模型相关信息
./yolo_export/yolov8s/yolov8s.json   # 模型相关信息
  • trt模式
./yolo_export/yolov8s/yolov8s.onnx   # 常规模型
./yolo_export/yolov8s/yolov8s.yaml   # 常规模型相关信息
./yolo_export/yolov8s/yolov8s.json   # 常规模型相关信息

./yolo_export/yolov8s/yolov8s.engine # 用于C++部署
./yolo_export/yolov8s/yolov8s.pt     # 用于python部署
  • end2end模式
./yolo_export/yolov8s/yolov8s.onnx   # 常规模型
./yolo_export/yolov8s/yolov8s.yaml   # 常规模型相关信息
./yolo_export/yolov8s/yolov8s.json   # 常规模型相关信息

./yolo_export/yolov8s/yolov8s_end2end.onnx   # 端到端模型
./yolo_export/yolov8s/yolov8s_end2end.json   # 端到端模型相关信息
./yolo_export/yolov8s/yolov8s_end2end.engine # 用于C++部署
./yolo_export/yolov8s/yolov8s_end2end.pt     # 用于python部署

端到端的模型最后会增加如下结构, 相关代码改自https://github.com/DataXujing/YOLOv8

推理

python

由本项目生成的,不论是常规模型还是端到端模型,均可使用,不依赖原项目的函数功能

python trt_infer.py --weight ./yolo_export/yolov8s/yolov8s.pt
                    --source path/to/your/video/file or (rtsp/rtmp/http)://xxx.xxx.xxx.xxx/live/xxx or 0
                    --no-label   # 声明此项则不会显示类别与置信度标签,仅显示目标框
                    
                    # 以下为常规模型所需参数,端到端模型无效
                    --conf 0.2     # 置信度阈值
                    --nms 0.6      # NMS 目标框重合度阈值

c++

即将推出,着急用可到其他项目中找找

yolov8_trt's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

yolov8_trt's Issues

error

yolo.cpp:(.text+0x1de0): undefined reference to cv::imread(std::string const&, int)' yolo.cpp:(.text+0x1ed4): undefined reference to cv::imshow(std::string const&, cv::_InputArray const&)'
yolo.cpp:(.text+0x1ef4): undefined reference to `cv::waitKey(int)'

batch的问题

你好 感谢您的工作 当设置batch>1时 输出onnx依然是1x3x640x640请问怎么实现多batch

Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64

python export.py --mode end2end --weights yolov8n.pt --batch 1 --img-size 640 640 --opset 11 --onnx yolov8n.onnx --cfg yolov8n.yaml --workspace 10 --fp16 --conf-thres 0.2 --nms-thres 0.45 --topk 300 --keep 100
Fusing layers...
YOLOv8n summary: 168 layers, 3151904 parameters, 0 gradients, 8.7 GFLOPs
Ultralytics YOLOv8.0.3 🚀 Python-3.9.12 torch-1.12.1+cu113 CPU
half=True only compatible with GPU or CoreML export, i.e. use device=0 or format=coreml
Fusing layers...
YOLOv8n summary: 168 layers, 3151904 parameters, 0 gradients, 8.7 GFLOPs

PyTorch: starting from yolov8n.pt with output shape (1, 84, 8400) (6.2 MB)

ONNX: starting export with onnx 1.15.0...
ONNX: export success ✅ 2.3s, saved as ./yolo_export/yolov8n/yolov8n.onnx (12.2 MB)

Export complete (2.5s)
Results saved to /project/trt/yolov8_trt
Predict: yolo task=detect mode=predict model=./yolo_export/yolov8n/yolov8n.onnx -WARNING ⚠️ not yet supported for YOLOv8 exported models
Validate: yolo task=detect mode=val model=./yolo_export/yolov8n/yolov8n.onnx -WARNING ⚠️ not yet supported for YOLOv8 exported models
Visualize: https://netron.app
2023-12-07 10:38:19.932 | INFO | utils.export_utils:info:40 - origin input size: [1, 3, 640, 640]
2023-12-07 10:38:19.932 | INFO | utils.export_utils:info:41 - origin output size: [1, 84, 8400]
2023-12-07 10:38:19.933 | INFO | utils.export_utils:add_postprocess:57 - add transpose layer: [1, 84, 8400] ---→ [1, 8400, 84]
2023-12-07 10:38:19.933 | INFO | utils.export_utils:add_without_obj_conf:68 - add layers without obj conf
2023-12-07 10:38:19.937 | INFO | utils.export_utils:add_nms:241 - start add nms layers
2023-12-07 10:38:19.965 | INFO | utils.export_utils:add_nms:265 - onnx file saved to yolo_export/yolov8n/yolov8n_end2end.onnx
&&&& RUNNING TensorRT.trtexec [TensorRT v8204] # trtexec --onnx=yolo_export/yolov8n/yolov8n_end2end.onnx --saveEngine=yolo_export/yolov8n/yolov8n_end2end.engine --workspace=10240 --fp16
[12/07/2023-10:38:20] [I] === Model Options ===
[12/07/2023-10:38:20] [I] Format: ONNX
[12/07/2023-10:38:20] [I] Model: yolo_export/yolov8n/yolov8n_end2end.onnx
[12/07/2023-10:38:20] [I] Output:
[12/07/2023-10:38:20] [I] === Build Options ===
[12/07/2023-10:38:20] [I] Max batch: explicit batch
[12/07/2023-10:38:20] [I] Workspace: 10240 MiB
[12/07/2023-10:38:20] [I] minTiming: 1
[12/07/2023-10:38:20] [I] avgTiming: 8
[12/07/2023-10:38:20] [I] Precision: FP32+FP16
[12/07/2023-10:38:20] [I] Calibration:
[12/07/2023-10:38:20] [I] Refit: Disabled
[12/07/2023-10:38:20] [I] Sparsity: Disabled
[12/07/2023-10:38:20] [I] Safe mode: Disabled
[12/07/2023-10:38:20] [I] DirectIO mode: Disabled
[12/07/2023-10:38:20] [I] Restricted mode: Disabled
[12/07/2023-10:38:20] [I] Save engine: yolo_export/yolov8n/yolov8n_end2end.engine
[12/07/2023-10:38:20] [I] Load engine:
[12/07/2023-10:38:20] [I] Profiling verbosity: 0
[12/07/2023-10:38:20] [I] Tactic sources: Using default tactic sources
[12/07/2023-10:38:20] [I] timingCacheMode: local
[12/07/2023-10:38:20] [I] timingCacheFile:
[12/07/2023-10:38:20] [I] Input(s)s format: fp32:CHW
[12/07/2023-10:38:20] [I] Output(s)s format: fp32:CHW
[12/07/2023-10:38:20] [I] Input build shapes: model
[12/07/2023-10:38:20] [I] Input calibration shapes: model
[12/07/2023-10:38:20] [I] === System Options ===
[12/07/2023-10:38:20] [I] Device: 0
[12/07/2023-10:38:20] [I] DLACore:
[12/07/2023-10:38:20] [I] Plugins:
[12/07/2023-10:38:20] [I] === Inference Options ===
[12/07/2023-10:38:20] [I] Batch: Explicit
[12/07/2023-10:38:20] [I] Input inference shapes: model
[12/07/2023-10:38:20] [I] Iterations: 10
[12/07/2023-10:38:20] [I] Duration: 3s (+ 200ms warm up)
[12/07/2023-10:38:20] [I] Sleep time: 0ms
[12/07/2023-10:38:20] [I] Idle time: 0ms
[12/07/2023-10:38:20] [I] Streams: 1
[12/07/2023-10:38:20] [I] ExposeDMA: Disabled
[12/07/2023-10:38:20] [I] Data transfers: Enabled
[12/07/2023-10:38:20] [I] Spin-wait: Disabled
[12/07/2023-10:38:20] [I] Multithreading: Disabled
[12/07/2023-10:38:20] [I] CUDA Graph: Disabled
[12/07/2023-10:38:20] [I] Separate profiling: Disabled
[12/07/2023-10:38:20] [I] Time Deserialize: Disabled
[12/07/2023-10:38:20] [I] Time Refit: Disabled
[12/07/2023-10:38:20] [I] Skip inference: Disabled
[12/07/2023-10:38:20] [I] Inputs:
[12/07/2023-10:38:20] [I] === Reporting Options ===
[12/07/2023-10:38:20] [I] Verbose: Disabled
[12/07/2023-10:38:20] [I] Averages: 10 inferences
[12/07/2023-10:38:20] [I] Percentile: 99
[12/07/2023-10:38:20] [I] Dump refittable layers:Disabled
[12/07/2023-10:38:20] [I] Dump output: Disabled
[12/07/2023-10:38:20] [I] Profile: Disabled
[12/07/2023-10:38:20] [I] Export timing to JSON file:
[12/07/2023-10:38:20] [I] Export output to JSON file:
[12/07/2023-10:38:20] [I] Export profile to JSON file:
[12/07/2023-10:38:20] [I]
[12/07/2023-10:38:20] [I] === Device Information ===
[12/07/2023-10:38:20] [I] Selected Device: Tesla T4
[12/07/2023-10:38:20] [I] Compute Capability: 7.5
[12/07/2023-10:38:20] [I] SMs: 40
[12/07/2023-10:38:20] [I] Compute Clock Rate: 1.59 GHz
[12/07/2023-10:38:20] [I] Device Global Memory: 14971 MiB
[12/07/2023-10:38:20] [I] Shared Memory per SM: 64 KiB
[12/07/2023-10:38:20] [I] Memory Bus Width: 256 bits (ECC enabled)
[12/07/2023-10:38:20] [I] Memory Clock Rate: 5.001 GHz
[12/07/2023-10:38:20] [I]
[12/07/2023-10:38:20] [I] TensorRT version: 8.2.4
[12/07/2023-10:38:20] [I] [TRT] [MemUsageChange] Init CUDA: CPU +321, GPU +0, now: CPU 333, GPU 1244 (MiB)
[12/07/2023-10:38:21] [I] [TRT] [MemUsageSnapshot] Begin constructing builder kernel library: CPU 333 MiB, GPU 1244 MiB
[12/07/2023-10:38:21] [I] [TRT] [MemUsageSnapshot] End constructing builder kernel library: CPU 468 MiB, GPU 1278 MiB
[12/07/2023-10:38:21] [I] Start parsing network model
[12/07/2023-10:38:21] [I] [TRT] ----------------------------------------------------------------
[12/07/2023-10:38:21] [I] [TRT] Input filename: yolo_export/yolov8n/yolov8n_end2end.onnx
[12/07/2023-10:38:21] [I] [TRT] ONNX IR version: 0.0.9
[12/07/2023-10:38:21] [I] [TRT] Opset version: 11
[12/07/2023-10:38:21] [I] [TRT] Producer name: pytorch
[12/07/2023-10:38:21] [I] [TRT] Producer version: 1.12.1
[12/07/2023-10:38:21] [I] [TRT] Domain:
[12/07/2023-10:38:21] [I] [TRT] Model version: 0
[12/07/2023-10:38:21] [I] [TRT] Doc string:
[12/07/2023-10:38:21] [I] [TRT] ----------------------------------------------------------------
[12/07/2023-10:38:21] [W] [TRT] onnx2trt_utils.cpp:366: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[12/07/2023-10:38:21] [E] [TRT] ModelImporter.cpp:773: While parsing node number 124 [Resize -> "onnx::Concat_259"]:
[12/07/2023-10:38:21] [E] [TRT] ModelImporter.cpp:774: --- Begin node ---
[12/07/2023-10:38:21] [E] [TRT] ModelImporter.cpp:775: input: "onnx::Resize_254"
input: "onnx::Resize_258"
input: "onnx::Resize_420"
output: "onnx::Concat_259"
name: "Resize_120"
op_type: "Resize"
attribute {
name: "coordinate_transformation_mode"
s: "asymmetric"
type: STRING
}
attribute {
name: "cubic_coeff_a"
f: -0.75
type: FLOAT
}
attribute {
name: "mode"
s: "nearest"
type: STRING
}
attribute {
name: "nearest_mode"
s: "floor"
type: STRING
}

[12/07/2023-10:38:21] [E] [TRT] ModelImporter.cpp:776: --- End node ---
[12/07/2023-10:38:21] [E] [TRT] ModelImporter.cpp:779: ERROR: builtin_op_importers.cpp:3608 In function importResize:
[8] Assertion failed: scales.is_weights() && "Resize scales must be an initializer!"
[12/07/2023-10:38:21] [E] Failed to parse onnx file
[12/07/2023-10:38:21] [I] Finish parsing network model
[12/07/2023-10:38:21] [E] Parsing model failed
[12/07/2023-10:38:21] [E] Failed to create engine from model.
[12/07/2023-10:38:21] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8204] # trtexec --onnx=yolo_export/yolov8n/yolov8n_end2end.onnx --saveEngine=yolo_export/yolov8n/yolov8n_end2end.engine --workspace=10240 --fp16
2023-12-07 10:38:21.586 | ERROR | main:end2end:211 - Convert to engine file failed.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.