lch1238 / bevdet-tensorrt-cpp Goto Github PK

View Code? Open in Web Editor NEW

221.0 4.0 34.0 7.14 MB

BEVDet implemented by TensorRT, C++； Achieving real-time performance on Orin

CMake 1.87% C++ 41.08% C 0.98% Cuda 38.91% Python 17.17%

bev tensorrt-inference 3ddetection

bevdet-tensorrt-cpp's Introduction

Hi there 👋

🔭 I take an interest in 3D object detection and autonomous driving perception.
🌱 I’m a third-year graduate student at Jilin University.
📫 How to reach me: [email protected].

bevdet-tensorrt-cpp's People

Contributors

Stargazers

Watchers

bevdet-tensorrt-cpp's Issues

关于pth模型转onnx模型的问题

你好，感谢开源代码。我在使用BEVDet提供的BEVDet-R50-4DLongterm-Depth-CBGS模型转onnx模型。参考BEVDet中mmdet3d/models/necks/view_transformer.py中get_mlp_input函数，不知道如何将rot、trans和bda如何关联起来。

上图是我转换的img_stage的部分输入，跟你的比有明显差别。同时，我也转换了bev_stage，与你的比对输入输出和网络结构看起来是正常的。img_stage和bev_stage onnx模型可以转换为trt模型：

使用你提供的环视图片推理，只能得到2个目标。
使用你的img_stage模型，我自己转换的bev_stage模型只能得到20个目标。与你提供的模型得到100个目标相差甚远，感觉两个阶段转换的都有问题。

现在有两个问题，

img_stage中输入部分如何将rot、trans和bda如何关联起来？
bev_stage中转换有什么需要注意的问题？

希望提供一些指导，谢谢！

请问应该如何得到mAP？

感谢您开源这个项目，我在Xavier上成功得到了txt格式的egobbox，但是不清楚如何使用这些txt生成mAP，我注意到您在主页上列出了模型的mAP，请问您是否方便开源一下用于检测mAP的代码，或是大概介绍一下使用mmdet相关工具进行后处理的流程？

bevdet4d tensorrt推理时间是多少？

export_onx_onnx.py导出的合一版本onnx，转换tensorrt后推理，只检测出了2个object，实际上应该是100个左右

你好， @LCH1238 ，使用你合一版本的onnx，针对sample的例子，只检测出了2个obj, 实际应该是100左右，请问这是哪里配置问题吗？
具体命令及结果如下：

./bevdemo ../configure.yaml

GPU has cuda devices: 1
----device id: 0 info----
GPU : NVIDIA GeForce RTX 4090
Capbility: 8.9
Global memory: 24209MB
Const memory: 64KB
Shared memory in a block: 48KB
warp size: 32
threads in a block: 1024
block dim: (1024,1024,64)
grid dim: (2147483647,65535,65535)

Successful load config : ../configure.yaml!
1.00 0.70 0.70 0.40 0.55 1.10 1.00 1.00 1.50 3.50
valid_feat_num: 356760
unique_bev_num: 13360
InitVewTransformer cost time : 12.0418 ms
Binding 0 (images): Input.
Binding 1 (mean): Input.
Binding 2 (std): Input.
Binding 3 (cam_params): Input.
Binding 4 (ranks_depth): Input.
Binding 5 (ranks_feat): Input.
Binding 6 (ranks_bev): Input.
Binding 7 (interval_starts): Input.
Binding 8 (interval_lengths): Input.
Binding 9 (adj_feats): Input.
Binding 10 (transforms): Input.
Binding 11 (flag): Input.
Binding 12 (curr_bevfeat): Output.
Binding 13 (reg_0): Output.
Binding 14 (height_0): Output.
Binding 15 (dim_0): Output.
Binding 16 (rot_0): Output.
Binding 17 (vel_0): Output.
Binding 18 (heatmap_0): Output.
images : 6 3 900 400
mean : 3
std : 3
cam_params : 1 6 27
ranks_depth : 356760
ranks_feat : 356760
ranks_bev : 356760
interval_starts : 13360
interval_lengths : 13360
adj_feats : 1 8 80 128 128
transforms : 1 8 6
flag : 1 1
curr_bevfeat : 1 80 128 128
reg_0 : 1 2 128 128
height_0 : 1 1 128 128
dim_0 : 1 3 128 128
rot_0 : 1 2 128 128
vel_0 : 1 2 128 128
heatmap_0 : 1 10 128 128

img num binding : 19
-------------------0-------------------
scenes_token : e7ef871f77f44331aefdebc24ec034b7, timestamp : 1533201470448696
TRT-Engine : 45.86426 ms
Postprocess : 0.23169 ms
Inference : 46.09595 ms
Detect 2 objects

Failed to execute viewer.py file

While executing viewer.py , I am getting error,

File "/bevdet-tensorrt-cpp/tools/open3d_vis.py", line 36, in _draw_points
vis.get_render_option().point_size = points_size # set points size

AttributeError: 'NoneType' object has no attribute 'point_size'

Please Help,

How to do quantization to int8 model of img_stage_lt_d.onnx and bev_stage_lt_d.onnx?

Hey,
How to do quantization to int8 model of img_stage_lt_d.onnx and bev_stage_lt_d.onnx?
Thanks!

请问关于算法输入是RGB格式还是BGR格式？

python torch版本，在dataload里面，将数据转换为了RGB格式。

但是在该版本的C++实现里面，首先经过decode_jpeg 得到了RGB_HWC格式，然后调用了函数convert_RGBHWC_to_BGRCHW转换为BGRCHW格式？？？所以最终输入的是BGR格式，这是bug吗？

pytorch转onnx模型问题

在我使用您提供的pth文件和export_onnx工具转onnx模型，我也按照您的config将pre_process进行了屏蔽，但是发生如下报错：
load checkpoint from local path: models/new-bevdet-lt-d-ft-nearest.pth
[[1, 2, 128, 128], [1, 1, 128, 128], [1, 3, 128, 128], [1, 2, 128, 128], [1, 2, 128, 128], [1, 10, 128, 128]]
['reg_0', 'height_0', 'dim_0', 'rot_0', 'vel_0', 'heatmap_0']
Traceback (most recent call last):
File "tools/export/export_onnx.py", line 142, in
torch.onnx.export(
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/init.py", line 316, in export
return utils.export(model, args, f, export_params, verbose, training,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 107, in export
_export(model, args, f, export_params, verbose, training, input_names, output_names,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 724, in _export
_model_to_graph(model, args, verbose, input_names,
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 493, in _model_to_graph
graph, params, torch_out, module = _create_jit_graph(model, args)
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 437, in _create_jit_graph
graph, torch_out = _trace_and_get_graph_from_model(model, args)
File "/opt/conda/lib/python3.8/site-packages/torch/onnx/utils.py", line 388, in _trace_and_get_graph_from_model
torch.jit._get_trace_graph(model, args, strict=False, _force_outplace=False, _return_inputs_states=True)
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 1166, in _get_trace_graph
outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 127, in forward
graph, out = torch._C._create_graph_by_tracing(
File "/opt/conda/lib/python3.8/site-packages/torch/jit/_trace.py", line 118, in wrapper
outs.append(self.inner(*trace_inputs))
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1090, in _slow_forward
result = self.forward(*input, **kwargs)
File "/home/wyh/BEVDet/mmdet3d/models/detectors/trt_model.py", line 67, in forward
x = self.img_view_transformer.depth_net(x, mlp_input)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1090, in _slow_forward
result = self.forward(*input, **kwargs)
File "/home/wyh/BEVDet/mmdet3d/models/necks/view_transformer.py", line 694, in forward
mlp_input = self.bn(mlp_input.reshape(-1, mlp_input.shape[-1]))
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1090, in _slow_forward
result = self.forward(*input, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/batchnorm.py", line 168, in forward
return F.batch_norm(
File "/opt/conda/lib/python3.8/site-packages/torch/nn/functional.py", line 2282, in batch_norm
return torch.batch_norm(
RuntimeError: running_mean should contain 24 elements not 27

请问这个repo支持最基础的bevdet转换吗

大佬你好，我看了一下readme，貌似代码支持bevdet-long-term 和 bevdet-depth，请问支持最基础的bevdet-r50吗？

nusence-mini data_infos

How to generate data_infoswenj for nusence-mini dataset？

Failed to convert .pth model to .onnx

Thank you for this great work! I used the bevdet-export repo to convert 'bevdet-lt-d-ft-nearest.pth' or 'bevdet-r50-4dlongterm-depth-cbgs.pth' to onnx format, but got an error as follow:

"./BEVDet-export/mmdet3d/models/detectors/trt_model.py", line 67, in forward
x, mlp_bn, con_se, depth_se = self.img_view_transformer.depth_net(r, mlp_input)
ValueError: not enough values to unpack (expected 4, got 2)

could you help me to solve the problem or give me some advices?

Want to infer using real time camera

Thanks for such a great work.
I want to do inference using camera. So , can you suggest changes that need to be done for that or can you add a code section for inference using real-time camera ?

Thanks in advance.

您好，请问下您那边在自己的数据上跑过推理么？

参数文件

你好，标定参数文件sample0000.yaml 下的ego2global旋转平移参数有一个全局的，每个相机参数下面也有同样的参数，看了下每个相机下的旋转平移参数与全局下的ego2global_rotation ego2global_translation 差别不大，但还是有差别。请问这每个相机下的这两组参数与全局下的这两组参数有区别吗？谢谢

how to project perception results into images?

Hi:

Thanks for this great work.
I have successfully generate sample0_lidarbox.txt, then how to project perception results into images?

best regards

CMake Error at bevdemo_generated_iou3d_nms.cu

Thanks for sharing this amazing work!
I tried to build the project using cmake .. && make but got this error:

-- /usr/local/cuda/targets/x86_64-linux/lib/libnvjpeg.so
-- Configuring done
-- Generating done
-- Build files have been written to: /home/user/colcon_ws/build
[  4%] Building NVCC (Device) object CMakeFiles/bevdemo.dir/src/bevdemo_generated_preprocess_plugin.cu.o
[  8%] Building NVCC (Device) object CMakeFiles/bevdemo.dir/src/bevdemo_generated_alignbev_plugin.cu.o
[ 13%] Building NVCC (Device) object CMakeFiles/bevdemo.dir/src/bevdemo_generated_bevpool_plugin.cu.o
[ 17%] Building NVCC (Device) object CMakeFiles/bevdemo.dir/src/bevdemo_generated_gatherbev_plugin.cu.o
[ 21%] Building NVCC (Device) object CMakeFiles/bevdemo.dir/src/bevdemo_generated_iou3d_nms.cu.o
/usr/local/cuda/targets/x86_64-linux/include/cub/agent/../block/block_merge_sort.cuh(169): error: expected a "," or ">"

1 error detected in the compilation of "/home/user/colcon_ws/src/iou3d_nms.cu".
CMake Error at bevdemo_generated_iou3d_nms.cu.o.RELEASE.cmake:280 (message):
  Error generating file
  /home/user/colcon_ws/build/CMakeFiles/bevdemo.dir/src/./bevdemo_generated_iou3d_nms.cu.o


make[2]: *** [CMakeFiles/bevdemo.dir/build.make:84: CMakeFiles/bevdemo.dir/src/bevdemo_generated_iou3d_nms.cu.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:84: CMakeFiles/bevdemo.dir/all] Error 2
make: *** [Makefile:91: all] Error 2

Could you please provide some suggestions to fix it? I am running One branch.

Thanks!

License information

Thank you for this great work!
I noticed that the repository is missing the necessary license information.
Could you please add one? Thank you!

目前模型fp32和fp16效果正常，int8下效果下降很多，这个正常么

fp32和fp16基本一致，检测框在120个左右
int8，检测框只能在10个左右了

其他啥都没动，仅仅是orin板端转成了int8的engine，用的--best

增加Pre-process-net stage后未达到预期结果

@LCH1238 作者你好，感谢你的开源。
有个问题请教你：
由于你的代码中使用去掉pre-process-net模块的模型转换和推理的，而我的模型中有pre-process-net模块，所以我修改转模型的代码，将自己的模型转成三个阶段，分别是image-stage-4d-depth，pre-process-stage-4d-depth，bev-stage-4d-depth。
InitEngine()中，三个模型的输入输出维度是：

images : 6 3 512 1408
rot : 1 6 3 3
trans : 1 6 3
intrin : 1 6 3 3
post_rot : 1 6 3 3
post_trans : 1 6 3
bda : 1 3 3
depth : 6 118 32 88
images_feat : 6 32 88 80

pre_process_input : 1 80 128 128
pre_process_output : 1 80 128 128

BEV_feat : 1 160 128 128
reg_0 : 1 2 128 128
height_0 : 1 1 128 128
dim_0 : 1 3 128 128
rot_0 : 1 2 128 128
vel_0 : 1 2 128 128
heatmap_0 : 1 10 128 128

我对推理代码做了如下修改：

将bev_pool_v2的输出赋值给pre-process-net的输入：

bev_pool_v2(bevpool_channel, unique_bev_num, bev_h * bev_w,
                (float*)imgstage_buffer[imgbuffer_map["depth"]], 
                (float*)imgstage_buffer[imgbuffer_map["images_feat"]], 
                ranks_depth_dev, ranks_feat_dev, ranks_bev_dev,
                interval_starts_dev, interval_lengths_dev,
                // (float*)bevstage_buffer[bevbuffer_map["BEV_feat"]]//out
                (float*)preprocessstage_buffer[preprocessbuffer_map["pre_process_input"]]
                );

然后进行Pre-process stage network forward：

if(!preprocessstage_context->enqueueV2(preprocessstage_buffer, stream, nullptr)){
        printf("Pre-peocess stage forward failing!\n");
    }
CHECK_CUDA(cudaDeviceSynchronize());

将Pre-process stage的输出复制给bevstage_buffer的当前BEV_feat：

CHECK_CUDA(cudaMemcpy((float*)bevstage_buffer[bevbuffer_map["BEV_feat"]], (float*)preprocessstage_buffer[preprocessbuffer_map["pre_process_output"]],
                        bev_h * bev_w * bevpool_channel * sizeof(float), cudaMemcpyDeviceToDevice));

最后依次进行align BEV feature 、BEV stage network forward、post process

我训练得到的pth模型的mAP=0.43，并不算低，但是转化为tensorrt后，用上述的推理代码，和本仓库提供的sample，输出结果只有16个object，而且使用viewer.py可视化我的推理结果，发现检测框并不在点云目标上。
请问你知道原因吗？或者我修改的推理代码逻辑是否正确？
期待你的回复！

The program execution time is a bit long

Thanks for your greate job!
I execute this program on Jetson Orin 16G，But the program execution time is a bit long：
......
-------------------0-------------------
scenes_token : e7ef871f77f44331aefdebc24ec034b7, timestamp : 1533201470448696
[Preprocess ] cost time: 19.340 ms
[Image stage ] cost time: 126.830 ms
[BEV pool ] cost time: 3.655 ms
[Align Feature] cost time: 10.830 ms
[BEV stage ] cost time: 33.886 ms
[Postprocess ] cost time: 3.629 ms
[Infer total ] cost time: 198.171 ms
Detect 100 objects
-------------------0-------------------
scenes_token : e7ef871f77f44331aefdebc24ec034b7, timestamp : 1533201470448696
[Preprocess ] cost time: 13.267 ms
[Image stage ] cost time: 91.173 ms
[BEV pool ] cost time: 4.079 ms
[Align Feature] cost time: 7.566 ms
[BEV stage ] cost time: 33.450 ms
[Postprocess ] cost time: 4.519 ms
[Infer total ] cost time: 154.055 ms
Detect 100 objects

Does anyone know why?

The scores are based on nuScenes test set or val set?

I mean mAP and NDS in the second table.

one分支导模型出现问题`BEVDepth4DTRT is not in the models registry`

您好，感谢您的工作。我在使用one分支测试bevdet.yaml，运行时YAML:会报错，我看了代码发现是bevdet.yaml中有些参数没有给出来，我测试bevdet_lt_depth.yaml配置时，使用tools/convert_bevdet_to_TRT.py脚本将pth转engine时，会出现KeyError: 'BEVDepth4DTRT is not in the models registry'错误，转模型使用的分支是BEVDetv2.1。我应该怎么处理才能使bevdet-r50-4dlongterm-depth-cbgs.pth文件转成engine文件？谢谢

Failed to build engine parser

Hii, when I execute ./bevdemo ../configure.yaml , got error

ERROR: 3: [runtime.cpp::deserializeCudaEngine::36] Error Code 3: API Usage Error (Parameter check failed at: runtime/rt/runtime.cpp::deserializeCudaEngine::36, condition: (blob) != nullptr
)
ERROR: Failed to build engine parser!
free(): double free detected in tcache 2
Aborted (core dumped)

Please resolve it

TypeError: tuple indices must be integers or slices, not tuple

Hello, @LCH1238, Thanks your work.

When i using your export branch, running python tools/train.py configs/bevdet/bevdet-r50-4dlongterm-depth-cbgs.py, get this error. The official version is OK. Do you have any option resolve it. Thanks.

导出onnx模型不一致问题

使用https://github.com/LCH1238/BEVDet/tree/export中的export_onnx.py，权重使用https://drive.google.com/drive/folders/1jSGT0PhKOmW3fibp6fvlJ7EY6mIBVv6i路径中的bevdet-lt-d-ft-nearest.pth导出onnx，最终的img_stage_lt_d.onnx如下图所示，与作者您提供的img_stage_lt_d.onnx不一致

export中尺寸问题

img_input = torch.zeros([6B, 3, 900, 400], dtype=torch.int32, device=f'cuda:{args.gpu_id}')
请问在转onnx时为什么要设置成900400，不直接设置为900*1600？

Questions about abtain bev_feat_prev

Hello @LCH1238 , thank you for the open source code.

I have some questions, or maybe I misunderstood the code, welcome to correct.

For BEVDet-r50-longterm-depth model, before bev_stage, need to concatenate multiple frames bev_feat_prev and align them, however, I find bev_feat_prev in the code is not actual, but the same bev_feat_curr has been concatenated num_adj times.code

So how to obtain the bev_feat_prev during the current frame inference.
look forward your reply.

lch1238 / bevdet-tensorrt-cpp Goto Github PK

bevdet-tensorrt-cpp's Introduction

Hi there 👋

bevdet-tensorrt-cpp's People

Contributors

Stargazers

Watchers

Forkers

bevdet-tensorrt-cpp's Issues

Recommend Projects

Recommend Topics

Recommend Org