Giter Site home page Giter Site logo

ouyangjunyuan / pointcloud-3d-detector-tensorrt Goto Github PK

View Code? Open in Web Editor NEW
54.0 4.0 10.0 16.27 MB

The first tensorrt implementation for point-based 3d detector, i.e., 3DSSD,SASA,IA-SSD.

CMake 6.38% C++ 46.44% Cuda 42.27% C 2.51% Python 2.40%
3d-detection pointcloud-detection tensorrt ia-ssd sasa 3dssd

pointcloud-3d-detector-tensorrt's Introduction

Introduction

In this repo, we provide a ros wrapper for lightweight yet powerful 3D object detection with TensorRT inference backend for real-time robotic applications.

  1. It is effective and efficient, achieving 5 ms runtime and 85% 3D Car mAP@R40.
  2. we chose IA-SSD as baseline since its high efficiency. Further, HAVSampler and GridBallQuery are adopted to gain 1000x faster than FPS and original BallQuery, respectively.
  3. we implement TensorRT plugins for NMS postprocessing and some common-to-use operators of point-based point cloud detector, e.g., sampling, grouping, gather.

News

  1. [2022/05/01]: We offer a faster version HAVSampler and reconstruct all plugins with our auto-declaration header. updates can be found in branch devel.
  2. [2022/04/17]: We release the PyTorch models and ONNX export script. You can retrain or do some modified based our models.
  3. [2022/04/14]: This repository implements GridBallQuery with a computational complexity of $\mathcal{O}(NK^3)$, instead of $\mathcal{O}(NM)$ of BallQuery.
  4. [2022/04/08]: Support INT8 quantization and Profiler.

Build

we test on the platform:

  1. ubuntu18.0 with GPU 2080Ti
  2. python3.7
  3. pytorch1.12
  4. cuda11.0
  5. cudnn8.4
  6. tensorrt8.4.0

You should follow the official guidance to install the above dependencies at first, and then build this package.

export CUDNN_DIR=/path/to/cudnn/root
export TENSORRT_DIR=/path/to/tensorrt/root

mkdir -p build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release -DTRT_QUANTIZE=FP16 -DCMAKE_CUDA_COMPILER=/usr/local/cuda/bin/nvcc
make -j$(nproc)

or build as normal ros package.

Test

We test exported model with TensorRT in KITTI val set and report the results AP_3D@R11/R40 as following:

iassd_hvcsx2_4x8_80e_kitti_3cls

Car Pedestrian Cyclist Runtime
FP32 83.8752 / 84.9749 53.9177 / 53.1046 67.2500 / 67.1609 10 ms
FP16 80.2896 / 80.8535 53.0247 / 51.4732 67.8503 / 68.3627 8 ms
INT8 77.7286 / 79.3178 52.2956 / 50.7517 68.3595 / 68.3880 9 ms

Unexpectedly, the runtime in INT8 mode is higher than that in FP16. This may be due to the fact that we did not implement INT8 format for the custom layer and the point cloud model has less large block computation.

we also profile the model in different precisions, read this for details.

iassd_hvcsx2_gq_4x8_80e_kitti_3cls

Car Pedestrian Cyclist Runtime
FP32 6 ms
FP16 5 ms
INT8

How to use

It receives msgs from sensor_msgs::PointCloud2 /points and publishes visualization_msgs::MarkerArray /objects.

./devel/lib/point_detection/point_detector

we offer another utils script to publish point clouds from .bin files.

python src/pcvt.py -s bin -d topic -t /points -p /home/nrsl/Downloads/velodyne_points/data 

Plugins

Your can easily implement a plugin just use our AUTO-CODES-GENERATION header.

ONNX

We export the model by RobDet3D. Please refer its manual to export you own onnx model. Feel free to let me know if you have any questions.

Limitation

  1. When build engine with INT8 mode, it throws cuda configuration error during calibration. Therefore, only FP32 and FP16 mode can be used.

TODO

  1. consider use cuda graph to reduce the latency introduced by launching too much kernel.
  2. use dynamic parallelism to avoid cpu-based loop in HAVSampling.

Citation

If you find this project useful in your research, please consider citing:

@article{ouyang2023hierarchical,
  title={Hierarchical Adaptive Voxel-guided Sampling for Real-time Applications in Large-scale Point Clouds},
  author={Ouyang, Junyuan and Liu, Xiao and Chen, Haoyao},
  journal={arXiv preprint arXiv:2305.14306},
  year={2023}
}

pointcloud-3d-detector-tensorrt's People

Contributors

ouyangjunyuan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

pointcloud-3d-detector-tensorrt's Issues

Model loading does not finish

After building the workspace, I tried to run the node using:

 rosrun point_detection point_detector

and got these logs:

loading /home/iassd_user/iassd_ws/src/pointcloud-3d-detector-tensorrt/plugins/lib/librd3d_trt_plugin.so ...
try loading TensorRT engine file: /home/iassd_user/iassd_ws/src/pointcloud-3d-detector-tensorrt/config/iassd_hvcsx2_gq_4x8_80e_kitti_3cls(export_fp16).engine
====== model infos ======
-> points(1, 16384, 4)
<- boxes(1, 256, 8)
<- scores(1, 256)
<- nums(1, 1)
=========================
CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars

I set CUDA_MODULE_LOADING=LAZY to get rid of the warning.
but the model is not loading further. I tried to debug it and I think here is the problem.
Any suggestions?
Thanks

How to deal with Error Code 3: API Usage Error

====== model infos ======
-> points(1, 16384, 4)
<- boxes(1, 256, 8)
<- scores(1, 256)
<- nums(1, 1)

3: [runtime.cpp::~Runtime::346] Error Code 3: API Usage Error (Parameter check failed at: runtime/rt/runtime.cpp::~Runtime::346, condition: mEngineCounter.use_count() == 1. Destroying a runtime before destroying deserialized engines created by the runtime leads to undefined behavior.
)

About Catkin_make and CUBLAS_LIB

Hi,
I tried to build this package with catkin_make and got this cuda-related error:

-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-- ~~  traversing 1 packages in topological order:
-- ~~  - point_detection
-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-- +++ processing catkin package: 'point_detection'
-- ==> add_subdirectory(pointcloud-3d-detector-tensorrt)
-- read TENSORRT_DIR from environment variable
-- read CUDNN_DIR from environment variable
CMake Error at pointcloud-3d-detector-tensorrt/plugins/CMakeLists.txt:34 (find_library):
  Could not find CUBLAS_LIB using the following names: cublas


-- Configuring incomplete, errors occurred!
Invoking "cmake" failed

I think that cudnn dir. is not correct in my case. But, I tried all possible options without luck.
When I search for libcublas:

sudo find /usr -name "libcublas*"

I got these paths:

/usr/local/lib/python3.8/dist-packages/nvidia/cublas/lib/libcublasLt.so.11
/usr/local/lib/python3.8/dist-packages/nvidia/cublas/lib/libcublas.so.11
/usr/local/cuda-12.1/targets/x86_64-linux/lib/stubs/libcublasLt.so
/usr/local/cuda-12.1/targets/x86_64-linux/lib/stubs/libcublas.so
/usr/local/cuda-12.1/targets/x86_64-linux/lib/libcublasLt.so
/usr/local/cuda-12.1/targets/x86_64-linux/lib/libcublas.so
/usr/local/cuda-12.1/targets/x86_64-linux/lib/libcublasLt.so.12
/usr/local/cuda-12.1/targets/x86_64-linux/lib/libcublas.so.12
/usr/local/cuda-12.1/targets/x86_64-linux/lib/libcublas.so.12.1.0.26
/usr/local/cuda-12.1/targets/x86_64-linux/lib/libcublasLt.so.12.1.0.26
/usr/local/cuda-11.8/targets/x86_64-linux/lib/libcublasLt.so.11.11.3.6
/usr/local/cuda-11.8/targets/x86_64-linux/lib/libcublas.so.11.11.3.6
/usr/local/cuda-11.8/targets/x86_64-linux/lib/stubs/libcublasLt.so
/usr/local/cuda-11.8/targets/x86_64-linux/lib/stubs/libcublas.so
/usr/local/cuda-11.8/targets/x86_64-linux/lib/libcublasLt.so
/usr/local/cuda-11.8/targets/x86_64-linux/lib/libcublasLt.so.11
/usr/local/cuda-11.8/targets/x86_64-linux/lib/libcublas.so.11
/usr/local/cuda-11.8/targets/x86_64-linux/lib/libcublas.so
/usr/share/doc/libcublas-dev-12-1
/usr/share/doc/libcublas-11-8
/usr/share/doc/libcublas-12-1
/usr/share/doc/libcublas-dev-11-8

Could you please direct me to solve this issue?

Thanks

How to run the HAV sampler?

Hi, thank you for your great works! I just want to run the HAV sampler of your paper, but the code is a bit complicated, can you tell me how I can do it? Input point cloud to get the down sampling point cloud.

Question about datasets

Hi!
Thanks for sharing this great work!
I wonder if you plan to add ONNX and configs for models trained on other datasets (Waymo, nuScenes,...). If not, will you provide torch to onnx conversion script for IA-SSD?

Thanks

cublas error

Hello, I tried to your project. @OuyangJunyuan

I had error like below in build step.

pointcloud-3d-detector-tensorrt/./inc/model.h:175:25: error: variable ‘std::ofstream p’ has initializer but incomplete type
175 | std::ofstream p(engine_file.c_str(), std::ios::binary);

compile error on ./plugins/src/nms3DPlugin/nms3D.cu

Dear @OuyangJunyuan,

I got some problem during building source.

The error messages are as follows,

[build] /usr/include/c++/10/tuple(582): error: pack "_UElements" does not have the same number of elements as "_Elements"
[build] detected during:
[build] instantiation of "__nv_bool std::tuple<_Elements...>::__nothrow_constructible<_UElements...>() [with _Elements=<const int &>, _UElements=<>]"
[build] /usr/include/c++/10/bits/stl_map.h(502): here
[build] instantiation of "std::map<_Key, _Tp, _Compare, _Alloc>::mapped_type &std::map<_Key, _Tp, _Compare, _Alloc>::operator[](const std::map<_Key, _Tp, _Compare, _Alloc>::key_type &) [with _Key=int, _Tp=cub::CachingDeviceAllocator::TotalBytes, _Compare=std::less, _Alloc=std::allocator<std::pair<const int, cub::CachingDeviceAllocator::TotalBytes>>]"
[build] /usr/local/cuda/include/cub/util_allocator.cuh(418): here
[build]
[build] 1 error detected in the compilation of "/home/ldk/test_ws/src/pointcloud-3d-detector-tensorrt-devel/plugins/src/nms3DPlugin/nms3D.cu".

And I googled it, they say It's because of gcc version(error on gcc-9, should downgrade to gcc-8), so I downgrade the gcc version to 8, but the error isn't solved.

Did you face the error before, or do you have any solution for this?

My configuration is as follows,
OS: Ubuntu 20.04
Cuda: 11.1
Cudnn: 8.6.0
TensorRT: 8.4.0.6 or 8.4.1.5 (Both cause the error)

Thanks.

IASSD deployment issues on jetson orin

First of all, thank you for your work on the IASSD deployment. I am having problems trying to deploy the IASSD algorithm on jetson orin. When I compile the project no error is generated, but when I publish the point cloud I get stuck and can't end the process with no result or error, can you please give me some help?
Results of cmake run:
-- read TENSORRT_DIR from environment variable
-- read CUDNN_DIR from environment variable
-- Found CUDA: /usr/local/cuda (found suitable version "11.4", minimum required is "11.4")
-- GPU_ARCHS is not defined. Generating CUDA code for default SMs: 53;60;61;70;75;80;86
-- read CUDNN_DIR from environment variable
-- read TENSORRT_DIR from environment variable
-- Using CATKIN_DEVEL_PREFIX: /home/xx/data/pointcloud-3d-detector-tensorrt/build/devel
-- Using CMAKE_PREFIX_PATH: /opt/ros/noetic
-- This workspace overlays: /opt/ros/noetic
-- Found PythonInterp: /usr/bin/python3 (found suitable version "3.8.10", minimum required is "3")
-- Using PYTHON_EXECUTABLE: /usr/bin/python3
-- Using Debian Python package layout
-- Found PY_em: /usr/lib/python3/dist-packages/em.py
-- Using empy: /usr/lib/python3/dist-packages/em.py
-- Using CATKIN_ENABLE_TESTING: ON
-- Call enable_testing()
-- Using CATKIN_TEST_RESULTS_DIR: /home/xx/data/pointcloud-3d-detector-tensorrt/build/test_results
-- Forcing gtest/gmock from source, though one was otherwise available.
-- Found gtest sources under '/usr/src/googletest': gtests will be built
-- Found gmock sources under '/usr/src/googletest': gmock will be built
CMake Deprecation Warning at /usr/src/googletest/CMakeLists.txt:4 (cmake_minimum_required):
Compatibility with CMake < 2.8.12 will be removed from a future version of
CMake.

Update the VERSION argument value or use a ... suffix to tell
CMake that the project does not need compatibility with older versions.

-- The C compiler identification is GNU 9.4.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
CMake Deprecation Warning at /usr/src/googletest/googlemock/CMakeLists.txt:45 (cmake_minimum_required):
Compatibility with CMake < 2.8.12 will be removed from a future version of
CMake.

Update the VERSION argument value or use a ... suffix to tell
CMake that the project does not need compatibility with older versions.

CMake Deprecation Warning at /usr/src/googletest/googletest/CMakeLists.txt:56 (cmake_minimum_required):
Compatibility with CMake < 2.8.12 will be removed from a future version of
CMake.

Update the VERSION argument value or use a ... suffix to tell
CMake that the project does not need compatibility with older versions.

-- Found PythonInterp: /usr/bin/python3 (found version "3.8.10")
-- Using Python nosetests: /usr/bin/nosetests3
-- catkin 0.8.10
-- BUILD_SHARED_LIBS is on
-- Checking for module 'eigen3'
-- Found eigen3, version 3.3.7
-- Found Eigen: /usr/include/eigen3 (Required is at least version "3.1")
-- Eigen found (include: /usr/include/eigen3, version: 3.3.7)
-- Checking for module 'flann'
-- Found flann, version 1.9.1
-- Found FLANN: /usr/lib/aarch64-linux-gnu/libflann_cpp.so
CMake Warning (dev) at /opt/cmake-3.23.0/share/cmake-3.23/Modules/FindPackageHandleStandardArgs.cmake:438 (message):
The package name passed to find_package_handle_standard_args (PCL_COMMON)
does not match the name of the calling package (PCL). This can lead to
problems in calling code that expects find_package result variables
(e.g., _FOUND) to follow a certain pattern.
Call Stack (most recent call first):
/usr/lib/aarch64-linux-gnu/cmake/pcl/PCLConfig.cmake:616 (find_package_handle_standard_args)
CMakeLists.txt:17 (find_package)
This warning is for project developers. Use -Wno-dev to suppress it.

-- Found PCL_COMMON: /usr/lib/aarch64-linux-gnu/libpcl_common.so
CMake Warning (dev) at /opt/cmake-3.23.0/share/cmake-3.23/Modules/FindPackageHandleStandardArgs.cmake:438 (message):
The package name passed to find_package_handle_standard_args (PCL_KDTREE)
does not match the name of the calling package (PCL). This can lead to
problems in calling code that expects find_package result variables
(e.g., _FOUND) to follow a certain pattern.
Call Stack (most recent call first):
/usr/lib/aarch64-linux-gnu/cmake/pcl/PCLConfig.cmake:616 (find_package_handle_standard_args)
CMakeLists.txt:17 (find_package)
This warning is for project developers. Use -Wno-dev to suppress it.

-- Found PCL_KDTREE: /usr/lib/aarch64-linux-gnu/libpcl_kdtree.so
CMake Warning (dev) at /opt/cmake-3.23.0/share/cmake-3.23/Modules/FindPackageHandleStandardArgs.cmake:438 (message):
The package name passed to find_package_handle_standard_args (PCL_OCTREE)
does not match the name of the calling package (PCL). This can lead to
problems in calling code that expects find_package result variables
(e.g., _FOUND) to follow a certain pattern.
Call Stack (most recent call first):
/usr/lib/aarch64-linux-gnu/cmake/pcl/PCLConfig.cmake:616 (find_package_handle_standard_args)
CMakeLists.txt:17 (find_package)
This warning is for project developers. Use -Wno-dev to suppress it.

-- Found PCL_OCTREE: /usr/lib/aarch64-linux-gnu/libpcl_octree.so
CMake Warning (dev) at /opt/cmake-3.23.0/share/cmake-3.23/Modules/FindPackageHandleStandardArgs.cmake:438 (message):
The package name passed to find_package_handle_standard_args (PCL_SEARCH)
does not match the name of the calling package (PCL). This can lead to
problems in calling code that expects find_package result variables
(e.g., _FOUND) to follow a certain pattern.
Call Stack (most recent call first):
/usr/lib/aarch64-linux-gnu/cmake/pcl/PCLConfig.cmake:616 (find_package_handle_standard_args)
CMakeLists.txt:17 (find_package)
This warning is for project developers. Use -Wno-dev to suppress it.

-- Found PCL_SEARCH: /usr/lib/aarch64-linux-gnu/libpcl_search.so
CMake Warning (dev) at /opt/cmake-3.23.0/share/cmake-3.23/Modules/FindPackageHandleStandardArgs.cmake:438 (message):
The package name passed to find_package_handle_standard_args
(PCL_SAMPLE_CONSENSUS) does not match the name of the calling package
(PCL). This can lead to problems in calling code that expects
find_package result variables (e.g., _FOUND) to follow a certain
pattern.
Call Stack (most recent call first):
/usr/lib/aarch64-linux-gnu/cmake/pcl/PCLConfig.cmake:616 (find_package_handle_standard_args)
CMakeLists.txt:17 (find_package)
This warning is for project developers. Use -Wno-dev to suppress it.

-- Found PCL_SAMPLE_CONSENSUS: /usr/lib/aarch64-linux-gnu/libpcl_sample_consensus.so
CMake Warning (dev) at /opt/cmake-3.23.0/share/cmake-3.23/Modules/FindPackageHandleStandardArgs.cmake:438 (message):
The package name passed to find_package_handle_standard_args
(PCL_FILTERS) does not match the name of the calling package (PCL). This
can lead to problems in calling code that expects find_package result
variables (e.g., _FOUND) to follow a certain pattern.
Call Stack (most recent call first):
/usr/lib/aarch64-linux-gnu/cmake/pcl/PCLConfig.cmake:616 (find_package_handle_standard_args)
CMakeLists.txt:17 (find_package)
This warning is for project developers. Use -Wno-dev to suppress it.

-- Found PCL_FILTERS: /usr/lib/aarch64-linux-gnu/libpcl_filters.so
-- ******** Summary ********
-- CMake version : 3.23.0
-- CMake command : /opt/cmake-3.23.0/bin/cmake
-- System : Linux
-- C++ compiler : /usr/bin/c++
-- C++ compiler version : 9.4.0
-- Build type : Release
-- CXX flags :
-- CMAKE_PREFIX_PATH : /home/xx/data/pointcloud-3d-detector-tensorrt/build/devel;/opt/ros/noetic
-- CMAKE_INSTALL_PREFIX : /usr/local
-- CMAKE_MODULE_PATH : /usr/lib/aarch64-linux-gnu/cmake/pcl/Modules

-- CUDA_VERSION : 11.4
-- CUDA_TOOLKIT_ROOT_DIR : /usr/local/cuda
-- CUDA_LIBRARIES : /usr/local/cuda/lib64/libcudart_static.a;Threads::Threads;dl;/usr/lib/aarch64-linux-gnu/librt.so
-- CUDA_INCLUDE_DIRS: : /usr/local/cuda/include
-- CUDART_LIB : /usr/local/cuda/lib64/libcudart_static.a
-- CUBLAS_LIB : /usr/local/cuda/lib64/libcublas.so
-- TENSORRT_LIB : /usr/src/tensorrt/lib/libnvinfer.so;/usr/lib/aarch64-linux-gnu/libnvonnxparser.so
-- TENSORRT_INCLUDE_DIR : /usr/src/tensorrt/include
-- CUDNN_LIB : /usr/lib/aarch64-linux-gnu/libcudnn.so
-- CUBLASLT_LIB : /usr/local/cuda/lib64/libcublasLt.so
-- Configuring done
-- Generating done
-- Build files have been written to: /home/xx/data/pointcloud-3d-detector-tensorrt/build

make -j10 did not report an error and compiled successfully,The message after running is as follows:
./devel/lib/point_detection/point_detector
loading /home/xx/data/pointcloud-3d-detector-tensorrt/plugins/lib/librd3d_trt_plugin.so ...
try loading TensorRT engine file: /home/xx/data/pointcloud-3d-detector-tensorrt/iassd_hvcsx2_gq_4x8_80e_kitti_3cls(export_fp16).engine
====== model infos ======
-> points(1, 16384, 4)
<- boxes(1, 256, 8)
<- scores(1, 256)
<- nums(1, 1)

However, when I publish the point cloud, the program is stuck and can't end. When I do a debug print, I find that it is stuck when the data enters the detector and the GPU occupancy stays at 99%.

My environment is as follows:
CUDA 11.4.14

TensorRT 8.4.1

cuDNN 8.4.1

OpenCV 4.5.4

GCC 9.4.0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.