Giter Site home page Giter Site logo

nadeen86 / epnetv2 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from happinesslz/epnetv2

0.0 0.0 0.0 377 KB

EPNet++: Cascade Bi-directional Fusion for Multi-Modal 3D Object Detection (TPAMI-2022)

License: MIT License

Shell 0.95% C++ 3.95% Python 82.84% C 0.38% Cuda 11.88%

epnetv2's Introduction

EPNet++

EPNet++: Cascade Bi-directional Fusion forMulti-Modal 3D Object Detection (TPAMI 2022).

Paper is now available in IEEE Explore or Arxiv EPNet++ , and the code is based on EPNet and PointRCNN.

Abstract

Recently, fusing the LiDAR point cloud and camera image to improve the performance and robustness of 3D object detection has received more and more attention, as these two modalities naturally possess strong complementarity. In this paper, we propose EPNet++ for multi-modal 3D object detection by introducing a novel Cascade Bi-directional Fusion (CB-Fusion) module and a Multi-Modal Consistency (MC) loss. More concretely, the proposed CB-Fusion module enhances point features with plentiful semantic information absorbed from the image features in a cascade bi-directional interaction fusion manner, leading to more powerful and discriminative feature representations. The MC loss explicitly guarantees the consistency between predicted scores from two modalities to obtain more comprehensive and reliable confidence scores. The experimental results on the KITTI, JRDB and SUN-RGBD datasets demonstrate the superiority of EPNet++ over the state-of-the-art methods. Besides, we emphasize a critical but easily overlooked problem, which is to explore the performance and robustness of a 3D detector in a sparser scene. Extensive experiments present that EPNet++ outperforms the existing SOTA methods with remarkable margins in highly sparse point cloud cases, which might be an available direction to reduce the expensive cost of LiDAR sensors.

image

TODO

**Note: We will integrate CB-Fusion into OpenPCDet for training multiple categories. Besides, we will provide voxel-wise fusion based on CB-Fusion on both Waymo and KITTI dataset. We plan to release them in January, 2023. **

Install(Same with PointRCNN)

The Environment:

  • Linux (tested on Ubuntu 16.04)
  • Python 3.7.6
  • PyTorch 1.20 + CUDA-10.0/10.1

a. Clone the EPNet++ repository.

git clone https://github.com/happinesslz/EPNetV2.git

b. Create conda environment.

conda create -n epnet_plus_plus_open python==3.7.6
conda activate epnet_plus_plus_open
conda install pytorch==1.2.0 torchvision==0.4.0 cudatoolkit=10.0 -c pytorch
pip install -r requirements.txt

c. Build and install the pointnet2_lib, iou3d, roipool3d libraries by executing the following command:

sh build_and_install.sh

Dataset preparation

Please download the official KITTI 3D object detection dataset and our provided train mask based on the KINS dataset. Then organize the downloaded files as follows:

EPNetV2
├── data
│   ├── KITTI
│   │   ├── ImageSets
│   │   ├── object
│   │   │   ├──training
│   │   │      ├──calib & velodyne & label_2 & image_2 & (optional: planes) & train_mask
│   │   │   ├──testing
│   │   │      ├──calib & velodyne & image_2
├── lib
├── pointnet2_lib
├── tools

Trained model

The results of Car on Recall 40:

Models Easy Moderate Hard
Car 92.98 83.45 82.44
Pedestrian 77.70 70.20 63.80
Cyclist 86.86 64.11 60.24

To evaluate all these models, please download the above models from Google or Baidu Pan (1rw2). Unzip these models and move them to "./tools". Then run:

bash run_all_eval_epnet_plus_plus_models.sh

Implementation

Training & Inference

bash run_train_and_eval_epnet_plus_plus_car.sh
bash run_train_and_eval_epnet_plus_plus_ped.sh
bash run_train_and_eval_epnet_plus_plus_cyc.sh

Acknowledgement

Thanks for the superior open-source project PointRCNN. Thanks for all co-authors.

Citation

If you find this work useful in your research, please consider cite:

@article{liu2022epnet++,
  title={EPNet++: Cascade bi-directional fusion for multi-modal 3D object detection},
  author={Liu, Zhe and Huang, Tengteng and Li, Bingling and Chen, Xiwu and Wang, Xi and Bai, Xiang},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year={2022},
  publisher={IEEE}
}
@article{Huang2020EPNetEP,
  title={EPNet: Enhancing Point Features with Image Semantics for 3D Object Detection},
  author={Tengteng Huang and Zhe Liu and Xiwu Chen and Xiang Bai},
  booktitle ={ECCV},
  month = {July},
  year={2020}
}
@InProceedings{Shi_2019_CVPR,
    author = {Shi, Shaoshuai and Wang, Xiaogang and Li, Hongsheng},
    title = {PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud},
    booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    month = {June},
    year = {2019}
}

epnetv2's People

Contributors

happinesslz avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.