Giter Site home page Giter Site logo

collector-m / epnet Goto Github PK

View Code? Open in Web Editor NEW

This project forked from happinesslz/epnet

0.0 1.0 0.0 490 KB

EPNet: Enhancing Point Features with Image Semantics for 3D Object Detection(ECCV 2020)

License: MIT License

Shell 0.51% Python 85.09% C++ 5.30% Cuda 9.10%

epnet's Introduction

EPNet

EPNet: Enhancing Point Features with Image Semantics for 3D Object Detection (ECCV 2020). Paper is now available in EPNet, and the code is based on PointRCNN.

Highlights

  1. Without extra image annotations, e.g. 2D bounding box, Semantic labels and so on.
  2. A more accurate multi-scale point-wise fusion for Image and Point Cloud.
  3. The proposed CE loss can improve the performance of 3D Detection greatly.
  4. Without GT AUG.

Contributions

This is Pytorch implementation for EPNet on KITTI dataset, which is mainly achieved by Liu Zhe and Huang Tengteng. Some parts also benefit from Chen Xiwu.

Abstract

In this paper, we aim at addressing two critical issues in the 3D detection task, including the exploitation of multiple sensors~(namely LiDAR point cloud and camera image), as well as the inconsistency between the localization and classification confidence. To this end, we propose a novel fusion module to enhance the point features with semantic image features in a point-wise manner without any image annotations. Besides, a consistency enforcing loss is employed to explicitly encourage the consistency of both the localization and classification confidence. We design an end-to-end learnable framework named EPNet to integrate these two components. Extensive experiments on the KITTI and SUN-RGBD datasets demonstrate the superiority of EPNet over the state-of-the-art methods.

image

Network

The architecture of our two-stream RPN is shown in the below. image

The architecture of our LI-Fusion module in the two-stream RPN. image

Install(Same with PointRCNN)

The Environment:

  • Linux (tested on Ubuntu 16.04)
  • Python 3.6+
  • PyTorch 1.0+

a. Clone the PointRCNN repository.

git clone https://github.com/happinesslz/EPNet.git

b. Install the dependent python libraries like easydict,tqdm, tensorboardX etc.

c. Build and install the pointnet2_lib, iou3d, roipool3d libraries by executing the following command:

sh build_and_install.sh

Dataset preparation

Please download the official KITTI 3D object detection dataset and organize the downloaded files as follows:

EPNet
├── data
│   ├── KITTI
│   │   ├── ImageSets
│   │   ├── object
│   │   │   ├──training
│   │   │      ├──calib & velodyne & label_2 & image_2 & (optional: planes)
│   │   │   ├──testing
│   │   │      ├──calib & velodyne & image_2
├── lib
├── pointnet2_lib
├── tools

Trained model

The results of Car on Recall 40:

LI Fusion CE loss Easy Moderate Hard mAP models
No No 88.76 78.03 76.20 80.99 Google, Baidu (a43t)
Yes No 89.93 80.77 77.25 82.65 Google, Baidu (dbxy)
No Yes 92.12 81.48 79.34 84.31 Google, Baidu (hrkv)
Yes Yes 92.17 82.68 80.10 84.99 Google, Baidu (nasm)

Besides, adding iou branch to EPNet (the last line in the above table) can bring a minor improvement and the results are more stable. The result is 92.50(Easy), 82.45(Moderate), 80.29(Hard), 85.08(mAP), and the model checkpoint can be obtained from Google, Baidu (8sir).

To evaluate all these models, please download the above models. Unzip these models and place them to "./log/Car/models"

cd ./tools
mkdir -p log/Car/models
bash run_eval_model.sh

Implementation

Training

Run EPNet for single gpu:

CUDA_VISIBLE_DEVICES=0 python train_rcnn.py --cfg_file cfgs/LI_Fusion_with_attention_use_ce_loss.yaml --batch_size 2 --train_mode rcnn_online --epochs 50 --ckpt_save_interval 1 --output_dir ./log/Car/full_epnet_without_iou_branch/   --set LI_FUSION.ENABLED True LI_FUSION.ADD_Image_Attention True RCNN.POOL_EXTRA_WIDTH 0.2 RPN.SCORE_THRESH 0.2 RCNN.SCORE_THRESH 0.2  USE_IOU_BRANCH False TRAIN.CE_WEIGHT 5.0

Run EPNet for two gpu:

CUDA_VISIBLE_DEVICES=0,1 python train_rcnn.py --cfg_file cfgs/LI_Fusion_with_attention_use_ce_loss.yaml --batch_size 6 --train_mode rcnn_online --epochs 50 --mgpus --ckpt_save_interval 1 --output_dir ./log/Car/full_epnet_without_iou_branch/   --set LI_FUSION.ENABLED True LI_FUSION.ADD_Image_Attention True RCNN.POOL_EXTRA_WIDTH 0.2 RPN.SCORE_THRESH 0.2 RCNN.SCORE_THRESH 0.2  USE_IOU_BRANCH False TRAIN.CE_WEIGHT 5.0

Testing

CUDA_VISIBLE_DEVICES=2 python eval_rcnn.py --cfg_file cfgs/LI_Fusion_with_attention_use_ce_loss.yaml --eval_mode rcnn_online  --eval_all  --output_dir ./log/Car/full_epnet_without_iou_branch/eval_results/  --ckpt_dir ./log/Car/full_epnet_without_iou_branch/ckpt --set  LI_FUSION.ENABLED True LI_FUSION.ADD_Image_Attention True RCNN.POOL_EXTRA_WIDTH 0.2  RPN.SCORE_THRESH 0.2 RCNN.SCORE_THRESH 0.2  USE_IOU_BRANCH False

Acknowledgement

The code is based on PointRCNN.

Citation

If you find this work useful in your research, please consider cite:

@article{Huang2020EPNetEP,
  title={EPNet: Enhancing Point Features with Image Semantics for 3D Object Detection},
  author={Tengteng Huang and Zhe Liu and Xiwu Chen and Xiang Bai},
  booktitle ={ECCV},
  month = {July},
  year={2020}
}
@InProceedings{Shi_2019_CVPR,
    author = {Shi, Shaoshuai and Wang, Xiaogang and Li, Hongsheng},
    title = {PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud},
    booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    month = {June},
    year = {2019}
}

epnet's People

Contributors

deepmeng avatar happinesslz avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.