kaixinbear / cape Goto Github PK

View Code? Open in Web Editor NEW

100.0 6.0 7.0 1.42 MB

(CVPR2023) CAPE: Camera View Position Embedding for Multi-View 3D Object Detection

License: Other

Shell 0.28% Python 99.72%

cape's Introduction

CAPE: Camera View Position Embedding for Multi-View 3D Object Detection (CVPR2023)

This repository is an official implementation of CAPE

CAPE is a simple yet effective method for multi-view 3D object detection. CAPE forms the 3D position embedding under the local camera-view system rather than the global coordinate system, which largely reduces the difficulty of the view transformation learning. And CAPE supports temporal modeling by exploiting the fusion between separated queries for multi frames.

Preparation

This implementation is built upon PETR, and can be constructed as the install.md.

Environments
Linux, Python==3.7.9, CUDA == 11.2, pytorch == 1.9.1, mmdet3d == 0.17.1
Detection Data
Follow the mmdet3d to process the nuScenes dataset (https://github.com/open-mmlab/mmdetection3d/blob/master/docs/en/data_preparation.md).
Pretrained weights
To verify the performance on the val set, we provide the pretrained V2-99 weights. The V2-99 is pretrained on DDAD15M (weights) and further trained on nuScenes train set with FCOS3D. For the results on test set in the paper, we use the DD3D pretrained weights. The ImageNet pretrained weights of other backbone can be found here. Please put the pretrained weights into ./ckpts/.

After preparation, you will be able to see the following directory structure:

CAPE
├── mmdetection3d
├── projects
│   ├── configs
│   ├── mmdet3d_plugin
├── tools
├── data
│   ├── nuscenes
│     ├── samples
│     ├── ...
├── ckpts
├── README.md

Train & inference

cd CAPE

You can train the model following:

sh train.sh

You can evaluate the model following:

sh test.sh

Main Results

config	mAP	NDS	config	download
cape_r50_1408x512_24ep_wocbgs_imagenet_pretrain	34.7%	40.6%	config	log / checkpoint
capet_r50_704x256_24ep_wocbgs_imagenet_pretrain	31.8%	44.2%	config	log / checkpoint
capet_VoV99_800x320_24ep_wocbgs_load_dd3d_pretrain	44.7%	54.36%	config	log / checkpoint

Acknowledgement

Many thanks to the authors of mmdetection3d. Special thanks to the authors of PETR.

Citation

If you find this project useful for your research, please consider citing:

@article{Xiong2023CAPE,
  title={CAPE: Camera View Position Embedding for Multi-View 3D Object Detection},
  author={Xiong, Kaixin and Gong, Shi and Ye, Xiaoqing and Tan, Xiao and Wan, Ji and Ding, Errui and Wang, Jingdong and Bai, Xiang},
  booktitle={Computer Vision and Pattern Recognition},
  year={2023}
}

Contact

If you have any questions, feel free to open an issue or contact us at [email protected] or [email protected] or [email protected].

cape's People

Contributors

Stargazers

Watchers

Forkers

wangwangwang1998 kerry678231 dikubab anycle dmame paperwave whuhxb

cape's Issues

Installation issues

I have been putting in quite some time to get everything up and running on my windows pc, but still have not succeeded. I switched to Docker so I could use a Linux environment, but am getting several issues. Have you, by any chance, created a Docker image one could use?

Thanks so much in advance!

Environment cannot be configured successfully

I am very frustrated. I have tried for five days, but still haven't configured the environment properly
1、Cuda and pytorch versions

The corresponding version provided in this project is cuda11.2 and pytorch1.9.1
But I can't find the corresponding command on the PyTorch official website at all

I really want to ask, what is your corresponding version？？
2、MMDetection3D
The installation commands in this project are different from those on the MMDetection3Dofficial website，I tried multiple versions of cuda and pytorch combinations, but failed to follow the guidelines of this project. I cannot determine if it is a version issue or a command issue
The environmental configuration of this project is too complex
Install.md is too simple
The cuda and pytorch versions are very suspicious
Multiple environments need to be configured, and different packages between environments are likely to have version conflicts
If there are successful individuals, we hope they can provide assistance. Thank you very much

Does the network use multi-level features?

Hello,

Thank you for your great work. I am currently looking to your code and I have a question about the use of multi-level features. Indeed, in CAPE :

forward_train called:
- extract_feat: outputing a list of several image features using the grid_masking, the backbone, and the neck: https://github.com/kaixinbear/CAPE/blob/main/projects/mmdet3d_plugin/models/detectors/cape.py#L73
- forward_pts_train calls self.pts_bbox_head (i.e CAPEHead) forward: however, the first line of the forward is
  "x = mlvl_feats[0]" (https://github.com/kaixinbear/CAPE/blob/main/projects/mmdet3d_plugin/models/dense_heads/cape_head.py#L274)

So it means only the first element of the list is used, is not it ? If I am right, only the first element is considered, meaning that CAPE does not use neither multilevel features nor the FPN.

Inference speed

Thanks for sharing the great work!

may i ask the inference speed of CAPE, compared to the original PETR?

About ann_file

Hi, thank you for your sharing work!
I find that the ann_file in configs/CAPE is different from configs/CAPE-T, e.g., nuscenes_infos_train(or val).pkl and mmdet3d_nuscenes_30f_infos_val.pkl. What's the difference between them? Are they all generated from tools/create_data.py, just with different --extra-tag.
Hope your reply, thanks!

mAP is 0

I followed your project configuration, and when I train the cape_r50_1408x512_24ep_wocbgs_imagenet_pretrain.py file, the mAP is always 0.

some issue about installation

I'm very sorry, I tried many times but still couldn't configure the environment for this project
The question I want to ask is as follows：
I followed the tutorial in the install.md document and found that the installation of mmdetection and mmdetection3d was not very successful. I found that the process in install.md is not exactly the same as that in the mmdetection project
1、About sudo
Do I have to use sudo naming when installing mmdetection
2、About versions
I noticed that you clearly specified the versions of mmdetection and mmdetection3d in the installation tutorial. May I ask why you are doing this?? Is it because other versions cannot be configured successfully?

Mini-version of Nuscenes Dataset

Hello everyone!

Tnx for sharing your helpful project.

I have started working on your Github. However, I want to run the code with mini-version of the Nuscenes dataset. In all part of your code, only the full dataset is used. Since, I have three questions!

1- I want to know that, is it possible to run the code with mini-version?
2- Also, is it possible to generate the .pkl files with mmdetection? Or, can you provide it on your Github?
3- Are sweeps data (I mean non-annotated frames) necessary for your training algorithm? Or, just by the keyframes we can run the training code.

Thank you.

How to visualize the attention maps like in Figure 6?

I want to know how to visualize the attention maps of the decoder layer. Which script do you use?

KFPE and QFPE

Which part of the code is KFPE and QFPE？ Thank you.

grad norm keeps being nan

Dear author, thank you for your work and sharing the code,
I try to train cape_VoV99_1600x640_24ep_wocbgs_load_dd3d_pretrain following your instructions, but the log shows that grad_norm keeps being nan, do you know why this happens and how to fix this?