Giter Site home page Giter Site logo

cape's Introduction

CAPE: Camera View Position Embedding for Multi-View 3D Object Detection (CVPR2023)

PWC

This repository is an official implementation of CAPE


CAPE is a simple yet effective method for multi-view 3D object detection. CAPE forms the 3D position embedding under the local camera-view system rather than the global coordinate system, which largely reduces the difficulty of the view transformation learning. And CAPE supports temporal modeling by exploiting the fusion between separated queries for multi frames.

Preparation

This implementation is built upon PETR, and can be constructed as the install.md.

  • Environments
    Linux, Python==3.7.9, CUDA == 11.2, pytorch == 1.9.1, mmdet3d == 0.17.1

  • Detection Data
    Follow the mmdet3d to process the nuScenes dataset (https://github.com/open-mmlab/mmdetection3d/blob/master/docs/en/data_preparation.md).

  • Pretrained weights
    To verify the performance on the val set, we provide the pretrained V2-99 weights. The V2-99 is pretrained on DDAD15M (weights) and further trained on nuScenes train set with FCOS3D. For the results on test set in the paper, we use the DD3D pretrained weights. The ImageNet pretrained weights of other backbone can be found here. Please put the pretrained weights into ./ckpts/.

  • After preparation, you will be able to see the following directory structure:

    CAPE
    ├── mmdetection3d
    ├── projects
    │   ├── configs
    │   ├── mmdet3d_plugin
    ├── tools
    ├── data
    │   ├── nuscenes
    │     ├── samples
    │     ├── ...
    ├── ckpts
    ├── README.md
    

Train & inference

cd CAPE

You can train the model following:

sh train.sh

You can evaluate the model following:

sh test.sh

Main Results

config mAP NDS config download
cape_r50_1408x512_24ep_wocbgs_imagenet_pretrain 34.7% 40.6% config log / checkpoint
capet_r50_704x256_24ep_wocbgs_imagenet_pretrain 31.8% 44.2% config log / checkpoint
capet_VoV99_800x320_24ep_wocbgs_load_dd3d_pretrain 44.7% 54.36% config log / checkpoint

Acknowledgement

Many thanks to the authors of mmdetection3d. Special thanks to the authors of PETR.

Citation

If you find this project useful for your research, please consider citing:

@article{Xiong2023CAPE,
  title={CAPE: Camera View Position Embedding for Multi-View 3D Object Detection},
  author={Xiong, Kaixin and Gong, Shi and Ye, Xiaoqing and Tan, Xiao and Wan, Ji and Ding, Errui and Wang, Jingdong and Bai, Xiang},
  booktitle={Computer Vision and Pattern Recognition},
  year={2023}
}

Contact

If you have any questions, feel free to open an issue or contact us at [email protected] or [email protected] or [email protected].

cape's People

Contributors

kaixinbear avatar shuluoshu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

cape's Issues

Installation issues

I have been putting in quite some time to get everything up and running on my windows pc, but still have not succeeded. I switched to Docker so I could use a Linux environment, but am getting several issues. Have you, by any chance, created a Docker image one could use?

Thanks so much in advance!

Environment cannot be configured successfully

I am very frustrated. I have tried for five days, but still haven't configured the environment properly
1、Cuda and pytorch versions
image
The corresponding version provided in this project is cuda11.2 and pytorch1.9.1
But I can't find the corresponding command on the PyTorch official website at all
image
image
I really want to ask, what is your corresponding version??
2、MMDetection3D
The installation commands in this project are different from those on the MMDetection3Dofficial website,I tried multiple versions of cuda and pytorch combinations, but failed to follow the guidelines of this project. I cannot determine if it is a version issue or a command issue
The environmental configuration of this project is too complex
Install.md is too simple
The cuda and pytorch versions are very suspicious
Multiple environments need to be configured, and different packages between environments are likely to have version conflicts
If there are successful individuals, we hope they can provide assistance. Thank you very much

Does the network use multi-level features?

Hello,

Thank you for your great work. I am currently looking to your code and I have a question about the use of multi-level features. Indeed, in CAPE :

So it means only the first element of the list is used, is not it ? If I am right, only the first element is considered, meaning that CAPE does not use neither multilevel features nor the FPN.

Inference speed

Thanks for sharing the great work!

may i ask the inference speed of CAPE, compared to the original PETR?

About ann_file

Hi, thank you for your sharing work!
I find that the ann_file in configs/CAPE is different from configs/CAPE-T, e.g., nuscenes_infos_train(or val).pkl and mmdet3d_nuscenes_30f_infos_val.pkl. What's the difference between them? Are they all generated from tools/create_data.py, just with different --extra-tag.
Hope your reply, thanks!

mAP is 0

I followed your project configuration, and when I train the cape_r50_1408x512_24ep_wocbgs_imagenet_pretrain.py file, the mAP is always 0.

some issue about installation

I'm very sorry, I tried many times but still couldn't configure the environment for this project
The question I want to ask is as follows:
I followed the tutorial in the install.md document and found that the installation of mmdetection and mmdetection3d was not very successful. I found that the process in install.md is not exactly the same as that in the mmdetection project
1、About sudo
Do I have to use sudo naming when installing mmdetection
2、About versions
I noticed that you clearly specified the versions of mmdetection and mmdetection3d in the installation tutorial. May I ask why you are doing this?? Is it because other versions cannot be configured successfully?

Mini-version of Nuscenes Dataset

Hello everyone!

Tnx for sharing your helpful project.

I have started working on your Github. However, I want to run the code with mini-version of the Nuscenes dataset. In all part of your code, only the full dataset is used. Since, I have three questions!

1- I want to know that, is it possible to run the code with mini-version?
2- Also, is it possible to generate the .pkl files with mmdetection? Or, can you provide it on your Github?
3- Are sweeps data (I mean non-annotated frames) necessary for your training algorithm? Or, just by the keyframes we can run the training code.

Thank you.

KFPE and QFPE

Which part of the code is KFPE and QFPE? Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.