Giter Site home page Giter Site logo

megvii-research / petr Goto Github PK

View Code? Open in Web Editor NEW
776.0 15.0 118.0 1.53 MB

[ECCV2022] PETR: Position Embedding Transformation for Multi-View 3D Object Detection & [ICCV2023] PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images

License: Other

Python 99.80% Shell 0.20%
multi-camera multi-task-learning object-detection segmentation 3d-position-embedding

petr's Introduction

[ECCV2022] Position Embedding Transformation for Multi-View 3D Object Detection

[ICCV2023] PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images

arXiv visitors PWC

This repository is an official implementation of PETR and PETRv2. The flash attention version can be find from the "flash" branch.


PETR develops position embedding transformation (PETR) for multi-view 3D object detection. PETR encodes the position information of 3D coordinates into image features, producing the 3D position-aware features. Object query can perceive the 3D position-aware features and perform end-to-end object detection. It can serve as a simple yet strong baseline for future research.


PETRv2 is a unified framework for 3D perception from multi-view images. Based on PETR, PETRv2 explores the effectiveness of temporal modeling, which utilizes the temporal information of previous frames to boost 3D object detection. The 3D PE achieves the temporal alignment on object position of different frames. A feature-guided position encoder is further introduced to improve the data adaptability of 3D PE. To support for high-quality BEV segmentation, PETRv2 provides a simply yet effective solution by adding a set of segmentation queries. Each segmentation query is responsible for segmenting one specific patch of BEV map. PETRv2 achieves state-of-the-art performance on 3D object detection and BEV segmentation.

News

2023.10.11 The 3D lane detection of PETRv2 has been released on TopoMLP. It support openlanev2 and won the 1st place in CVPR2023 workshop!.
2023.01.25 Our multi-view 3D detection framework StreamPETR (63.6% NDS and 55.0% mAP)** without TTA and future frames.
2023.01.04 Our multi-modal detection framework CMT is released on arxiv.
2022.11.04 The code of multi-scale improvement in PETRv2 is released.
2022.09.21 The code of query denoise improvement in PETRv2 is released.
2022.09.04 PETRv2 with VoVNet backbone and multi-scale achieves (59.1% NDS and 50.8% mAP).
2022.08.11 PETRv2 with GLOM-like backbone and query denoise achieves (59.2% NDS and 51.2% mAP) without extra data.
2022.07.04 PETR has been accepted by ECCV 2022.
2022.06.28 The code of BEV Segmentation in PETRv2 is released.
2022.06.16 The code of 3D object detection in PETRv2 is released.
2022.06.10 The code of PETR is released.
2022.06.06 PETRv2 is released on arxiv.
2022.06.01 PETRv2 achieves another SOTA performance on nuScenes dataset (58.2% NDS and 49.0% mAP) by the temporal modeling and supports BEV segmentation.
2022.03.10 PETR is released on arxiv.
2022.03.08 PETR achieves SOTA performance (50.4% NDS and 44.1% mAP) on standard nuScenes dataset.

Preparation

This implementation is built upon detr3d, and can be constructed as the install.md.

  • Environments
    Linux, Python==3.6.8, CUDA == 11.2, pytorch == 1.9.0, mmdet3d == 0.17.1

  • Detection Data
    Follow the mmdet3d to process the nuScenes dataset (https://github.com/open-mmlab/mmdetection3d/blob/master/docs/en/data_preparation.md).

  • Segmentation Data
    Download Map expansion from nuScenes dataset (https://www.nuscenes.org/nuscenes#download). Extract the contents (folders basemap, expansion and prediction) to your nuScenes maps folder.
    Then build Segmentation dataset:

    cd tools
    python build-dataset.py
    

    If you want to train the segmentation task immediately, we privided the processed data ( HDmaps-final.tar ) at gdrive. The processed info files of segmentation can also be find at gdrive.

  • Pretrained weights
    To verify the performance on the val set, we provide the pretrained V2-99 weights. The V2-99 is pretrained on DDAD15M (weights) and further trained on nuScenes train set with FCOS3D. For the results on test set in the paper, we use the DD3D pretrained weights. The ImageNet pretrained weights of other backbone can be found here. Please put the pretrained weights into ./ckpts/.

  • After preparation, you will be able to see the following directory structure:

    PETR
    ├── mmdetection3d
    ├── projects
    │   ├── configs
    │   ├── mmdet3d_plugin
    ├── tools
    ├── data
    │   ├── nuscenes
    │     ├── HDmaps-nocover
    │     ├── ...
    ├── ckpts
    ├── README.md
    

Train & inference

cd PETR

You can train the model following:

tools/dist_train.sh projects/configs/petr/petr_r50dcn_gridmask_p4.py 8 --work-dir work_dirs/petr_r50dcn_gridmask_p4/

You can evaluate the model following:

tools/dist_test.sh projects/configs/petr/petr_r50dcn_gridmask_p4.py work_dirs/petr_r50dcn_gridmask_p4/latest.pth 8 --eval bbox

Visualize

You can generate the reault json following:

./tools/dist_test.sh projects/configs/petr/petr_vovnet_gridmask_p4_800x320.py work_dirs/petr_vovnet_gridmask_p4_800x320/latest.pth 8 --out work_dirs/pp-nus/results_eval.pkl --format-only --eval-options 'jsonfile_prefix=work_dirs/pp-nus/results_eval'

You can visualize the 3D object detection following:

python3 tools/visualize.py

Main Results

PETR: We provide some results on nuScenes val set with pretrained models. These model are trained on 8x 2080ti without cbgs. Note that the models and logs are also available at Baidu Netdisk with code petr.

config mAP NDS training config download
PETR-r50-c5-1408x512 30.5% 35.0% 18hours config log / gdrive
PETR-r50-p4-1408x512 31.70% 36.7% 21hours config log / gdrive
PETR-vov-p4-800x320 37.8% 42.6% 17hours config log / gdrive
PETR-vov-p4-1600x640 40.40% 45.5% 36hours config log / gdrive

PETRv2: We provide a 3D object detection baseline and a BEV segmentation baseline with two frames. The model is trained on 8x 2080ti without cbgs. The processed info files contain 30 previous frames, whose transformation matrix is aligned with the current frame. The info files, models and logs are also available at Baidu Netdisk with code petr.

config mAP NDS training config download
PETRv2-vov-p4-800x320 41.0% 50.3% 30hours config log / gdrive
config Drive Lane Vehicle backbone config download
PETRv2_BEVseg 85.6% 49.0% 46.3% V2-99 config log / gdrive
config F-score X-near X-far Z-near Z-far backbone config download
PETRv2_3DLane 61.2% 0.400 0.573 0.265 0.413 V2-99

StreamPETR: Stream-PETR achieves significant performance improvements without introducing extra computation cost, compared to the single-frame baseline.

config mAP NDS FPS-Pytorch config download
StreamPETR-r50-704x256 45.0% 55.0% 31.7/s

Acknowledgement

Many thanks to the authors of mmdetection3d and detr3d .

Citation

If you find this project useful for your research, please consider citing:

@article{liu2022petr,
  title={Petr: Position embedding transformation for multi-view 3d object detection},
  author={Liu, Yingfei and Wang, Tiancai and Zhang, Xiangyu and Sun, Jian},
  journal={arXiv preprint arXiv:2203.05625},
  year={2022}
}
@article{liu2022petrv2,
  title={PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images},
  author={Liu, Yingfei and Yan, Junjie and Jia, Fan and Li, Shuailin and Gao, Qi and Wang, Tiancai and Zhang, Xiangyu and Sun, Jian},
  journal={arXiv preprint arXiv:2206.01256},
  year={2022}
}

Contact

If you have any questions, feel free to open an issue or contact us at [email protected], [email protected] or [email protected].

petr's People

Contributors

eltociear avatar ifzhang avatar junjie18 avatar kvjia avatar megvii-model avatar yingfei1016 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

petr's Issues

Reproduce PETR result

Thanks for sharing such wonderful & interesting work!!!
I'm trying to reproduce the result of "petr_r50dcn_gridmask_p4.py". At the end of this config file, the result is as followed:

mAP: 0.3174

mATE: 0.8397

mASE: 0.2796

mAOE: 0.6158

mAVE: 0.9543

mAAE: 0.2326

NDS: 0.3665

I train with this config file, because I have only 2 V100 cards, I change the batchsize as "samples_per_gpu=2, workers_per_gpu=2,", also use "--autoscale-lr" and not to use it. But my result is almost like this:
mAP: 0.2103
mATE: 1.0048
mASE: 0.3099
mAOE: 0.8165
mAVE: 1.1984
mAAE: 0.4087
NDS: 0.2516

I also check the training log you provided (20220606_223059.log), at the end of 24 epochs, your loss is 5.6355, but for my loss, it's about 7.xx. I test the model you provided, result is the same as that in "petr_r50dcn_gridmask_p4.py".

lr, batchsize, or other parameters? Any advices? Thanks!

randomness during inference?

Hello,

I have a problem when I'm doing inference part. When I try to put the same data into the model, I got different outputs for the pred 3d bboxes. Could you explain that? Thansk.

Grad norm became Inf and Loss failed to decrease while training

Hi,
Thanks for sharing so wonderful work! I'm trying to reproduce your work now. But according to log file, grad norm became Inf and training loss failed to converge after at the third
epoch 3
grad error1
grad error2
epoch. Few changes of code was made but just data root path.

I used 4 GPUs of GeForce RTX 3090 for training. My environment is below:
CUDA: 11.2
python: 3.6.9
pytorch: 1.9.0
mmcv: 1.4.0
mmdet: 2.24.1
mmsegmentation: 0.20.2
mmdet3d: 0.17.1
tools/dist_train.sh projects/configs/petr/petr_r50dcn_gridmask_p4.py 4 --work-dir work_dirs/petr_r50dcn_gridmask_p4/

The first Inf happened at 4200 iters of first epoch, and after 2 epochs, grad norm was almost Inf and Nan

Why IoU calculation is (I + I)/(I +U)

The IoU function seems to be (I + I)/(I +U) in mmdet3d_plugin/models/detectors/petr3d_seg.py, but not I/U that we common used.

This may lead to higher IoU value than that calculated by normal IoU metrics. So why not to use normal IoU metrics?

The source code of the IoU function is shown below:

def IOU (intputs,targets):
numerator = 2 * (intputs * targets).sum(dim=1)
denominator = intputs.sum(dim=1) + targets.sum(dim=1)
loss = (numerator + 0.01) / (denominator + 0.01)
return loss

The data preparation of petrv2

hi, the train dataset used in petrv2 experiments is "mmdet3d_nuscenes_30f_infos_train.pkl", and is there any code to create the data? I follow the readme instruction to use the code offered by the official mmdet, but it can only generate "nuscenes_infos_train.pkl", but not "mmdet3d_nuscenes_30f_infos_train.pkl".

regression of PETRv2 and its label

Hi,
Thanks for sharing your wonderful work! Here I have 2 uncertain questions.

  1. Does PETRv2 regresses the relative offset between two frames rather than velocity?
  2. If the corresponding label is the relative velocity? In other words, the label is subtracted from its own current velocity?

test.py和visualize_result.py可视化出错

按照readme下的提示,训练完模型后,我用python3 tools/test.py $config $ckpt --show --show-dir $showdir 命令时报错:
请问这个是什么原因?
Traceback (most recent call last):
File "tools/test.py", line 258, in
main()
File "tools/test.py", line 225, in main
outputs = single_gpu_test(model, data_loader, args.show, args.show_dir)
File "/disk5/shc/mmdetection3d/mmdet3d/apis/test.py", line 47, in single_gpu_test
model.module.show_results(data, result, out_dir=out_dir)
File "/disk5/shc/mmdetection3d/mmdet3d/models/detectors/mvx_two_stage.py", line 466, in show_results
if isinstance(data['points'][0], DC):
KeyError: 'points'

另外使用visualize_results.py时也遇到了问题
Traceback (most recent call last):
File "tools/misc/visualize_results.py", line 88, in
main()
File "tools/misc/visualize_results.py", line 78, in main
dataset.show(results, args.show_dir, pipeline=eval_pipeline)
File "/disk5/shc/mmdetection3d/mmdet3d/datasets/nuscenes_dataset.py", line 567, in show
show_result(points, show_gt_bboxes, show_pred_bboxes, out_dir,
File "/disk5/shc/mmdetection3d/mmdet3d/core/visualizer/show_result.py", line 99, in show_result
vis = Visualizer(points)
File "/disk5/shc/mmdetection3d/mmdet3d/core/visualizer/open3d_vis.py", line 379, in init
self.pcd, self.points_colors = _draw_points(
File "/disk5/shc/mmdetection3d/mmdet3d/core/visualizer/open3d_vis.py", line 35, in _draw_points
vis.get_render_option().point_size = points_size # set points size
AttributeError: 'NoneType' object has no attribute 'point_size'

Data augmentation with segmentation map/queries

Hi! I noticed that in your petrv2_BEVseg.py config file there's GlobalRotScaleTransImage which applies rotation and scaling to the bev space and correspondingly 3d ground truth boxes (which makes total sense for detection because boxes are also modified accordingly). However, the transformations are not applied on results['gt_map'] or results['maps'].

I'm confused here since when segmentation queries are generated using fixed locations in the bev space, so in my understanding either:

  • the queries should first be rotated & scaled, or
  • the gt map should be rotated & scaled.

I didn't see this in Petr3D_seg, PETRHeadseg, or GlobalRotScaleTransImage. Am I missing anything? If that's already done could you point me to where it is?

Thank you!

Any report for the computation cost?

Hello,

Thanks for your excellent work!
I'm wondering if you have any analysis for the computation cost (params and MACs) of petr models? Or if you have any convenient way to calculate that during inference? I found a "get_flops.py" file under tools folder, but it seems like only support simple input for images rather than a dict containing data and cam_info.

Thanks.

Could not reproduce results of PETR-vov-p4

Dear author:
Thanks for sharing such excellent work! I am trying to reproduce your results using the official code, config, and pre-trained weights.
I obtain NDS: 0.4480, mAP: 0.3983 for PETR-vov-p4-1600x640, and NDS: 0.4202, mAP: 0.3738 for PETR-vov-p4-800x320. These numbers are lower than what you report in README.
I could successfully reproduce the results on R50. Could you give me some advice? Thanks~

TypeError: can't pickle dict_keys objects

I am trying to reproduce the PETRv2 on bevseg.
But I face one error:
TypeError: can't pickle dict_keys objects

the whole log is as follow:
2022-08-16 06:30:37,780 - mmdet - INFO - workflow: [('train', 1)], max: 24 epochs 2022-08-16 06:30:37,783 - mmdet - INFO - Checkpoints will be saved to /home/gyang/data/PETR/work_dirs/petrv2_BEVseg by HardDiskBackend. Traceback (most recent call last): File "tools/train.py", line 255, in <module> main() File "tools/train.py", line 251, in main meta=meta) File "/home/gyang/data/PETR/mmdetection3d/mmdet3d/apis/train.py", line 351, in train_model meta=meta) File "/home/gyang/data/PETR/mmdetection3d/mmdet3d/apis/train.py", line 319, in train_detector runner.run(data_loaders, cfg.workflow) File "/home/gyang/anaconda3/envs/petr/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 130, in run epoch_runner(data_loaders[i], **kwargs) File "/home/gyang/anaconda3/envs/petr/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 47, in train for i, data_batch in enumerate(self.data_loader): File "/home/gyang/anaconda3/envs/petr/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 355, in __iter__ return self._get_iterator() File "/home/gyang/anaconda3/envs/petr/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 301, in _get_iterator return _MultiProcessingDataLoaderIter(self) File "/home/gyang/anaconda3/envs/petr/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 914, in __init__ w.start() File "/home/gyang/anaconda3/envs/petr/lib/python3.7/multiprocessing/process.py", line 112, in start self._popen = self._Popen(self) File "/home/gyang/anaconda3/envs/petr/lib/python3.7/multiprocessing/context.py", line 223, in _Popen return _default_context.get_context().Process._Popen(process_obj) File "/home/gyang/anaconda3/envs/petr/lib/python3.7/multiprocessing/context.py", line 284, in _Popen return Popen(process_obj) File "/home/gyang/anaconda3/envs/petr/lib/python3.7/multiprocessing/popen_spawn_posix.py", line 32, in __init__ super().__init__(process_obj) File "/home/gyang/anaconda3/envs/petr/lib/python3.7/multiprocessing/popen_fork.py", line 20, in __init__ self._launch(process_obj) File "/home/gyang/anaconda3/envs/petr/lib/python3.7/multiprocessing/popen_spawn_posix.py", line 47, in _launch reduction.dump(process_obj, fp) File "/home/gyang/anaconda3/envs/petr/lib/python3.7/multiprocessing/reduction.py", line 60, in dump ForkingPickler(file, protocol).dump(obj) TypeError: can't pickle dict_keys objects
Would you mind to give some suggestion about this?

Cannot reproduce PTERv2 results

Hi, @yingfei1016 ,I tried to reproduce the results with the default config on VoVNet on A100, but got a bit lower results. I got 49.6 NDS and 40.3 mAP. Results recorded in the config are 50.3 NDS and 41 mAP. Is it normal?

Dose "sweep_range" comply with the regulation of nuscenes?

Thanks for sharing this great work!

I have a question to discuss.
As nuScenes page of detection task mentioned:
"The maximum time window of past sensor data and ego poses that may be used at inference time is approximately 0.5s (at most 6 past camera images, 6 past radar sweeps and 10 past lidar sweeps). At training time there are no restrictions."

Dose "sweep_range=[3, 27]" comply with the regulation?
Is there a standard to use temporal information for camera methods?

About the bev segmentation branch.

Hi, thanks for your great work! I want to ask what is the meaning of the reshape operation in the bev segmentation branch in the paper and where is the corresponding code? Thank you!

About the config of petrv2 to achieve the performance

Hello,
Thanks for your excellent work!
I'm wondering which config of petrv2 can achieve the performance (mAP0.490/NDS0.582) on the nuScenes test set? Is https://github.com/megvii-research/PETR/blob/main/projects/configs/petrv2/petrv2_vovnet_gridmask_p4_800x320.py this one? or it is not reported yet? And i saw the performance you discribe at the bottom of the config petrv2_vovnet_gridmask_p4_1600x640_trainval_cbgs.py https://github.com/megvii-research/PETR/blob/main/projects/configs/petrv2/petrv2_vovnet_gridmask_p4_1600x640_trainval_cbgs.py achieved such a good performance at mAP: 0.8412/NDS: 0.83, is that a camera only method? how about its performance on the nuScenes test set? why not report its result to the nuScenes leaderboard?

Log of loss curves

Hi, can you share the log files of training process? When I use the official log files to train models, the loss seems not to drop. I want to see how the loss curve should be when the training process is proper.

Code of PETRv2.

Thanks for you great work!
ARE there plans to release the code of PETRv2?

Do not find the methods proposed in PETR

Hi, in this repo, I do not find the two main contributions of PETR, i.e., 3D coordinate generator and 3D position encoder. There is no encoder in the PETR head, and the position coding seems to be the normal 2D position coding.

Segmentation visualization results

Dear authors,

Thanks for sharing this great work. When I tried to visualize the segmentation results on valset, I found that the visualized segmentation results do not appear to be as good as the paper shows. Despite the fact that the IOU results are quite good, the visualized lane, vehicle, and driving area are somewhat unrecognized. So could you please give me some suggestion to this? and you can find one of visualized segmentation results in the attached picture. Thanks in advance for your help.

(model output)
f_lane_show_2900
(Ground truth)
gt_map_show_2900

About the relationship between depth information and positional encoding

I mentioned that you have claimed "We argue that the 3D PE should be driven by the 2D features since the image feature can provide some informative guidance (e.g., depth). " So is there some probabilities that some explicit depth-like supervision can be combined into the PETRv2 series, or have you achieved some experiment results like this? THX!

Input data for training and inference process

Hi, I have the following 3 questions:

  1. In PETRv2, the size of the img input in the training process is [12, 3, 320, 800]. Can 12 be understood as 6 pictures at the current moment and 6 pictures in the previous frame?
  2. During the inference process, the input img size is [1, 12, 3, 320, 800]. Does the inference process input 12 pictures at two times at the same time?
  3. When reading the dataloader, I did not find the process of adding the 6 pictures at the previous moment to the 6 pictures at the current moment. Which pipeline is this process reflected in?
    thank you very much!

some question about position encoding in depth axis

Hi, thanks for sharing so wonderful work, after reading the paper, I have some maybe stupid question, how about set D=1? With my understanding, for each position in different view image's featuremap, we should use position encoding to distinguish the position from which view, so why you must use D's depth, even with D=1 depth, after coordinate transfom using image extrin matrix, I think it's very easy to jugde the position comes from which view. So my question is what's the difference when set D = 1 or D > 1, as I don't see the ablation study.

About the config of Swin tiny

Thank you for releasing the code! I tried the swin tiny define in BEVDet in PETRv2, with 800*320 resolution, getting 40.6 NDS and 29.6 mAP, which is kind of low. Could you kindly release the config of swin tiny? Lots of thanks!

Initialization of ResNet50 Models and Paths in Config

Hi Guys,

Thanks for the nice work! I have some questions and not sure if my understanding is correct.

  1. From the config file, the models for ResNet-50 are initialized from a special pth file. However, it is missing in the provided model weights.
  2. I guess you want to change the data_root in the configuration files, such as this line, to ./data/nuscenes/ to fit the directory structure in README.

Thanks again for the help! Please help me check with the pretrained weights.

No loss displayed while training

just this. No loss displayed while training
2022-08-03 17:16:37,279 - mmdet - INFO - workflow: [('train', 1)], max: 24 epochs
2022-08-03 17:16:37,281 - mmdet - INFO - Checkpoints will be saved to /root/paddlejob/workspace/wangguangjie/PETR/work_dirs/petr_r50dcn_gridmask_p4 by HardDiskBackend.
2022-08-03 17:19:34,002 - mmdet - INFO - Saving checkpoint at 1 epochs
2022-08-03 17:22:33,118 - mmdet - INFO - Saving checkpoint at 2 epochs
2022-08-03 17:25:32,982 - mmdet - INFO - Saving checkpoint at 3 epochs

train.py引用库的问题

您好,在train.py中from mmdet.utils import get_device显示没有get_device这个函数,然后我在mmdet里面也没找到,请问这个问题该怎么解决呢?

Cannot reproduce the training result of PETRv2

Hi, thanks for your great work.
I want to train the PETRv2 using a single GPU, with default config you have provided with nothing else altered, but get nan in grad_norm after several iters during the first epoch just as the below log file shows, Any suggestion?
Thanks in advance!
20220812_090303.log

Questions on extrinsics & intrinsics

Hi! Thank you for releasing a wonderful work. I have a few questions regarding the extrinsics & intrinsics processing in CustomNuScensDataset and data pipelines, and hope you could help answer:

  1. I inspected the values of extrinsics of some samples, and found the translation part a bit confusing. For example, two samples have the translations:
>>> np.array([extr[-1,:].round(2) for extr in extrinsics])
array([[ 0.01, -0.33, -0.48,  1.  ],
       [ 0.05, -0.34, -0.63,  1.  ],
       [ 0.09, -0.33, -0.54,  1.  ],
       [-0.  , -0.28, -1.  ,  1.  ],
       [-0.24, -0.24, -0.44,  1.  ],
       [ 0.08, -0.27, -0.49,  1.  ]])
# step over to another sample
>>> np.array([extr[-1,:].round(2) for extr in extrinsics])
array([[ 0.01, -0.33, -0.58,  1.  ],
       [ 0.11, -0.34, -0.68,  1.  ],
       [-0.02, -0.33, -0.61,  1.  ],
       [-0.  , -0.28, -0.96,  1.  ],
       [-0.23, -0.24, -0.44,  1.  ],
       [ 0.14, -0.27, -0.47,  1.  ]])
>>> info['cams'].keys()
dict_keys(['CAM_FRONT', 'CAM_FRONT_RIGHT', 'CAM_FRONT_LEFT', 'CAM_BACK', 'CAM_BACK_LEFT', 'CAM_BACK_RIGHT'])
>>> info['lidar_path']
'./data/nuscenes/samples/LIDAR_TOP/n015-2018-07-18-11-50-34+0800__LIDAR_TOP__1531885881798485.pcd.bin'
  • Why are the y part of the 3 FRONT cameras so close to each other? I mean if the vehicle is moving (roughly forward) and each camera is triggered when lidar scan pass through the FOV center, shouldn't the y part show the forward movement after a clockwise full scan?
  • Why does the y part stay almost the same, while z part varies noticebly? Is it x-right, y-forward, z-upward?
  1. In the customized transform_3d.py, why is line 527 here commented out? My understanding is that, even though results["lidar2img"] is modified accordingly, the extrinsics should also be updated if anyone wants to make use of the extrinsics.
  2. Similar to 2. in scale_xyz of GlobalRotScaleTransImage, if I want to make use of intrinsics myself I should also update intrinsics, right? Currently at line 546 the intrinsics is not modified.

Looking forward to hearing from you. Thanks in advance!

memory increase and then crash when start dist

thank you for sharing the code~
i met the problom when start dist_train.sh

swap mem increase to full, then crashed, got error
image

ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -9) local_rank: 2 (pid: 8530) of binary: /opt/miniconda3/envs/petr/bin/python3
Traceback (most recent call last):
File "/opt/miniconda3/envs/petr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/opt/miniconda3/envs/petr/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/opt/miniconda3/envs/petr/lib/python3.7/site-packages/torch/distributed/launch.py", line 193, in
main()
File "/opt/miniconda3/envs/petr/lib/python3.7/site-packages/torch/distributed/launch.py", line 189, in main
launch(args)
File "/opt/miniconda3/envs/petr/lib/python3.7/site-packages/torch/distributed/launch.py", line 174, in launch
run(args)
File "/opt/miniconda3/envs/petr/lib/python3.7/site-packages/torch/distributed/run.py", line 692, in run
)(*cmd_args)
File "/opt/miniconda3/envs/petr/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 116, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/opt/miniconda3/envs/petr/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 246, in launch_agent
failures=result.failures,
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:


         tools/train.py FAILED

================================================
Root Cause:
[0]:
time: 2022-09-15_05:47:52
rank: 2 (local_rank: 2)
exitcode: -9 (pid: 8530)
error_file: <N/A>
msg: "Signal 9 (SIGKILL) received by PID 8530"

Other Failures:
<NO_OTHER_FAILURES>


my environment:

  • C++ Version: 201402 [1847/1847]
  • Intel(R) Math Kernel Library Version 2020.0.1 Product Build 20200208 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v2.1.2 (Git Hash 98be7e8afa711dc9b66c8ff3504129cb82013cdb)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 11.1
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_
    75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
  • CuDNN 8.0.5
  • Magma 2.5.2
  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUS
    E_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-
    missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-type
    defs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-col
    or=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PE
    RF_WITH_AVX512=1, TORCH_VERSION=1.9.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

TorchVision: 0.9.0+cu111
OpenCV: 4.2.0
MMCV: 1.4.0
MMCV Compiler: GCC 5.4
MMCV CUDA Compiler: 10.2
MMDetection: 2.24.1
MMSegmentation: 0.20.2
MMDetection3D: 0.17.1+9a62051

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.