wzzheng / tpvformer Goto Github PK
View Code? Open in Web Editor NEWAn academic alternative to Tesla's occupancy network for autonomous driving.
License: Apache License 2.0
An academic alternative to Tesla's occupancy network for autonomous driving.
License: Apache License 2.0
Hi! I want to train the semantic occupancy prediciton task, but after I run the following command:
bash launcher.sh config/tpv04_occupancy.py out/tpv_occupancy --lovasz-input voxel
I got an OOM error, I'm a little confused because in the README you say that you successfully trained with RTX3090, any suggestions?
ref_2d_hw = self.ref_2d_hw.clone().expand(bs, -1, -1, -1)
hybird_ref_2d = torch.cat([ref_2d_hw, ref_2d_hw], 0)
Firstly appreciate your work,
and I wonder why you concatenate the ref pts of hw for 2 layers as shown above.
Thanks a lot.
Hi, I have been trying to use your model for Lidar Segmentation task. However, I only have download a small dataset nuscenes v1.0-mini, and I am wondering if this is enough to test the performance of the model. Have you tested the model on a small dataset before? If so, could you provide me with some advice or guidance?
Thank you!
Hi, thanks for sharing the code and paper. After reading the code and the paper, I'm confused with many name like ICA、CVHA、HCAB、HAB。but in code,it looks like there are only two kind transformer layer,selfattention and crossattention。what‘s more,why the transformer config have different in lidarseg and occupancy config,should they only have difference on output?
thanks for kindly reply。
I want train on v1.0-mini dataset, how can I generate thenuscenes_infos_train.pkl/nuscenes_infos_val.pkl of v1.0-mini dataset
Thanks for sharing the great work.
Regarding to Cross-view Hybrid attention, is it only apllied for the HW top plane?
The query is itself, key and value are both none while later in cross-view hybrid attention the value is set to be the concatenation of queries
Meet coredump in visulization
(open-mmlab) ~/TPVFormer$ python visualization/vis_scene.py --py-config config/tpv04_occupancy.py --work-dir out/tpv_occupancy --ckpt-path ckpts/tpv04_occupancy_v2.pth --save-path out/tpv_occupancy/videos --scene-name scene-0916
QObject::moveToThread: Current thread (0x49973b0) is not the object's thread (0x53adbe0).
Cannot move to target thread (0x49973b0)
qt.qpa.plugin: Could not load the Qt platform plugin "xcb" in "/home/qihaoh/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/cv2/qt/plugins" even though it was found.
This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.
Available platform plugins are: xcb, eglfs, linuxfb, minimal, minimalegl, offscreen, vnc, wayland-egl, wayland, wayland-xcomposite-egl, wayland-xcomposite-glx, webgl.
Aborted (core dumped)
We want to follow your exciting work.
When can you release the ssc part?
What is the label of the network? How do you deal with it? Thanks to the author.
请问有遇到这个问题么,在运行“2. generate individual video frames”部分的指令时,碰到了这个问题。
`
visualizing scene-0916
/home/zzh/DataStack_2T/workSpace/TPVFormer/visualization/dataset.py:73: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
sweep_cams = np.array(sweep_cams)
/home/zzh/DataStack_2T/workSpace/TPVFormer/visualization/dataset.py:74: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
sweep_tss = np.array(sweep_tss)
236
processing frame 0 of scene 0
/home/zzh/anaconda3/envs/TPVFormer/lib/python3.8/site-packages/torch/utils/checkpoint.py:25: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
warnings.warn("None of the inputs have requires_grad=True. Gradients will be None")
2955
2023-05-01 17:19:17.196 ( 36.717s) [ E9797740]vtkXOpenGLRenderWindow.:266 ERR| vtkXOpenGLRenderWindow (0x11f80320): Could not find a decent config
2023-05-01 17:19:17.196 ( 36.717s) [ E9797740]vtkXOpenGLRenderWindow.:484 ERR| vtkXOpenGLRenderWindow (0x11f80320): Could not find a decent visual
Aborted (core dumped)
`
Hello, I have a specific question about the organization for dataset folders.
Inside lidarseg
folder, it is correct to have following folders also?
├── TPVFormer/data
│ ├── nuscenes
│ └── lidarseg
│ └── v1.0-trainval
│ └── v1.0-test
│ └── v1.0-mini
I'm asking you because it keeps producing the following error:
assert table_name in self.table_names, "Table {} not found".format(table_name)
Hi,
I'm trying to run TPVFormer on a vehicle with only 5 cameras for occupancy prediction, but the code is throwing size mismatch error while loading model weights:
" copying a param with shape torch.Size([6, 256]) from checkpoint, the shape in current model is torch.Size([5, 256])."
Any suggestions on how to bypass this problem, other than creating a dummy image with rgb values being 0?
Many thanks!
Hi I am trying to train tpvformer. Can you provide the detailed data structure for nuscnes (inculding lidarseg)?
Thanks for your great work! I have a question about train/val pickle files, is there any difference between the pickle file you provided and the pickle file generated by mmdet3d? Can I use the mmdet3d pickle files instead?
Maybe I don't understand your paper well. In the train.py, why you do val first and do (train,val) again? What's the use of the val procedure before train?
I need to make improvements to the TPVFormer. How can I debug the code?
@huang-yh Thanks for your nice work!
if I just want to run the inference code and visualize the result, whether my PC configuration is enough?
Memory: 32G , GPU: 3070 8G?
Hi, thanks for the great job.
could you release the weights for Lidar Segmentation in tpv04 setting? tpv10 setting is too large for 3090.
I really like your project, and I believe these weights would be very helpful for my research. If it's convenient, could you provide a weights file(your best model pth)?
Thanks for your excellent work.
But, I am unable to successfully run any scripts on v1.0-mini.
I made sure to look through previous issues, downloaded the corresponding .pkl file and followed the instructions to execute the script. However, I always encounter a problem on line 100 of dataset_wrapper.py (update to line 91):
processed_label = nb_process_label(np.copy(processed_label), label_voxel_pair)
raise TypeError: No matching definition for argument type(s) array(uint8, 3d, C), array(int32, 2d, C)
.
Can you release the code in semantic scene completion?
运行visualization时FileNotFoundError: [Errno 2] 没有那个文件或目录: 'out/tpv_occupancy/latest.pth'
如果把路径设置成pretrain.pth就会 Unexpected key(s) erro
请问有用预训练好的ckpt可视化的方法吗? 感谢
Hi, Firstly, I would like to express my appreciation for the impressive work you have presented in your recent paper on TPVFormer. The concept of utilizing a tri-perspective view (TPV) representation and the proposed CVHA (Cross-View Hybrid Attention) mechanism for information exchange between different views are both novel and intriguing.
After carefully examining the code implementation provided in the TPVFormer repository, I noticed that the CVHA mechanism, as described in the paper, is not fully implemented (this was also asked in #29). The code only includes the self-attention mechanism on the HW plane but does not incorporate the cross-view hybrid attention (TPV self-attention) as outlined in the paper. I would like to kindly inquire about the following questions (different from #29):
Thanks!
I am Vansin, the technical operator of OpenMMLab. In September of last year, we announced the release of OpenMMLab 2.0 at the World Artificial Intelligence Conference in Shanghai. We invite you to upgrade your algorithm library to OpenMMLab 2.0 using MMEngine, which can be used for both research and commercial purposes. If you have any questions, please feel free to join us on the OpenMMLab Discord at https://discord.gg/A9dCpjHPfE or add me on WeChat (ID: van-sin) and I will invite you to the OpenMMLab WeChat group.
Here are the OpenMMLab 2.0 repos branches:
OpenMMLab 1.0 branch | OpenMMLab 2.0 branch | |
---|---|---|
MMEngine | 0.x | |
MMCV | 1.x | 2.x |
MMDetection | 0.x 、1.x、2.x | 3.x |
MMAction2 | 0.x | 1.x |
MMClassification | 0.x | 1.x |
MMSegmentation | 0.x | 1.x |
MMDetection3D | 0.x | 1.x |
MMEditing | 0.x | 1.x |
MMPose | 0.x | 1.x |
MMDeploy | 0.x | 1.x |
MMTracking | 0.x | 1.x |
MMOCR | 0.x | 1.x |
MMRazor | 0.x | 1.x |
MMSelfSup | 0.x | 1.x |
MMRotate | 0.x | 1.x |
MMYOLO | 0.x |
Attention: please create a new virtual environment for OpenMMLab 2.0.
Hello, I have read your message.
I would like to visualize the data following the instructions in the Visualization/readme.md file. I have executed the following command:
python visualization/dump_pkl.py --src-path data/nuscenes_infos_val.pkl --dst-path data/nuscenes_infos_val_scene.pkl --data-path data/nuscenes
I have a question regarding the site you provided where I downloaded the nuScenes-lidarseg data and moved it to the "data" folder. However, I am encountering the following error. What should I do?
If there are any additional data files that need to be downloaded, please let me know
[Errno 2] No such file or directory: 'data/nuscenes/v1.0-trainval/visibility.json',
[Errno 2] No such file or directory: 'data/nuscenes/v1.0-trainval/attribute.json'
Hi @wzzheng,
I'm very interested in your work, as I think it paves the way for more open-source camera-only experimentation in 3D occupancy mapping.
I have two questions for you:
ref_3d_hw = self.get_reference_points(tpv_h, tpv_w, pc_range[5]-pc_range[2], num_points_in_pillar[0], '3d', device='cpu')#【1,4,10000,3】
ref_3d_zh = self.get_reference_points(tpv_z, tpv_h, pc_range[3]-pc_range[0], num_points_in_pillar[1], '3d', device='cpu')
ref_3d_zh = ref_3d_zh.permute(3, 0, 1, 2)[[2, 0, 1]]
ref_3d_zh = ref_3d_zh.permute(1, 2, 3, 0)#【1,32,800,3】
ref_3d_wz = self.get_reference_points(tpv_w, tpv_z, pc_range[4]-pc_range[1], num_points_in_pillar[2], '3d', device='cpu')
ref_3d_wz = ref_3d_wz.permute(3, 0, 1, 2)[[1, 2, 0]]
ref_3d_wz = ref_3d_wz.permute(1, 2, 3, 0)#【1,32,800,3】
最后的3表示的是xyz,但是前面的4,32,32这个维度表示的是?
I noticed that your code was written with the assumption that batch_size = 1
, but when I increased the batch_size
, it resulted in dimension errors. I want to know why batch_size
is limited to 1
.
If it cannot be increased, it will not be possible to more efficiently utilize my device resources.
TPVFormer/dataloader/dataset_wrapper.py
Lines 116 to 127 in bbed188
assert table_name in self.table_names, "Table {} not found".format(table_name)
AssertionError: Table lidarseg not found
出现下面的报错信息,请问有办法解决吗
hello, i run the inference of tpv04_occupancy with command, the gpu i used is v100
python eval.py --py-config config/tpv04_occupancy.py --ckpts ckpt/tpv04_occupancy_v2.pth
the performance on nuscenes is shown in the picture.
the performance is not matched with paper released and could you please explain the evaluation metrics of miou vox/pts, i'm confused about it.
The paper includes a table comparing performance on the SemanticKITTI dataset. Are there plans to release the code for training using the SemanticKITTI dataset?
Hi Author,
For the 3D occupancy prediction task training, do we still need to set ignore_index=0 when initiate the cross_entroy loss function? In the paper, you said "pseudo-per-voxel labels were generated from sparse point cloud by assigning a new label of empty to any voxel that does not contain any point, and we use voxel predictions as input to both lovasz-softmax and cross-entropy losses." Does that mean for the [100,100,8] 3D volume, we set all rest voxel 's label to 0 as empty label? I found the occupied voxels whose label generate from sparse lidar point, only has around 1000~2300 in total. This approach will cause serious class imbalance. Can author provide more detail information about empty voxel label generation for the 3D occupancy prediction task?
Thanks for your great work !
I am particularly interested in details about how to apply the 2D BEVFormer to the 3D OCC task?
Hi, could you add a software license to this repo? Thank you!
首先感谢您非常好的工作。
我复现的过程中,按照Train TPVFormer for lidar segmentation task on 3090 with 24G GPU memory.的步骤,四卡GPU训练时可以正常训练,但是使用八卡3090时会出现“_pickle.UnpicklingError: pickle data was truncated”的问题,请问是哪里出了问题呢
Dear authors,
first, please let me thank you for your great work.
Can I please ask you to share the config files for TPVFormer-Small for the tasks of 3D semantic occupancy prediction?
Thank you very much in advance.
I followed the installation instructions in the README, but I encountered some issues during the dataset installation step.
I downloaded the dataset from the nuScenes website, but I wasn't sure which files I needed to download. I made an educated guess and downloaded the following:
nuimages-v1.0-all-samples.tgz
nuimages-v1.0-all-sweeps-cam-back-left.tgz
nuimages-v1.0-all-sweeps-cam-back-right.tgz
nuimages-v1.0-all-sweeps-cam-back.tgz
nuimages-v1.0-all-sweeps-cam-front-left.tgz
nuimages-v1.0-all-sweeps-cam-front-right.tgz
nuimages-v1.0-all-sweeps-cam-front.tgz
nuplan-maps-v1.0.zip
nuScenes-lidarseg-all-v1.0.tar
I extracted these datasets to the specified directory.
However, when I tried to run either the train or eval.py scripts, I encountered the following issue:
Namespace(gpus=8, py_config='config/tpv04_occupancy.py', resume_from='', work_dir='out/tpv_occupancy')
tcp://127.0.0.1:20506
tcp://127.0.0.1:20506
tcp://127.0.0.1:20506
tcp://127.0.0.1:20506
tcp://127.0.0.1:20506
tcp://127.0.0.1:20506
tcp://127.0.0.1:20506
tcp://127.0.0.1:20506
..................
2023-04-21 00:11:26,596 - mmseg - INFO - Config:
2023-04-21 00:13:02,052 - mmseg - INFO - initialize FPN with init_cfg {'type': 'Xavier', 'layer': 'Conv2d', 'distribution': 'uniform'}
2023-04-21 00:13:03,511 - mmseg - INFO - Number of params: 62465552
done ddp modelLoading NuScenes tables for version v1.0-trainval...
Traceback (most recent call last):
File "train.py", line 401, in
torch.multiprocessing.spawn(main, args=(args,), nprocs=args.gpus)
File "/root/miniconda3/envs/cat_TPV/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 230, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/root/miniconda3/envs/cat_TPV/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 188, in start_processes
while not context.join():
File "/root/miniconda3/envs/cat_TPV/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 150, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:Process 6 terminated with the following error:
Traceback (most recent call last):
File "/root/miniconda3/envs/cat_TPV/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
fn(i, *args)
File "/root/data01/zzy/TPVFormer/train.py", line 99, in main
data_builder.build(
File "/root/data01/zzy/TPVFormer/builder/data_builder.py", line 20, in build
nusc = NuScenes(version=version, dataroot=data_path, verbose=True)
File "/root/miniconda3/envs/cat_TPV/lib/python3.8/site-packages/nuscenes/nuscenes.py", line 70, in init
self.attribute = self.load_table('attribute')
File "/root/miniconda3/envs/cat_TPV/lib/python3.8/site-packages/nuscenes/nuscenes.py", line 136, in load_table
with open(osp.join(self.table_root, '{}.json'.format(table_name))) as f:
FileNotFoundError: [Errno 2] No such file or directory: 'data/nuscenes/v1.0-trainval/attribute.json'
Since I am not entirely certain if I downloaded the correct dataset, I would like to ask if what I did was correct.
我按照给的环境配置,最后一直出现这个错误,mmcv-full==1.4.0我通过pip安装不上,最后是通过mim install mmcv==1.4.0装上的,但是一直报错No module named 'mmcv._ext',请问怎么解决
Hello author, I'm very interested in your research, and I want to know when the code will go to open source?
Hello author, I'm very interested in your research, and I want to know when this paper will be published, and the code will go to open source—looking forward to seeing the paper and source code.
作者您好,首先非常感谢做出这样优秀的作品。
paper中提到训练occupancy时的TPV resolution是200x200x16,并且dim是128,然而在tpv04_occupancy.py中,TPV resolution是100x100x8, dim是256:
请问可以重新上传一个config么?最好和paper保持一致,这样方便大家复现。
万分感谢~!
(顺便说一下,新上传的可视化代码在visualization文件夹中,但是有一些package import用的是visualize这个词,一个minor bug,请知晓)
您好作者,论文中比较大的一个优化就是使用3个平面(HW, ZH, ZW)代替3d occupancy feature(HWZ),请问有直接对比过性能和显存速度指标吗?
Hello! I wonder if you can release the code and colormap about the visualization of output 3D occupancy results. Thank you.
File "/opt/conda/lib/python3.7/site-packages/nuscenes/nuscenes.py", line 214, in get
assert table_name in self.table_names, "Table {} not found".format(table_name)
AssertionError: Table lidarseg not found
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.