tsinghua-mars-lab / occ3d Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
Hello,
The K-nearest neighbor (KNN) algorithm is used to assign semantic labels to each unlabeled point. In this way, the data density can be improved, but it is bound to produce additional errors, especially the shadow caused by dynamic objects (frame w/o labels), how to solve this problem.
I'm curious if [mask_lidar] and [mask_camera] are generated during training in real time. If not, are many voxels marked as "free" but "visible" saved when when preparing the data?
Can you share the code of the "mesh reconstruction" in the paper? This should not have a big impact on the Challenge
Can you share more details about the mesh reconstruction? For example, the parameters used for vdbfusion
能否提供tpvformer的代码,我复现出来的结果跟论文提供的结果差很多
I noticed that the voxel size mentioned in the article for Occ-Waymo is 0.05m, which is inconsistent with what you stated on the webpage as [0.1,0.1,0.2]/[0.4,0.4,0.4]. I'm curious to know why this discrepancy exists?
Thank you for your excellent work.I have some querstion about the paper :
这对我很重要,谢谢回答😭
The number of Train/Val/Test splits is listed in the Occ3D main git Readme.
Why is there a difference in the train/test split?
Train/Test splits are 700/150 in the original nuScenes dataset, not 600/250.
And how could you download the annotation of the test split??
Thx.
Hello, when I download the waymo-occ data in google drive in cmd, I always encounter
Too many users have viewed or downloaded this file recently. Please try accessing the file again later. If the file you are trying to access is particularly large or is shared with many people, it may take up to 24 hours to be able to view or download the file. If you still can't access a file after 24 hours, contact your domain administrator
problem because there are too many files, can you provide a command to download correctly?
In my opinion, when the vehicle moves, both the camera and lidar change position, so lidar visibility should also differs at each timestamp
When I run the following code: np.unique(semantics[mask_lidar==0]), I find that even unobserved voxels have semantic labels. Considering that the OCC dataset is not manually annotated but based on auto-labeling, I'm curious about the origin of these labels. Should they just be ignored?
Hello,
As I delve into the code branch, I'm encountering challenges when attempting to replicate your baseline results. I'm curious whether you could consider updating the README to include instructions on how to set up the 3D occupancy dataset for use with your baseline models (CTF-OCC, BEVFomer, TPVFormer). This particular section appears to be missing from the README. It seems that some data conversion step is necessary before it can be loaded correctly into any model for training, testing, or validation.
I appreciate your help in advance.
It is a awesome work!
Shall we extend more data source to support baiduyun and opendatalab?
Awesome work !
I am confused about the parameter named "spatial_shape" in ray casting method (the parameter can be found in Algorithm1 & 2 & 3) and can't find any explanation in your paper. Would you explain the meaning of this parameter?
Thanks !
On your webpage https://tsinghua-mars-lab.github.io/Occ3D/ under The Occ3D Dataset/Dataset Downloads one can find two links for the Nuscenes and Waymo datasets but they are pointing to the same google drive location for the Waymo dataset. Can you help me getting the Nuscenes dataset, please? Am I missing something?
According to the dataset generation steps in the paper, the labels of point cloud are required to produce the occupancy ground truth. And the paper says they produced 900 scenes for Occ3d-nuscene in total( 600 train, 150 val, 150 test). However, Nuscenes only release the point labels of the 850 trainval datasets and reserved the labels of test sets.
Thanks for the very interesting work! I want to ask more details about your label generation pipeline especially in the LiDAR Accumulation section. Was the range [-40, -40, -1, 40, 40, 5.4] used in the LiDAR accumulation process or did your team used a different range. Since from my knowledge the above range will filtered most road points in LiDAR accumulation. Furthermore in which phase did your team change the coordinate from LiDAR to IMU, thanks!
I am trying to download the nuScenes dataset form the google drive link:
https://drive.google.com/drive/folders/1wZ-8OI1IJkrXo6BudFSGmaKXBUYQ3ts_?usp=share_link
However, it failed several times when I reached 16.2GB downloaded data. I tried different storage locations and it fails at 16.2GB.
Do you have an idea what could be the problem?
The download from google drive is unstable and always corrupt.
The information provided in the paper is not enough, when will the code be released
Could you provide as with visualization code from each camera?
I'm having trouble visualizing using your 3D occupancy annotations on nuscenes.
Thanks
请问你们的gt是在哪个坐标系呢?
请问什么时候才会开源代码?
Could you specify what is the coordinate system of the voxel annotations?
Since the z range starts at -1m, I'm assuming it's the ego_vehicle (not the LiDAR).
Is this correct?
Hi there, thank you for you excellent work! I want to ask that after observing the data, I notice some weird inconsistency between the mask_camera of nuScenes and Waymo. Can you elaborate on why the camera mask of Occ3D-nuScenes is much denser than Waymo, furthermore, there seems to be drivable surfaces behind cars in nuScenes that is still visible after applying the camera mask?
Can you report the performance of TPVFormer on the Occ3D dataset?
after installing mmdet3d, i run
./tools/dist_train.sh projects/configs/bevformer/bevformer_base_occ_nuscene.py 1
but get error
Traceback (most recent call last):
File "./tools/train.py", line 24, in <module>
from mmdet3d.datasets import build_dataset
File "/opt/conda/envs/py382/lib/python3.8/site-packages/mmdet3d-0.17.1-py3.8-linux-x86_64.egg/mmdet3d/datasets/__init__.py", line 4, in <module>
from .custom_3d import Custom3DDataset
File "/opt/conda/envs/py382/lib/python3.8/site-packages/mmdet3d-0.17.1-py3.8-linux-x86_64.egg/mmdet3d/datasets/custom_3d.py", line 10, in <module>
from ..core.bbox import get_box_type
File "/opt/conda/envs/py382/lib/python3.8/site-packages/mmdet3d-0.17.1-py3.8-linux-x86_64.egg/mmdet3d/core/__init__.py", line 3, in <module>
from .bbox import * # noqa: F401, F403
File "/opt/conda/envs/py382/lib/python3.8/site-packages/mmdet3d-0.17.1-py3.8-linux-x86_64.egg/mmdet3d/core/bbox/__init__.py", line 5, in <module>
from .iou_calculators import (AxisAlignedBboxOverlaps3D, BboxOverlaps3D,
File "/opt/conda/envs/py382/lib/python3.8/site-packages/mmdet3d-0.17.1-py3.8-linux-x86_64.egg/mmdet3d/core/bbox/iou_calculators/__init__.py", line 2, in <module>
from .iou3d_calculator import (AxisAlignedBboxOverlaps3D, BboxOverlaps3D,
File "/opt/conda/envs/py382/lib/python3.8/site-packages/mmdet3d-0.17.1-py3.8-linux-x86_64.egg/mmdet3d/core/bbox/iou_calculators/iou3d_calculator.py", line 6, in <module>
from ..structures import get_box_type
File "/opt/conda/envs/py382/lib/python3.8/site-packages/mmdet3d-0.17.1-py3.8-linux-x86_64.egg/mmdet3d/core/bbox/structures/__init__.py", line 2, in <module>
from .base_box3d import BaseInstance3DBoxes
File "/opt/conda/envs/py382/lib/python3.8/site-packages/mmdet3d-0.17.1-py3.8-linux-x86_64.egg/mmdet3d/core/bbox/structures/base_box3d.py", line 6, in <module>
from mmdet3d.ops.iou3d import iou3d_cuda
File "/opt/conda/envs/py382/lib/python3.8/site-packages/mmdet3d-0.17.1-py3.8-linux-x86_64.egg/mmdet3d/ops/__init__.py", line 20, in <module>
from .roiaware_pool3d import (RoIAwarePool3d, points_in_boxes_batch,
File "/opt/conda/envs/py382/lib/python3.8/site-packages/mmdet3d-0.17.1-py3.8-linux-x86_64.egg/mmdet3d/ops/roiaware_pool3d/__init__.py", line 1, in <module>
from .points_in_boxes import (points_in_boxes_batch, points_in_boxes_cpu,
File "/opt/conda/envs/py382/lib/python3.8/site-packages/mmdet3d-0.17.1-py3.8-linux-x86_64.egg/mmdet3d/ops/roiaware_pool3d/points_in_boxes.py", line 3, in <module>
from . import roiaware_pool3d_ext
ImportError: /opt/conda/envs/py382/lib/python3.8/site-packages/mmdet3d-0.17.1-py3.8-linux-x86_64.egg/mmdet3d/ops/roiaware_pool3d/roiaware_pool3d_ext.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZN3c1015SmallVectorBaseIjE8grow_podEPvmm
Anyone know the solution?
Can you provide some guidance on the visualization of 3D semantic occupancy?
May I ask if you will release mask_lidar and mask_camera code? Thanks!
Thanks for your works! Could you please tell how can we download the Occupancy datasets of Waymo?
Thanks for this excellent work.
In the article, when generating the label, three occupancy states were generated: free, occupied, and unobserved. However, the network made a binary classification to determine whether the voxel was empty (occupied or not?).
So what is the significance of adding unobserved states (for camera-based tasks)?
I want to use occ3d-waymo darasets to support my research, the question is what should i do to align the occ3d Voxel with the waymo datasets?
Hi team, thanks for your awesome work.
I have a question about the generation of mask_lidar and mask_camera in the code. The pseudo code in the paper uses ray casting to determine whether a lidar point is free or occupied. However, like scene reconstruction in Nerf, there are areas that are actually free but may not be visible to the lidar, such as the sky or very distant locations. How is this problem addressed in your implementation?
thanks for your outstanding work.
I have a question about Occ3D(v2). According to Algorithm 1 and 3, when performing camera ray casting, the camera ray stops only when cur_voxel goes out of spatial_shape or reaches the last voxel. This seems not to produce the occlusion effect shown in Fig.8. Because the ray casting does not stop when the camera ray reaches an observed voxel (occupied or free voxel). According to subsection 3.3.2, if the camera ray only sets the first encountered voxel as occupied, then there should be much more voxels labeled as unobserved. Therefore, I am very curious about the details for generating the camera visibility mask, especially when the camera ray stops casting.
I'd appreciate it if you could answer my questions.
Hello, in the paper the Waymo occ resolution is 0.05, but in the google drive, the resolution is 0.1/0.4?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.