Giter Site home page Giter Site logo

occ3d's People

Contributors

hangzhaomit avatar myc634 avatar waveleaf27 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

occ3d's Issues

Question about Generating Occupancy Annotation

Hello,

The K-nearest neighbor (KNN) algorithm is used to assign semantic labels to each unlabeled point. In this way, the data density can be improved, but it is bound to produce additional errors, especially the shadow caused by dynamic objects (frame w/o labels), how to solve this problem.

about "mesh reconstruction"

Can you share the code of the "mesh reconstruction" in the paper? This should not have a big impact on the Challenge

关于tpvformer

能否提供tpvformer的代码,我复现出来的结果跟论文提供的结果差很多

Some questions about the paper

Thank you for your excellent work.I have some querstion about the paper :

  1. How is this learnable voxel embedding constructed?
  2. How are the image feature tokens selected?
  3. How do voxel embeddings interact with image feature tokens? Is it through mapping between internal and external parameters?

Question of Train, val, test split in Readme

The number of Train/Val/Test splits is listed in the Occ3D main git Readme.
Why is there a difference in the train/test split?
Train/Test splits are 700/150 in the original nuScenes dataset, not 600/250.

And how could you download the annotation of the test split??

Thx.

Question about google download limit

Hello, when I download the waymo-occ data in google drive in cmd, I always encounter
Too many users have viewed or downloaded this file recently. Please try accessing the file again later. If the file you are trying to access is particularly large or is shared with many people, it may take up to 24 hours to be able to view or download the file. If you still can't access a file after 24 hours, contact your domain administrator
problem because there are too many files, can you provide a command to download correctly?

Why do unobserved voxels have semantic labels?

When I run the following code: np.unique(semantics[mask_lidar==0]), I find that even unobserved voxels have semantic labels. Considering that the OCC dataset is not manually annotated but based on auto-labeling, I'm curious about the origin of these labels. Should they just be ignored?

How to prepare 3D Occupancy Dataset

Hello,

As I delve into the code branch, I'm encountering challenges when attempting to replicate your baseline results. I'm curious whether you could consider updating the README to include instructions on how to set up the 3D occupancy dataset for use with your baseline models (CTF-OCC, BEVFomer, TPVFormer). This particular section appears to be missing from the README. It seems that some data conversion step is necessary before it can be loaded correctly into any model for training, testing, or validation.

I appreciate your help in advance.

the meaning of "spatial_shape" in visiblity calculation

Awesome work !
I am confused about the parameter named "spatial_shape" in ray casting method (the parameter can be found in Algorithm1 & 2 & 3) and can't find any explanation in your paper. Would you explain the meaning of this parameter?
Thanks !

questions about Occ3d-nuscenes

According to the dataset generation steps in the paper, the labels of point cloud are required to produce the occupancy ground truth. And the paper says they produced 900 scenes for Occ3d-nuscene in total( 600 train, 150 val, 150 test). However, Nuscenes only release the point labels of the 850 trainval datasets and reserved the labels of test sets.

Question about LiDAR accumulation range

Thanks for the very interesting work! I want to ask more details about your label generation pipeline especially in the LiDAR Accumulation section. Was the range [-40, -40, -1, 40, 40, 5.4] used in the LiDAR accumulation process or did your team used a different range. Since from my knowledge the above range will filtered most road points in LiDAR accumulation. Furthermore in which phase did your team change the coordinate from LiDAR to IMU, thanks!

Visualization code

Could you provide as with visualization code from each camera?
I'm having trouble visualizing using your 3D occupancy annotations on nuscenes.
Thanks

code

请问什么时候才会开源代码?

Coordinate system for annotations

Could you specify what is the coordinate system of the voxel annotations?
Since the z range starts at -1m, I'm assuming it's the ego_vehicle (not the LiDAR).
Is this correct?

Inconsistency between camera mask of Occ3D-nuScene and Waymo

Hi there, thank you for you excellent work! I want to ask that after observing the data, I notice some weird inconsistency between the mask_camera of nuScenes and Waymo. Can you elaborate on why the camera mask of Occ3D-nuScenes is much denser than Waymo, furthermore, there seems to be drivable surfaces behind cars in nuScenes that is still visible after applying the camera mask?
ScreenCapture_2023-11-05-12-09-02
ScreenCapture_2023-11-05-12-09-08

error when training

after installing mmdet3d, i run
./tools/dist_train.sh projects/configs/bevformer/bevformer_base_occ_nuscene.py 1
but get error

Traceback (most recent call last):
  File "./tools/train.py", line 24, in <module>
    from mmdet3d.datasets import build_dataset
  File "/opt/conda/envs/py382/lib/python3.8/site-packages/mmdet3d-0.17.1-py3.8-linux-x86_64.egg/mmdet3d/datasets/__init__.py", line 4, in <module>
    from .custom_3d import Custom3DDataset
  File "/opt/conda/envs/py382/lib/python3.8/site-packages/mmdet3d-0.17.1-py3.8-linux-x86_64.egg/mmdet3d/datasets/custom_3d.py", line 10, in <module>
    from ..core.bbox import get_box_type
  File "/opt/conda/envs/py382/lib/python3.8/site-packages/mmdet3d-0.17.1-py3.8-linux-x86_64.egg/mmdet3d/core/__init__.py", line 3, in <module>
    from .bbox import *  # noqa: F401, F403
  File "/opt/conda/envs/py382/lib/python3.8/site-packages/mmdet3d-0.17.1-py3.8-linux-x86_64.egg/mmdet3d/core/bbox/__init__.py", line 5, in <module>
    from .iou_calculators import (AxisAlignedBboxOverlaps3D, BboxOverlaps3D,
  File "/opt/conda/envs/py382/lib/python3.8/site-packages/mmdet3d-0.17.1-py3.8-linux-x86_64.egg/mmdet3d/core/bbox/iou_calculators/__init__.py", line 2, in <module>
    from .iou3d_calculator import (AxisAlignedBboxOverlaps3D, BboxOverlaps3D,
  File "/opt/conda/envs/py382/lib/python3.8/site-packages/mmdet3d-0.17.1-py3.8-linux-x86_64.egg/mmdet3d/core/bbox/iou_calculators/iou3d_calculator.py", line 6, in <module>
    from ..structures import get_box_type
  File "/opt/conda/envs/py382/lib/python3.8/site-packages/mmdet3d-0.17.1-py3.8-linux-x86_64.egg/mmdet3d/core/bbox/structures/__init__.py", line 2, in <module>
    from .base_box3d import BaseInstance3DBoxes
  File "/opt/conda/envs/py382/lib/python3.8/site-packages/mmdet3d-0.17.1-py3.8-linux-x86_64.egg/mmdet3d/core/bbox/structures/base_box3d.py", line 6, in <module>
    from mmdet3d.ops.iou3d import iou3d_cuda
  File "/opt/conda/envs/py382/lib/python3.8/site-packages/mmdet3d-0.17.1-py3.8-linux-x86_64.egg/mmdet3d/ops/__init__.py", line 20, in <module>
    from .roiaware_pool3d import (RoIAwarePool3d, points_in_boxes_batch,
  File "/opt/conda/envs/py382/lib/python3.8/site-packages/mmdet3d-0.17.1-py3.8-linux-x86_64.egg/mmdet3d/ops/roiaware_pool3d/__init__.py", line 1, in <module>
    from .points_in_boxes import (points_in_boxes_batch, points_in_boxes_cpu,
  File "/opt/conda/envs/py382/lib/python3.8/site-packages/mmdet3d-0.17.1-py3.8-linux-x86_64.egg/mmdet3d/ops/roiaware_pool3d/points_in_boxes.py", line 3, in <module>
    from . import roiaware_pool3d_ext
ImportError: /opt/conda/envs/py382/lib/python3.8/site-packages/mmdet3d-0.17.1-py3.8-linux-x86_64.egg/mmdet3d/ops/roiaware_pool3d/roiaware_pool3d_ext.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZN3c1015SmallVectorBaseIjE8grow_podEPvmm

Anyone know the solution?

About visualization

Can you provide some guidance on the visualization of 3D semantic occupancy?

我复现论文的方法生成了mask_camera,与数据集中的相差甚远

感谢作者提供了优秀的方法。我复现论文的方法生成了mask_camera,与数据集中的相差甚远
图中展示的均为voxel的切片,第6层,为相机所在高度。
左边为voxel的占用情况,中间是我生成的mask_camera,右边是gt提供的mask_camera
image
请问作者,按照论文的原理,mask_camera应该是射线状的,那么数据集gt提供的mask_camera是按什么方法生成的?能提供一下生成mask_camera的代码么?

Question about the label generation pipeline

Thanks for this excellent work.

In the article, when generating the label, three occupancy states were generated: free, occupied, and unobserved. However, the network made a binary classification to determine whether the voxel was empty (occupied or not?).
So what is the significance of adding unobserved states (for camera-based tasks)?

sky or very distant locations in mask_lidar and mask_camera

Hi team, thanks for your awesome work.
I have a question about the generation of mask_lidar and mask_camera in the code. The pseudo code in the paper uses ray casting to determine whether a lidar point is free or occupied. However, like scene reconstruction in Nerf, there are areas that are actually free but may not be visible to the lidar, such as the sky or very distant locations. How is this problem addressed in your implementation?

About unobserved voxel in camera view

thanks for your outstanding work.

I have a question about Occ3D(v2). According to Algorithm 1 and 3, when performing camera ray casting, the camera ray stops only when cur_voxel goes out of spatial_shape or reaches the last voxel. This seems not to produce the occlusion effect shown in Fig.8. Because the ray casting does not stop when the camera ray reaches an observed voxel (occupied or free voxel). According to subsection 3.3.2, if the camera ray only sets the first encountered voxel as occupied, then there should be much more voxels labeled as unobserved. Therefore, I am very curious about the details for generating the camera visibility mask, especially when the camera ray stops casting.

I'd appreciate it if you could answer my questions.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.