Hi, Are you still working on this or it is finished? I was get inspired by you

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

Is this project is still in work in progress? about desire HOT 5 OPEN

jagdishbhanushali commented on May 26, 2024

Is this project is still in work in progress?

from desire.

Comments (5)

sujithvemi commented on May 26, 2024 1

@i-chaochen I really wish I could help you here. But I really don't know much about LIDAR and was not able to fully understand what the paper said.

You can check the supplemental material provided here, it might help you.

from desire.

stratomaster31 commented on May 26, 2024

I'm working on this model. I've coded the CVAE and I have good results in training phase, but not for test phase...
Which are the decoder1 inputs? In the paper is not specified...

from desire.

i-chaochen commented on May 26, 2024

I think this work only can handle Stanford Drone Dataset? Do you know how to process KITTI dataset?

In the original paper, as the following

As the dataset does not provide semantic labels for 3D points (which we need for scene context), we first perform semantic segmentations of images and project Velodyne laser scans onto the image plane using the provided camera matrix to label 3D points. The semantically labeled 3D points are then registered into the world coordinates using GPS-IMU tags. Finally we create top-down view feature maps I of size H ×W × C.

If I understood correctly, they did these:

first do the semantic segmentation of all images, get masks of data.
project laser data into 2d image and put mask data on this 2d image. // i.e., opencv's projectPoints() to do the project?

2.1 Since KITTI is bin format, we need to convert it to PCD first.

2.2 Do the registration for all PCD files to fuse as a global frame, and then finally we can use camera matrix (provided by KITTI) and extrinsic matrix (calculated by GPU-IMU) to covert it as a 2d image, and we also will project segmentation mask from step-1 to this projected 2d image.

Anyone can correct me if I'm wrong? Thanks in advance!

from desire.

sujithvemi commented on May 26, 2024

@i-chaochen

I don't work with LiDAR data, so I can't comment on bin and PCD format etc. But the approach that you are taking sounds fine to me. To summarize, this is my understanding:

Project the Velodyne 3D laser scan to 2D image plane
All the points in the third dimension that fall on same point in the 2D image plane get the same label as that is recognised from semantic segmentation
Now the points are converted to world co-ordinate frame
Build a BEV 3D matrix with the third dimension being a one-hot vector corresponding to the class from semantic segmentation (cropping of this feature map can be done before building it)

Feel free to comment if I am wrong in any sense, so we can better understand. Thanks in advance.

from desire.

i-chaochen commented on May 26, 2024

@i-chaochen

I don't work with LiDAR data, so I can't comment on bin and PCD format etc. But the approach that you are taking sounds fine to me. To summarize, this is my understanding:

Project the Velodyne 3D laser scan to 2D image plane

All the points in the third dimension that fall on same point in the 2D image plane get the same label as that is recognised from semantic segmentation

Now the points are converted to world co-ordinate frame

Build a BEV 3D matrix with the third dimension being a one-hot vector corresponding to the class from semantic segmentation (cropping of this feature map can be done before building it)

Feel free to comment if I am wrong in any sense, so we can better understand. Thanks in advance.

@sujithvemi Thanks for the feedback. I am not sure I fully understood what the original paper means for project Velodyne laser scans onto the image plane. What this image plane looks like? Does it look like this one?

Also, since they already project 3D scans to 2D image plane, why they need to register 3D scans to the world coordinate using GPS-IMU tag? 2D image coordinate can be used for the prediction anyway.

If they want to do the register to the world coordinate, I think they will need intrinsic and extrinsic (it can provide by GPS-IMU?) matrices instead of GPS-IMU.

from desire.

Is this project is still in work in progress? about desire HOT 5 OPEN

Comments (5)

Related Issues (4)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent