Comments (5)
@i-chaochen I really wish I could help you here. But I really don't know much about LIDAR and was not able to fully understand what the paper said.
You can check the supplemental material provided here, it might help you.
from desire.
I'm working on this model. I've coded the CVAE and I have good results in training phase, but not for test phase...
Which are the decoder1 inputs? In the paper is not specified...
from desire.
I think this work only can handle Stanford Drone Dataset? Do you know how to process KITTI dataset?
In the original paper, as the following
As the dataset does not provide semantic labels for 3D points (which we need for scene context), we first perform semantic segmentations of images and project Velodyne laser scans onto the image plane using the provided camera matrix to label 3D points. The semantically labeled 3D points are then registered into the world coordinates using GPS-IMU tags. Finally we create top-down view feature maps I of size H ×W × C.
If I understood correctly, they did these:
-
first do the semantic segmentation of all images, get masks of data.
-
project laser data into 2d image and put mask data on this 2d image. // i.e., opencv's projectPoints() to do the project?
2.1 Since KITTI is bin format, we need to convert it to PCD first.
2.2 Do the registration for all PCD files to fuse as a global frame, and then finally we can use camera matrix (provided by KITTI) and extrinsic matrix (calculated by GPU-IMU) to covert it as a 2d image, and we also will project segmentation mask from step-1 to this projected 2d image.
Anyone can correct me if I'm wrong? Thanks in advance!
from desire.
I don't work with LiDAR data, so I can't comment on bin and PCD format etc. But the approach that you are taking sounds fine to me. To summarize, this is my understanding:
- Project the Velodyne 3D laser scan to 2D image plane
- All the points in the third dimension that fall on same point in the 2D image plane get the same label as that is recognised from semantic segmentation
- Now the points are converted to world co-ordinate frame
- Build a BEV 3D matrix with the third dimension being a one-hot vector corresponding to the class from semantic segmentation (cropping of this feature map can be done before building it)
Feel free to comment if I am wrong in any sense, so we can better understand. Thanks in advance.
from desire.
I don't work with LiDAR data, so I can't comment on bin and PCD format etc. But the approach that you are taking sounds fine to me. To summarize, this is my understanding:
- Project the Velodyne 3D laser scan to 2D image plane
- All the points in the third dimension that fall on same point in the 2D image plane get the same label as that is recognised from semantic segmentation
- Now the points are converted to world co-ordinate frame
- Build a BEV 3D matrix with the third dimension being a one-hot vector corresponding to the class from semantic segmentation (cropping of this feature map can be done before building it)
Feel free to comment if I am wrong in any sense, so we can better understand. Thanks in advance.
@sujithvemi Thanks for the feedback. I am not sure I fully understood what the original paper means for project Velodyne laser scans onto the image plane.
What this image plane looks like? Does it look like this one?
Also, since they already project 3D scans to 2D image plane, why they need to register 3D scans to the world coordinate using GPS-IMU tag? 2D image coordinate can be used for the prediction anyway.
If they want to do the register to the world coordinate, I think they will need intrinsic and extrinsic (it can provide by GPS-IMU?) matrices instead of GPS-IMU.
from desire.
Related Issues (4)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from desire.