melfm / avod-ssd Goto Github PK

View Code? Open in Web Editor NEW

97.0 97.0 22.0 23.97 MB

Code for 3D single stage object detection for autonomous driving

License: MIT License

Python 86.22% Shell 0.26% Makefile 0.05% C++ 13.46%

avod-ssd's People

Contributors

Stargazers

Watchers

avod-ssd's Issues

[email protected]: Permission denied (publickey)

This is the error I get when I type the command: git clone [email protected]:kujason/avod.git --recurse-submodules

Cloning into 'avod'...
[email protected]: Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

Please tell me the solution. Thanks!

understanding of avod-ssd framework

Thank you so much for making avod-ssd public, it helps a lot.
I roughly read the code in avod_ssd_model.py, and please check whether my understanding is correct or not.

AVOD-SSD is not a "REAL" SSD.
Becuase
(1) It still needs to generate 3D anchors in 3D space like what is done in AVOD-FPN before training. While, in SSD, anchors are generated from feature map like conv4_3, fc_7 and so on.
(2) AVOD-SSD only use the last FPN feature maps for box generation. Whie in SSD, multiple scales of feature maps are utilized
AVOD-SSD is called SSD, only because RPN of AVOD-FPN is gone.
AVOD-SSD extracts FPN feature map from BEV and RGB, and then directly connect the maps to FC piplelines of (2048, 2048, 2048), which is 2nd fusion part in AVOD-FPN.
Because of the above 2 reasons, inference time of AVOD-SSD is similar to AVOD.
Since (1) FPN feature map generation and (2) fusion part of (2048, 2048, 2048) has no change, and fusion of (256, 256) is not taking too much time compared with the above 2 factors.

If any mis-understanding exists, it is appreciated for any one to point out.

how to run test set

@melfm Hi, thanks for sharing your code. I just finished the training and wonder how to do inference on KITTI test set? How to set the ground-plane belong to the test set? Thanks

Hi,I have registered my account on Kitti. How do I upload the result? Do we need to upload txt documents containing test results only? How does Kitti measure the speed?
How do we distinguish between 2D and 3D object on the Kitti list?

Here is a non-commercial Wechat group for discussing the lidar-based perception technologies. Welcome to join us.

channel numbers of FC layers?

Hi @melfm ,
Thank you for the share, it is really a great work.

Here is my question.
The feature extractor output channel numbers are both 32 for the BEV and image, and after the fusion, the channel is still 32. (Is that right?)
However, the channel numbers of FC layers are 512, 1024 and 1024 for classification, offset, and orientation regression, and it means that the number of categories is 16 and the offset and orientation are both encoded as 32 channels. (Is that right?)
In fact, those three number in you thesis are set as 2, 10 and 2 for categories, offset and orientation.

So, it is really confused me.

traing loss converge problem

@melfm
Hi, Sorry to bother you.
I have tried to train/validate avod_ssd using provided data. And the loss is like following.

Although by iteratively evaluate the checkpoints, best AP at iter=140000 is like following, which is very close to yours.
83.17% | 73.49% | 67.58%

Why loss is so unstable? how could i improve this?
By the way, the original avod_ssd_cars_example.config and training data are used, no change was made.

Inference Speed?

HI, thanks for you excellent work! It seems your ssd version of avod costs more time than the original avod with lower AP from the results in readme. I wonder what are the pros and cons of the SSD version. Sorry, I haven't read the whole codes in the repo. And I wonder when would you release your Master's thesis, the link in readme seems not working.

Multiclass detection

Thanks for making the code public.

I noticed none of the example config files is set for detecting multiple classes (i.e. car, pedestrian and cyclist as 3 separate classes). I tried modifying one of the included configs by appending to "classes" and "num_clusters" in dataset_config but it crashed building the dataset. I think the related lines are 178-179 of avod.datasets.kitti.kitti_dataset:

            elif self.classes == ['Car', 'Pedestrian', 'Cyclist']:
                self.classes_name = 'All'

then 'All' is passed to get_anchor_info in avod.core.mini_batch_utils, which tries to find the preprocessed dataset files for 'All', which is not created in the preprocessing - instead it creates 3 directories for 'car', 'pedestrian' and 'cyclist'. I think there's either a mismatch between the preprocessing and training code, or I'm missing something in the config file that's unclear.

Is it possible to include one of the config files from your thesis work where you did multiclass detection on KITTI?

melfm / avod-ssd Goto Github PK

avod-ssd's People

Contributors

Stargazers

Watchers

Forkers

avod-ssd's Issues

[email protected]: Permission denied (publickey)

understanding of avod-ssd framework

how to run test set

How to upload Kitti results?

Here is a non-commercial Wechat group for discussing the lidar-based perception technologies. Welcome to join us.

channel numbers of FC layers?

traing loss converge problem

Inference Speed?

Multiclass detection

?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent