Giter Site home page Giter Site logo

dpod's People

Contributors

zakharos avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

dpod's Issues

Reproducibility issue of the DPOD’s refinement network

Hi Zakharov!

Thank you for making your work publicly available. I'm trying to reproduce the result of DPOD refiner by following what the paper says, however, my scores still have the large gap from the reported one.

As the explanation of the network architecture lacks some details, could you please answer the following questions to reproduce your paper's result?

  • What kind of ResNet are you using? (ResNet18 or ResNet50?)
  • XY location given to the network is the center of bounding box or projection of the center of a 3D model?
  • How do you extract the feature vector from E2? Simply using max pooling?
  • Could you let me know the algorithm of what this line says? Weights of the fully connected layers are initialized in such a way that for the 0th iteration the network just outputs the input pose,
  • Regarding this paragraph The first layer of the rotation regression head takes the feature vector f produced by ResNet and adds four values, which are the quaternion representing an initial rotation. The second layer takes the output of the previous one, stacks with the initial quaternion and outputs the final rotation, the quaternion is fed into both the first and second layers? What does adds four values mean here? (looks like concatenation is correct)

Thanks,
Shun

Generating data for training

Hello,

I am working on my Masters thesis about pose estimation and I am trying to replicate your paper's results. For this purpose, I am trying to generate the files needed for training the network. Based on the original LineMOD ply files, I am trying to generate a new file with one-to-one correspondence between UV values and xyz data. Does this need to be a spherical/cylindrical projection?
Also, could you give any advice on how to generate the files:
synth_XXXX_corr.png
synth_XXXX_dpt.png
synth_XXXX_dpt_vis.png
synth_XXXX_img.png
synth_XXXX_norm.png?
I think I can obtain the _corr file by projecting the UV-mapped model over a black background and the _img file projecting the original model, but I don't understand how to obtain the rest of the files.
Are you planning to release the pose refiner code, as well?
The pretrained network weights work fine for each object, but how can I train for multiple objects?
Do you recommend any SfM software for generating custom object models that can be used for training DPOD?

Thank you in advance,

Dennis

3d Bbox Visualization

Hello zakharos,

Thank you for your repo on DPOD. I was wondering if you had to make any preprocessing on LINEMOD Dataset. I downloaded the dataset from the official LINEMOD website and trained all objects. In all objects except camera and cat I get considerable offsets between where the 3d bbox is and where it should be. I was wondering if you had the same issue and how you dealt with it. I hope you can help me.

These images are GT:
image
image

Thank you,
Dennis

GT annotations for LINEMOD

Hello,

Could you provide gt.yml files for the rest of the objects in LINEMOD please?
Thank you,

Dennis Mendoza

Results on YCB-V dataset

Hello,

Thank you for your works.

Could you provide pose estimation with refinement results on YCB-V result for comparison?

Best,
Rui

training custom dataset

Do I need depth maps (synth_xxxx_dpt.png/synth_xxxx_dpt_vis.png) for training custom datasets?

LineMOD Dataset

Hello,

Are you planning to release the full dataset ready for training? If not, could you gave some recommendations on how to preprocess the original LineMOD dataset and obtain the needed files for training?
Thank you,

Dennis

Bug in the ADD(-S)-0.1d, 0.3d, and 0.5d metric calculation?

Based on the definition of the ADD(-S) metric, I'd say the following lines in DPOD/pipelines/test.py should be

add_10 = count_add_10 / len(testloader)
add_30 = count_add_30 / len(testloader)
add_50 = count_add_50 / len(testloader)

not,

add_10 = count_add_10 / n_detected_correctly
add_30 = count_add_30 / n_detected_correctly
add_50 = count_add_50 / n_detected_correctly

Could you please check if my understanding is correct at leisure?

add_10 = count_add_10 / n_detected_correctly

add_30 = count_add_30 / n_detected_correctly

add_50 = count_add_50 / n_detected_correctly

Model is not training at all...

Hey,
I could start your model but it is not training at all and I get an error message because everything was 0. Do you have any idea what I did wrong?

This is the output I got:

Loading yaml...
Train Epoch: 0 [0/8 (0%)] Losses: - Corr: 11.363614, - Mask: 3.527913
Train Epoch: 0 [2/8 (25%)] Losses: - Corr: 11.226030, - Mask: 3.484621
Train Epoch: 0 [4/8 (50%)] Losses: - Corr: 11.116333, - Mask: 3.232947
Train Epoch: 0 [6/8 (75%)] Losses: - Corr: 11.117272, - Mask: 3.169233
Train Epoch: 1 [0/8 (0%)] Losses: - Corr: 11.052947, - Mask: 2.952454
Train Epoch: 1 [2/8 (25%)] Losses: - Corr: 11.025255, - Mask: 2.332255
Train Epoch: 1 [4/8 (50%)] Losses: - Corr: 10.995690, - Mask: 2.576330
Train Epoch: 1 [6/8 (75%)] Losses: - Corr: 10.988077, - Mask: 1.807793
Saved network
Processing model 06
0/5
Recall: 0.0
Precision: 0.0
F1 0
/usr/local/lib/python3.7/dist-packages/numpy/core/fromnumeric.py:3118: RuntimeWarning: Mean of empty slice.
out=out, **kwargs)
/usr/local/lib/python3.7/dist-packages/numpy/core/_methods.py:85: RuntimeWarning: invalid value encountered in double_scalars
ret = ret.dtype.type(ret / rcount)

distance: nan
ADD 10: 0.000000, ADD 30: 0.000000, ADD 50: 0.000000
N correctly detected: 0

Thank you in advance!

Number of Epochs

Hello,

How many epochs were trained to achieve the results mentioned in the paper?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.