lolrudy / gpv_pose Goto Github PK

View Code? Open in Web Editor NEW

72.0 4.0 12.0 3.91 MB

pytorch implementation of GPV-Pose

License: MIT License

Python 97.77% C++ 1.23% Cuda 1.00%

gpv_pose's Introduction

GPV-Pose

Pytorch implementation of GPV-Pose: Category-level Object Pose Estimation via Geometry-guided Point-wise Voting. (link)

UPDATE!

The results on NOCS and the trained model on CAMERA can be found here.

A new version of code which integrates shape prior information has been updated to the shape-prior-integrated branch in this repo! A brief introuction will be presented in this file. Since L_PC_(s) is not really useful (also indicated in the paper), we remove the loss term and transform it into a pre-processing procedure. You can find it in the updated branch.

Required environment

Ubuntu 18.04
Python 3.8
Pytorch 1.10.1
CUDA 11.3.

Installing

Install the main requirements in 'requirement.txt'.
Install Detectron2.

Data Preparation

To generate your own dataset, use the data preprocess code provided in this git. Download the detection results in this git.

Trained model

Download the trained model from this link.

Training

Please note, some details are changed from the original paper for more efficient training.

Specify the dataset directory and run the following command.

python -m engine.train --data_dir YOUR_DATA_DIR --model_save SAVE_DIR

Detailed configurations are in 'config/config.py'.

Evaluation

python -m evaluation.evaluate --data_dir YOUR_DATA_DIR --detection_dir DETECTION_DIR --resume 1 --resume_model MODEL_PATH --model_save SAVE_DIR

Acknowledgment

Our implementation leverages the code from 3dgcn, FS-Net, DualPoseNet, SPD.

gpv_pose's People

Contributors

Stargazers

Watchers

Forkers

thu-da-6d-pose-group benjobs1 shangbuhuan13 barvin04 rohanksaxena kooshyarkosari ahasan-haque chamhoo youngxiao13 fubowen1229 sunshineywz123 wxycwymds

gpv_pose's Issues

weird evaluation results

Hello,
I tried to run the evaluation and obtained

2022-12-18 15:10:19,793 : average mAP:
I1218 15:10:19.793995 140695683121408 evaluate.py:162] average mAP:
2022-12-18 15:10:19,794 : 3D IoU at 25: 48.1
I1218 15:10:19.794512 140695683121408 evaluate.py:162] 3D IoU at 25: 48.1
2022-12-18 15:10:19,794 : 3D IoU at 50: 0.0
I1218 15:10:19.794570 140695683121408 evaluate.py:162] 3D IoU at 50: 0.0
2022-12-18 15:10:19,794 : 3D IoU at 75: 0.0
I1218 15:10:19.794611 140695683121408 evaluate.py:162] 3D IoU at 75: 0.0
2022-12-18 15:10:19,794 : 5 degree, 2cm: 0.0
I1218 15:10:19.794649 140695683121408 evaluate.py:162] 5 degree, 2cm: 0.0
2022-12-18 15:10:19,794 : 5 degree, 5cm: 0.0
I1218 15:10:19.794685 140695683121408 evaluate.py:162] 5 degree, 5cm: 0.0
2022-12-18 15:10:19,794 : 10 degree, 2cm: 0.0
I1218 15:10:19.794721 140695683121408 evaluate.py:162] 10 degree, 2cm: 0.0
2022-12-18 15:10:19,794 : 10 degree, 5cm: 0.0
I1218 15:10:19.794758 140695683121408 evaluate.py:162] 10 degree, 5cm: 0.0
2022-12-18 15:10:19,794 : 10 degree, 10cm: 0.0
I1218 15:10:19.794793 140695683121408 evaluate.py:162] 10 degree, 10cm: 0.0
2022-12-18 15:10:19,794 : category bottle
I1218 15:10:19.794828 140695683121408 evaluate.py:162] category bottle
2022-12-18 15:10:19,794 : mAP:
I1218 15:10:19.794862 140695683121408 evaluate.py:162] mAP:
2022-12-18 15:10:19,794 : 3D IoU at 25: 40.1
I1218 15:10:19.794898 140695683121408 evaluate.py:162] 3D IoU at 25: 40.1
2022-12-18 15:10:19,794 : 3D IoU at 50: 0.0
I1218 15:10:19.794933 140695683121408 evaluate.py:162] 3D IoU at 50: 0.0
2022-12-18 15:10:19,794 : 3D IoU at 75: 0.0
I1218 15:10:19.794968 140695683121408 evaluate.py:162] 3D IoU at 75: 0.0
2022-12-18 15:10:19,795 : 5 degree, 2cm: 0.0
I1218 15:10:19.795003 140695683121408 evaluate.py:162] 5 degree, 2cm: 0.0
2022-12-18 15:10:19,795 : 5 degree, 5cm: 0.0
I1218 15:10:19.795037 140695683121408 evaluate.py:162] 5 degree, 5cm: 0.0
2022-12-18 15:10:19,795 : 10 degree, 2cm: 0.0
I1218 15:10:19.795072 140695683121408 evaluate.py:162] 10 degree, 2cm: 0.0
2022-12-18 15:10:19,795 : 10 degree, 5cm: 0.0
I1218 15:10:19.795106 140695683121408 evaluate.py:162] 10 degree, 5cm: 0.0
2022-12-18 15:10:19,795 : 10 degree, 10cm: 0.0
I1218 15:10:19.795141 140695683121408 evaluate.py:162] 10 degree, 10cm: 0.0
2022-12-18 15:10:19,795 : category bowl
I1218 15:10:19.795175 140695683121408 evaluate.py:162] category bowl
2022-12-18 15:10:19,795 : mAP:
I1218 15:10:19.795209 140695683121408 evaluate.py:162] mAP:
2022-12-18 15:10:19,795 : 3D IoU at 25: 83.0
I1218 15:10:19.795244 140695683121408 evaluate.py:162] 3D IoU at 25: 83.0
2022-12-18 15:10:19,795 : 3D IoU at 50: 0.0
I1218 15:10:19.795278 140695683121408 evaluate.py:162] 3D IoU at 50: 0.0
2022-12-18 15:10:19,795 : 3D IoU at 75: 0.0
I1218 15:10:19.795313 140695683121408 evaluate.py:162] 3D IoU at 75: 0.0
2022-12-18 15:10:19,795 : 5 degree, 2cm: 0.0
I1218 15:10:19.795347 140695683121408 evaluate.py:162] 5 degree, 2cm: 0.0
2022-12-18 15:10:19,795 : 5 degree, 5cm: 0.0
I1218 15:10:19.795381 140695683121408 evaluate.py:162] 5 degree, 5cm: 0.0
2022-12-18 15:10:19,795 : 10 degree, 2cm: 0.0
I1218 15:10:19.795415 140695683121408 evaluate.py:162] 10 degree, 2cm: 0.0
2022-12-18 15:10:19,795 : 10 degree, 5cm: 0.0
I1218 15:10:19.795450 140695683121408 evaluate.py:162] 10 degree, 5cm: 0.0
2022-12-18 15:10:19,795 : 10 degree, 10cm: 0.0
I1218 15:10:19.795485 140695683121408 evaluate.py:162] 10 degree, 10cm: 0.0
2022-12-18 15:10:19,795 : category camera
I1218 15:10:19.795519 140695683121408 evaluate.py:162] category camera
2022-12-18 15:10:19,795 : mAP:
I1218 15:10:19.795553 140695683121408 evaluate.py:162] mAP:
2022-12-18 15:10:19,795 : 3D IoU at 25: 52.1
I1218 15:10:19.795587 140695683121408 evaluate.py:162] 3D IoU at 25: 52.1
2022-12-18 15:10:19,795 : 3D IoU at 50: 0.0
I1218 15:10:19.795621 140695683121408 evaluate.py:162] 3D IoU at 50: 0.0
2022-12-18 15:10:19,795 : 3D IoU at 75: 0.0
I1218 15:10:19.795656 140695683121408 evaluate.py:162] 3D IoU at 75: 0.0
2022-12-18 15:10:19,795 : 5 degree, 2cm: 0.0
I1218 15:10:19.795690 140695683121408 evaluate.py:162] 5 degree, 2cm: 0.0
2022-12-18 15:10:19,795 : 5 degree, 5cm: 0.0
I1218 15:10:19.795733 140695683121408 evaluate.py:162] 5 degree, 5cm: 0.0
2022-12-18 15:10:19,795 : 10 degree, 2cm: 0.0
I1218 15:10:19.795767 140695683121408 evaluate.py:162] 10 degree, 2cm: 0.0
2022-12-18 15:10:19,795 : 10 degree, 5cm: 0.0
I1218 15:10:19.795802 140695683121408 evaluate.py:162] 10 degree, 5cm: 0.0
2022-12-18 15:10:19,795 : 10 degree, 10cm: 0.0
I1218 15:10:19.795836 140695683121408 evaluate.py:162] 10 degree, 10cm: 0.0
2022-12-18 15:10:19,795 : category can
I1218 15:10:19.795870 140695683121408 evaluate.py:162] category can
2022-12-18 15:10:19,795 : mAP:
I1218 15:10:19.795905 140695683121408 evaluate.py:162] mAP:
2022-12-18 15:10:19,795 : 3D IoU at 25: 49.3
I1218 15:10:19.795939 140695683121408 evaluate.py:162] 3D IoU at 25: 49.3
2022-12-18 15:10:19,795 : 3D IoU at 50: 0.0
I1218 15:10:19.795973 140695683121408 evaluate.py:162] 3D IoU at 50: 0.0
2022-12-18 15:10:19,796 : 3D IoU at 75: 0.0
I1218 15:10:19.796007 140695683121408 evaluate.py:162] 3D IoU at 75: 0.0
2022-12-18 15:10:19,796 : 5 degree, 2cm: 0.0
I1218 15:10:19.796042 140695683121408 evaluate.py:162] 5 degree, 2cm: 0.0
2022-12-18 15:10:19,796 : 5 degree, 5cm: 0.0
I1218 15:10:19.796077 140695683121408 evaluate.py:162] 5 degree, 5cm: 0.0
2022-12-18 15:10:19,796 : 10 degree, 2cm: 0.0
I1218 15:10:19.796111 140695683121408 evaluate.py:162] 10 degree, 2cm: 0.0
2022-12-18 15:10:19,796 : 10 degree, 5cm: 0.0
I1218 15:10:19.796145 140695683121408 evaluate.py:162] 10 degree, 5cm: 0.0
2022-12-18 15:10:19,796 : 10 degree, 10cm: 0.0
I1218 15:10:19.796179 140695683121408 evaluate.py:162] 10 degree, 10cm: 0.0
2022-12-18 15:10:19,796 : category laptop
I1218 15:10:19.796213 140695683121408 evaluate.py:162] category laptop
2022-12-18 15:10:19,796 : mAP:
I1218 15:10:19.796247 140695683121408 evaluate.py:162] mAP:
2022-12-18 15:10:19,796 : 3D IoU at 25: 0.0
I1218 15:10:19.796282 140695683121408 evaluate.py:162] 3D IoU at 25: 0.0
2022-12-18 15:10:19,796 : 3D IoU at 50: 0.0
I1218 15:10:19.796316 140695683121408 evaluate.py:162] 3D IoU at 50: 0.0
2022-12-18 15:10:19,796 : 3D IoU at 75: 0.0
I1218 15:10:19.796351 140695683121408 evaluate.py:162] 3D IoU at 75: 0.0
2022-12-18 15:10:19,796 : 5 degree, 2cm: 0.0
I1218 15:10:19.796385 140695683121408 evaluate.py:162] 5 degree, 2cm: 0.0
2022-12-18 15:10:19,796 : 5 degree, 5cm: 0.0
I1218 15:10:19.796419 140695683121408 evaluate.py:162] 5 degree, 5cm: 0.0
2022-12-18 15:10:19,796 : 10 degree, 2cm: 0.0
I1218 15:10:19.796454 140695683121408 evaluate.py:162] 10 degree, 2cm: 0.0
2022-12-18 15:10:19,796 : 10 degree, 5cm: 0.0
I1218 15:10:19.796489 140695683121408 evaluate.py:162] 10 degree, 5cm: 0.0
2022-12-18 15:10:19,796 : 10 degree, 10cm: 0.0
I1218 15:10:19.796523 140695683121408 evaluate.py:162] 10 degree, 10cm: 0.0
2022-12-18 15:10:19,796 : category mug
I1218 15:10:19.796557 140695683121408 evaluate.py:162] category mug
2022-12-18 15:10:19,796 : mAP:
I1218 15:10:19.796592 140695683121408 evaluate.py:162] mAP:
2022-12-18 15:10:19,796 : 3D IoU at 25: 64.3
I1218 15:10:19.796626 140695683121408 evaluate.py:162] 3D IoU at 25: 64.3
2022-12-18 15:10:19,796 : 3D IoU at 50: 0.0
I1218 15:10:19.796661 140695683121408 evaluate.py:162] 3D IoU at 50: 0.0
2022-12-18 15:10:19,796 : 3D IoU at 75: 0.0
I1218 15:10:19.796695 140695683121408 evaluate.py:162] 3D IoU at 75: 0.0
2022-12-18 15:10:19,796 : 5 degree, 2cm: 0.0
I1218 15:10:19.796730 140695683121408 evaluate.py:162] 5 degree, 2cm: 0.0
2022-12-18 15:10:19,796 : 5 degree, 5cm: 0.0
I1218 15:10:19.796765 140695683121408 evaluate.py:162] 5 degree, 5cm: 0.0
2022-12-18 15:10:19,796 : 10 degree, 2cm: 0.0
I1218 15:10:19.796799 140695683121408 evaluate.py:162] 10 degree, 2cm: 0.0
2022-12-18 15:10:19,796 : 10 degree, 5cm: 0.0
I1218 15:10:19.796833 140695683121408 evaluate.py:162] 10 degree, 5cm: 0.0
2022-12-18 15:10:19,796 : 10 degree, 10cm: 0.0
I1218 15:10:19.796868 140695683121408 evaluate.py:162] 10 degree, 10cm: 0.0

I think there are something wrong here. Most of the values are zero. Any suggestions to correct it? and may I know what is the expected evaluation results please?

where/how can I find/generate "mug_handle.pkl"?

Hello~! Thank you very much for your sharing.
I already handled all SPD, FS-Net, Dualpose repos before, but I didn't see "mug_handle.pkl" which is required for training your model~
Please let me know about it~

Thanks!

Question about the training on CAMERA dataset

Thanks for the code sharing and the impressive work.

I am trying to train the code on REAL275 and CAMERA datasets by myself. I trained two networks by setting the dataset to Real and CAMERA in config.py. The result trained on REAL275 is good. However, the one trained on the CAMERA dataset did not seem well. The following are the results on CAMERA:

average mAP:
3D IoU at 25: 95.0
3D IoU at 50: 92.8
3D IoU at 75: 85.1
5 degree, 2cm: 61.9
5 degree, 5cm: 73.0
10 degree, 2cm: 71.1
10 degree, 5cm: 85.8
10 degree, 10cm: 87.3

I am wondering should the hyperparameters be adjusted for training the CAMERA dataset, or are there any other clues for this? Thank you so much!

Please give me some guide for "Paper and Code loss name matching"

Hello. I feel quite confused while trying to read your code.
I think
3.1 L_Basic_rc == cal_loss_R_con in fsnet_loss,py << sure by k1 value
3.2 L_Basic_Sym == prop_sym_matching_loss in prop_loss.py
3.3 L_Basic_pc == cal_recon_loss_point in recon_loss.py << sure by k2 value
3.4 L_PC_(R,t) == cal_geo_loss_point in geometry_loss.py
L_PC_(s) == cal_geo_loss_face in geometry_loss.py
3.5 L_BB_(R,t,s) == ??

But, maybe I'm wrong.

Plus, can you tell me how did you get the specific value(13.7, 303.5)??
Please give me some advice~!

Thank you!

Encountering issues while installing dependency packages via requirements.txt.

pip install -r requirements.txt
Collecting Markdown==3.3.7 (from -r tmp_requirements.txt (line 1))
Using cached Markdown-3.3.7-py3-none-any.whl (97 kB)
Collecting MarkupSafe==2.0.1 (from -r tmp_requirements.txt (line 2))
Using cached MarkupSafe-2.0.1-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (30 kB)
Collecting matplotlib==3.4.3 (from -r tmp_requirements.txt (line 3))
Using cached matplotlib-3.4.3-cp38-cp38-manylinux1_x86_64.whl (10.3 MB)
Collecting matplotlib-inline==0.1.3 (from -r tmp_requirements.txt (line 4))
Using cached matplotlib_inline-0.1.3-py3-none-any.whl (8.2 kB)
Collecting mistune==0.8.4 (from -r tmp_requirements.txt (line 5))
Using cached mistune-0.8.4-py2.py3-none-any.whl (16 kB)
Collecting mkl==2022.1.0 (from -r tmp_requirements.txt (line 6))
Using cached mkl-2022.1.0-py2.py3-none-manylinux1_x86_64.whl (256.4 MB)
Collecting mkl-fft==1.3.1 (from -r tmp_requirements.txt (line 7))
Using cached mkl_fft-1.3.1-17-cp38-cp38-manylinux2014_x86_64.whl (252 kB)
Collecting mkl-random==1.2.2 (from -r tmp_requirements.txt (line 8))
Using cached mkl_random-1.2.2-78-cp38-cp38-manylinux2014_x86_64.whl.metadata (2.5 kB)
Collecting mkl-service==2.4.0 (from -r tmp_requirements.txt (line 9))
Using cached mkl_service-2.4.0-35-cp38-cp38-manylinux2014_x86_64.whl.metadata (2.4 kB)
Collecting mmcv==1.5.0 (from -r tmp_requirements.txt (line 10))
Using cached mmcv-1.5.0.tar.gz (530 kB)
Preparing metadata (setup.py) ... done
Collecting multidict==6.0.2 (from -r tmp_requirements.txt (line 11))
Using cached multidict-6.0.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (121 kB)
Collecting mypy-extensions==0.4.3 (from -r tmp_requirements.txt (line 12))
Using cached mypy_extensions-0.4.3-py2.py3-none-any.whl (4.5 kB)
Collecting nbclassic==0.3.2 (from -r tmp_requirements.txt (line 13))
Using cached nbclassic-0.3.2-py3-none-any.whl (18 kB)
Collecting nbclient==0.5.4 (from -r tmp_requirements.txt (line 14))
Using cached nbclient-0.5.4-py3-none-any.whl (66 kB)
Collecting nbconvert==6.2.0 (from -r tmp_requirements.txt (line 15))
Using cached nbconvert-6.2.0-py3-none-any.whl (553 kB)
Collecting nbformat==5.1.3 (from -r tmp_requirements.txt (line 16))
Using cached nbformat-5.1.3-py3-none-any.whl (178 kB)
Collecting nest-asyncio==1.5.1 (from -r tmp_requirements.txt (line 17))
Using cached nest_asyncio-1.5.1-py3-none-any.whl (5.0 kB)
Collecting networkx==2.6.3 (from -r tmp_requirements.txt (line 18))
Using cached networkx-2.6.3-py3-none-any.whl (1.9 MB)
Collecting nose==1.3.7 (from -r tmp_requirements.txt (line 20))
Using cached nose-1.3.7-py3-none-any.whl (154 kB)
Requirement already satisfied: importlib-metadata>=4.4 in /home/nvidia/anaconda3/envs/gpv/lib/python3.8/site-packages (from Markdown==3.3.7->-r tmp_requirements.txt (line 1)) (4.11.3)
Requirement already satisfied: cycler>=0.10 in /home/nvidia/anaconda3/envs/gpv/lib/python3.8/site-packages (from matplotlib==3.4.3->-r tmp_requirements.txt (line 3)) (0.10.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /home/nvidia/anaconda3/envs/gpv/lib/python3.8/site-packages (from matplotlib==3.4.3->-r tmp_requirements.txt (line 3)) (1.3.2)
Requirement already satisfied: numpy>=1.16 in /home/nvidia/anaconda3/envs/gpv/lib/python3.8/site-packages (from matplotlib==3.4.3->-r tmp_requirements.txt (line 3)) (1.24.4)
Requirement already satisfied: pillow>=6.2.0 in /home/nvidia/anaconda3/envs/gpv/lib/python3.8/site-packages (from matplotlib==3.4.3->-r tmp_requirements.txt (line 3)) (10.0.1)
Requirement already satisfied: pyparsing>=2.2.1 in /home/nvidia/anaconda3/envs/gpv/lib/python3.8/site-packages (from matplotlib==3.4.3->-r tmp_requirements.txt (line 3)) (3.1.1)
Requirement already satisfied: python-dateutil>=2.7 in /home/nvidia/anaconda3/envs/gpv/lib/python3.8/site-packages (from matplotlib==3.4.3->-r tmp_requirements.txt (line 3)) (2.8.2)
Requirement already satisfied: traitlets in /home/nvidia/anaconda3/envs/gpv/lib/python3.8/site-packages (from matplotlib-inline==0.1.3->-r tmp_requirements.txt (line 4)) (5.13.0)
Requirement already satisfied: intel-openmp==2022.* in /home/nvidia/anaconda3/envs/gpv/lib/python3.8/site-packages (from mkl==2022.1.0->-r tmp_requirements.txt (line 6)) (2022.1.0)
Requirement already satisfied: tbb==2021.* in /home/nvidia/anaconda3/envs/gpv/lib/python3.8/site-packages (from mkl==2022.1.0->-r tmp_requirements.txt (line 6)) (2021.10.0)
Collecting numpy>=1.16 (from matplotlib==3.4.3->-r tmp_requirements.txt (line 3))
Downloading numpy-1.22.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (16.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 16.9/16.9 MB 11.1 MB/s eta 0:00:00
Requirement already satisfied: dpcpp_cpp_rt in /home/nvidia/anaconda3/envs/gpv/lib/python3.8/site-packages (from mkl-fft==1.3.1->-r tmp_requirements.txt (line 7)) (2022.1.0)
INFO: pip is looking at multiple versions of mkl-random to determine which version is compatible with other requirements. This could take a while.
ERROR: Cannot install -r tmp_requirements.txt (line 3), -r tmp_requirements.txt (line 7) and -r tmp_requirements.txt (line 8) because these package versions have conflicting dependencies.

The conflict is caused by:
matplotlib 3.4.3 depends on numpy>=1.16
mkl-fft 1.3.1 depends on numpy<1.23.0 and >=1.22.3
mkl-random 1.2.2 depends on numpy<1.25.0 and >=1.24.3

To fix this you could try to:

loosen the range of package versions you've specified
remove package versions to allow pip attempt to solve the dependency conflict

ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts

（Additional information: Here, I have only installed lines 81 to 100 of the requirements.txt file.

why pred_c is detached

Hello, why is the confidence of the rotation all detach ？

 'Rot1_f': f_green_R.detach(),
 'Rot2_f': f_red_R.detach(),

why different hyperparameters are used?

Hello! thanks for your kind sharing!
I saw the difference between your implementation and paper regarding the hyperparams(lambda 1, ~, 8 and so on)

flags.DEFINE_float('rot_1_w', 8.0, '')
flags.DEFINE_float('rot_2_w', 8.0, '')
flags.DEFINE_float('rot_regular', 4.0, '')
flags.DEFINE_float('tran_w', 8.0, '')
flags.DEFINE_float('size_w', 8.0, '')
flags.DEFINE_float('recon_w', 8.0, '')
flags.DEFINE_float('r_con_w', 1.0, '')

flags.DEFINE_float('recon_n_w', 3.0, 'normal estimation loss')
flags.DEFINE_float('recon_d_w', 3.0, 'dis estimation loss')
flags.DEFINE_float('recon_v_w', 1.0, 'voting loss weight')
flags.DEFINE_float('recon_s_w', 0.3, 'point sampling loss weight, important')
flags.DEFINE_float('recon_f_w', 1.0, 'confidence loss')
flags.DEFINE_float('recon_bb_r_w', 1.0, 'bbox r loss')
flags.DEFINE_float('recon_bb_t_w', 1.0, 'bbox t loss')
flags.DEFINE_float('recon_bb_s_w', 1.0, 'bbox s loss')
flags.DEFINE_float('recon_bb_self_w', 1.0, 'bb self')

flags.DEFINE_float('mask_w', 1.0, 'obj_mask_loss')

flags.DEFINE_float('geo_p_w', 1.0, 'geo point mathcing loss')
flags.DEFINE_float('geo_s_w', 10.0, 'geo symmetry loss')
flags.DEFINE_float('geo_f_w', 0.1, 'geo face loss, face must be consistent with the point cloud')

flags.DEFINE_float('prop_pm_w', 2.0, '')
flags.DEFINE_float('prop_sym_w', 1.0, 'importtannt for symmetric objects, can do point aug along reflection plane')
flags.DEFINE_float('prop_r_reg_w', 1.0, 'rot confidence must be sum to 1')

those are different from the paper( 1/8.0, 1/8.0, 1/8.0, 1. ,,,. 8.0, 1.0, 1.0) and also it seems like there are ssome unseen, newly added ones like recon_n_w recon_d_w recon_s_w prop_pm_w.

Could you please give me some advice about it?

Plus, I think I should add 'Geo_face' loss term for 3.5 Bounding Box - Pose Geometric Consistency. Right?

Thank you very much!

How long is the training?

I'm very interested in your work. I found that it takes an hour to train an epoch on one 3090, and It takes a very long time to fully train the GPV_Pose. How many GPUs and how much time does you take to train GPV_Pose?

A question regarding the loss.

Thank you for your great work, but I have a small question regarding some loss terms. What is the geometric meaning of the formula enclosed in the red box in the diagram below? Additionally, for the second part of Equation (14), why is it i∈|B|, or in other words, what does |B| represent?Looking forward to your response.

your requirements may contain some @, I can not know its current version

for example

blessings @ file:///tmp/build/80754af9/blessings_1614076441300/work

it may occur in your requirements

this may solve it problem

pip list --format=freeze > requirements.txt

How to get 'Real/train_list.txt'?

How to get txt files and pkl files?I will be grateful if you can give me these files.
img_list_path = ['CAMERA/train_list.txt', 'Real/train_list.txt',
'CAMERA/val_list.txt', 'Real/test_list.txt']
model_file_path = ['obj_models/camera_train.pkl', 'obj_models/real_train.pkl',
'obj_models/camera_val.pkl', 'obj_models/real_test.pkl']

Questions about the code

ey :)
I am sorry I have another question

in the recon_loss.py

res_vote = res_vote / 6 / bs

all the devisions by 6 is this because of the 6 distances paramters or is this related to the number of categories

and another question

GPV_Pose/losses/recon_loss.py

Line 234 in 5ac3307

if obj_id != 5:

this line why is this not calculated for latop?

Request a download link for the supplemental files

Hi author, GPV-Pose is really a wonderful work! where can I download the supplementary material?

question about the dataset processing

I am kind of confused about several items in datasets/load_data.py. I have some questions about the content in this file.

At Line 270, you used: fsnet_scale, mean_shape = self.get_fs_net_scale(self.id2cat_name[str(cat_id + 1)], model, nocs_scale), what means fsnet_scale, mean_shape? What's your motivation to do so? And if I want to build a customized dataset, what am I supposed to do with it?
As my understanding, the nocs_scale in the data_dict means the scale of the normalized 3D model for the observed object, and model_point means sampled points cloud from the same 3D model. Am I right?

Thanks in advance
Yunlong

RuntimeError: Function 'DotBackward0' returned nan values in its 0th output.

Hi~ Thank you for releasing the code. When I run the training code, the loss will appear Nan after several epochs. I have tried three times and encountered the same problem. I did not modify any parameters. Can you give me some advice?

About obtaining pose from point clouds

Hi Yan Di, I see the input of your network is P = points - points.mean(dim=1, keepdim=True)，where points is obtained from the backprojection of the depth map.

In my shallow knowledge, two pieces of information are needed to obtain the pose(R only discussed), the points P=R @ P_ori after the R transformation and the original points P_ori.

The input of network only has P=R @ P_ori, how does the network get rotationR without knowing the original points P_ori？

Question regarding evaluation

So for eval a different dataset is used compared to training,

how are the values for the dict obtained?

eval:

(rgb=data['roi_img'].to(device), depth=data['roi_depth'].to(device),
                          depth_normalize=data['depth_normalize'].to(device),
                          obj_id=data['cat_id_0base'].to(device), 
                          camK=data['cam_K'].to(device),
                          gt_mask=data['roi_mask'].to(device),
                          gt_R=None, gt_t=None, gt_s=None, mean_shape=mean_shape,
                          gt_2D=data['roi_coord_2d'].to(device), sym=sym,
                          def_mask=data['roi_mask'].to(device))

train:

network(rgb=data['roi_img'].to(device), depth=data['roi_depth'].to(device),
                              depth_normalize=data['depth_normalize'].to(device),
                              obj_id=data['cat_id'].to(device), 
                              camK=data['cam_K'].to(device), gt_mask=data['roi_mask'].to(device),
                              gt_R=data['rotation'].to(device), gt_t=data['translation'].to(device),
                              gt_s=data['fsnet_scale'].to(device), mean_shape=data['mean_shape'].to(device),
                              gt_2D=data['roi_coord_2d'].to(device), sym=data['sym_info'].to(device),
                              aug_bb=data['aug_bb'].to(device), aug_rt_t=data['aug_rt_t'].to(device), aug_rt_r=data['aug_rt_R'].to(device),
                              def_mask=data['roi_mask_deform'].to(device),
                              model_point=data['model_point'].to(device), nocs_scale=data['nocs_scale'].to(device), do_loss=True)

RGB --> same
depth --> same
depth_normalize --> same
obj_id --> eval ['cat_id_0base'], train data['cat_id']
camK --> same
gt_mask --> same
gt_R --> not used in eval
gt_t --> not used in eval
gt_s --> not used in eval
mean_shape --> not used in eval
gt_2D --> same
sym --> same
def_mask --> eval def_mask=data['roi_mask'], train def_mask=data['roi_mask_deform']

required for eval:
pred_RT --> obtained from line 84, generate_RT([p_green_R_vec, p_red_R_vec], [f_green_R, f_red_R], p_T, mode='vec', sym=sym)

information not present in GPV Pose --> how to obtain gt_handle_visibility
in mentian/object-deformnet --> https://github.com/mentian/object-deformnet/search?q=gt_handle_visibility
gt_handle_visibility = nocs['gt_handle_visibility']

so my question is why is the category id definition different? and for an own dataset how to obtain the value gt_handle_visibility

No script to visualize results

Hi there,

this is not really an issue. I'm just wondering if there should be a script that projects the estimated pose onto an rgb image to generate something like figure 5 in the paper.

I found draw_detections in eval_utils.py and vis_utils.py, but it does not seem to be used anywhere? Would it be possible to add such a script?

Thanks in advance!

Quires about the experiment on LineMOD dataset

Hi, thanks for sharing the code. I am planning to compare my algorithm with the GPV-Pose. About the experiment on the LineMOD dataset, is the evaluation mask trained using MaskRCNN? Also, would you like to share the resulting masks for a fair comparison? Thanks!