Giter Site home page Giter Site logo

sc6d-pose's Introduction

SC6D: Symmetry-agnostic and Correspondence-free 6D Object Pose Estimation (3DV 2022)

@inproceedings{cai2022sc6d,
  title={SC6D: Symmetry-agnostic and Correspondence-free 6D Object Pose Estimation},
  author={Cai, Dingding and Heikkil{\"a}, Janne and Rahtu, Esa},
  booktitle={2022 International Conference on 3D Vision (3DV)},
  year={2022},
  organization={IEEE}
}

Setup

Please start by installing Miniconda3 with Pyhton3.8 or above.

git clone https://github.com/dingdingcai/SC6D-pose.git
cd SC6D-pose
conda env create -f environment.yml
conda activate sc6d
pip install torch==1.8.0+cu111 torchvision==0.9.0+cu111 -f https://download.pytorch.org/whl/torch_stable.html
pip install "git+https://github.com/facebookresearch/pytorch3d.git"
python -m pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu111/torch1.8/index.html

Dataset

Our evaluation is conducted on three benchmark datasets all downloaded from BOP website. All three datasets are stored in the same directory, e.g. BOP_Dataset/tless, BOP_Dataset/ycbv, BOP_Dataset/itodd, and set the "DATASET_ROOT" (in config.py) to the BOP_Dataset directory.

Denpendencies

This project requires the evaluation code from bop_toolkit.

Pre-trained Models

The pre-trained models can be downloaded here, all the models are saved to the checkpoints directory, for example, checkpoints/tless, checkpoints/ycbv, checkpoints/itodd.

Inference

Download the predicted detection results from BOP Challenge 2022 and decompress it to the root directory.

  • unzip bop22_default_detections_and_segmentations.zip

Evaluation on the model trained using only PBR images.

  • python inference.py --dataset_name tless --gpu_id 0

Evaluation on the model first trained using the PBR images and finetuned with the combined Synt+Real images

  • python inference.py --dataset_name tless --gpu_id 0 --eval_finetune

Training

To train SC6D, download the VOC2012 dataset and set the "VOC_BG_ROOT" (in config.py) to the VOC2012 directory

  • bash training.sh # change the "NAME" variable for a new dataset.

Acknowledgement

sc6d-pose's People

Contributors

dingdingcai avatar

Stargazers

sewon jeon avatar Weihang Li avatar  avatar Xiao Xuan avatar  avatar Mona Jalal avatar  avatar Jintong Cai avatar  avatar Yang Hai avatar  avatar 林海涛 | Haitao Lin avatar  avatar Linpeng Peng avatar Van Nguyen Nguyen avatar Federico Vasile avatar wwww avatar Wei Jiang avatar Arul Selvam avatar Ajinkya Indulkar avatar 爱可可-爱生活 avatar Yuliang Guo avatar Alberto Remus avatar  avatar Rasmus Haugaard avatar Wen Jiang avatar  avatar Xingyu Liu avatar Yongzhi Su avatar Xingyi He avatar Yuan Liu avatar  avatar

Watchers

James Cloos avatar  avatar

sc6d-pose's Issues

pXY loss not dropping

Hi! Many thanks for the cool repo and paper. I'm training the network on a custom dataset, but pXY is not dropping, and therefore the results are not satisfactory. While you were training your networks, did you face such an issue? Do you have any recommendations to improve this? Many thanks in advance!

Finetuning configuration

What is the parameter setting for fine-tuning? In particular, what learning rate did you use?

[Question] how to determine Tz's scope?

Hi dingdingcai,
I've read your paper. It's such an amazing job!
In your job, you replace regressing tz with classfiying tz. Could you tell me how you determine the nearest(or farthest) Tz, to which you set a value here. Many thanks.
Best,
Leroy

Finetuning setting

First of all, really nice work and the results are so impressive! I am trying to reproduce the metric on TLESS dataset but I have trouble obtaining the same score as reported in the paper via fine-tuning with real images. I was starting with a learning rate of 1e-4 and gradually annealing it towards 1e-5 after 30 epochs. I also did oversampling of real images to ensure a 50/50 percentage of real / synthetic examples. Any important details that I was missing here? Thanks!

Licensing

Thanks for sharing this great code repo! We found your work interesting and inspiring and we would like to follow up. However, it seems that there's no license associated with it which makes it difficult for us to include any code snippets from you. According to GitHub's help page:

You're under no obligation to choose a license. However, without a license, the default copyright laws apply, meaning that you retain all rights to your source code and no one may reproduce, distribute, or create derivative works from your work. If you're creating an open source project, we strongly encourage you to include an open source license. The Open Source Guide provides additional guidance on choosing the correct license for your project.

Could you please add an appropriate license if possible? We will give acknowledgements and citations. FYI, this instruction may be helpful: https://docs.github.com/en/communities/setting-up-your-project-for-healthy-contributions/adding-a-license-to-a-repository. A less restrictive license like Apache-2.0 or MIT would be preferred and appreciated by us. Thanks!

Question about position encoding

Hey, it's an interesting work!
I notice that you have not implemented position encoding for SO3 rotation input, while it has been done in Implicit-PDF. I am wondering if there are any concerns with implementing position encoding.
Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.