Giter Site home page Giter Site logo

chisarie / centersnap Goto Github PK

View Code? Open in Web Editor NEW

This project forked from zubair-irshad/centersnap

0.0 0.0 0.0 54.34 MB

Pytorch code for ICRA'22 paper: "Single-Shot Multi-Object 3D Shape Reconstruction and Categorical 6D Pose and Size Estimation"

Home Page: https://zubair-irshad.github.io/projects/CenterSnap.html

Shell 0.01% Python 1.38% Jupyter Notebook 98.62%

centersnap's Introduction

CenterSnap: Single-Shot Multi-Object 3D Shape Reconstruction and Categorical 6D Pose and Size Estimation

License: MIT PWC

This repository is the pytorch implementation of our paper:

CenterSnap: Single-Shot Multi-Object 3D Shape Reconstruction and Categorical 6D Pose and Size Estimation
Muhammad Zubair Irshad, Thomas Kollar, Michael Laskey, Kevin Stone, Zsolt Kira
International Conference on Robotics and Automation (ICRA), 2022

[Project Page] [arXiv] [PDF] [Video] [Poster]

Explore CenterSnap in Colab

Citation

If you find this repository useful, please consider citing:

@inproceedings{irshad2022centersnap,
  title={CenterSnap: Single-Shot Multi-Object 3D Shape Reconstruction and Categorical 6D Pose and Size Estimation},
  author={Muhammad Zubair Irshad and Thomas Kollar and Michael Laskey and Kevin Stone and Zsolt Kira},
  journal={IEEE International Conference on Robotics and Automation (ICRA)},
  year={2022},
  url={https://arxiv.org/abs/2203.01929},
}

@inproceedings{irshad2022shapo,
  title={ShAPO: Implicit Representations for Multi Object Shape Appearance and Pose Optimization},
  author={Muhammad Zubair Irshad and Sergey Zakharov and Rares Ambrus and Thomas Kollar and Zsolt Kira and Adrien Gaidon},
  journal={European Conference on Computer Vision (ECCV)},
  year={2022},
  url={https://arxiv.org/abs/2207.13691},
}

Contents

๐Ÿ’ป Environment

Create a python 3.8 virtual environment and install requirements:

cd $CenterSnap_Repo
conda create -y --prefix ./env python=3.8
conda activate ./env/
./env/bin/python -m pip install --upgrade pip
./env/bin/python -m pip install -r requirements.txt -f https://download.pytorch.org/whl/torch_stable.html

The code was built and tested on cuda 10.2

๐Ÿ“Š Dataset

  1. Download pre-processed dataset

We recommend downloading the preprocessed dataset to train and evaluate CenterSnap model. Download and untar Synthetic (868GB) and Real (70GB) datasets. These files contains all the training and validation you need to replicate our results.

cd $CenterSnap_REPO/data
wget https://tri-robotics-public.s3.amazonaws.com/centersnap/CAMERA.tar.gz
tar -xzvf CAMERA.tar.gz

wget https://tri-robotics-public.s3.amazonaws.com/centersnap/Real.tar.gz
tar -xzvf Real.tar.gz

The data directory structure should follow:

data
โ”œโ”€โ”€ CAMERA
โ”‚   โ”œโ”€โ”€ train
โ”‚   โ””โ”€โ”€ val_subset
โ”œโ”€โ”€ Real
โ”‚   โ”œโ”€โ”€ train
โ””โ”€โ”€ โ””โ”€โ”€ test
  1. To prepare your own dataset, we provide additional scripts under prepare_data.

โœจ Training and Inference

  1. Train on NOCS Synthetic (requires 13GB GPU memory):
./runner.sh net_train.py @configs/net_config.txt

Note than runner.sh is equivalent to using python to run the script. Additionally it sets up the PYTHONPATH and CenterSnap Enviornment Path automatically.

  1. Finetune on NOCS Real Train (Note that good results can be obtained after finetuning on the Real train set for only a few epochs i.e. 1-5):
./runner.sh net_train.py @configs/net_config_real_resume.txt --checkpoint \path\to\best\checkpoint
  1. Inference on a NOCS Real Test Subset

Download a small NOCS Real subset from [here]

./runner.sh inference/inference_real.py @configs/net_config.txt --data_dir path_to_nocs_test_subset --checkpoint checkpoint_path_here

You should see the visualizations saved in results/CenterSnap. Change the --ouput_path in *config.txt to save them to a different folder

  1. Optional (Shape Auto-Encoder Pre-training)

We provide pretrained model for shape auto-encoder to be used for data collection and inference. Although our codebase doesn't require separately training the shape auto-encoder, if you would like to do so, we provide additional scripts under external/shape_pretraining

๐Ÿ“ FAQ

1. I am not getting good performance on my custom camera images i.e. Realsense, OAK-D or others.

2. I am getting no cuda GPUs available while running colab.

  • Ans: Make sure to follow this instruction to activate GPUs in colab:
Make sure that you have enabled the GPU under Runtime-> Change runtime type!

3. I am getting raise RuntimeError('received %d items of ancdata' % RuntimeError: received 0 items of ancdata

  • Ans: Increase ulimit to 2048 or 8096 via uimit -n 2048

4. I am getting RuntimeError: CUDA error: no kernel image is available for execution on the device or You requested GPUs: [0] But your machine only has: []

  • Ans: Check your pytorch installation with your cuda installation. Try the following:
  1. Installing cuda 10.2 and running the same script in requirements.txt

  2. Installing the relevant pytorch cuda version i.e. changing this line in the requirements.txt

torch==1.7.1
torchvision==0.8.2

5. I am seeing zero val metrics in wandb

  • Ans: Make sure you threshold the metrics. Since pytorch lightning's first validation check metric is high, it seems like all other metrics are zero. Please threshold manually to remove the outlier metric in wandb to see actual metrics.

Acknowledgments

  • This code is built upon the implementation from SimNet

Related Work

Licenses

centersnap's People

Contributors

zubair-irshad avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.