Giter Site home page Giter Site logo

alignsdf's Introduction

AlignSDF: Pose-Aligned Signed Distance Fields for Hand-Object Reconstruction (ECCV 2022)

This repository is the official implementation of AlignSDF: Pose-Aligned Signed Distance Fields for Hand-Object Reconstruction. Project webpage: https://zerchen.github.io/projects/alignsdf.html.

Recent work achieved impressive progress towards joint reconstruction of hands and manipulated objects from monocular color images. Existing methods focus on two alternative representations in terms of either parametric meshes or signed distance fields (SDFs). On one side, parametric models can benefit from prior knowledge at the cost of limited shape deformations and mesh resolutions. Mesh models, hence, may fail to precisely reconstruct details such as contact surfaces of hands and objects. SDF-based methods, on the other side, can represent arbitrary details but are lacking explicit priors. In this work we aim to improve SDF models using priors provided by parametric representations. In particular, we propose a joint learning framework that disentangles the pose and the shape. We obtain hand and object poses from parametric models and use them to align SDFs in 3D space. We show that such aligned SDFs better focus on reconstructing shape details and improve reconstruction accuracy both for hands and objects. We evaluate our method and demonstrate significant improvements over the state of the art on the challenging ObMan and DexYCB benchmarks. framework

Installation

conda create --name alignsdf python=3.9
conda activate alignsdf
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
pip install -r requirements.txt

Dataset preprocessing

I have put the preprocessed data at here. You could skip following steps and use my preprocessed data.

Download ObMan and DexYCB from their official webistes. I use the s1 split of DexYCB and make its data organization same as ObMan. The data directory structure is described as below.

${alignsdf}
|-- data
|   |-- obman
|   |   |-- train
|   |   |   |-- rgb
|   |   |   |-- meta
|   |   |   |-- mesh_hand
|   |   |   |-- mesh_obj
|   |   |   |-- sdf_hand
|   |   |   |-- sdf_obj
|   |   |   |-- norm
|   |   |-- test
|   |   |   |-- rgb
|   |   |   |-- meta
|   |   |   |-- mesh_hand
|   |   |   |-- mesh_obj
|   |-- dexycb
|   |   |-- train
|   |   |-- test

I first use the tool to generate the mesh_hand and mesh_obj folders. Then, I build the SDF preprocessing code based on Grasping Field and DeepSDF to compute signed distances to both the hand and the object, and to find the associate hand-part label for each point. To set up the SDF preprocessing environment, you first need to install CLI11, Pangolin, Nanoflann and Eigen3. After that, you could cd scripts/sample_points and follow the standard cmake procedure to build executables:

mkdir build
cd build
cmake ..
make -j

More details about the compilation process can be found on DeepSDF. Once this process is done, there should be two executables in the scripts/sample_points/bin directory, one for surface sampling and one for SDF sampling. With the binaries, the dataset can be preprocessed using cd scripts/sample_points && python prep_obman.py. Then, you could generate sdf_hand, sdf_obj and norm folders.

(Optional) To accelerate IO, you could refer to tools/create_lmdb.py to convert your data into the lmdb format. The organization of data looks like:

${alignsdf}
|-- data
|   |-- obman
|   |   |-- train
|   |   |   |-- rgb.lmdb
|   |   |   |-- meta.lmdb
|   |   |   |-- sdf_hand.lmdb
|   |   |   |-- sdf_obj.lmdb
|   |   |   |-- norm.lmdb
|   |   |-- test

Training

Before the training process, create a outputs directory under ${alignsdf} to save outputs. You could train the baseline (i.e., the re-implementation of grasping field):

CUDA_VISIBLE_DEVICES=0,1,2,3 bash dist_train.sh 4 6666 -e experiments/obman/30k_1e2d_mlp5.json

You could launch the training of our method through the script shown below:

CUDA_VISIBLE_DEVICES=0,1,2,3 bash dist_train.sh 4 6666 -e experiments/obman/30k_1e2d_mlp5.json --mano --obj_pose --point_size 9 --encode both --ocrw 0

Testing and evaluating

Though the testing process will start automatically after training, you could also launch the testing script explicitly when needed:

CUDA_VISIBLE_DEVICES=0,1,2,3 python dist_reconstruct.py -e ${model_dir} -t ${task}$

When finish the testing phase, you could use evaluate.py to evaluate the performance.

Citation

If you find this work useful, please consider citing:

@InProceedings{chen2022alignsdf,
author       = {Chen, Zerui and Hasson, Yana and Schmid, Cordelia and Laptev, Ivan},
title        = {{AlignSDF}: {Pose-Aligned} Signed Distance Fields for Hand-Object Reconstruction},
booktitle    = {ECCV},
year         = {2022},
}

Acknowledgements

Some of the codes are built upon Grasping Field, DeepSDF, manopth and PoseNet. Thanks to them for their great work! If you find this work interesting, you might also be interested in the following publication, which addresses the task under a similar setting:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.