Giter Site home page Giter Site logo

homan's Introduction

Towards uncontrained hand-object reconstruction from RGB videos

drawingdrawingdrawing

Yana Hasson, Gül Varol, Ivan Laptev and Cordelia Schmid

Table of Content

Demo

Open In Colab

Setup

Environment setup

Note that you will need a reasonably recent GPU to run this code.

We recommend using a conda environment:

conda env create -f environment.yml
conda activate phosa16

External Dependencies

Detectron2, NMR, FrankMocap

mkdir -p external
git clone --branch v0.2.1 https://github.com/facebookresearch/detectron2.git external/detectron2
pip install external/detectron2
mkdir -p external
git clone https://github.com/hassony2/multiperson.git external/multiperson
pip install external/multiperson/neural_renderer
cd external/multiperson/sdf
pip install external/multiperson/sdf
mkdir -p external
git clone https://github.com/hassony2/frankmocap.git external/frankmocap
sh scripts/install_frankmocap.sh

Install MANO

Follow the instructions below to install MANO
  • Go to MANO website: http://mano.is.tue.mpg.de/
  • Create an account by clicking *Sign Up* and provide your information
  • Download Models and Code (the downloaded file should have the format mano_v*_*.zip). Note that all code and data from this download falls under the MANO license (see http://mano.is.tue.mpg.de/license).
  • Unzip and copy the content of the *models* folder into the extra_data/mano folder

Install SMPL-X

Follow the instructions below to install SMPL-X

Download datasets

HO-3D

Download the dataset following the instructions on the official project webpage.

This code expects to find the ho3d root folder at local_data/datasets/ho3d

Core50

Follow instructions below to setup the Core50 dataset
  • Download the Object models from ShapeNetCorev2
    • Go to https://shapenet.org and create an account
    • Go to the download ShapeNet page
    • You will need the "Archive of ShapeNetCore v2 release" (~25GB)
    • unzip to local_data folder by adapting the command
      • unzip /path/to/ShapeNetCore.v2.zip -d local_data/datasets/ShapeNetCore.v2/

Running the Code

Check installation

Make sure your file structure after completing all the Setup steps, your file structure in the homan folder looks like this.

# Installed datasets
local_data/
  datasets/
    ho3d/
    core50/
    ShapeNetCore.v2/
    epic/
# Auxiliary data needed to run the code
extra_data/
  # MANO data files
  mano/
    MANO_RIGHT.pkl
    ...
  smpl/
    SMPLX_NEUTRAL.pkl

Start fitting

Core50

Step 1

  • Pre-processing images
  • Joint optimization with coarse interaction terms
python fit_vid_dataset.py --dataset core50 --optimize_object_scale 0 --result_root results/core50/step1

Step 2

  • Joint optimization refinement
python fit_vid_dataset.py --dataset core50 --split test --lw_collision 0.001 --lw_contact 1 --optimize_object_scale 0 --result_root results/core50/step2 --resume results/core50/step1

HO3d

Step 1

  • Pre-processing images
  • Joint optimization with coarse interaction terms
python fit_vid_dataset.py --dataset ho3d --split test --optimize_object_scale 0 --result_root results/ho3d/step1

Step 2

  • Joint optimization refinement
python fit_vid_dataset.py --dataset ho3d --split test --lw_collision 0.001 --lw_contact 1 --optimize_object_scale 0 --result_root results/ho3d/step2 --resume results/ho3d/step1

Acknowledgements

PHOSA

The code for this project is heavily based on and influenced by Perceiving 3D Human-Object Spatial Arrangements from a Single Image in the Wild (PHOSA)] by Jason Y. Zhang*, Sam Pepose*, Hanbyul Joo, Deva Ramanan, Jitendra Malik, and Angjoo Kanazawa, ECCV 2020

Consider citing their work !

@InProceedings{zhang2020phosa,
    title = {Perceiving 3D Human-Object Spatial Arrangements from a Single Image in the Wild},
    author = {Zhang, Jason Y. and Pepose, Sam and Joo, Hanbyul and Ramanan, Deva and Malik, Jitendra and Kanazawa, Angjoo},
    booktitle = {European Conference on Computer Vision (ECCV)},
    year = {2020},
}

Funding

This work was funded in part by the MSR-Inria joint lab, the French government under management of Agence Nationale de la Recherche as part of the ”Investissements d’avenir” program, reference ANR19-P3IA-0001 (PRAIRIE 3IA Institute) and by Louis Vuitton ENS Chair on Artificial Intelligence.

Other references

If you find this work interesting, you will certainly be also interested in the following publication:

To keep track of recent publications take a look at awesome-hand-pose-estimation by Xinghao Chen.

License

Note that our code depends on other libraries, including SMPL, SMPL-X, MANO which each have their own respective licenses that must also be followed.

homan's People

Contributors

hassony2 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

homan's Issues

about test ho3d dataset

I want to test this method on ho3d dataset on codlab,but when I run evalho3drecons.py,it tell me that
image
Can you tell me how to create this file

Is the relative distance between hand and object predicted?

Thanks for your great work! The results are fascinating!
And I have some questions.

  1. Is the relative distance between hand and object predicted? I want to let robots imitate object-lifting videos to lift objects. So I need the relative distance between hand and object.
  2. I haven't seen the object scale parameters for Ho3d in homan/dataset/ho3dconstants.py. But core50 has object scale parameters, such as 0.08 in your colab demo. So how to set the scale parameter for Ho3d?
  3. How to obtain the final results of objects and hand pose?

Really nice work, thanks!

How to get the "translations" and "rotations" parameters from the 'person_parameters' generated by Frankmocap?

The hand parameters generated from Frankmocap include '['pred_vertices_smpl', 'pred_joints_smpl', 'faces', 'bbox_scale_ratio', 'bbox_top_left', 'pred_camera', 'img_cropped', 'pred_hand_pose', 'pred_hand_betas', 'pred_vertices_img', 'pred_joints_img']' in my test. So I cannot find the "translations" and "rotations" parameters which needed in the jointopt.py. How can I solve this problem?

Issue with prediction masks during step1

I'm attempting to generate results on the HO3D dataset (specifically the MC2 clip within the train split). I am able to run step 1 and step 2 of the process using the commands provided in the README for generating HO3D results. After running step 1 I am noticing strange behavior with the masks for hand and object. I am seeing incorrect hand and object masks:

step1/samples/00000000/detection_masks.png:
detections_masks

I am able to run step 2 of the process, but I believe the results from step 1 of the process may be limiting the performance of step 2. These are the results I specifically am seeing after running step 2 (I used 60 frame chunks).

step2/samples/00000000/final_points.mp4:
https://user-images.githubusercontent.com/15927999/197063491-8a3d5e5e-77b9-4786-a620-716e8db79629.mp4.

The hand appears to get very distorted and the object doesn't align correctly. I'm wondering if this is in part due to the incorrect hand and object masks in the image I attached.

One change I had to make was that I couldn't download the PointRend model https://dl.fbaipublicfiles.com/detectron2/PointRend/InstanceSegmentation/pointrend_rcnn_R_50_FPN_3x_coco/164955410/model_final_3c3198.pkl due to a HTTP forbidden error (I guessing the file is no longer available). Instead I used https://dl.fbaipublicfiles.com/detectron2/PointRend/InstanceSegmentation/pointrend_rcnn_R_50_FPN_3x_coco/164955410/model_final_edd263.pkl for the model to generate my results.

Is there any reason why these masks look incorrect and any potential solution? Thanks!

ModuleNotFoundError that "No module named 'detectron2.projects'"

Hello,thanks for this amazing repo and also for easy to understand step by step installation. When I run this code in windows10 , I meet a ModuleNotFoundError that "No module named 'detectron2.projects'" in folder "homan\homan\pointrend.py".

I have installed detectron2-0.2.1 in a conda environment, could you tell me how can I import this module, or where this module in detectron2.

About missing annotation data in HO-3D dataset

Thanks for sharing the awesome code!
I got an error in the command ‘track_dataset.py’ for HO-3D dataset. (Please see the below for details)

cv2.error: OpenCV(4.8.0) /io/opencv/modules/calib3d/src/calibration.cpp:3534: error: (-2:Unspecified error) in function 'void cv::Rodrigues(cv::InputArray, cv::OutputArray, cv::OutputArray)'

> Input matrix must be 1x3 or 3x1 for a rotation vector, or 3x3 for a rotation matrix:
>     'srcSz == Size(3, 1) || srcSz == Size(1, 3) || (srcSz == Size(1, 1) && src.channels() == 3) || srcSz == Size(3, 3)'
> where
>     'srcSz' is [0 x 0]

It seems to be related to missing annotation data for camera parameters.
As far as I checked, it was 'none' in the annotation data of some frames.
스크린샷 2023-07-16 오후 11 56 17

I would like to know if you had any problems with the missing data of HO-3D as in my case.
And if so, can you share how to deal with the problem?

Thank you :)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.