nvlabs / affordance_diffusion Goto Github PK

View Code? Open in Web Editor NEW

89.0 5.0 4.0 12.82 MB

Codes for "Affordance Diffusion: Synthesizing Hand-Object Interactions"

Home Page: https://github.com/NVlabs/affordance_diffusion/blob/master

Python 99.78% Shell 0.22%

diffusion-models vision

affordance_diffusion's Introduction

Affordance Diffusion: Synthesizing Hand-Object Interactions

Yufei Ye, Xueting Li, Abhinav Gupta, Shalini De Mello, Stan Birchfield, Jiaming Song, Shubham Tulsiani, Sifei Liu

in CVPR2023

Tl;dr: Given a single RGB image of an object, hallucinate plausible ways of human interacting with it.

[Project Page] [Video] [Arxiv] [Data Generation]

Installation

See install.md

Inference

HOI synthesis

python inference.py data.data_dir='docs/demo/*.*g' test_num=3

Inference script first synthesizes $test_num HOI images in batch and then extract 3D hand pose.

Input	Synthesized HOI images	Extracted 3D Hand Pose

Interpolation

The script takes in the layout parameter of the $index-th example predicted from inference.py, and smoothly interpolates the HOI synthesis to the horizontally flipped parameters. To run demo,

python -m scripts.interpolate dir=docs/demo_inter

This should gives results similar to:

Input	Interpolated Layouts	Output

Addtional parameters

``` python -m scripts.interpolate dir=\${output}/release/layout/cascade index=0000_00_s0 ```

interpolation.len: length of a interpolation sequence
interpolation.num: number of interpolation sequences
interpolation.test_name: subfolder to save the output
interpolation.orient: whether to horizontally flip approaching direction

Heatmap Guidance

The following command runs guided generation with keypoints in docs/demo_kpts

python inference.py  mode=hijack data.data_dir='docs/demo_kpts/*.png' test_name=hijack

This should gives results similar to:

Input 1	Output 1	Input 2	Output 2

Training

Data Preprocessing

We provide the script to generate the HO3Pair dataset. Please see preprocess/.

Train your own models

LayoutNet: First download off-shelf pretrained model from here and put it under ${environment.pretrain}/glide/base_inpaint.pt specified in configs/model/layout.yaml:resume_ckpt

python -m models.base -m  --config-name=train \
  expname=reproduce/\${model.module} \
  model=layout

ContentNet-GLIDE: First download off-shelf pretrained model from here and put it under ${environment.pretrain}/glide/base_inpaint.pt specified in configs/model/content_glide.yaml:resume_ckpt

python -m models.base -m  --config-name=train \
  expname=reproduce/\${model.module} \
  model=content_glide

ContentNet-LDM: First download off-shelf pretrained model from here and put it under ${environment.pretrain}/stable/inpaint.ckpt specified in configs/model/content_ldm.yaml:resume_ckpt

python -m models.base -m  --config-name=train \
  expname=reproduce/\${model.module} \
  model=content_ldm

Split and test images

Per-category HOI4D instance splits (was not used in the paper), test images on HOI4D and EPIC-KITCHENS(VISOR) can be downloaded here.

License

This project is licensed under CC-BY-NC-SA-4.0. Redistribution and use should follow this license.

Acknowledgement

Affordance Diffusion leverages many amazing open-sources shared in research community:

Frankmocap
Detectron2
Latent Diffusion (Files under ldm/ are modified from this repo)
GLIDE and its modification (Files under glide_text2im/ are modified from this repo)

Citation

If you use find this work helpful, please consider citing:

 @inproceedings{ye2023affordance,
                title={Affordance Diffusion: Synthesizing Hand-Object Interactions},
                author={Yufei Ye and Xueting Li and Abhinav Gupta
                        and Shalini De Mello and Stan Birchfield and Jiaming Song
                        and Shubham Tulsiani and Sifei Liu},
                year={2023},
                booktitle ={CVPR},
            }

affordance_diffusion's People

Contributors

Stargazers

Watchers

Forkers

budiu-39 archana53 dqj5182 bruinxiong

affordance_diffusion's Issues

Inconsistency in the environment

I noticed that in the preprocessing phase,the Pytorch version is 1.9,but in the environment.yaml Pytorch version is 1.1,which version should I choose to run the project?

Errors in data generation

Hi, I've encountered an issue similar to the one previously documented as issue #14.

Specifically, in the Text2ImUNet model, the in_channel parameter is set to 3, whereas in the provided checkpoint, the in_channel appears to be 7. I'm uncertain if my approach to resolving this inconsistency is correct. While my adjustments did address the initial model loading problem, they have unfortunately led to a new issue.

Here is my modification: just change the name of ckpt in load_base()

if args.base_ckpt is None:
        # model.load_state_dict(load_checkpoint('base-inpaint', device))
        model.load_state_dict(load_checkpoint('base', device))

The dimension mismatch error has been resolved, but I've now encountered a different issue within the generate_data:

File "/public/home/v-liuym/projects/affordance_diffusion/preprocess/../glide_text2im/gaussian_diffusion.py", line 413, in p_sample_loop
    for sample in self.p_sample_loop_progressive(
  File "/public/home/v-liuym/projects/affordance_diffusion/preprocess/../glide_text2im/gaussian_diffusion.py", line 465, in p_sample_loop_progressive
    out = self.p_sample(
  File "/public/home/v-liuym/projects/affordance_diffusion/preprocess/../glide_text2im/gaussian_diffusion.py", line 364, in p_sample
    out = self.p_mean_variance(
  File "/public/home/v-liuym/projects/affordance_diffusion/preprocess/../glide_text2im/respace.py", line 116, in p_mean_variance
    return super().p_mean_variance(self._wrap_model(model), *args, **kwargs)
  File "/public/home/v-liuym/projects/affordance_diffusion/preprocess/../glide_text2im/gaussian_diffusion.py", line 258, in p_mean_variance
    model_output = model(x, t, **model_kwargs)
  File "/public/home/v-liuym/projects/affordance_diffusion/preprocess/../glide_text2im/respace.py", line 146, in __call__
    return self.model(x, new_ts, **kwargs)
  File "generate_data.py", line 161, in model_fn
    model_out = model(combined, ts, **kwargs)
  File "/public/home/v-liuym/.conda/envs/afford_diff/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
TypeError: forward() got an unexpected keyword argument 'inpaint_image'

I think I've narrowed down the problem to the setup args for the diffusion model, at least that's what it looks like from the definition here But no luck fixing it yet. Would be awesome if you could give me a hand with this! @JudyYe

ffmpeg: error while loading shared libraries: libopenh264.so.5: cannot open shared object file: No such file or directory

cool work!I want to ask for the version of ffmpeg,in the preprocess stage,I met the above error,I choose version 1.4 channel pypi.

RuntimeError: Error(s) in loading state_dict for Text2ImUNet: size mismatch for input_blocks.0.0.weight: copying a param with shape torch.Size([192, 7, 3, 3]) from checkpoint, the shape in current model is torch.Size([192, 3, 3, 3]).

Your project is really impressive, but I encountered some issues while trying to reproduce it. I hope to get your help. I encountered the following error during the data generation phase in the inpainting stage.

[Error message]:
Traceback (most recent call last):
File "/affordance_diffusion/preprocess/generate_data.py", line 495, in
batch_main(args)
File "/affordance_diffusion/preprocess/generate_data.py", line 301, in batch_main
glide['base'] = load_base()
File "/affordance_diffusion/preprocess/generate_data.py", line 88, in load_base
model.load_state_dict(load_checkpoint('base-inpaint', device))
File "miniconda3/envs/afford_diff/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1482, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Text2ImUNet:
size mismatch for input_blocks.0.0.weight: copying a param with shape torch.Size([192, 7, 3, 3]) from checkpoint, the shape in current model is torch.Size([192, 3, 3, 3]).

Process finished with exit code 1

Could you please help me understand the cause of this? It has been bothering me for a while. By the way, I also wanted to ask about the first stage of data generation, the decoding phase. I couldn't find any data in the output path. Is this normal? I hope to receive your response. Thank you!

Ckpt Loading error

Hi all,

Thanks for your work! I am trying to fine-tune the layout model you provide on my data, but I get the following warning while launching the fine-tuning command:

$ python -m models.base -m  --config-name=train \
  expname=reproduce/\${model.module} \
  model=layout 
[...]
[2023-07-31 12:27:02,222][root][WARNING] - Checkpoint misses key splat_to_mask.template
[2023-07-31 12:27:02,222][root][WARNING] - Checkpoint misses key splat_to_mask.ndcTloll
[2023-07-31 12:27:02,223][root][WARNING] - Checkpoint misses key proj_in_param_img.weight
[2023-07-31 12:27:02,223][root][WARNING] - Checkpoint misses key proj_in_param_img.bias
[2023-07-31 12:27:02,223][root][WARNING] - Checkpoint misses key spatial_img.0.norm.weight
[2023-07-31 12:27:02,223][root][WARNING] - Checkpoint misses key spatial_img.0.norm.bias
[2023-07-31 12:27:02,223][root][WARNING] - Checkpoint misses key spatial_img.0.qkv.weight
[2023-07-31 12:27:02,223][root][WARNING] - Checkpoint misses key spatial_img.0.qkv.bias
[2023-07-31 12:27:02,223][root][WARNING] - Checkpoint misses key spatial_img.0.encoder_kv.weight
[2023-07-31 12:27:02,223][root][WARNING] - Checkpoint misses key spatial_img.0.encoder_kv.bias
[2023-07-31 12:27:02,223][root][WARNING] - Checkpoint misses key spatial_img.0.proj_out.weight
[2023-07-31 12:27:02,223][root][WARNING] - Checkpoint misses key spatial_img.0.proj_out.bias
[2023-07-31 12:27:02,223][root][WARNING] - Checkpoint misses key spatial_txt.0.norm.weight
[2023-07-31 12:27:02,223][root][WARNING] - Checkpoint misses key spatial_txt.0.norm.bias
[2023-07-31 12:27:02,223][root][WARNING] - Checkpoint misses key spatial_txt.0.qkv.weight
[2023-07-31 12:27:02,223][root][WARNING] - Checkpoint misses key spatial_txt.0.qkv.bias
[2023-07-31 12:27:02,223][root][WARNING] - Checkpoint misses key spatial_txt.0.encoder_kv.weight
[2023-07-31 12:27:02,224][root][WARNING] - Checkpoint misses key spatial_txt.0.encoder_kv.bias
[2023-07-31 12:27:02,224][root][WARNING] - Checkpoint misses key spatial_txt.0.proj_out.weight
[2023-07-31 12:27:02,224][root][WARNING] - Checkpoint misses key spatial_txt.0.proj_out.bias
[2023-07-31 12:27:02,224][root][WARNING] - Checkpoint misses key proj_out_param.weight
[2023-07-31 12:27:02,224][root][WARNING] - Checkpoint misses key proj_out_param.bias

Apparently, some of the weights are not loaded correctly. Is this expected?

module importation error

Hi, thanks for sharing this wonderful work!
While following your instructions to execute the inference.py script, I encountered an issue within the three_d_metric method. Specifically, the demo_handmocap function attempts to import a module named jutils. However, this action triggers an error:

ModuleNotFoundError: No module named 'jutils'

I am wondering whether the jutils referenced here corresponds to the same jutils module utilized within the affordance_diffusion project. Could you please confirm if they are identical, or if additional steps are required to resolve this module importation error?

Thanks!

Interpolation error

When I try to run demo:python -m scripts.interpolate dir=docs/demo_inter,it raised a FileNotFoundError,FileNotFoundError: No such file: '/home/chen/Projects/affordance_diffusion/docs/demo_inter/inter/superres/0000_01_s0_00_s0.png'

ImportError: cannot import name 'Ego_Centric_HOI_Detector' from 'handmocap.hand_bbox_detector'

Thank you for showing such a fantastic work! I am interested in the hand contact evaluation shown in inference, However, when running the code, an error :cannot import name 'Ego_Centric_HOI_Detector' from 'handmocap.hand_bbox_detector' appears

Is it possible to give some hint on how to fix this problem or some details about how to evaluate the contact recall? Thank you so much for your time!

rm error

In the preprocessing stage,I met an error:
rm: cannot remove 'data/hoi4d/HOI4D_release/ZY20210800004/H4/C14/N21/S174/s02/T1/align_frames/': No such file or directory
rm -r data/hoi4d/HOI4D_release/ZY20210800004/H4/C14/N21/S174/s02/T1/align_frames/
rm: cannot remove 'data/hoi4d/HOI4D_release/ZY20210800004/H4/C14/N21/S174/s02/T1/align_frames/*': No such file or directory
my dataset folder structure is as follows:preprocess/data/hoi4d/HOI4D_annotations,and my --data_dir is data/hoi4d/
Is align_frames folder missed?or something wrong with my setting,do you met this error before?

manopth module

ModuleNotFoundError: No module named 'manopth',where can I download manopth module?,is it OK to just download manopth project from github,and put it below jutils?or the document missed?

Sorry to botter you,is the folder "inter" under folder "docs/demo_inter" missed?

When I python -m scripts.interpolate dir=docs/demo_inter,it raised FileNotFoundError while the other two code works well.and by the way,could you please make the code that generates heatmap from affordance public?

PermissionError: [Errno 13] Permission denied: '/ws-judyye'

Hi,thanks for your cool work,everytime the project goes to EDSR-PyTorch folder,it stops with some errors,so I test EDSR-PyTorch,it truned out PermissionError.I don't know how to solve this problem.

Is config.yaml missed?

After python inference.py data.data_dir='docs/demo/*.*g' test_num=3,FileNotFoundError: [Errno 2] No such file or directory: '/data/PycharmProjects/affordance_diffusion/output/release/layout/config.yaml',do you know how to solve this problem?and the same error occured when python -m scripts.interpolate dir=docs/demo_inter,looking forward to your reply,thanks.

install_frankmocap.sh

Thanks for your great work,I want to ask where is install_frankmocap.sh,or is just git clone from https://github.com/judyye/frankmocap.git OK?

error in preprocessing

When python generate_data.py --data_dir data/ --save_dir output/ --inpaint(I put HOI4D_release and HOI4D_annotatinos dataset under data folder),an error:No such file or directory:data/HOI4D_release/ZY20210800001/.../align_frames/xxx.png,do you how to rectify this?