Giter Site home page Giter Site logo

silverster98 / humanise Goto Github PK

View Code? Open in Web Editor NEW
112.0 112.0 6.0 30.6 MB

Official implementation of the NeurIPS22 paper "HUMANISE: Language-conditioned Human Motion Generation in 3D Scenes"

Home Page: https://silverster98.github.io/HUMANISE/

License: MIT License

Python 99.68% Shell 0.32%
3d-scene-understanding deep-learning motion-generation

humanise's People

Contributors

silverster98 avatar thusiyuan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

humanise's Issues

GPU resource in training

Hi, I noticed that you mentioned you use a V100 GPU with batch size of 32 in training. However, I found it's hard to set batch size of 32 with a single V100 GPU. Could you tell me more detail about your environment and training?

How to render different view demo?

Hi, I noticed that you use fixed top-down view to visualize the results here. How could I modify the camera pose to get another view visualization such as looking form one corner of the room?

{scene_id}_vh_clean_2.ply doesn't have field of name label

Hi, I was running your script align_motion.py for aligning the motions with scenes, and seems that in this line labels[:] = plydata['vertex'].data['label'], you use property 'label' from {scene_id}_vh_clean_2.ply.
However, I cannot find 'label' under the ply file.
Looking forward to your reply!

The process of calculating the generation metrics is extremely slow

The bottleneck is in the SMPLX_Util.get_body_vertices_sequence, since it will load the smplx pretrained weights repeatedly. For example, there are 1319 examples of action walk, and the k == 10, then the number of times of the loading process will be 13190, making the IO time extremely long. My suggestion is: simply instance 1 smplx model with batch_size=max_motion_len, and select the unmasked smpl parameters after the inference of smplx model.
Here is my code to test the time cost of three modes:

import torch
import smplx
import mmengine as me


test_mode = 'cuda'  # cpu, cuda, cuda_static

device = 'cpu'
if test_mode in ['cuda', 'cuda_static']:
    device = 'cuda'

seq_len = 60
torch_param = dict()
torch_param['body_pose'] = torch.randn(seq_len, 63).to(device)
torch_param['betas'] = torch.randn(seq_len, 10).to(device)
torch_param['transl'] = torch.randn(seq_len, 3).to(device)
torch_param['global_orient'] = torch.randn(seq_len, 3).to(device)
torch_param['left_hand_pose'] = torch.randn(seq_len, 45).to(device)
torch_param['right_hand_pose'] = torch.randn(seq_len, 45).to(device)

static_model = smplx.create(model_path='data/models_smplx_v1_1/models',
                            model_type='smplx',
                            gender='neutral',
                            num_betas=10,
                            use_pca=False,
                            batch_size=seq_len,
                            ext='npz')

static_model = static_model.to(device)

for i in me.track_iter_progress(range(100)):
    if test_mode in ['cpu', 'cuda']:
        model = smplx.create(model_path='data/models_smplx_v1_1/models',
                             model_type='smplx',
                             gender='neutral',
                             num_betas=10,
                             use_pca=False,
                             batch_size=seq_len,
                             ext='npz').to(device)
        output = model(return_verts=True, **torch_param)
    elif test_mode == 'cuda_static':
        output = static_model(return_verts=True, **torch_param)

When test_mode = 'cpu':
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 100/100, 1.7 task/s, elapsed: 58s, ETA: 0s
When test_mode = 'cuda':
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 100/100, 1.7 task/s, elapsed: 60s, ETA: 0s
When test_mode = 'cuda_static':
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 100/100, 40.2 task/s, elapsed: 2s, ETA: 0s
40x faster when the seq_len=60, half of the max_motion_len.

About align my own motion data with HUMANISE setting

Wang, it is really a great work on this project! I’m new to it and I would like to compare HUMANISE with my work in my paper.
However, I am not sure how to align my motion data, which is generated by my motion diffusion model, with your setting. Specifically, I don't know how to align the Coordinate System of my motion with HUMANISE. if I use visualize_dataset.py with my motion data directly, the motion data I generated will have the wrong initial orientation in visualization .
Thus, could you provide me with the detail about the motion Coordinate System of HUMANISE? I would appreciate it.

My.mp4

The code implementation for more actions

Hi! In your paper, you showed some examples of generalizing to 'jump up', 'turn to', 'open' and 'place' motions. I wonder if you can provide the code of generating these types of motions? Because in data/align_motion.py (line 1798-1808), I only find support of the 4 default actions:

## sample valid position and rotation according to sit action
        if action == 'sit':
            action_align = SitAlign(self.annotations, self.instance_to_semantic, self.label_mapping, scene_path, static_scene, static_scene_label, translate_mat, body_vertices, joints_traj)
        elif action == 'stand up':
            action_align = StandUpAlign(self.annotations, self.instance_to_semantic, self.label_mapping, scene_path, static_scene, static_scene_label, translate_mat, body_vertices, joints_traj)
        elif action == 'walk':
            action_align = WalkAlign(self.annotations, self.instance_to_semantic, self.label_mapping, scene_path, static_scene, static_scene_label, translate_mat, body_vertices, joints_traj)
        elif action =='lie':
            action_align = LieAlign(self.annotations, self.instance_to_semantic, self.label_mapping, scene_path, static_scene, static_scene_label, translate_mat, body_vertices, joints_traj)
        else:
            raise Exception('Unsupport action: {}'.format(action))

Thanks!

Align motion for Scannet test split?

Hi @Silverster98

In the paper you mention that you follow Scannet's original train-test split and get 16.5k motions in 543 scenes for training and 3.1k motions in 100 scenes for testing.

  1. How do you align motion for Scannet's test split?
    In Scannet's test set, I see that every scene only has "_00_vh_clean_2.ply" files and does not have other label files such as "_00_vh_clean_2.labels.ply", ".aggregation.json" and "_vh_clean_2.0.010000.segs.json" which are needed for motion alignment.
    Am I missing something?

  2. Also I had a Minor doubt
    Scannet has 707 scenes in its train split. Why does HUMANISE have 543 scene in its train split? Is it because while generating alignment motion it so happened that for some scenes, out of 10, none of the motions were selected as they could not follow the alignment constraints?

Looking forward for your reply.
Thank you!

Action-Specific Model path

I ran the evaluation script as the following but got very poor results. The avatar is just flying with meaningless motions. Is there any extra parameters setting or anything I have missed?
bash scripts/eval.sh 20220829_194320 "walk"

motion.mp4

Extract motion segments from AMASS with BABEL

Hi @Silverster98,
I was running python dataset/babel_process.py --action "walk" for extracting walk motion segments. The total segments that I get, irrespective of time duration, is 6345. And the selected motions between the time range of 1sec to 4sec are 3844.
However, in the dataset which you provide, the pure_motion for walk has only 777 motion segments.

  1. What is the time duration you set for humanise dataset? Is it 1sec to 4sec itself?
  2. Do you perform any human/automatic checks to remove motion segments having inconsistent walk motion (as babels annotations have some error) to go from 3844 to 777?

Thanks!

Visualization

Hi Wang,
Nice work! I notice that you have visualized your generated human motion in .mp4 and .gif files. Since I want to consider your method as one of our baseline methods and analyze their qualitative comparisons. Can you provide the codes that visualize all generated poses in a .png file, like the following example?
image

Many Thanks!

Quantitative Evaluation

Hi, I run the following command the eval the model.

bash scripts/eval_metric.sh 20220829_194320 "walk"

I get a file named recon.json in the folder, and I have a few questions.

  1. How could I get another generation.json with generation metrics?
  2. How could I reproduce table2 in your paper?

ScanNet V2 dataset

Thanks for sharing your code. I notice that you mention ScanNet V2 dataset in Data Preparation. Do I need to download the whole ScanNet V2 dataset to train my model in your HUMANISE dataset?

Scan2Cad dataset annotations

Hello @Silverster98,
Thanks for sharing this detailed repo for your interesting work.

Can you please provide the Scan2Cad dataset's full_annonations.json, as the download link provided by the authors isn't working?

Thanks!

ModuleNotFoundError: No module named 'pointops_cuda'

Hello,

I am having an error when executing : bash scripts/train.sh sit

  File "./project/HUMANISE/model/pointtransformer/pointops.py", line 7, in <module>
    import pointops_cuda
ModuleNotFoundError: No module named 'pointops_cuda'

And I find the same error in POSTECH-CVLab/point-transformer#27

But when I found the file in ./miniconda3 (like find ./miniconda3/ -name "pointops*"), I realized that I don't have this file at all.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.