sxl142 / glot Goto Github PK

Global-to-Local Modeling for Video-based 3D Human Pose and Shape Estimation

Python 100.00%

glot's Introduction

GLoT: Global-to-Local Modeling for Video-based 3D Human Pose and Shape Estimation (CVPR2023)

Introduction

This repository is the official Pytorch implementation of Global-to-Local Modeling for Video-based 3D Human Pose and Shape Estimation.

The base codes are largely borrowed from VIBE and TCMR.

See our paper for more details.

Results

Here I report the performance of GLoT.

Running GLoT

Installation

conda create -n glot python=3.7 -y
pip install torch==1.4.0 torchvision==0.5.0
pip install -r requirements.txt

Data preparation

Download base_data and SMPL pkl (male&female and neutral), and then put them into ${ROOT}/data/base_data/. Rename SMPL pkl as SMPL_{GENDER}.pkl format. For example, mv basicModel_neutral_lbs_10_207_0_v1.0.0.pkl SMPL_NEUTRAL.pkl.
Download data provided by TCMR (except InstaVariety dataset). Pre-processed InstaVariety is uploaded by VIBE authors here. Put them into ${ROOT}/data/preprocessed_data/
Download models for testing. Put them into ${ROOT}/data/pretrained_models/
Download images (e.g., 3DPW) for rendering. Put them into ${ROOT}/data/3dpw/

The data directory structure should follow the below hierarchy.

${ROOT}  
|-- data  
  |-- base_data  
    |-- J_regressor_extra.npy  
    |-- ...
  |-- preprocessed_data
    |-- 3dpw_train_db.pt
    |-- ...
  |-- pretrained_models
    |-- table1_3dpw_weights.pth.tar
    |-- ...
  |-- 3dpw
    |-- imageFiles
      |-- courtyard_arguing_00
      |-- ...

Evaluation

Run the evaluation code with a corresponding config file to reproduce the performance in the tables of our paper.

# Table1 3dpw
python evaluate.py --dataset 3dpw --cfg ./configs/repr_table1_3dpw.yaml --gpu 0 
# Table1 h36m
python evaluate.py --dataset h36m --cfg ./configs/repr_table1_h36m_mpii3d.yaml --gpu 0
# Table1 mpii3d
python evaluate.py --dataset mpii3d --cfg ./configs/repr_table1_h36m_mpii3d.yaml --gpu 0

# Table2 3dpw
python evaluate.py --dataset 3dpw --cfg ./configs/repr_table2_3dpw.yaml --gpu 0 

# for rendering 
python evaluate.py --dataset 3dpw --cfg ./configs/repr_table1_3dpw.yaml --gpu 0 --render

Reproduction (Training)

Run the training code with a corresponding config file to reproduce the performance in the tables of our paper.

# Table1 3dpw
python train_cosine_trans.py --cfg ./configs/repr_table1_3dpw.yaml --gpu 0 

# Table1 h36m & mpii3d
python train_cosine_trans.py --cfg ./configs/repr_table1_h36m_mpii3d.yaml --gpu 0 

# Table2 3dpw
python train_cosine_trans.py --cfg ./configs/repr_table2_3dpw.yaml --gpu 0

After the training, change the config file's TRAIN.PRETRAINED with the checkpoint path (either checkpoint.pth.tar or model_best.pth.tar) and follow the evaluation command.

Quick demo

Download your videos, and run the following command.

python demo.py --vid_file demo.mp4 --gpu 0 --cfg ./configs/repr_table1_3dpw.yaml

The results will be saved in ./demo_output/demo/

Reference

@inproceedings{shen2023global,
  title={Global-to-Local Modeling for Video-based 3D Human Pose and Shape Estimation},
  author={Shen, Xiaolong and Yang, Zongxin and Wang, Xiaohan and Ma, Jianxin and Zhou, Chang and Yang, Yi},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={8887--8896},
  year={2023}
}

License

This project is licensed under the terms of the MIT license.

glot's People

Contributors

Stargazers

Watchers

Forkers

amoghtiwari marcballe lisaner000

glot's Issues

When I was validating the model, I couldn't match the parameters. Is there a solution?

D:\tools\anaconda3\envs\GLOT2\python.exe "D:\python Project\GLoT-main\demo.py"
Running "ffmpeg -i ./video/taiji4.mp4 -r 30000/1001 -f image2 -v error ./tmp\taiji4_mp4/%06d.jpg"
Images saved to "./tmp\taiji4_mp4"
Input video number of frames 1507

C:\Users\MIO.torch/models/yolov3.weights
Running Multi-Person-Tracker
100%|██████████| 126/126 [01:07<00:00, 1.86it/s]
Finished. Detection + Tracking FPS 22.22
=> loaded pretrained model from './data/base_data\spin_model_checkpoint.pth.tar'
Load pretrained weights from './data/pretrained_models/table1_3dpw_weights.pth.tar'
Traceback (most recent call last):
File "D:\python Project\GLoT-main\demo.py", line 382, in
main(args, cfgs)
File "D:\python Project\GLoT-main\demo.py", line 118, in main
model.load_state_dict(ckpt, strict=False)
File "D:\tools\anaconda3\envs\GLOT2\lib\site-packages\torch\nn\modules\module.py", line 1052, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for GLoT:
size mismatch for regressor.smpl.shapedirs: copying a param with shape torch.Size([6890, 3, 10]) from checkpoint, the shape in current model is torch.Size([6890, 3, 300]).
size mismatch for global_modeling.regressor.smpl.shapedirs: copying a param with shape torch.Size([6890, 3, 10]) from checkpoint, the shape in current model is torch.Size([6890, 3, 300]).

Process finished with exit code 1

ImportError: ('Unable to load EGL library', 22, '找不到指定的模块。', None, 126, None, 'EGL', None)

Hi, I was interested in your model, so I tried to run your demo.py. I configured the environment according to the readme file given and encountered the following error when running the demo.py file.

Traceback (most recent call last):
File "C:\Users\meruijingz.conda\envs\glot\lib\site-packages\OpenGL\platform\egl.py", line 70, in EGL
mode=ctypes.RTLD_GLOBAL
File "C:\Users\meruijingz.conda\envs\glot\lib\site-packages\OpenGL\platform\ctypesloader.py", line 45, in loadLibrary
return dllType( name, mode )
File "C:\Users\meruijingz.conda\envs\glot\lib\ctypes_init_.py", line 364, in init
self._handle = _dlopen(self._name, mode)
OSError: [WinError 126] 找不到指定的模块。

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "demo.py", line 38, in
from lib.utils.renderer import Renderer
File "D:\GLoT\lib\utils\renderer.py", line 5, in
import pyrender
File "C:\Users\meruijingz.conda\envs\glot\lib\site-packages\pyrender_init_.py", line 3, in
from .light import Light, PointLight, DirectionalLight, SpotLight
File "C:\Users\meruijingz.conda\envs\glot\lib\site-packages\pyrender\light.py", line 11, in
from .texture import Texture
File "C:\Users\meruijingz.conda\envs\glot\lib\site-packages\pyrender\texture.py", line 8, in
from OpenGL.GL import *
File "C:\Users\meruijingz.conda\envs\glot\lib\site-packages\OpenGL\GL_init_.py", line 3, in
from OpenGL import error as _error
File "C:\Users\meruijingz.conda\envs\glot\lib\site-packages\OpenGL\error.py", line 12, in
from OpenGL import platform, configflags
File "C:\Users\meruijingz.conda\envs\glot\lib\site-packages\OpenGL\platform_init.py", line 35, in
load()
File "C:\Users\meruijingz.conda\envs\glot\lib\site-packages\OpenGL\platform_init.py", line 32, in _load
plugin.install(globals())
File "C:\Users\meruijingz.conda\envs\glot\lib\site-packages\OpenGL\platform\baseplatform.py", line 92, in install
namespace[ name ] = getattr(self,name,None)
File "C:\Users\meruijingz.conda\envs\glot\lib\site-packages\OpenGL\platform\baseplatform.py", line 14, in get
value = self.fget( obj )
File "C:\Users\meruijingz.conda\envs\glot\lib\site-packages\OpenGL\platform\egl.py", line 93, in GetCurrentContext
return self.EGL.eglGetCurrentContext
File "C:\Users\meruijingz.conda\envs\glot\lib\site-packages\OpenGL\platform\baseplatform.py", line 14, in get
value = self.fget( obj )
File "C:\Users\meruijingz.conda\envs\glot\lib\site-packages\OpenGL\platform\egl.py", line 73, in EGL
raise ImportError("Unable to load EGL library", *err.args)
ImportError: ('Unable to load EGL library', 22, '找不到指定的模块。', None, 126, None, 'EGL', None)

I tried commenting the line os.environ['PYOPENGL_PLATFORM'] = 'egl', but nothing changed, I still got this error. I want to know how to fix this problem.

Please release training code !!!

Inquiry about the impact of awass dataset

Hello! I noticed that VIBE uses the AWASS dataset. However, it seems that TCMR does not use it while MPS-Net does. I’m wondering if the utilization of this dataset in GLoT has a significant impact on the results?

Curious about the config

Dear author,

Thank you for your great work! Recently I have been trying to replicate your work. I noticed that use_accel is set to False for all 3 configs provided. Should that be "True" instead (since it is mentioned in your method)?

Thank you so much.

When did you upload the complete training and testing code

Wonderful work!!

Exploring the Challenges: Reproducing the Results with repr_table1_3dpw.yaml Configuration

Hello, author. I used the settings from the repr_table1_3dpw.yaml file you provided, but I couldn’t train a model with the same results. Could there be any other details that I overlooked?

{'CUDNN': CfgNode({'BENCHMARK': True, 'DETERMINISTIC': False, 'ENABLED': True}),
'DATASET': CfgNode({'SEQLEN': 16, 'OVERLAP': 0.5}),
'DEBUG': False,
'DEBUG_FREQ': 5,
'DEVICE': 'cuda',
'EXP_NAME': 'table1_3dpw_exp',
'LOGDIR': '/home/xyh/project/GLot/exp_dir/experiments/21-07-2023_19-27-43_table1_3dpw_exp',
'LOSS': {'D_MOTION_LOSS_W': 0.0,
'KP_2D_W': 300.0,
'KP_3D_W': 300.0,
'POSE_W': 60.0,
'SHAPE_W': 0.06,
'use_accel': False,
'vel_or_accel_2d_weight': 10.0,
'vel_or_accel_3d_weight': 100.0},
'MODEL': {'MODEL_NAME': 'GLoT',
'atten_drop': 0.0,
'd_model': 512,
'drop_path_r': 0.2,
'drop_reg_short': 0.25,
'dropout': 0.1,
'mask_ratio': 0.5,
'n_layers': 2,
'num_head': 8,
'short_atten_drop': 0.0,
'short_d_model': 256,
'short_drop_path_r': 0.2,
'short_dropout': 0.1,
'short_n_layers': 3,
'short_num_head': 8,
'stride_short': 4},
'NUM_WORKERS': 16,
'OUTPUT_DIR': '/home/xyh/project/GLot/exp_dir/experiments',
'SEED_VALUE': 1,
'TITLE': 'repr_table4_3dpw_model',
'TRAIN': {'BATCH_SIZE': 64,
'DATASETS_2D': ['Insta'],
'DATASETS_3D': ['ThreeDPW', 'MPII3D', 'Human36M'],
'DATASET_EVAL': 'ThreeDPW',
'DATA_2D_RATIO': 0.6,
'END_EPOCH': 50,
'GEN_LR': 0.0001,
'GEN_MOMENTUM': 0.9,
'GEN_OPTIM': 'Adam',
'GEN_WD': 0.0,
'LR_PATIENCE': 5,
'MOT_DISCR': {'ATT': {'DROPOUT': 0.2,
'LAYERS': 3,
'SIZE': 1024},
'FEATURE_POOL': 'attention',
'HIDDEN_SIZE': 1024,
'LR': 0.0001,
'MOMENTUM': 0.9,
'NUM_LAYERS': 2,
'OPTIM': 'Adam',
'UPDATE_STEPS': 1,
'WD': 0.0001},
'NUM_ITERS_PER_EPOCH': 1000,
'OVERLAP': 0.0,
'PRETRAINED': '/home/xyh/project/GLot/exp_dir/experiments/21-07-2023_17-08-53_table1_3dpw_exp/checkpoint.pth.tar',
'PRETRAINED_REGRESSOR': '/home/xyh/data/base_data/spin_model_checkpoint.pth.tar',
'RESUME': '',
'START_EPOCH': 0,
'val_epoch': 25},
'render': False}

About the training parameters

The datasets and the pretrained model you used are the same as TCMR, but I want to know why the shape of target_3d['w_3d'] and target['w_smpl'] in loading datasets are the different as TCMR? Where can these information be modified？

Query About Training & Evaluation Strategy

Hello,

Thank you for your interesting work, and for making the source code public.

Could you please clarify if you use the same pretrained checkpoint to evaluate your method on all 3 datasets (3DPW, Human3.6M and MPI-INF-3DHP) or do you use different models during evaluation? As per the paper, I thought that you are using a single checkpoint for evaluations, but looking at the evaluation instructions and this issue, I understand that you are using different checkpoints for evaluating on different datasets. Hence, I am confused. Kindly clarify.

Pre-processed InstaVariety

What is the difference between Pre-processed InstaVariety of VIBE and TCMR?

Question about training dataset in table 1

Hello author, what a great work!

Looking at repr_table1_3dpw.yaml, the datasets you used are Insta, 3DPW, Human3.6M, and 3DHP.

However, looking at your thesis, it says that you also used pennaction and posetrack as training db. I wonder what kind of training dataset you used for the 3DPW performance in table1.