Giter Site home page Giter Site logo

glot's Introduction

GLoT: Global-to-Local Modeling for Video-based 3D Human Pose and Shape Estimation (CVPR2023)

Introduction

This repository is the official Pytorch implementation of Global-to-Local Modeling for Video-based 3D Human Pose and Shape Estimation.

The base codes are largely borrowed from VIBE and TCMR.

framework

See our paper for more details.

Results

Here I report the performance of GLoT.

table1

table2

Running GLoT

Installation

conda create -n glot python=3.7 -y
pip install torch==1.4.0 torchvision==0.5.0
pip install -r requirements.txt

Data preparation

  1. Download base_data and SMPL pkl (male&female and neutral), and then put them into ${ROOT}/data/base_data/. Rename SMPL pkl as SMPL_{GENDER}.pkl format. For example, mv basicModel_neutral_lbs_10_207_0_v1.0.0.pkl SMPL_NEUTRAL.pkl.

  2. Download data provided by TCMR (except InstaVariety dataset). Pre-processed InstaVariety is uploaded by VIBE authors here. Put them into ${ROOT}/data/preprocessed_data/

  3. Download models for testing. Put them into ${ROOT}/data/pretrained_models/

  4. Download images (e.g., 3DPW) for rendering. Put them into ${ROOT}/data/3dpw/

The data directory structure should follow the below hierarchy.

${ROOT}  
|-- data  
  |-- base_data  
    |-- J_regressor_extra.npy  
    |-- ...
  |-- preprocessed_data
    |-- 3dpw_train_db.pt
    |-- ...
  |-- pretrained_models
    |-- table1_3dpw_weights.pth.tar
    |-- ...
  |-- 3dpw
    |-- imageFiles
      |-- courtyard_arguing_00
      |-- ...

Evaluation

  • Run the evaluation code with a corresponding config file to reproduce the performance in the tables of our paper.
# Table1 3dpw
python evaluate.py --dataset 3dpw --cfg ./configs/repr_table1_3dpw.yaml --gpu 0 
# Table1 h36m
python evaluate.py --dataset h36m --cfg ./configs/repr_table1_h36m_mpii3d.yaml --gpu 0
# Table1 mpii3d
python evaluate.py --dataset mpii3d --cfg ./configs/repr_table1_h36m_mpii3d.yaml --gpu 0

# Table2 3dpw
python evaluate.py --dataset 3dpw --cfg ./configs/repr_table2_3dpw.yaml --gpu 0 

# for rendering 
python evaluate.py --dataset 3dpw --cfg ./configs/repr_table1_3dpw.yaml --gpu 0 --render

Reproduction (Training)

  • Run the training code with a corresponding config file to reproduce the performance in the tables of our paper.
# Table1 3dpw
python train_cosine_trans.py --cfg ./configs/repr_table1_3dpw.yaml --gpu 0 

# Table1 h36m & mpii3d
python train_cosine_trans.py --cfg ./configs/repr_table1_h36m_mpii3d.yaml --gpu 0 

# Table2 3dpw
python train_cosine_trans.py --cfg ./configs/repr_table2_3dpw.yaml --gpu 0 
  • After the training, change the config file's TRAIN.PRETRAINED with the checkpoint path (either checkpoint.pth.tar or model_best.pth.tar) and follow the evaluation command.

Quick demo

  • Download your videos, and run the following command.
python demo.py --vid_file demo.mp4 --gpu 0 --cfg ./configs/repr_table1_3dpw.yaml 
  • The results will be saved in ./demo_output/demo/

Reference

@inproceedings{shen2023global,
  title={Global-to-Local Modeling for Video-based 3D Human Pose and Shape Estimation},
  author={Shen, Xiaolong and Yang, Zongxin and Wang, Xiaohan and Ma, Jianxin and Zhou, Chang and Yang, Yi},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={8887--8896},
  year={2023}
}

License

This project is licensed under the terms of the MIT license.

glot's People

Contributors

sxl142 avatar

Stargazers

sky_konkuk avatar Jiayi Ouyang avatar  avatar Shuguang Dou avatar  avatar nehc0 avatar Jichao Zhang avatar  avatar En-Jhih Lo avatar xuheny avatar Mingwei Li avatar  avatar  avatar Soyong Shin avatar cyhm avatar SONG avatar Inpyo Song avatar Shuolin avatar Nguyễn Quí Vinh Quang avatar XiangxiTian avatar Linjun Wu avatar Jiaheng Liu avatar Xueting Yang avatar  avatar HonghuPan avatar Chao Liang avatar YoujianZhang avatar Sauradip Nag avatar Andrey Smorodov avatar  avatar Snow avatar Jeff Carpenter avatar tm avatar  avatar Jonnyshao avatar  avatar yuangan avatar bzcloud avatar ShaoZhihao avatar  avatar limuloo avatar Yingxuan You avatar  avatar  avatar  avatar Tao Tang avatar Frank Zhiyang Dou avatar Heng Jia avatar Sandalots avatar 爱可可-爱生活 avatar Naiyuan Liu avatar Xuanmeng Zhang avatar Hongwei Yi avatar zou hongwei avatar Jiahao Li avatar CYY avatar Shuai Zhao avatar Yucheng Suo avatar

Watchers

John D. Pope avatar PeterZs avatar Snow avatar  avatar  avatar

glot's Issues

When I was validating the model, I couldn't match the parameters. Is there a solution?

D:\tools\anaconda3\envs\GLOT2\python.exe "D:\python Project\GLoT-main\demo.py"
Running "ffmpeg -i ./video/taiji4.mp4 -r 30000/1001 -f image2 -v error ./tmp\taiji4_mp4/%06d.jpg"
Images saved to "./tmp\taiji4_mp4"
Input video number of frames 1507

C:\Users\MIO.torch/models/yolov3.weights
Running Multi-Person-Tracker
100%|██████████| 126/126 [01:07<00:00, 1.86it/s]
Finished. Detection + Tracking FPS 22.22
=> loaded pretrained model from './data/base_data\spin_model_checkpoint.pth.tar'
Load pretrained weights from './data/pretrained_models/table1_3dpw_weights.pth.tar'
Traceback (most recent call last):
File "D:\python Project\GLoT-main\demo.py", line 382, in
main(args, cfgs)
File "D:\python Project\GLoT-main\demo.py", line 118, in main
model.load_state_dict(ckpt, strict=False)
File "D:\tools\anaconda3\envs\GLOT2\lib\site-packages\torch\nn\modules\module.py", line 1052, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for GLoT:
size mismatch for regressor.smpl.shapedirs: copying a param with shape torch.Size([6890, 3, 10]) from checkpoint, the shape in current model is torch.Size([6890, 3, 300]).
size mismatch for global_modeling.regressor.smpl.shapedirs: copying a param with shape torch.Size([6890, 3, 10]) from checkpoint, the shape in current model is torch.Size([6890, 3, 300]).

Process finished with exit code 1

ImportError: ('Unable to load EGL library', 22, '找不到指定的模块。', None, 126, None, 'EGL', None)

Hi, I was interested in your model, so I tried to run your demo.py. I configured the environment according to the readme file given and encountered the following error when running the demo.py file.

Traceback (most recent call last):
File "C:\Users\meruijingz.conda\envs\glot\lib\site-packages\OpenGL\platform\egl.py", line 70, in EGL
mode=ctypes.RTLD_GLOBAL
File "C:\Users\meruijingz.conda\envs\glot\lib\site-packages\OpenGL\platform\ctypesloader.py", line 45, in loadLibrary
return dllType( name, mode )
File "C:\Users\meruijingz.conda\envs\glot\lib\ctypes_init_.py", line 364, in init
self._handle = _dlopen(self._name, mode)
OSError: [WinError 126] 找不到指定的模块。

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "demo.py", line 38, in
from lib.utils.renderer import Renderer
File "D:\GLoT\lib\utils\renderer.py", line 5, in
import pyrender
File "C:\Users\meruijingz.conda\envs\glot\lib\site-packages\pyrender_init_.py", line 3, in
from .light import Light, PointLight, DirectionalLight, SpotLight
File "C:\Users\meruijingz.conda\envs\glot\lib\site-packages\pyrender\light.py", line 11, in
from .texture import Texture
File "C:\Users\meruijingz.conda\envs\glot\lib\site-packages\pyrender\texture.py", line 8, in
from OpenGL.GL import *
File "C:\Users\meruijingz.conda\envs\glot\lib\site-packages\OpenGL\GL_init_.py", line 3, in
from OpenGL import error as _error
File "C:\Users\meruijingz.conda\envs\glot\lib\site-packages\OpenGL\error.py", line 12, in
from OpenGL import platform, configflags
File "C:\Users\meruijingz.conda\envs\glot\lib\site-packages\OpenGL\platform_init
.py", line 35, in
load()
File "C:\Users\meruijingz.conda\envs\glot\lib\site-packages\OpenGL\platform_init
.py", line 32, in _load
plugin.install(globals())
File "C:\Users\meruijingz.conda\envs\glot\lib\site-packages\OpenGL\platform\baseplatform.py", line 92, in install
namespace[ name ] = getattr(self,name,None)
File "C:\Users\meruijingz.conda\envs\glot\lib\site-packages\OpenGL\platform\baseplatform.py", line 14, in get
value = self.fget( obj )
File "C:\Users\meruijingz.conda\envs\glot\lib\site-packages\OpenGL\platform\egl.py", line 93, in GetCurrentContext
return self.EGL.eglGetCurrentContext
File "C:\Users\meruijingz.conda\envs\glot\lib\site-packages\OpenGL\platform\baseplatform.py", line 14, in get
value = self.fget( obj )
File "C:\Users\meruijingz.conda\envs\glot\lib\site-packages\OpenGL\platform\egl.py", line 73, in EGL
raise ImportError("Unable to load EGL library", *err.args)
ImportError: ('Unable to load EGL library', 22, '找不到指定的模块。', None, 126, None, 'EGL', None)

I tried commenting the line os.environ['PYOPENGL_PLATFORM'] = 'egl', but nothing changed, I still got this error. I want to know how to fix this problem.

Inquiry about the impact of awass dataset

Hello! I noticed that VIBE uses the AWASS dataset. However, it seems that TCMR does not use it while MPS-Net does. I’m wondering if the utilization of this dataset in GLoT has a significant impact on the results?

Curious about the config

Dear author,

Thank you for your great work! Recently I have been trying to replicate your work. I noticed that use_accel is set to False for all 3 configs provided. Should that be "True" instead (since it is mentioned in your method)?

Thank you so much.

Exploring the Challenges: Reproducing the Results with repr_table1_3dpw.yaml Configuration

Hello, author. I used the settings from the repr_table1_3dpw.yaml file you provided, but I couldn’t train a model with the same results. Could there be any other details that I overlooked?

{'CUDNN': CfgNode({'BENCHMARK': True, 'DETERMINISTIC': False, 'ENABLED': True}),
'DATASET': CfgNode({'SEQLEN': 16, 'OVERLAP': 0.5}),
'DEBUG': False,
'DEBUG_FREQ': 5,
'DEVICE': 'cuda',
'EXP_NAME': 'table1_3dpw_exp',
'LOGDIR': '/home/xyh/project/GLot/exp_dir/experiments/21-07-2023_19-27-43_table1_3dpw_exp',
'LOSS': {'D_MOTION_LOSS_W': 0.0,
'KP_2D_W': 300.0,
'KP_3D_W': 300.0,
'POSE_W': 60.0,
'SHAPE_W': 0.06,
'use_accel': False,
'vel_or_accel_2d_weight': 10.0,
'vel_or_accel_3d_weight': 100.0},
'MODEL': {'MODEL_NAME': 'GLoT',
'atten_drop': 0.0,
'd_model': 512,
'drop_path_r': 0.2,
'drop_reg_short': 0.25,
'dropout': 0.1,
'mask_ratio': 0.5,
'n_layers': 2,
'num_head': 8,
'short_atten_drop': 0.0,
'short_d_model': 256,
'short_drop_path_r': 0.2,
'short_dropout': 0.1,
'short_n_layers': 3,
'short_num_head': 8,
'stride_short': 4},
'NUM_WORKERS': 16,
'OUTPUT_DIR': '/home/xyh/project/GLot/exp_dir/experiments',
'SEED_VALUE': 1,
'TITLE': 'repr_table4_3dpw_model',
'TRAIN': {'BATCH_SIZE': 64,
'DATASETS_2D': ['Insta'],
'DATASETS_3D': ['ThreeDPW', 'MPII3D', 'Human36M'],
'DATASET_EVAL': 'ThreeDPW',
'DATA_2D_RATIO': 0.6,
'END_EPOCH': 50,
'GEN_LR': 0.0001,
'GEN_MOMENTUM': 0.9,
'GEN_OPTIM': 'Adam',
'GEN_WD': 0.0,
'LR_PATIENCE': 5,
'MOT_DISCR': {'ATT': {'DROPOUT': 0.2,
'LAYERS': 3,
'SIZE': 1024},
'FEATURE_POOL': 'attention',
'HIDDEN_SIZE': 1024,
'LR': 0.0001,
'MOMENTUM': 0.9,
'NUM_LAYERS': 2,
'OPTIM': 'Adam',
'UPDATE_STEPS': 1,
'WD': 0.0001},
'NUM_ITERS_PER_EPOCH': 1000,
'OVERLAP': 0.0,
'PRETRAINED': '/home/xyh/project/GLot/exp_dir/experiments/21-07-2023_17-08-53_table1_3dpw_exp/checkpoint.pth.tar',
'PRETRAINED_REGRESSOR': '/home/xyh/data/base_data/spin_model_checkpoint.pth.tar',
'RESUME': '',
'START_EPOCH': 0,
'val_epoch': 25},
'render': False}

About the training parameters

The datasets and the pretrained model you used are the same as TCMR, but I want to know why the shape of target_3d['w_3d'] and target['w_smpl'] in loading datasets are the different as TCMR? Where can these information be modified?

Query About Training & Evaluation Strategy

Hello,

Thank you for your interesting work, and for making the source code public.

Could you please clarify if you use the same pretrained checkpoint to evaluate your method on all 3 datasets (3DPW, Human3.6M and MPI-INF-3DHP) or do you use different models during evaluation? As per the paper, I thought that you are using a single checkpoint for evaluations, but looking at the evaluation instructions and this issue, I understand that you are using different checkpoints for evaluating on different datasets. Hence, I am confused. Kindly clarify.

Question about training dataset in table 1

Hello author, what a great work!

Looking at repr_table1_3dpw.yaml, the datasets you used are Insta, 3DPW, Human3.6M, and 3DHP.

However, looking at your thesis, it says that you also used pennaction and posetrack as training db. I wonder what kind of training dataset you used for the 3DPW performance in table1.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.