Giter Site home page Giter Site logo

idea-research / osx Goto Github PK

View Code? Open in Web Editor NEW
573.0 16.0 49.0 12.21 MB

[CVPR 2023] Official implementation of the paper "One-Stage 3D Whole-Body Mesh Recovery with Component Aware Transformer"

Home Page: https://osx-ubody.github.io/

License: MIT License

Python 97.12% C++ 0.24% Cuda 2.63% Shell 0.01%
3d-body-recovery cvpr2023 human-pose-estimation smpl-model smplx whole-body-pose-estimation

osx's Introduction

One-Stage 3D Whole-Body Mesh Recovery with Component Aware Transformer

Authors

Jing Lin, Ailing Zeng, Haoqian Wang, Lei Zhang, Yu Li


The proposed UBody dataset

News

  • 2023.10.12 : UBody is now supported in MMPose. Please feel free to use it. 🌟
  • 2023.07.28 : UBody can boost 2D whole-body pose estimation and controllable image generation, especially for in-the-wild hand keypoint detection. The training and test code and pre-trained models are released. See details. 🥳
  • 2023.05.03 : UBody-V1 is released. We'll release UBody-V2 later, which have manually annotated bboxes. 🕺
  • 2023.04.17 : We fix bug of rendering in A100/V100 and support yolov5 as a person detector in demo.py. 🚀
  • 2023.04.15 : We merge OSX into Grounded-SAM and support promptable 3D whole-body mesh recovery. 🔥


Demo of Grounded-SAM-OSX.

space-1.jpg
A person with pink clothes
space-1.jpg
A man with a sunglasses

1. Introduction

This repo is official PyTorch implementation of One-Stage 3D Whole-Body Mesh Recovery with Component Aware Transformer (CVPR2023). We propose the first one-stage whole-body mesh recovery method (OSX) and build a large-scale upper-body dataset (UBody). It is the top-1 method on AGORA benchmark SMPL-X Leaderboard (dated March 2023).

2. Create Environment

  • PyTorch >= 1.7 + CUDA

    Recommend to install by:

    pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113
  • Python packages:

    bash install.sh

3. Quick demo

  • Download the pre-trained OSX from here.
  • Prepare pre-trained snapshot at pretrained_models folder.
  • Prepare human_model_files folder following below Directory part and place it at common/utils/human_model_files.
  • Go to demo folders, and run python demo.py --gpu 0 --img_path IMG_PATH --output_folder OUTPUT_FOLDER . Please replace IMG_PATH and OUTPUT_FOLDRE with your own image path and saving folder. For a more efficient inference, you can add --decoder_setting wo_decoder --pretrained_model_path ../pretrained_models/osx_l_wo_decoder.pth.tar to use the encoder-only version OSX.
  • If you run this code in ssh environment without display device, do follow:
1、Install oemesa follow https://pyrender.readthedocs.io/en/latest/install/
2、Reinstall the specific pyopengl fork: https://github.com/mmatl/pyopengl
3、Set opengl's backend to egl or osmesa via os.environ["PYOPENGL_PLATFORM"] = "egl"

4. Directory

(1) Root

The ${ROOT} is described as below.

${ROOT}  
|-- data  
|-- dataset
|-- demo
|-- main  
|-- pretrained_models
|-- tool
|-- output  
|-- common
|   |-- utils
|   |   |-- human_model_files
|   |   |   |-- smpl
|   |   |   |   |-- SMPL_NEUTRAL.pkl
|   |   |   |   |-- SMPL_MALE.pkl
|   |   |   |   |-- SMPL_FEMALE.pkl
|   |   |   |-- smplx
|   |   |   |   |-- MANO_SMPLX_vertex_ids.pkl
|   |   |   |   |-- SMPL-X__FLAME_vertex_ids.npy
|   |   |   |   |-- SMPLX_NEUTRAL.pkl
|   |   |   |   |-- SMPLX_to_J14.pkl
|   |   |   |   |-- SMPLX_NEUTRAL.npz
|   |   |   |   |-- SMPLX_MALE.npz
|   |   |   |   |-- SMPLX_FEMALE.npz
|   |   |   |-- mano
|   |   |   |   |-- MANO_LEFT.pkl
|   |   |   |   |-- MANO_RIGHT.pkl
|   |   |   |-- flame
|   |   |   |   |-- flame_dynamic_embedding.npy
|   |   |   |   |-- flame_static_embedding.pkl
|   |   |   |   |-- FLAME_NEUTRAL.pkl
  • data contains data loading codes.
  • dataset contains soft links to images and annotations directories.
  • pretrained_models contains pretrained models.
  • demo contains demo codes.
  • main contains high-level codes for training or testing the network.
  • tool contains pre-processing codes of AGORA and pytorch model editing codes.
  • output contains log, trained models, visualized outputs, and test result.
  • common contains kernel codes for Hand4Whole.
  • human_model_files contains smpl, smplx, mano, and flame 3D model files. Download the files from [smpl] [smplx] [SMPLX_to_J14.pkl] [mano] [flame]. We provide the download links for each file here.

(2) Data

You need to follow directory structure of the dataset as below.

${ROOT}  
|-- dataset  
|   |-- AGORA
|   |   |-- data
|   |   |   |-- AGORA_train.json
|   |   |   |-- AGORA_validation.json
|   |   |   |-- AGORA_test_bbox.json
|   |   |   |-- 1280x720
|   |   |   |-- 3840x2160
|   |-- EHF
|   |   |-- data
|   |   |   |-- EHF.json
|   |-- Human36M  
|   |   |-- images  
|   |   |-- annotations  
|   |-- MPII
|   |   |-- data
|   |   |   |-- images
|   |   |   |-- annotations
|   |-- MPI_INF_3DHP
|   |   |-- data
|   |   |   |-- images_1k
|   |   |   |-- MPI-INF-3DHP_1k.json
|   |   |   |-- MPI-INF-3DHP_camera_1k.json
|   |   |   |-- MPI-INF-3DHP_joint_3d.json
|   |   |   |-- MPI-INF-3DHP_SMPL_NeuralAnnot.json
|   |-- MSCOCO  
|   |   |-- images  
|   |   |   |-- train2017  
|   |   |   |-- val2017  
|   |   |-- annotations 
|   |-- PW3D
|   |   |-- data
|   |   |   |-- 3DPW_train.json
|   |   |   |-- 3DPW_validation.json
|   |   |   |-- 3DPW_test.json
|   |   |-- imageFiles
|   |-- UBody
|   |   |-- images
|   |   |-- videos
|   |   |-- annotations
|   |   |-- splits
|   |   |   |-- inter_scene_test_list.npy
|   |   |   |-- intra_scene_test_list.npy

(3) Output

You need to follow the directory structure of the output folder as below.

${ROOT}  
|-- output  
|   |-- log  
|   |-- model_dump  
|   |-- result  
|   |-- vis  
  • Creating output folder as soft link form is recommended instead of folder form because it would take large storage capacity.
  • log folder contains training log file.
  • model_dump folder contains saved checkpoints for each epoch.
  • result folder contains final estimation files generated in the testing stage.
  • vis folder contains visualized results.

5. Training OSX

(1) Download Pretrained Encoder

Download pretrained encoder osx_vit_l.pth and osx_vit_b.pth from here and place the pretrained model to pretrained_models/.

(2) Setting1: Train on MSCOCO, Human3.6m, MPII and Test on EHF and AGORA-val

In the main folder, run

python train.py --gpu 0,1,2,3 --lr 1e-4 --exp_name output/train_setting1 --end_epoch 14 --train_batch_size 16

After training, run the following command to evaluate your pretrained model on EHF and AGORA-val:

# test on EHF
python test.py --gpu 0,1,2,3 --exp_name output/train_setting1/ --pretrained_model_path ../output/train_setting1/model_dump/snapshot_13.pth.tar --testset EHF
# test on AGORA-val
python test.py --gpu 0,1,2,3 --exp_name output/train_setting1/ --pretrained_model_path ../output/train_setting1/model_dump/snapshot_13.pth.tar --testset AGORA

To speed up, you can use a light-weight version OSX by change the encoder setting by adding --encoder_setting osx_b or change the decoder setting by adding --decoder_setting wo_face_decoder. We recommend adding --decoder_setting wo_face_decoder as it would obviously speed up and would not lead to significant performance decline. It takes about 20 hours to finish the training with one NVIDIA A100.

(3) Setting2: Train on AGORA and Test on AGORA-test

In the main folder, run

python train.py --gpu 0,1,2,3 --lr 1e-4 --exp_name output/train_setting2 --end_epoch 140 --train_batch_size 16  --agora_benchmark --decoder_setting wo_decoder

After training, run the following command to evaluate your pretrained model on AGORA-test:

python test.py --gpu 0,1,2,3 --exp_name output/train_setting2/ --pretrained_model_path ../output/train_setting2/model_dump/snapshot_139.pth.tar --testset AGORA --agora_benchmark --test_batch_size 64 --decoder_setting wo_decoder

The reconstruction result will be saved at output/train_setting2/result/.

You can zip the predictions folder into predictions.zip and submit it to the AGORA benchmark to obtain the evaluation metrics.

You can use a light-weight version OSX by adding --encoder_setting osx_b.

(4) Setting3: Train on MSCOCO, Human3.6m, MPII, UBody-Train and Test on UBody-val

In the main folder, run

python train.py --gpu 0,1,2,3 --lr 1e-4 --exp_name output/train_setting3 --train_batch_size 16  --ubody_benchmark --decoder_setting wo_decoder

After training, run the following command to evaluate your pretrained model on UBody-test:

python test.py --gpu 0,1,2,3 --exp_name output/train_setting3/ --pretrained_model_path ../output/train_setting3/model_dump/snapshot_13.pth --testset UBody --test_batch_size 64 --decoder_setting wo_decoder 

The reconstruction result will be saved at output/train_setting3/result/.

6. Testing OSX

(1) Download Pretrained Models

Download pretrained models osx_l.pth.tar and osx_l_agora.pth.tar from here and place the pretrained model to pretrained_models/.

(2) Test on EHF

In the main folder, run

python test.py --gpu 0,1,2,3 --exp_name output/test_setting1 --pretrained_model_path ../pretrained_models/osx_l.pth.tar --testset EHF

(3) Test on AGORA-val

In the main folder, run

python test.py --gpu 0,1,2,3 --exp_name output/test_setting1 --pretrained_model_path ../pretrained_models/osx_l.pth.tar --testset AGORA

(4) Test on AGORA-test

In the main folder, run

python test.py --gpu 0,1,2,3 --exp_name output/test_setting2  --pretrained_model_path ../pretrained_models/osx_l_agora.pth.tar --testset AGORA --agora_benchmark --test_batch_size 64

The reconstruction result will be saved at output/test_setting2/result/.

You can zip the predictions folder into predictions.zip and submit it to the AGORA benchmark to obtain the evaluation metrics.

(5) Test on UBody-test

In the main folder, run

python test.py --gpu 0,1,2,3 --exp_name output/test_setting3  --pretrained_model_path ../pretrained_models/osx_l_wo_decoder.pth.tar --testset UBody --test_batch_size 64

The reconstruction result will be saved at output/test_setting3/result/.

7. Results

(1) AGORA test set

image-20230327202353903

(2) AGORA-val, EHF, 3DPW

image-20230327202755593

image-20230327204220453

Troubleshoots

  • RuntimeError: Subtraction, the '-' operator, with a bool tensor is not supported. If you are trying to invert a mask, use the '~' or 'logical_not()' operator instead.: Go to here

  • TypeError: startswith first arg must be bytes or a tuple of bytes, not str.: Go to here.

Acknowledgement

This repo is mainly based on Hand4Whole. We thank the well-organized code and patient answers of Gyeongsik Moon in the issue!

Reference

@inproceedings{lin2023one,
  title={One-Stage 3D Whole-Body Mesh Recovery with Component Aware Transformer},
  author={Lin, Jing and Zeng, Ailing and Wang, Haoqian and Zhang, Lei and Li, Yu},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={21159--21168},
  year={2023}
}

osx's People

Contributors

ailingzengzzz avatar guspan-tanadi avatar linjing7 avatar osx-ubody avatar walterhuang23 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

osx's Issues

about inference time

Thank you for your great work. As shown in paper, the inference time of OSX is 54.6ms in NVIDIA A100 GPU, and has accelerated deployment been used (such as tensorrt) ? In addition, is it possible to replace the backbone network with a more lightweight network , such as mobilenet?

Is it possible to make this model lighter for using mobile phone?

Hi! Thank you for sharing your great work.

I have a question about porting to mobile phones.
I'm a newbie in this field, so please bear with me.

Do you think it is possible to make this model lighter for using mobile phones?

If you think it is possible, could you give me any tips for me? Thanks!

demo/demo.py:Virtual environment requirements for demo.py operation

The use of demo. py did not inform the specific usage environment, including versions of mmcv, mmpose, and mmdet. I attempted to install mmcv=1.6.2, mmpose=0.29.0, and mmdet=2.23.0
But the following error will be reported:
image
The file related to demo.py contains' from config import cfg ', which is an incorrect reference path
Fix bug:
Copying main/config.py to the project root directory can solve the bug.

运行中断

IMG_20230625_161718
IMG_20230625_164206
运行中途分别出现两种问题,OMP:ERROR#15,经过网上方法之后删除libiomp5mb.dll,在第二次再次出现这个问题。
第二个是爬虫问题,作为科研新手想知道有没有好的解决方法

problem about load pre_trained models

when i load osx_l_agora.pth.tar and osx_l_wo_decoder.pth.tar ,something went wrong . This should be a model loading problem, what configuration files do I need to modify or is the model you released different from the latest code.

Thank you for your excellent work and look forward to your reply!

size mismatch for module.body_position_net.conv.0.weight: copying a param with shape torch.Size([400, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([400, 768, 1, 1]).
size mismatch for module.box_net.deconv.0.weight: copying a param with shape torch.Size([1424, 256, 4, 4]) from checkpoint, the shape in current model is torch.Size([1168, 256, 4, 4]).
size mismatch for module.hand_position_net.conv.0.weight: copying a param with shape torch.Size([320, 1024, 1, 1]) from checkpoint, the shape in current model is torch.Size([320, 384, 1, 1]).
size mismatch for module.face_regressor.expr_out.0.weight: copying a param with shape torch.Size([10, 1024]) from checkpoint, the shape in current model is torch.Size([10, 37592]).
size mismatch for module.face_regressor.jaw_pose_out.0.weight: copying a param with shape torch.Size([6, 1024]) from checkpoint, the shape in current model is torch.Size([6, 37592]).

flame_static_embedding.pkl file pickle load issue

I working on linux, there is pickle load issue with the flame_static_embedding.pkl file

grounded-sam-osx/utils/smplx/smplx/body_models.py:1860 in __init__1857 │   │                                                                         
│   1858 │   │   with open(landmark_bcoord_filename, 'rb') as fp:                       
│   1859 │   │   │   print(f"landmark_bcoord_filename:{landmark_bcoord_filename}")     
│ ❱ 1860 │   │   │   landmarks_data = pickle.load(fp, encoding='latin1')               
│   1861 │   │                                                                         
│   1862 │   │   lmk_faces_idx = landmarks_data['lmk_face_idx'].astype(np.int64)       
│   1863 │   │   self.register_buffer('lmk_faces_idx', 

the pickle file flame_static_embedding.pkl from repo https://github.com/soubhiksanyal/RingNet may write in Windows, and file failed to load in Linux

problem when run demo on ubuntu

I have everything ready, and run "python demo.py --gpu 0 --img_path input.png --output_folder=../output", but I got the follows:

Using GPU: 0
WARNING: You are using a SMPL model, with only 10 shape coefficients.
WARNING: You are using a SMPL model, with only 10 shape coefficients.
WARNING: You are using a SMPL model, with only 10 shape coefficients.
/home/hume/anaconda3/lib/python3.10/site-packages/mmcv/init.py:20: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
warnings.warn(
04-26 17:54:56 Load checkpoint from ../pretrained_models/osx_l.pth.tar
04-26 17:54:56 Creating graph...
Traceback (most recent call last):
File "/home/hume/anaconda3/lib/python3.10/site-packages/mmcv/utils/registry.py", line 69, in build_from_cfg
return obj_cls(**args)
File "/home/hume/anaconda3/lib/python3.10/site-packages/mmpose/models/detectors/top_down.py", line 48, in init
self.backbone = builder.build_backbone(backbone)
File "/home/hume/anaconda3/lib/python3.10/site-packages/mmpose/models/builder.py", line 19, in build_backbone
return BACKBONES.build(cfg)
File "/home/hume/anaconda3/lib/python3.10/site-packages/mmcv/utils/registry.py", line 237, in build
return self.build_func(*args, **kwargs, registry=self)
File "/home/hume/anaconda3/lib/python3.10/site-packages/mmcv/cnn/builder.py", line 27, in build_model_from_cfg
return build_from_cfg(cfg, registry, default_args)
File "/home/hume/anaconda3/lib/python3.10/site-packages/mmcv/utils/registry.py", line 61, in build_from_cfg
raise KeyError(
KeyError: 'ViT is not in the models registry'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/hume/PycharmProjects/OSX/demo/demo.py", line 44, in
demoer._make_model()
File "/home/hume/PycharmProjects/OSX/main/../common/base.py", line 207, in _make_model
model = get_model('test')
File "/home/hume/PycharmProjects/OSX/demo/../main/OSX.py", line 418, in get_model
vit = build_posenet(vit_cfg.model)
File "/home/hume/anaconda3/lib/python3.10/site-packages/mmpose/models/builder.py", line 39, in build_posenet
return POSENETS.build(cfg)
File "/home/hume/anaconda3/lib/python3.10/site-packages/mmcv/utils/registry.py", line 237, in build
return self.build_func(*args, **kwargs, registry=self)
File "/home/hume/anaconda3/lib/python3.10/site-packages/mmcv/cnn/builder.py", line 27, in build_model_from_cfg
return build_from_cfg(cfg, registry, default_args)
File "/home/hume/anaconda3/lib/python3.10/site-packages/mmcv/utils/registry.py", line 72, in build_from_cfg
raise type(e)(f'{obj_cls.name}: {e}')
KeyError: "TopDown: 'ViT is not in the models registry'"

problem when bash install.sh

Traceback (most recent call last):
File "setup.py", line 169, in
long_description=readme(),
File "setup.py", line 11, in readme
with open('README.md', encoding='utf-8') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'README.md'

how to deal with, thanks.

The cam_trans jitters a lot in z-axis

Hi, thanks for your attention. I suppose that the cam_trans' unit is meter, and it jitters a lot in z-axis. Am I using it right? Is there a better way to predict the transitions of root joint?

运行demo.py出现的问题

image

在进行python demo.py --gpu 0 --img_path IMG_PATH --output_folder OUTPUT_FOLDER运行时出现Unable to load EGL library,不知道问题出在哪块

2D joints show

Hi, sorry to bother you, but when I run the demo.py and get the output['smplx_joint_proj'], I show the joints with the function vis_keypoints() in vis.py, but I got the following result:
image
can you tell me what's wrong whit this Situation?

run demo.py on windows while occur problems

D:\applications\envs\osx\lib\site-packages\mmcv_init_.py:20: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
warnings.warn(
Traceback (most recent call last):
File "D:\applications\envs\osx\lib\site-packages\OpenGL\platform\osmesa.py", line 22, in GL
return ctypesloader.loadLibrary(
File "D:\applications\envs\osx\lib\site-packages\OpenGL\platform\ctypesloader.py", line 45, in loadLibrary
return dllType( name, mode )
File "D:\applications\envs\osx\lib\ctypes_init_.py", line 373, in init
self._handle = _dlopen(self._name, mode)
FileNotFoundError: ("Could not find module 'OSMesa' (or one of its dependencies). Try using the full path with constructor syntax.", 'OSMesa', None)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "D:\Project\OSX-main\OSX-main\main\demo.py", line 15, in
from utils.vis import render_mesh, save_obj
File "D:\Project\OSX-main\OSX-main\main..\common\utils\vis.py", line 8, in
import pyrender
File "D:\applications\envs\osx\lib\site-packages\pyrender_init_.py", line 3, in
from .light import Light, PointLight, DirectionalLight, SpotLight
File "D:\applications\envs\osx\lib\site-packages\pyrender\light.py", line 11, in
from .texture import Texture
File "D:\applications\envs\osx\lib\site-packages\pyrender\texture.py", line 8, in
from OpenGL.GL import *
File "D:\applications\envs\osx\lib\site-packages\OpenGL\GL_init_.py", line 3, in
from OpenGL import error as _error
File "D:\applications\envs\osx\lib\site-packages\OpenGL\error.py", line 12, in
from OpenGL import platform, configflags
File "D:\applications\envs\osx\lib\site-packages\OpenGL\platform_init
.py", line 35, in
load()
File "D:\applications\envs\osx\lib\site-packages\OpenGL\platform_init
.py", line 32, in _load
plugin.install(globals())
File "D:\applications\envs\osx\lib\site-packages\OpenGL\platform\baseplatform.py", line 92, in install
namespace[ name ] = getattr(self,name,None)
File "D:\applications\envs\osx\lib\site-packages\OpenGL\platform\baseplatform.py", line 14, in get
value = self.fget( obj )
File "D:\applications\envs\osx\lib\site-packages\OpenGL\platform\osmesa.py", line 66, in GetCurrentContext
function = self.OSMesa.OSMesaGetCurrentContext
File "D:\applications\envs\osx\lib\site-packages\OpenGL\platform\baseplatform.py", line 14, in get
value = self.fget( obj )
File "D:\applications\envs\osx\lib\site-packages\OpenGL\platform\osmesa.py", line 60, in OSMesa
def OSMesa( self ): return self.GL
File "D:\applications\envs\osx\lib\site-packages\OpenGL\platform\baseplatform.py", line 14, in get
value = self.fget( obj )
File "D:\applications\envs\osx\lib\site-packages\OpenGL\platform\osmesa.py", line 28, in GL
raise ImportError("Unable to load OpenGL library", *err.args)
ImportError: ('Unable to load OpenGL library', "Could not find module 'OSMesa' (or one of its dependencies). Try using the full path with constructor syntax.", 'OSMesa', None)

We follow the advice on "https://github.com/mkocabas/VIBE/issues/101",but not work.
I don't know how to fix the issue.

When I making annotation files, I encountered the following error.

When I making annotation files, I encountered the following error.

Traceback (most recent call last):
File "agora2coco.py", line 63, in
smplx_layer = smplx.create(args.human_model_path, 'smplx', use_pca=False)
File "/root/anaconda3/envs/test/lib/python3.8/site-packages/smplx/body_models.py", line 2404, in create
return SMPLX(model_path, **kwargs)
File "/root/anaconda3/envs/test/lib/python3.8/site-packages/smplx/body_models.py", line 969, in init
assert osp.exists(smplx_path), 'Path {} does not exist!'.format(
AssertionError: Path /data/OSX/dataset/AGORA/data/smplx does not exist!

So I download and unzip models_smplx_v1_1.zip,It reported the following error.

Traceback (most recent call last):
File "agora2coco.py", line 149, in
joints_2d_lhand = joints_2d[smplx_joint_part['lhand'],:]
IndexError: index 66 is out of bounds for axis 0 with size 45

What files should I place under the human model layer path?

Question about how to get skeleton image

Hi, sorry to bother you, I would like to ask how do I get the skeleton key-points show in original image in the demo, which can only generate the 3Dmesh of the original image?Thank you very much.

How can I prepare folder "human_model_files"

I am a newbie of pose estimation and want to leverage the OSX to extract 3d mesh body model and try to inference with the code. I have signed up a lot of account of xxx.is.tue.mpg.de and download a lot of models but turn out still not enough to preparing the folder according readme. Here are downloaded files:
image
Current match:
image
Any tips of that?

Ubody Dataset

Hi authors, thanks for the great work!

I have a few questions:

  1. Is the Ubody dataset used in the training of the experiments in Table 3?
  2. When will the dataset be released? :)

Results of OSX

Hi, from the results (the same person with pink clothes, as the below imgs shown) you showed in the README.md, it seems that bigger bounding box has better results. Is the region of bounding box really matters?
image

image

"TopDown: 'ViT is not in the models registry'"

When I run demo.py, there seems a bug: "TopDown: 'ViT is not in the models registry'". I have run setup.py in main/transformer_utils and my mmpose version is 0.29.0. Hope to get your reply.

Question about Camera Parameters | Project model in Blender

Hello! Thanks for your work.

My question is about projecting a mesh. I am trying to change your Intrinsiс camera (with fx fy cx cy) to an FOV camera with right overlaying on image for further usage in a blender. Can you please suggest how this can be done?

about pre-trained models

Hello. How do you get pretrained encoder osx_vit_l.pth and osx_vit_b.pth? Is it trained on MSCOCO, Human3.6m, MPII from scratch?

Problem about load model

Hello, thank you for your excellent work! I trained and saved the snapshot_13.pth.tar using the training code you provided. But I encountered the following issues during testing:

raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for DataParallel:
size mismatch for module.face_regressor.expr_out.0.weight: copying a param with shape torch.Size([10, 1024]) from checkpoint, the shape in current model is torch.Size([10, 37592]).
size mismatch for module.face_regressor.jaw_pose_out.0.weight: copying a param with shape torch.Size([6, 1024]) from checkpoint, the shape in current model is torch.Size([6, 37592]).

If I have overlooked any details?

RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

when "python demo --gpu 0 --img_path xxx --output_folder xxx", error turns to:

06-04 19:28:05 Load checkpoint from ../pretrained_models/osx_l.pth.tar
06-04 19:28:05 Creating graph...
Traceback (most recent call last):
File "autodl-fs/twice/OSX/demo/demo.py", line 44, in
demoer._make_model()
File "/root/autodl-fs/twice/OSX/demo/../common/base.py", line 209, in _make_model
ckpt = torch.load(cfg.pretrained_model_path)
File "/root/twice/lib/python3.8/site-packages/torch/serialization.py", line 780, in load
with _open_zipfile_reader(opened_file) as opened_zipfile:
File "/root/twice/lib/python3.8/site-packages/torch/serialization.py", line 285, in init
super(_open_zipfile_reader, self).init(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

torch==1.13.1

how can we solve this o(╥﹏╥)o

提示GPU内存不足

image 在跑demo.py时,提示GPU内存不足,用网上终端清除缓存的方法也不行,如果不换显卡有别的办法吗?

Training code for obtaining the dataset

Hi, after carefully reading the supplementary materials, I found that the process of generating the pseudo 3D SMPL-X body is very helpful for the community. Would you consider opening up the training code for this part?

New pre-trained model

Hi, thanks for updating. Is the new pre-trained model named osx_l_wo_decoder.pth.tar fine-tuned on UBody? Why is it smaller than the original one?

About the result of Ubody test

To make Ubody test, I used [video2image.py] to transform Videos into Images. Additionally, I changed "os.system(... -r 30...')" to "os.system(... -r 1...')" to make the data smaller, and run the test code with "python test.py --gpu 0,1,2,3 --exp_name output/test_setting3 --decoder_setting wo_decoder --pretrained_model_path ../pretrained_models/osx_l_wo_decoder.pth.tar --testset UBody --test_batch_size 64". However, my result was quite different from yours. What should I do? And here's my running process and result:

/OSX/main/../data/UBody/UBody.py:661: RuntimeWarning: Mean of empty slice.
np.sqrt(np.sum((mesh_out_face_align - mesh_gt_face) ** 2, 1))[mesh_face_valid].mean() * 1000)
/Anaconda3/envs/OSX1/lib/python3.8/site-packages/numpy/core/_methods.py:190: RuntimeWarning: invalid value encountered in double_scalars
ret = ret.dtype.type(ret / rcount)
/OSX/main/../data/UBody/UBody.py:666: RuntimeWarning: Mean of empty slice.
np.sqrt(np.sum((mesh_out_face_align - mesh_gt_face) ** 2, 1))[mesh_face_valid].mean() * 1000)
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:17<00:00, 8.55s/it]
PA MPVPE (All): 60.88 mm
PA MPVPE (Hands): 11.00 mm
PA MPVPE (Face): nan mm

MPVPE (All): 153.83 mm
MPVPE (Hands): 71.00 mm
MPVPE (Face): nan mm

PA MPJPE (Body): 66.53 mm
PA MPJPE (Hands): 11.25 mm

Temporal Optimization

Hi,

Did you use temporal optimization for your results? It seems not present in the current code. The resutls sometimes are not very smooth and stable.

Efficiency and infer time

Thank you for your outstanding work. In the Supplementary Material B,it is displayed that the infer time is 53ms on A100. I wonder this time is on pytorch or tensorrt ?

Question about the 3D whole-body mesh recovery annotation.

Hi, I have a question about the '3D whole-body mesh recovery annotation'. As the paper(supp.) mentioned, 'we use our proposed OSX to estimate the SMPL-X parameters from human images as a proper 3D initialization'. It seems like a chicken-and-egg loop, or do you pre-train the OSX model on other datasets and then use it as an initialization for the SMPL-X parameters? I didn't find the answer in the paper, hope to get your clarification about this part, thanks a lot.

UBody data visualization

Hi, I was managing to render your UBody dataset. But I found the format is a bit different from the network predictions. Is there a script to do this? I could already visualize the mesh by projection, but still having difficulties in doing a rendering.
image
How can we render something like the teaser below?:
image

Thank you

run demo.py get bad results

Hello, when i run demo.py, i got bad result by visualizing the 3D joints and vertexes(it seems the params is wrong),do you know what the problem might be?

image

image

image

image

image

UBody dataset generation

Thanks for this great work!

May I ask the plan for the release of UBody dataset? In the meantime, do you also plan to release the code to annotate UBody? I have been using your code on some in-the-wild images but the results are not very precise, and I think that having a fitting pipeline described in your paper will strongly improve the performance! :)

Unable to inference on 32GB V100

Hi:)

When I try to run the demo.py code, I get the CUDA out of memory bug on 32GB V100. Just to make sure whether it's my implementation error, or that it's unable to inference with 32GB memory?

Many thx

About the pretrained model

Thanks for your great work! I am wondering if you plan to release the model that was finetuned using UBody Dataset?

demo with osx_l_wo_decoder.pth.tar ?

Hi,

Thanks for yoru great work.

I have been trying to run demo.py with osx_l_wo_decoder.pth.tar because it is approximately 2 times faster. And I just following commom/base.py to add:

if cfg.decoder_setting == 'normal':
    from OSX import get_model
elif cfg.decoder_setting == 'wo_face_decoder':
    from OSX_WoFaceDecoder import get_model
elif cfg.decoder_setting == 'wo_decoder':
    from OSX_WoDecoder import get_model

in demo.py to replace from OSX import get_model. And replace the the cfg.set_additional_args(encoder_setting='osx_l', decoder_setting='normal') into cfg.set_additional_args(encoder_setting='osx_l', decoder_setting='wo_decoder').

However, the results are completely wrong even under simple cases, comparing to the original osx_l.pth.tar. It seems quite strange:
image
image

The left-back one is using original osx_l.pth.tar and the front-right one is using osx_l_wo_decoder.pth.tar.

About the result of dataset

when i visualize the dataset, i meet a problem that even with an empty image, there still exists a human. So what's the problem?
4

Inference speed

Hey, many thanks for this project first!

For others reference, I successfully run the demo.py on Windows 11 with 3080 GPU (10G).

One Q, it turns out that the inference speed is around 2s for the forward only. Since this is a one-stage regression method, it should be faster. Is it possible to reach real time?

Quick demo result is not good

Hi ,I run code on ubuntu20.04 + rtx3080 which use the model (osx_l.pth.tar)
I see the render image result is not good. can you tell me the reason?
the list is my result

test0_result
test5_result

I also test some video the result is not satisfied.
can you help me find the reason? thanks very much!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.