facebookresearch / agrol Goto Github PK
View Code? Open in Web Editor NEWCode release for "Avatars Grow Legs Generating Smooth Human Motion from Sparse Tracking Inputs with Diffusion Model", CVPR 2023
License: Other
Code release for "Avatars Grow Legs Generating Smooth Human Motion from Sparse Tracking Inputs with Diffusion Model", CVPR 2023
License: Other
I have solved the issue from : eth-siplab/AvatarPoser#17
Hi, could you please tell me about the total training time and total steps you used for your model? I am using the default parameter in your code which sets numsteps = 6000000, but the training on eight A100 GPUs takes nearly 10 days. Could it be possible that this code does not support multi-GPU usage?
Could you provide the processing code for the AIST++dataset?
The code seems to not support multi-gpu training. Although I find that the code has some parts to support it , it seems to not work. Can you fix it?
Hi,
I'm currently facing an issue with preparing dataset for training and testing.
I have added SMPLH model file and allotted it as support_dir . but I'm getting a bug which says 'posedirs is not a file in the archive'
Also, We were unable to find BioMotionLab_NTroje and MPI_HDM05 on AMASS webpage. Could you let us know how to download them.
We are also adding the error below :
C:\Users\Admin>python E:\AGRoLmain\prepare_data.py --support_dir E:\AGRoLmain\ --save_dir ./dataset/AMASS/ --root_dir E:\AGRoLmain\dataset\AMASS
Traceback (most recent call last):
File "E:\AGRoLmain\prepare_data.py", line 194, in
bm_male = BodyModel(
File "E:\AGRoLmain\human_body_prior\body_model\body_model.py", line 68, in init
npose_params = smpl_dict['posedirs'].shape[2] // 3
File "C:\Users\Admin\AppData\Local\Programs\Python\Python310\lib\site-packages\numpy\lib\npyio.py", line 251, in getitem
raise KeyError("%s is not a file in the archive" % key)
KeyError: 'posedirs is not a file in the archive'
Thank you !
Hello,
I am trying to run AGRoL in real-time. However, I struggle with generating poses I recorded myself. I record motion data from my VR headset + controllers in Unity and then pass this information to the model by inserting rotation, rotational velocity, position and velocity. I end up with a hovering avatar rotating in space. I was wondering if any of you could provide me with some insight how you transformed the data before passing the information to the model. The position is in meters, I calculate the velocity every frame anew and for the rotation and rotational velocity I follow the instructions of the paper. For that I just cut off from a 4x4 rotation matrix the top left portion to get a 3x3 rotation matrix.
Hi! Thanks for this amazing work!
I am adapting this algorithm to my VR devices so that I can play with it.
However, I encounter an issue that the data from my devices are not similar to the one from dataset. I guess it's because of coordintate system or the scale.
Does the dataset use right-hand coordinate system and meter in scale?
(The data shows that controllers' position are silmilar in z-axis rather than y-axis.)
I am currently sending IMU data from Pico to a Python server running an Agrol model. The process is not real-time; it requires accumulating about 30 frames before a prediction result can be sent to a model in Unity. My initial pose is a T-pose with the backs of the hands facing up, and most movements are accurately predicted based on this. However, I encounter issues with certain movements, such as being unable to turn my palms upwards, and I don't understand why this is happening. Additionally, when I simulate running, the model fails to predict leg movements correctly. Could this be an issue with how I'm converting coordinates? I would appreciate any insights or suggestions.
Hi,
I have a question about the training process. I have retrained the DiffMLP model about 54 epochs. However, the performance is far below that of the pretrained model. Are there some strategies in the training process? Also, I found the mean and std achieved from AMASS are different from those provided. Would you mind giving some hints about this?
Today I encountered a problem where I couldn't find the model.npz file. Since this is my first time working on such a project, after some searching, I finally found the relevant resources. Now, I'd like to share them with everyone, hoping it might be helpful. Best wishes for everyone’s work!
Download link for the SMPL model: If you need to download the SMPL model files, you can do so through the following link: SMPL Official Download Page.
Download link for DMPLs model: For those who need DMPLs (Dynamics Models for People in Loose Clothing), you can download them via this link: MANO Official Download Page.
Both of these are needed and should be downloaded.
Traceback (most recent call last):
File "/home/yyh_file/AGRol-main/vis.py", line 94, in
main(opt)
File "/home/yyh_file/AGRol-main/vis.py", line 75, in main
avg_error = evaluate(opt, logger, model, test_loader, save_animation=1)
File "/home/yyh_file/AGRol-main/test.py", line 41, in evaluate
vis.save_animation(body_pose=predicted_body, savepath=save_video_path_gt, bm = body_model.body_model, fps=fps, resolution = (800,800))
File "/home/yyh_file/AGRol-main/utils/utils_visualize.py", line 174, in save_animation
mv = MeshViewer(width=imw, height=imh, use_offscreen=True)
File "/home/yyh_file/AGRol-main/body_visualizer/mesh/mesh_viewer.py", line 59, in init
self.viewer = pyrender.OffscreenRenderer(*self.figsize)
File "/home/.conda/envs/AGRol/lib/python3.9/site-packages/pyrender/offscreen.py", line 31, in init
self._create()
File "/home/.conda/envs/AGRol/lib/python3.9/site-packages/pyrender/offscreen.py", line 137, in _create
egl_device = egl.get_device_by_index(device_id)
File "/home/.conda/envs/AGRol/lib/python3.9/site-packages/pyrender/platforms/egl.py", line 83, in get_device_by_index
raise ValueError('Invalid device ID ({})'.format(device_id, len(devices)))
ValueError: Invalid device ID (0)
Thanks for your work, but sadly there seems to be a problem with the pose visualization code. Have you encountered such a problem? Or how to solve it. thank you for your reply!
Hi,
Input: Position and orientation information of the head and both hands
The model is trained with the size of the SMPL model with fixed arm length.
In reality, each person has a different body size.
How to solve the estimation problem for people with different body sizes?
Thank you!
hi authors, I have some questions about the overlapping test, thanks in advance for your help!
Can this work really run in real-time? In the paper, it is written that ' AGRoL model achieves real-time inference speed' because it 'produces 196 output frames in 35 ms'. However, for online prediction, given one new observation, we only need one prediction (like what AvatarPoser did), 196 outputs seem redundant, how do you make it work for real-time usage?
when I tried to test with overlapping, It shows the following errors:
python test.py --model_path /path/to/your/model --timestep_respacing ddim5 --support_dir /path/to/your/smpls/dmpls --dataset_path ./dataset/AMASS/ --overlapping_test
Loading dataset...
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 536/536 [00:00<00:00, 1072.94it/s]
Creating model and diffusion...
Loading checkpoints from [pretrained_weights/diffmlp.pt]...
Overlapping testing...
0%| | 0/536 [00:00<?, ?it/s]
Traceback (most recent call last):
File "test.py", line 552, in
main()
File "test.py", line 516, in main
output, body_param, head_motion, filename = test_func(
File "test.py", line 269, in overlapping_test
memory_end_index = sparse_splits[step_index][1]
IndexError: index 1 is out of bounds for dimension 0 with size 1
Hi,
Thanks for the amazing work and releasing the models.
I have a query regarding running the model with real data captured from HMD devices. The network has been trained on AMASS data whose coordinate system is different from that of real-world data. Does the model work in these scenarios? Or is there any preprocessing we can do for real data to work on models trained with AMASS data?
Thanks for any help,
Sai
Dear author,
I am currently working on a project that involves the AMASS dataset, particularly its setting2, and I am looking to understand its data processing methods in comparison to FLAG. I have been exploring your project's documentation and codebase, but I find myself in need of further clarification.
My primary objectives are to understand:
Differences in Data Processing Methods: How does the data processing in AMASS setting2 differ from that in FLAG? Are there specific steps or procedures in AMASS setting2 that are not present or are significantly different in FLAG?
Code Implementation Details: Could you provide more insights or examples of how the data processing is implemented in code for AMASS setting2? This would be particularly helpful for understanding any unique aspects of its processing.
Best Practices and Recommendations: Based on your experience, are there any best practices or recommendations you would suggest when working with AMASS setting2 compared to FLAG, especially concerning data preprocessing, handling, and analysis?
I believe this information will not only assist me in my current project but also provide valuable insights for the community working with these datasets.
Thank you for your time and consideration. I appreciate the effort you have put into maintaining this project and look forward to your response.
Hello,
Great work! This has definitely pushed the boundaries of the state of the art. Congratulations!
I would like to ask how to run the model for real-time predictions. Should I create a small buffer (t-->0) with some point movements in 6D rotation and feed it to the model? Is it expected to use the non_overlapping_test or the overlapping_test method?
Thank you in advance for your response.
Hi,
Is it possible to generate a single character from the Pose for about 5 seconds?
I have a video of Pose ( openpose + hands + face) and i was wondering if it is possible to generate an output video withe the length of 5 seconds that has a consistent character/Avatar which plays Dance, .... from the controlled (pose) input?
I have a video of OpenPose+hands+face and i want to generate human like animation (No matter what, but just a consistent Character/Avatar)
Sample Video
P.S. Any Model that could supports Pose+Hand+Face, can be used!
Thanks
Best regards
Hello,
I'm encountering an error while attempting to load the weights of the MLP model. I'm unsure if I'm misunderstanding something, but it seems that the weights provided in the following link: https://github.com/facebookresearch/AGRoL/releases/download/v0/agrol_AMASS_pretrained_weights.zip (specifically in the diffmlp.pt file) are meant for the diffusion model. As the MLP model has a different state_dict shape upon creation, I believe these weights won't work for it.
if I'm not wrong, Are there any plans to share the weights specifically for the MLP model as well?
Thank you for your response and help!
ImportError: ('Unable to load EGL library', "Could not find module 'EGL' (or one of its dependencies). Try using the full path with constructor syntax.", 'EGL', None)
I have been trying to use the library on windows. There is no dependencies for EGL to install that I know of. If someone encountered the same issue, and solved please help ~!
Hi, I have dealt with this project for a month, and my recent goal is to have a real-time demo.
However, In real-time situation, my plotting time of each frame is around 0.05 seconds which is still not as fast as 0.0167 seconds (60Hz).
Do you have any suggestion for accelerate the preprocessing part, inference part, or visualization tool?
I have the following error. Is there any solution to get it to work on windows?
ImportError: ('Unable to load EGL library', "Could not find module 'EGL' (or one of its dependencies). Try using the full path with constructor syntax.", 'EGL', None)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.