Giter Site home page Giter Site logo

Comments (18)

Combo1 avatar Combo1 commented on July 19, 2024 1

@StefanoBraghetto This is very surprising to hear. My machine is equipped with a NVIDIA Geforce RTX 2060 Super, but I will look, if I can speed this process up. However, I was referring to AvatarPoser not AGRoL, my bad. However, since AGRoL seems to work based on a large amount of code from AvatarPoser I figured I could search here to find more information about it.

`python main_test_avatarposer.py
export CUDA_VISIBLE_DEVICES=0
number of GPUs is: 1
LogHandlers setup!
-------------------------------number of test data is 329
Dataset [AMASS_Dataset - test_dataset] is created.
Initialization method [kaiming_normal + uniform], gain is [0.20]
Training model [ModelAvatarPoser] is created.
Loading model for G [model_zoo/avatarposer.pth] ...
23-07-17 14:21:53.854 : testing the sample 0/329
None

results/AvatarPoseEstimation/videos/0/None.avi
23-07-17 14:35:23.771 : testing the sample 1/329
None
results/AvatarPoseEstimation/videos/1/None.avi
23-07-17 14:36:34.273 : testing the sample 2/329
None`

I modified my code a bit, but now that you mention it I guess most of the time is due to drawing the video clip.

@yufu-liu In case, you meant me, I will start working with AGRoL now, since AvatarPoser seems to work now. If I am able to generate motions without any trembling effect, I will share my results with you.

from agrol.

yufu-liu avatar yufu-liu commented on July 19, 2024 1

Hi, I would like to share some ideas.

Maybe this is a real issue in this algorithm, but some actions still can be done.
For example, we can modify the model or post-process the predicted rotation and orientation.
Moreover, according to the things we found, window size and trembling level is trade-off, so maybe we can adjust window size to acceptable latency like 10-30 in real-time cases.

from agrol.

StefanoBraghetto avatar StefanoBraghetto commented on July 19, 2024

Just to provied more info.

My first attempt to use the model in real time prediction was to use the overlapping sample function with a sld_wind_size=1.
This should create sparse_splits of 1 frame. But this is the result:

Recording.2023-06-05.175545.mp4

In the generated animation, there is a trembling effect that is more noticeable in the legs. My initial thought as to why this is happening was that the diffusion model starts its prediction from a randomly generated starting point. However, upon further analysis, even when setting the seed or switching to an MLP model, the result should be the same. This is because when adding a new frame, the model generates a new movement for the entire sequence, from which only the last frame is taken. If the previous text is not clear, I tried to create a diagram in the following drawing.

image

What I'm trying to say with all of this is that this model is not useful to real time prediction. Could you confirm this?

Thank you

from agrol.

yufu-liu avatar yufu-liu commented on July 19, 2024

Hi, thanks for sharing!
I also encountered the same issue for real-time prediction. The avatar's leg kept moving with small sliding window size like 1~10.
Have you solved it yet?

from agrol.

Combo1 avatar Combo1 commented on July 19, 2024

Hi, in which part did you guys insert your HMD + controller data into the model? I try to replace the "hmd_position_global_full_gt_list" with my own HMD/controller data, but there is no documentation on this. I only know that in the first 18 positions the rotation, the following 18 rotational velocity, the next 9 the position and the last 9 the positional velocity are, but I still get very weird results. Can you share how you inserted your data into the AGRoL pipeline?

from agrol.

StefanoBraghetto avatar StefanoBraghetto commented on July 19, 2024

Hi @Combo1 ,

I wouldn't like to lose the main subject of this issue which is to understand if the model is suitable for Real Time Prediction. I'd love the authors of this paper to clarify this.
But if you want, maybe you could share in another issue your script to understand if there is any problem.
I did just what you described with the sparse input vector.

from agrol.

yufu-liu avatar yufu-liu commented on July 19, 2024

@Combo1
I think there are many reasons may let you stock in weird results.
(Perhaps we all stock in different problems.)
I guess you insert your data at right place, but your data isn’t transformed correctly.
@StefanoBraghetto
As the project shows demo video, it should work.
My recent problem is I can’t speed up my inference data to 60 Hz frame rate which means my avatar looks lagging. Moreover, when collecting data faster than 60Hz, it starts trembling which means the collected data is too sensitive to the model.
I suggest Sefano examine your sampling rate and inference rate which should be 60Hz.

from agrol.

StefanoBraghetto avatar StefanoBraghetto commented on July 19, 2024

@yufu-liu thanks for your answer.
In the implementation I did, the trembling stopped when I give the model a delay of prediction of at least 10 frames (this is collect 10 new frames and make another prediction).

Also I am using the same Dataset as in AGRoL so pretty sure I am using 60 Hz frame rate. The demo says Real Data, Not Real Time which is different. Are you doing Real time Prediction? Which is make a new prediction with just one new frame. If you did. Could you please more details of how are you doing this? Are you using a sld_wind_size=1?

Thank you again!

from agrol.

yufu-liu avatar yufu-liu commented on July 19, 2024

@StefanoBraghetto
I see. So you just switch the evaluation function from non_overlap to overlap and set the window size to 10, right?
I haven’t try it yet, but I tried non_overlap version and get smooth animations.

So your understanding is that demo is filmed, the data is collected , and then the author used the collected data to generate the animation eventually?
If that’s true, we should stop here, go back to avatar poser, and modify it!
I only found the animation is weird when the man is running backwards.

Yes! I’m doing real-time inference and ideally want to get the last 196 frames every 16ms.
Perhaps you can examine if the window step and window size are correct.

from agrol.

StefanoBraghetto avatar StefanoBraghetto commented on July 19, 2024

@yufu-liu thanks again

If you aim to process the 196 frames in 16ms your framerate would be: 196 / 0,016 = 12,250 (twelve thousand two hundred and fifty frames per second) which is much more faster to what the model was trained for. As you can see here:

image

which means the input data to the model wont almost move, like every frame would be almost the same. I am misunderstanding you?

thank you again

from agrol.

yufu-liu avatar yufu-liu commented on July 19, 2024

@StefanoBraghetto
Yeah, there is a little misunderstanding.

Here is my thought:
613BB9D5-B126-40D1-B7B7-401EC0AEAB05

So the window size is 196 frames, overlapping 195 frames, and sliding step is 1. However, it is too hard to implement like this in a smooth way. The overlapping size might be smaller. For example, the paper suggests 70 which means it still move 126 frames forward after finishing every inference. It is also necessary to adjust visualization part when you adjusted the window size.

from agrol.

Stefano-retinize avatar Stefano-retinize commented on July 19, 2024

Thank you again,

Then the model is not suitable for online prediction? I Can't see how to solve the trembling since the new frame comes from a new generated motion which is slightly different. I think to avoid the trembling the model should be also receiving the lasts frames prediction as context, so it could fit the new prediction with the past motion.

from agrol.

Combo1 avatar Combo1 commented on July 19, 2024

Thanks for your replies! I checked my transformations again and indeed found the error, which caused my avatar to behave weird and now it looks acceptable. For real-time inference, I have a similar doubt that AGRoL and AvatarPoser might be recorded and then predicted not in real-time. When I try to infer a four second clip it takes my machine over a minute, but this might be due to my hardware restrictions.

from agrol.

StefanoBraghetto avatar StefanoBraghetto commented on July 19, 2024

@Combo1 , that's weird. For a really average computer (without even a GPU) it takes less than 0.1 second in do the inference of a clip of 196 frames at 60 Hz. Which is then about 3 seconds clip. You might enhance that time.

from agrol.

yufu-liu avatar yufu-liu commented on July 19, 2024

Hi, nice to hear that!
But I have the same concern with the speed.
Can you share your sliding window size and window step?

Lastly, do you find any trembling effect or jitter in real time inference?

from agrol.

cccvision avatar cccvision commented on July 19, 2024

I found similar issues, did you manage to get smooth results with overlapping test? Besides, is it possible to run this model in real-time?

from agrol.

asanakoy avatar asanakoy commented on July 19, 2024

clsoing

from agrol.

gb2111 avatar gb2111 commented on July 19, 2024

@StefanoBraghetto This is very surprising to hear. My machine is equipped with a NVIDIA Geforce RTX 2060 Super, but I will look, if I can speed this process up. However, I was referring to AvatarPoser not AGRoL, my bad. However, since AGRoL seems to work based on a large amount of code from AvatarPoser I figured I could search here to find more information about it.

`python main_test_avatarposer.py export CUDA_VISIBLE_DEVICES=0 number of GPUs is: 1 LogHandlers setup! -------------------------------number of test data is 329 Dataset [AMASS_Dataset - test_dataset] is created. Initialization method [kaiming_normal + uniform], gain is [0.20] Training model [ModelAvatarPoser] is created. Loading model for G [model_zoo/avatarposer.pth] ... 23-07-17 14:21:53.854 : testing the sample 0/329 None

results/AvatarPoseEstimation/videos/0/None.avi 23-07-17 14:35:23.771 : testing the sample 1/329 None results/AvatarPoseEstimation/videos/1/None.avi 23-07-17 14:36:34.273 : testing the sample 2/329 None`

I modified my code a bit, but now that you mention it I guess most of the time is due to drawing the video clip.

@yufu-liu In case, you meant me, I will start working with AGRoL now, since AvatarPoser seems to work now. If I am able to generate motions without any trembling effect, I will share my results with you.

@Combo1 Do you mind to share this with me as well?
Thanks.

from agrol.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.