Giter Site home page Giter Site logo

Comments (21)

ZheC avatar ZheC commented on May 20, 2024 8

With the large training dataset, the model convergence is slow. We use two Titanx (old version) and train for six days. You can change the batch size from 10 to 8 to speed it up a little bit. We did not try a batch size below 8. I will plot the loss with respect to iterations and post here.

from realtime_multi-person_pose_estimation.

ZheC avatar ZheC commented on May 20, 2024 4

Hi all, I am really sorry for my late response. I graduated from CMU so it is not easy to access the old files again. But I plot the loss for the two levels here:

https://github.com/ZheC/Realtime_Multi-Person_Pose_Estimation/blob/master/training/example_loss/Loss_l1.png

https://github.com/ZheC/Realtime_Multi-Person_Pose_Estimation/blob/master/training/example_loss/Loss_l2.png

All the terminal output is: https://github.com/ZheC/Realtime_Multi-Person_Pose_Estimation/blob/master/training/example_loss/output.txt

The code to plot the loss is here: https://github.com/ZheC/Realtime_Multi-Person_Pose_Estimation/blob/master/training/example_loss/plotLoss.sh

from realtime_multi-person_pose_estimation.

Shaswat27 avatar Shaswat27 commented on May 20, 2024 3

@ZheC Can you post the loss vs iterations curve again?

from realtime_multi-person_pose_estimation.

yq1011 avatar yq1011 commented on May 20, 2024

Thanks a lot!

from realtime_multi-person_pose_estimation.

ZheC avatar ZheC commented on May 20, 2024

I deleted the wrong figure posted before, this will be updated soon.

from realtime_multi-person_pose_estimation.

yq1011 avatar yq1011 commented on May 20, 2024

Hi, is this the loss of L1 or L2?

from realtime_multi-person_pose_estimation.

yq1011 avatar yq1011 commented on May 20, 2024

hi, any updates? :D

from realtime_multi-person_pose_estimation.

guanxiongsun avatar guanxiongsun commented on May 20, 2024

@yq1011 what's your final result? Was it converge? How long did you train for it? I am training this model for days, and it's not so fast as ZheC said, I use 4 GPUs trained for 2 days , and it's about 2w iterations...

from realtime_multi-person_pose_estimation.

ds2268 avatar ds2268 commented on May 20, 2024

I think we should talk in terms of epochs instead (the training is printing that). @ZheC when you mentioned that you were using 2 x Titan X with batch size of 10 that probably meant that the actual batch size was 20 (10 per gpu) ?

from realtime_multi-person_pose_estimation.

wujiyoung avatar wujiyoung commented on May 20, 2024

I have the same problem as you @yq1011 .
Did you get a converged loss finally?

from realtime_multi-person_pose_estimation.

jricheimer avatar jricheimer commented on May 20, 2024

@ZheC Hi, would you be able to post the loss curve again? I would like to compare with the model when I train it locally to make sure it is performing comparably. Thanks!

from realtime_multi-person_pose_estimation.

ildoonet avatar ildoonet commented on May 20, 2024

@ZheC Can you post the loss curve?

from realtime_multi-person_pose_estimation.

ildoonet avatar ildoonet commented on May 20, 2024

@ZheC at least you can tell the last loss value so that we can compare.

from realtime_multi-person_pose_estimation.

ildoonet avatar ildoonet commented on May 20, 2024

@ZheC Thanks for sharing!!

from realtime_multi-person_pose_estimation.

Nestarneal avatar Nestarneal commented on May 20, 2024

@ZheC Hi, I set the parameters based on your terminal output and train, but it is terminated about iteration 1200 without any log shown on the screen. Do you have any idea about this? Thanks.

from realtime_multi-person_pose_estimation.

Ai-is-light avatar Ai-is-light commented on May 20, 2024

Hi,@ZheC,thanks for your great work! I have some questions about train the pose model. In your https://github.com/ZheC/Realtime_Multi-Person_Pose_Estimation/blob/master/training/example_loss/Loss_l1.png

https://github.com/ZheC/Realtime_Multi-Person_Pose_Estimation/blob/master/training/example_loss/Loss_l2.png
,But there was only the result about iterations of # 25,0000. However, in the Openpose
https://github.com/CMU-Perceptual-Computing-Lab/openpose
the https://github.com/CMU-Perceptual-Computing-Lab/openpose/tree/master/models/pose/coco
the model was shown is pose_iter_# 440000.caffemodel, I would like to know whether it is only 250000 iterations or 440000 iterations.
Thanks for you attentions.

from realtime_multi-person_pose_estimation.

Ai-is-light avatar Ai-is-light commented on May 20, 2024

Are there someone getting the same results in the paper , and trying to train the small model only 2-stages used?

from realtime_multi-person_pose_estimation.

Ai-is-light avatar Ai-is-light commented on May 20, 2024

@yq1011 Did you get the same results as the paper?Thanks

from realtime_multi-person_pose_estimation.

ZheC avatar ZheC commented on May 20, 2024

@Ai-is-light I use 440000 iterations' model. I pick up the best iteration based on the evaluation score on a validation set. I keep testing the accuracy of the trained models at different iteration. The best iteration is not fixed for different models. So I think probably you want to follow the same way to pick up your trained model.

from realtime_multi-person_pose_estimation.

yw155 avatar yw155 commented on May 20, 2024

Hi @ZheC , I have a question that how much the effect the number of images being used in the evaluation have to the evaluation score. I saw on your paper you chosen 1160 images randomly. Why not choose the whole validation set?

from realtime_multi-person_pose_estimation.

soans1994 avatar soans1994 commented on May 20, 2024

Hello,
How to choose the stepsize for a given iteration?

The default parametrs from the author is step size=13106 for iterations=600000

Thank You

from realtime_multi-person_pose_estimation.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.