Giter Site home page Giter Site logo

Comments (10)

dluvizon avatar dluvizon commented on July 23, 2024

Hi @ashar-ali ,

  1. For MPII on single person, there is a standard split between training and validation samples.
    I uploaded the file mpii_annotations.mat with this split.

  2. We use soft-argmax to regress directly joint coordinates, so the method does not rely on generate GT heatmaps. Please take a look in the paper for more details.

Best,

from deephar.

ashar-ali avatar ashar-ali commented on July 23, 2024

Oh great,

figured out the soft-argmax thing. But can you please re-verify the link you have provided above for file download? It throws a 404: not found error for me when I click on it. also I remember you released the weights for mpii yesterday. But am not able to access them today.

Would be a great help.

Thanks @dluvizon

from deephar.

dluvizon avatar dluvizon commented on July 23, 2024

The links should work now!

from deephar.

ashar-ali avatar ashar-ali commented on July 23, 2024

Great,

Thanks a ton for providing these. Do you also plan to release models for pose estimation and/or actiivity recognition for penn_action dataset anytime soon?

Thanks,

from deephar.

dluvizon avatar dluvizon commented on July 23, 2024

I am planning to release the weights finetuned for action, for both Penn and NTU.

from deephar.

ashar-ali avatar ashar-ali commented on July 23, 2024

Thanks @dluvizon ,

As a sanity check experiment, I was also trying to train the action recognition nets independently for a few epochs.

By independently, I mean I just took the pose ground truth and tried to learn actions with categorical cross-entropy.
Similarly, I extracted the appearance features and pose heat maps offline and tried to learn action categories with hyper parameters mentioned in Appendix B of the paper.

Questions-

  1. For both the above cases, I could only get the accuracy close to ~8% on the training data itself in 3-4 epochs. Is this performance expected, or should I get at least some prior accuracy with this kind of offline independent learning?

  2. Do you suggest it is better to directly jump to learn jointly with pose estimator networks (after 2 epochs) as mentioned in the paper?

P.S.- all the discussion above is based on experiments I did on Penn Action dataset. Features and probability heat maps were extracted from 2d pose estimator (trained on MPII) as provided by you.

from deephar.

dluvizon avatar dluvizon commented on July 23, 2024

Hi @ashar-ali ,

  1. Considering that PennAction contains 15 classes, you are getting random predictions.
    In my experience, after 3-4 epochs it should be close to 80% using visual features and a bit lower using only pose.

  2. If your net is not learning at all, I guess that the problem is not with the pose data.

PennAction is a pretty easy dataset, so even a naive method should attain 80% relatively fast.

from deephar.

ashar-ali avatar ashar-ali commented on July 23, 2024

Hi @dluvizon ,

Thanks for sharing these insights. I think it could be also because I am not using any kind of data augmentation for now as I was doing a proof of concept and the architecture is not converging because of that.

Now that you have uploaded the full code for action model as well as weights, I will try and see if I can reproduce its results.

Can you please point me to the annotations.mat file for Penn Action Dataset?
If you could just verify if I am encoding the labels right that would be great-

'baseball_pitch', - 0
'baseball_swing', - 1
'bench_press', - 2
'bowl', - 3
'clean_and_jerk', - 4
'golf_swing', - 5
'jump_rope', - 6
'jumping_jacks', - 7
'pullup', - 8
'pushup', - 9
'situp', - 10
'squat', - 11
'strum_guitar', - 12
'tennis_forehand', - 13
'tennis_serve' - 14

from deephar.

dluvizon avatar dluvizon commented on July 23, 2024

Hi,

The file should be OK now at https://github.com/dluvizon/deephar/releases/download/v0.3/penn_annotations.mat

You can check the penn action labels by doing:

    print (penn_seq.action_labels)

just after loading the dataset. That gives:

['baseball_pitch' 'baseball_swing' 'bench_press' 'bowl' 'clean_and_jerk'
 'golf_swing' 'jump_rope' 'jumping_jacks' 'pullup' 'pushup' 'situp' 'squat'
 'strum_guitar' 'tennis_forehand' 'tennis_serve']

which corresponds to your list.

from deephar.

ashar-ali avatar ashar-ali commented on July 23, 2024

Sounds great,

Thanks a lot for all your help @dluvizon

from deephar.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.