Giter Site home page Giter Site logo

project-actionposemotion's People

Contributors

kulbear avatar mayoudong1993 avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

project-actionposemotion's Issues

Refinement of detected 2D keypoints

The quality of input 2D detection to the pose2motion model is crucial, should we first try to improve the 2D detection result by leveraging the pose sequence nature?

Transfer learning (self-supervised?)

Might be a meta issue.

  1. The intuition we had from adding action recognition is to give a (weak?) conditional evidence for the motion prediction task. But somehow if we think this is a pretext task in transfer/self-supervised learning, it also works. I would say this is more on the transfer learning side as the annotation of action type can't be automatically generated but requires human annotation.
  • RNN encoder part first train to use 2D pose sequences predict actions
  • RNN encoder part first train to use 2D pose sequences predict both 3D pose sequences and actions
  1. Build pretext tasks for the RNN decoder part as well.
  • at each time step t, the input is a dropout version of pose (random drop some joint) and try to recover the pose
  • at each time step t, the input is a dropout version of pose (random drop some joint) and try to still predict the pose at time t+1
  1. Other pretext/auxillary tasks for transfer and self-supervised method...

Action recognition with pose and motion

Use predicted 3D pose sequence (time t - n to t) and predicted motion (time t + 1 to t + m) to perform action recognition.

See the multitask pose estimation and action recognition paper for details.

Dataloader for NTU-RGBD dataset

We have had a dataloader for the H3.6M dataset, the H3.6M is mostly used for pose estimation, we'd like to add NTU-RGBD for evaluating our action recognition component.

I suggest we have a keyword argument to control the dataset mode, it could be something like

class NTUDataset(Dataset):
    def __init__(self, *args, **kwargs, include_image: bool = False):
        assert isinstance(include_image, bool), "include_image" must be a boolean type.
        ...

The __getitem__ method in the dataset class should return the following data depends on some control argument:

# this is just an example
ACTION_INDEX_MAP = {'Directions': 0,
                    'Discussion': 1,
                    'Eating': 2,
                    'Greeting': 3}
...

...
    def __getitem__(self, index):
        return {
            'pose_2d': out_2d_past, # the 2d ground truth pose from t-n to t
            'pose_3d': out_3d_past, # the 3d ground truth pose from t-n to t
            'future_pose_3d': out_3d_future, # the 3d ground truth pose from t+1 to t+m
            'action_type': ACTION_INDEX_MAP[self._actions[index]] # action type, represented by a number is fine.
        }

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.