project-actionposemotion's People
project-actionposemotion's Issues
Visualization for Pose and Motion
Refinement of detected 2D keypoints
The quality of input 2D detection to the pose2motion model is crucial, should we first try to improve the 2D detection result by leveraging the pose sequence nature?
Lie group skeleton representation and dataset preparation
Re-projection of 3D pose sequence to 2D keypoints?
At the encoder part, when we lift the 2D keypoints to 3D poses, can we try to re-project the predicted 3D pose to 2D and compute loss (and backprop)?
Non-overlapping evaluation for Pose and Motion
The current evaluation of pose sequence and motion sequence have some overlapped region in the sliding window setting.
Trajectory refinement over the entire sequence (pose + motion).
Drifting prevention by using temporal weighted loss for motion generation
The generated motion can start drifting after a few frames, or become static.
Use a temporal weighted loss for motion generation might be helpful for this issue.
Mean angle error evaluation
Transfer learning (self-supervised?)
Might be a meta issue.
- The intuition we had from adding action recognition is to give a (weak?) conditional evidence for the motion prediction task. But somehow if we think this is a pretext task in transfer/self-supervised learning, it also works. I would say this is more on the transfer learning side as the annotation of action type can't be automatically generated but requires human annotation.
- RNN encoder part first train to use 2D pose sequences predict actions
- RNN encoder part first train to use 2D pose sequences predict both 3D pose sequences and actions
- Build pretext tasks for the RNN decoder part as well.
- at each time step t, the input is a dropout version of pose (random drop some joint) and try to recover the pose
- at each time step t, the input is a dropout version of pose (random drop some joint) and try to still predict the pose at time t+1
- Other pretext/auxillary tasks for transfer and self-supervised method...
Action recognition with pose and motion
Use predicted 3D pose sequence (time t - n
to t
) and predicted motion (time t + 1
to t + m
) to perform action recognition.
See the multitask pose estimation and action recognition paper for details.
Try attention-based recurrent model?
Use Transformer (or self-attention architecture) for sequence processing
Update the H36M dataloader for fetching data in different mode
include image or not, as mentioned in #9
Add normal and mask prediction module to the 2D detection phase
Siamese network architecture on distinguishing targets
Potential targets:
- subject
- view
- action
Use `argparser` for handling experiment configurations
Use an extra file create another layer of complexity for experiment version control.
Decide to make it a Python argparse
thing.
Dataloader for NTU-RGBD dataset
We have had a dataloader for the H3.6M dataset, the H3.6M is mostly used for pose estimation, we'd like to add NTU-RGBD for evaluating our action recognition component.
I suggest we have a keyword argument to control the dataset mode, it could be something like
class NTUDataset(Dataset):
def __init__(self, *args, **kwargs, include_image: bool = False):
assert isinstance(include_image, bool), "include_image" must be a boolean type.
...
The __getitem__
method in the dataset class should return the following data depends on some control argument:
# this is just an example
ACTION_INDEX_MAP = {'Directions': 0,
'Discussion': 1,
'Eating': 2,
'Greeting': 3}
...
...
def __getitem__(self, index):
return {
'pose_2d': out_2d_past, # the 2d ground truth pose from t-n to t
'pose_3d': out_3d_past, # the 3d ground truth pose from t-n to t
'future_pose_3d': out_3d_future, # the 3d ground truth pose from t+1 to t+m
'action_type': ACTION_INDEX_MAP[self._actions[index]] # action type, represented by a number is fine.
}
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.