Giter Site home page Giter Site logo

askintution / human-pose-estimation-benchmarking-and-action-recognition Goto Github PK

View Code? Open in Web Editor NEW

This project forked from chengeyang/human-pose-estimation-benchmarking-and-action-recognition

0.0 1.0 0.0 102.09 MB

Deep Learning Project.

Python 100.00%

human-pose-estimation-benchmarking-and-action-recognition's Introduction

Human Pose Estimation Benchmarking and Action Recognition

Deep Learning Project, Winter 2019, Northwestern University

Group members: Chenge Yang, Zhicheng Yu, Feiyu Chen


Results

1. Human Pose Estimation Benchmarking

Multi-Person (left: AlphaPose, right: OpenPose)

Single-Person (left: AlphaPose, right: OpenPose)

2. Action Recognition


Introduction

This project contain two main parts:

1. Human Pose Estimation Benchmarking

In this part, we conducted benchmarking test on the two most state-of-the-art human pose estimation models OpenPose and AlphaPose. We tested different modes on both single-person and multi-person scenarios.

2. Online Skeleton-Based Action Recognition

Real-time multi-person human action recognition based on tf-pose-estimation. The pipeline is as follows:

  • Real-time multi-person pose estimation via tf-pose-estimation
  • Feature Extraction
  • Multi-person action recognition using TensorFlow / Keras

Dependencies and Installation

1. Human Pose Estimation Benchmarking

Check the installation_benchmarking.md.

2. Online Skeleton-Based Action Recognition

Check the installation_action_recognition.md.


Usage

Human Pose Estimation Benchmarking

Training Action Recognition Model

  • Copy your dataset (must be .csv file) into /data folder
  • Run training.py with the following command:
python3 src/training.py --dataset [dataset_filename]
  • The model is saved in /model folder

Real-time Action Recognition

  • To see our multi-person action recognition result using your webcam, run run_detector.py with the following command:
python3 src/run_detector.py --images_source webcam

Benchmarking Results

Requirements

  • 0S: Ubuntu 18.04
  • CPU: AMD Ryzen Threadripper 1920X (12-core / 24-thread)
  • GPU: Nvidia GTX 1080Ti - 12 GB
  • RAM: 64GB
  • Webcam: Creative 720p Webcam

1. Multi-person

Benchmark on a 1920x1080 video with 902 frames, 30fps

2. Single-person

Benchmark on a 1920x1080 video with 902 frames, 30fps


Implementation (Action Recognition)

Collecting training data

we collected 3916 training images from our laptop's webcam for training the model and classifying five actions: squat, stand, punch, kick, and wave. In each training image, there is only one person doing one of these 5 actions. The videos are recorded at 10 fps with a frame size of 640 x 480, and then saved to images.

The examples and the numbers of training images for each action class are shown below:

squat stand punch kick wave

Get Skeleton from Image

We used tf-pose-estimation to detect the human pose in each training image. The output skeleton format of OpenPose can be found at OpenPose Demo - Output.

The generated training data files are located in data folder:

Feature Extraction

To transfer the original skeleton data into the input of our neural network, three features are extracted, which are implemented in data_preprocessing.py :

  1. Head reference: all joint positions are converted to the x-y coordinates relative to the head joint.
  2. Pose to angle: the 18 joint positions are converted to 8 joint angles: left / right shoulder, left / right elbow, left / right hip, left / right knee.
  3. Normalization: all joint positions are converted to the x-y coordinates relative to the skeleton bounding box.

The third feature is used, which gives the best result and robustness.

Deep Learning model

We built our Deep Learning model refering to Online-Realtime-Action-Recognition-based-on-OpenPose. The model is implemented in training.py using Keras and Tensorflow. The model consists of three hidden layers and a Softmax output layer to conduct a 5-class classification.

The generated model is saved in model folder.


Acknowledgement

human-pose-estimation-benchmarking-and-action-recognition's People

Contributors

chengeyang avatar bigfacebear avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.