Giter Site home page Giter Site logo

varun-tandon14 / implementation-of-cross-view-tracking-for-multi-human-3d-pose-estimation-at-over-100-fps Goto Github PK

View Code? Open in Web Editor NEW
6.0 2.0 1.0 4.86 MB

(Unofficial) Implementation of the paper "Cross-View Tracking for Multi-Human 3D Pose Estimation at over 100 FPS" Chen et al.

Python 55.69% Jupyter Notebook 44.31%
computer-vision multi-person-tracking multi-view-stereo triangulation

implementation-of-cross-view-tracking-for-multi-human-3d-pose-estimation-at-over-100-fps's Introduction

Implementation-of-the-paper-Cross-View-Tracking-for-Multi-Human-3D-Pose-Estimation-at-over-100-FPS

(Unofficial) Implementation of the paper "Cross-View Tracking for Multi-Human 3D Pose Estimation at over 100 FPS" by Chen et al. Paper link. Before starting please visit the original repo. In the original repo, the authors provide us with 2D pose keypoints (detections) and 3D pose (targets) tracking results along with camera calibration and visualizations functions. We extend the functionality to implement the algorithm itself. For 3D pose visualization, we use matplotlib only rather than vispy due to some difficulties with the installations for the latter. Thanks to the original authors for their excellent work.

Get dataset:

Currently, the code is only tested for the Campus dataset. But since the values are not hard coded this code should ideally run without errors for other datasets as well. Campus dataset info - Onedrive download link

Please download the dataset in Onedrive and extract the zip into a folder named Campus_Seq1. The dataset folder should therefore look like this.

Dependencies:

  1. numpy
  2. pandas
  3. opencv
  4. scipy
  5. tqdm
  6. matplotlib

Implementation details:

  1. Thanks to the author of the original repo for making the visualization and calibration code publically available. The graph partitioning problem solver was also provided by the author here. Kudos.
  2. I have extended the original camera.py and calibration.py to support my implementation. To make the code very easy to use I put everything from config, other helper functions, algorithms and visualization inside a single notebook.
  3. The original repo suggests using vispy but installation is sometimes complicated. I thought it will be more convenient to use matplotlib animation therefore we need not worry about vispy here.
  4. At the end of the run of the algorithm, we save the details in the log file. Please go through the logs after running the code to get a feel of the algorithm.
  5. The code runs below 100 FPS as it is severely unoptimized now. This code was meant to quickly implement the paper to the best of my ability.
  6. The authors provide the original IDs for the 2D key points (detections) and 3D poses (targets) in the files annotation_2d and annotation_3d respectively. In this current implementation, we only use the 2D pose keypoints only and not the IDs as provided in the annotation_2d (as they are preassigned and you could directly use them for triangulation).
  7. After looking at the IDs in the annotation_3D we see that the authors probably implemented ReID to get respective results. I have not implemented ReiD since it was not mentioned in the algo. 1 in the paper.

Qualitative results (campus dataset):

Screenshot for a single timestamp:

Please see the tracking results for a single timestamp (at 41.72 sec) from 3 different angles below:

Android Registration
Android Registration
Android Registration

Complete animation:

cross_view_tracking_3D_plot_campus_dataset_rotate_30_fps.mp4

Possible improvements:

  1. Implement ReID to deal with the person re-entering the scene.
  2. Optimize the codebase to support higher execution speeds.
  3. Improve velocity estimation:- Currently, calculated by two-point difference rather than multi-point linear regression.
  4. Refine triangulation methodology:- Currently, we use linear eigen ideally we should atleast use non-iterative L2 methods. Please help us mrcal.
  5. Enhance visualization to include skeleton or SMPL models.
  6. Add quantitative results.

Additional Note:

The repo is currently unlicensed as the license information is unclear to me in the original repo. I will update this repo when that becomes clear. I have no affiliation with AiFi Inc.

implementation-of-cross-view-tracking-for-multi-human-3d-pose-estimation-at-over-100-fps's People

Contributors

varun-tandon14 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

zulkafilabbas

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.