Giter Site home page Giter Site logo

shaw-git / lightweight-head-pose-estimation Goto Github PK

View Code? Open in Web Editor NEW
44.0 2.0 2.0 13.04 MB

Code for TMM 2022 "Accurate Head Pose Estimation Using Image Rectification and Lightweight Convolutional Neural Network"

License: MIT License

Python 100.00%
deep-learning pytorch head-pose-estimation lightweight-neural-network

lightweight-head-pose-estimation's Introduction

Lightweight Head Pose Estimation

This is an official implement for Accurate Head Pose Estimation Using Image Rectification and Lightweight Convolutional Neural Network

Abstract

Head pose estimation is an important step for many human-computer interaction applications such as face detection, facial recognition, and facial expression classification. Accurate head pose estimation benefits these applications that require face image as the input. Most head pose estimation methods suffer from perspective distortion because the users do not always align their face perfectly with the camera. This paper presents a new approach that uses image rectification to reduce the negative effect of perspective distortion and a lightweight convolutional neural network to obtain highly accurate head pose estimation. The proposed method calculates the angle between the camera optical axis and the projection vector of the face center. The face image is rectified using this estimated angle through perspective transformation. A lightweight network with the size of only 0.88 MB is designed to take the rectified face image as the input to perform head pose estimation. The output of the network, the head pose estimation of the rectified face image, is transformed back to the camera coordinate system as the final head pose estimation. Experiments on public benchmark datasets show that the proposed image rectification and the newly designed lightweight network remarkably improve the accuracy of head pose estimation. Compared with state-of-the-art methods, our approach achieves both higher accuracy and faster processing speed.

Result

Platform

  • GTX-1080Ti
  • Ubuntu

Dependencies

  • Anaconda
  • OpenCV
  • Pytorch
  • Numpy

How to run the code

python test_network.py [--input INPUT_VIDEO_PATH] [--output OUTPUT_VIDEO_PATH]

If you want to use your webcam, please set [--input "0"].

If you want to use mtcnn to detect face, please install MTCNN and Tensorflow and run the following code.

python test_network_mtcnn.py [--input INPUT_VIDEO_PATH] [--output OUTPUT_VIDEO_PATH]

Datasets

The model provided in this repo is trained on 300W-LP. For more datasets used in our paper, please refer to the following links.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.