Giter Site home page Giter Site logo

gaze-mouse-pointer-controller's Introduction

Computer Pointer Controller

This project uses a gaze detection model to control the mouse pointer of a computer. Intel OpenVINO Gaze Estimation model is used to estimate the gaze of the user's eyes and change the mouse pointer position accordingly. I have used the following pre-trained models from the Model Zoo: face detection model, head-pose estimation model, facial landmarks model, and gaze estimation.

Project Set Up and Installation (for Windows 10)

  • Install OpenVINO™ toolkit and its dependencies to run the application. OpenVINO 2020.4 is used on this project. See the installation documentation here: https://docs.openvinotoolkit.org/latest/openvino_docs_install_guides_installing_openvino_windows_fpga.html

  • create virtual environment by running the following command:

    python3 -m venv

    Activate the virtual environment by running:

    your-env\Scripts\activate

  • clone this repository

  • download the following models from the Model Zoo:

    Face Detection Model

    python /intel/openvino/deployment_tools/tools/model_downloader/downloader.py --name "face-detection-adas-binary-0001"

    Facial Landmark Detection Model

    python /intel/openvino/deployment_tools/tools/model_downloader/downloader.py --name "landmarks-regression-retail-0009"

    HeadPose Estimation Model

    python /intel/openvino/deployment_tools/tools/model_downloader/downloader.py --name "head-pose-estimation-adas-0001"

    Gaze Estimation Model

    python /intel/openvino/deployment_tools/tools/model_downloader/downloader.py --name "gaze-estimation-adas-0002" Model Zoo: https://download.01.org/opencv/2020/openvinotoolkit/2020.1/open_model_zoo/models_bin/1/

Demo

To run the project demo, from 'src' folder execute:

python gaze_mouse_control.py -f face-detection-adas-binary-0001.xml -fl landmarks-regression-retail-0009.xml -hp head-pose-estimation-adas-0001.xml -g gaze-estimation-adas-0002.xml -i demo.mp4

If needed, replace the models paths (refer to the documentation section for descriptions of command line argumants)

Documentation

OpenVINO documentation: https://docs.openvinotoolkit.org/latest/index.html

Application Command Line Arguments:

-h : help

-fl (required) : path to Facial Landmark Detection model's xml file

-f (required) : path to Face Detection model's xml file

-g (required) : path to Gaze Estimation model's xml file

-hp (required) : path to Head Pose Estimation model's xml file

-i (required) : path to input video file OR 'cam' for taking input from webcam

-l (optional) : absolute path of cpu extension if some layers of models are not supported on the device.

-prob (optional) : Probability threshold for model to detect the face accurately from the video frame.

-d (optional) : target device to run the model on. Options are: CPU, GPU, FPGA, MYRIAD.

-v (optional) : Optional model visualization flags. Use it to diaplay the model outputs. fd = Face Detection, fld = Facial Landmark Detection, hp for Head Pose Estimation, ge for Gaze Estimation. Flags should be separated by space.

Benchmarks

Performance Analysis of FP32 Precision Models (in seconds):

Face detection:

  • Loading: 0.46
  • Inference time: 1.8

Facial Landmark detection:

  • Loading: 0.97
  • Inference time: 0.21

Head Pose detection:

  • Loading: 0.21
  • Inference time: 0.19

Gaze Estimation:

  • Loading: 0.52
  • Inference time: 0.27

Performance Analysis of FP16 Precision Models (in seconds):

Face detection: N/A

Facial Landmark detection:

  • Loading: 1.7
  • Inference time: 0.21

Head Pose detection:

  • Loading: 0.4
  • Inference time: 0.15

Gaze Estimation:

  • Loading: 0.3
  • Inference time: 0.26

Results

FP16 models take slightly more time to load, but take less time for inference.

gaze-mouse-pointer-controller's People

Contributors

alex01001 avatar

Stargazers

leoliu avatar

Watchers

James Cloos avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.