Giter Site home page Giter Site logo

keyposs's Introduction

KeyPosS: Facial Landmark Detection through GPS-Inspired True-Range Multilateration

KeyPosS is a facial landmark detection method inspired by GPS technology. It addresses the limitations of traditional heatmap and coordinate regression techniques with an efficient and accurate approach.

KeyPosS uses a fully convolutional network to predict distance maps between points of interest (POIs) on a face and multiple anchor points. The anchor points are then leveraged to precisely triangulate the POIs' positions using true-range multilateration.


Figure 1: A comparison of four decoding methods. Our KeyPosS excels with minimal overhead.

Figure 2: The KeyPosS pipeline, encompassing the Distance Encoding Model, Station Anchor Sampling Strategy, and True-range Multilateration. It is suitable for any distance encoding-based approach.

Key Features

  • GPS-inspired: Applies proven concepts from GPS technology to facial analysis, enabling more precise localization.

  • True-Range Multilateration: Decodes predicted distances into landmark coordinates through multilateration with anchoring points.

  • Versatile: Can be built upon any distance encoding-based model for enhanced performance.

  • Efficient: Avoids computational burdens of heatmap-based methods.

For more details, please see our ACM MM 2023 paper.

Performance Overview


Table 1: A performance comparison with State-of-the-Art methods. Results are presented in NME (%), with top results in bold.

Quick Start Guide

Get started with the KeyPosS facial landmark detection system in a few simple steps:

1. Installation:

  • Environment Setup: Begin by setting up the necessary environment. For this, refer to the instructions provided by mmpose.

  • Datasets: Our experiments utilize the COCO, WFLW, 300W, COFW, and AFLW datasets.

2. Training:

  • Pre-trained Models: We leverage ImageNet models from mmpose as our starting point.

  • Training Command: To start the training process, execute the following command:

    CUDA_VISIBLE_DEVICES=0,1,2,3 sh tools/dist_train.sh \
        configs/face/2d_kpt_sview_rgb_img/topdown_heatmap/coco_wholebody_face/hrnetv2_w18_coco_wholebody_face_256x256_dark.py \
        4 \
        --work-dir exp/exp889

3. Evaluation:

Step 1: Obtain the Models

  • Download: Retrieve the pre-trained and trained models for each dataset and heatmap resolution from Google Drive.

Step 2: Model Setup

  • Placement: After downloading, move the "exp" model file to the root directory of your codebase.

Step 3: Resolution Configuration

  • Supported Resolutions: The model in the "exp" directory is compatible with five resolutions: 64, 32, 16, 8, and 4.

  • Configuration: Prior to running the test script, adjust the resolution by editing the "data_cfg/heatmap_size" field in the configuration file to your chosen resolution.

Step 4: Test Execution

  • Script Selection: Based on your chosen resolution, run the appropriate test script:

    • run_test_64.sh
    • run_test_32.sh
    • run_test_16.sh
    • run_test_8.sh
    • run_test_4.sh

    These scripts evaluate the model's efficacy across various face datasets: WFLW, COCO, 300W, AFLW, and COFW.

Step 5: Evaluation Command

  • Command Execution: To kick off the evaluation, input the following command:

    CUDA_VISIBLE_DEVICES=0,1,2,3 sh tools/dist_test.sh \
        configs/face/2d_kpt_sview_rgb_img/topdown_heatmap/wflw/hrnetv2_w18_wflw_256x256_dark.py \
        exp/exp_v1.3.0/best_NME_epoch_60.pth \
        4 

Acknowledgment

Our work is primarily based on mmpose. We express our gratitude to the authors for their invaluable contributions.

Citation

If you find this work beneficial, kindly cite our paper:

@inproceedings{bao2023keyposs,
  title={KeyPosS: Plug-and-Play Facial Landmark Detection through GPS-Inspired True-Range Multilateration},
  author={Bao, Xu and Cheng, Zhi-Qi and He, Jun-Yan and Xiang, Wangmeng and Li, Chenyang and Sun, Jingdong and Liu, Hanbing and Liu, Wei and Luo, Bin and Geng, Yifeng and others},
  booktitle={Proceedings of the 31st ACM International Conference on Multimedia},
  pages={5746--5755},
  year={2023}
}

License

This repository is licensed under the Apache 2.0 license. For more details, please refer to the LICENSE file.

keyposs's People

Contributors

zhiqic avatar

Stargazers

JHowe avatar  avatar Xelawk avatar  avatar  avatar  avatar Zhang Shengyu avatar  avatar JP avatar

Watchers

 avatar  avatar  avatar

keyposs's Issues

ImportError: cannot import name 'inference_top_down_pose_model' from 'mmpose.apis'

mmcv                      2.0.1                    pypi_0    pypi
mmdet                     3.1.0                    pypi_0    pypi
mmengine                  0.8.4                    pypi_0    pypi
mmpose                    1.1.0                    pypi_0    pypi
torch                     2.0.1+cu117              pypi_0    pypi
torchaudio                2.0.2+cu117              pypi_0    pypi
torchmetrics              0.11.4                   pypi_0    pypi
torchvision               0.15.2+cu117             pypi_0    pypi

Exception has occurred: ImportError (note: full exception trace is shown but execution is paused at: run_module_as_main)
cannot import name 'inference_top_down_pose_model' from 'mmpose.apis' (D:\CH\Anaconda\envs\KeyPosS\lib\site-packages\mmpose\apis_init
.py)
File "D:\CH\Projects\Driver State Detection\testing\me\Fabian\facial landmarks\KeyPosS\demo\face_video_demo_CH.py", line 9, in
from mmpose.apis import (inference_top_down_pose_model, init_pose_model,
File "D:\CH\Anaconda\envs\KeyPosS\Lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "D:\CH\Anaconda\envs\KeyPosS\Lib\runpy.py", line 196, in _run_module_as_main (Current frame)
return run_code(code, main_globals, None,
ImportError: cannot import name 'inference_top_down_pose_model' from 'mmpose.apis' (D:\CH\Anaconda\envs\KeyPosS\lib\site-packages\mmpose\apis_init
.py)

Please advice on the above error

测试后如何可视化输出结果

我运行了官方给的测试命令 CUDA_VISIBLE_DEVICES=0,1,2,3 sh tools/dist_test.sh
configs/face/2d_kpt_sview_rgb_img/topdown_heatmap/wflw/hrnetv2_w18_wflw_256x256_dark.py
exp/exp_v1.3.0/best_NME_epoch_60.pth
4
并成功得到了result_keypoints.json,里面是各个关键点的坐标,但是运行后并没有可视化的图,请问我应该改动哪些地方才能得到像论文里展示的一样的可视化结果

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.