The unipose from idea-research

UniPose: Detecting Any Keypoints

Online demo: Quick Checkpoint download:

`Project Page` | `Paper` | `ReadPaper` | `UniKPT Annotation` |`Video`

Jie Yang^1,2, Ailing Zeng¹, Ruimao Zhang², Lei Zhang¹

¹International Digital Economy Academy ²The Chinese University of Hong Kong, Shenzhen

🤩 News

2024.02.14: We update a file to highlight all classes (1237 classes) in the UNIKPT dataset.
2023.11.28: We are excited to highlight the 68 face keypoints detection ability of UniPose across any categories in this figure. The definition of face keypoints follows this dataset.
2023.11.9: Thanks to OpenXLab, you can try a quick online demo. Looking forward to the feedback!
2023.11.1: We release the inference code, demo, checkpoints, and the annotation of the UniKPT dataset.
2023.10.13: We release the arxiv version.

In-the-wild Test via UniPose

UniPose has strong fine-grained localization and generalization abilities across image styles, categories, and poses.

Detecting any Face Keypoints:

🗒 TODO

Release inference code and demo.
Release checkpoints.
Release UniKPT annotations.
Release training codes.

💡 Overview

• UniPose is the first end-to-end prompt-based keypoint detection framework.

• It supports multi-modality prompts, including textual and visual prompts to detect arbitrary keypoints (e.g., from articulated, rigid, and soft objects).

Visual Prompts as Inputs:

Textual Prompts as Inputs:

🔨 Environment Setup

Clone this repo

git clone https://github.com/IDEA-Rensearch/UniPose.git
cd UniPose

Install the needed packages

pip install -r requirements.txt

Compiling CUDA operators

cd models/UniPose/ops
python setup.py build install
# unit test (should see all checking is True)
python test.py
cd ../../..

▶ Demo

1. Guidelines

• We have released the textual prompt-based branch for inference. As the visual prompt involves a substantial amount of user input, we are currently exploring more user-friendly platforms to support this functionality.

• Since UniPose has learned strong structural prior, it's best to use the predefined skeleton as the keypoint textual prompts, which are shown in predefined_keypoints.py.

• If users don't provide a keypoint prompt, we'll try to match the appropriate skeleton based on the user's instance category. If unsuccessful, we'll default to using the animal's skeleton, which covers a wider range of categories and testing requirements.

2. Run

Replace {GPU ID}, image_you_want_to_test.jpg, and "dir you want to save the output" with appropriate values in the following command

CUDA_VISIBLE_DEVICES={GPU ID} python inference_on_a_image.py \
-c config/UniPose_SwinT.py \
-p weights/unipose_swint.pth \
-i image_you_want_to_test.jpg \
-o "dir you want to save the output" \
-t "instance categories" \ (e.g., "person", "face", "left hand", "horse", "car", "skirt", "table")
-k "keypoint_skeleton_text" (If necessary, please select an option from the 'predefined_keypoints.py' file.)

We also support the inference using gradio.

python app.py

Checkpoints

	name	backbone	Keypoint AP on COCO	Checkpoint	Config
1	UniPose	Swin-T	74.4	Google Drive / OpenXLab	GitHub Link
2	UniPose	Swin-L	76.8	Coming Soon	Coming Soon

The UniKPT Dataset

Datasets	KPT	Class	Images	Instances	Unify Images	Unify Instance
COCO	17	1	58,945	156,165	58,945	156,165
300W-Face	68	1	3,837	4,437	3,837	4,437
OneHand10K	21	1	11,703	11,289	2,000	2000
Human-Art	17	1	50,000	123,131	50,000	123,131
AP-10K	17	54	10,015	13,028	10,015	13,028
APT-36K	17	30	36,000	53,006	36,000	53,006
MacaquePose	17	1	13,083	16,393	2,000	2,320
Animal Kingdom	23	850	33,099	33,099	33,099	33,099
AnimalWeb	9	332	22,451	21,921	22,451	21,921
Vinegar Fly	31	1	1,500	1,500	1,500	1,500
Desert Locust	34	1	700	700	700	700
Keypoint-5	55/31	5	8,649	8,649	2,000	2,000
MP-100	561/293	100	16,943	18,000	16,943	18,000
UniKPT	338	1237	-	-	226,547	418,487

• UniKPT is a unified dataset from 13 existing datasets, which is only for non-commercial research purposes.

• All images included in the UniKPT dataset originate from the datasets listed in the table above. To access these images, please download them from the original repository.

• We provide the annotations with precise keypoints' textual descriptions for effective training. More conveniently, you can find the text annotations in the link.

Citing UniPose

If you find this repository useful for your work, please consider citing it as follows:

@article{yang2023unipose,
  title={UniPose: Detection Any Keypoints},
  author={Yang, Jie and Zeng, Ailing and Zhang, Ruimao and Zhang, Lei},
  journal={arXiv preprint arXiv:2310.08530},
  year={2023}
}

@inproceedings{yang2023neural,
  title={Neural Interactive Keypoint Detection},
  author={Yang, Jie and Zeng, Ailing and Li, Feng and Liu, Shilong and Zhang, Ruimao and Zhang, Lei},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={15122--15132},
  year={2023}
}

@inproceedings{yang2022explicit,
  title={Explicit Box Detection Unifies End-to-End Multi-Person Pose Estimation},
  author={Yang, Jie and Zeng, Ailing and Liu, Shilong and Li, Feng and Zhang, Ruimao and Zhang, Lei},
  booktitle={The Eleventh International Conference on Learning Representations},
  year={2022}
}

idea-research / unipose Goto Github PK

unipose's Introduction

UniPose: Detecting Any Keypoints

Project Page | Paper | ReadPaper | UniKPT Annotation |Video

🤩 News

In-the-wild Test via UniPose

Detecting any Face Keypoints:

🗒 TODO

💡 Overview

Visual Prompts as Inputs:

Textual Prompts as Inputs:

🔨 Environment Setup

▶ Demo

1. Guidelines

2. Run

Checkpoints

The UniKPT Dataset

Citing UniPose

unipose's People

Contributors

Stargazers

Watchers

Forkers

unipose's Issues

Recommend Projects

Recommend Topics

Recommend Org

`Project Page` | `Paper` | `ReadPaper` | `UniKPT Annotation` |`Video`