microsoft / human-pose-estimation.pytorch Goto Github PK

The project is an official implement of our ECCV2018 paper "Simple Baselines for Human Pose Estimation and Tracking(https://arxiv.org/abs/1804.06208)"

License: MIT License

Makefile 0.10% Python 91.72% C++ 0.14% Cuda 4.76% Cython 3.28%

human-pose-estimation deep-learning coco-keypoints-detection mpii-dataset mscoco-keypoint

human-pose-estimation.pytorch's Introduction

Simple Baselines for Human Pose Estimation and Tracking

News

Our new work High-Resolution Representations for Labeling Pixels and Regions is available at HRNet. Our HRNet has been applied to a wide range of vision tasks, such as image classification, objection detection, semantic segmentation and facial landmark.
Our new work Deep High-Resolution Representation Learning for Human Pose Estimation has already been released at https://github.com/leoxiaobin/deep-high-resolution-net.pytorch. The best single HRNet can obtain an AP of 77.0 on COCO test-dev2017 dataset and 92.3% of [email protected] on MPII test set. The new repositoty also support the SimpleBaseline method, and you are welcomed to try it.
Our entry using this repo has won the winner of PoseTrack2018 Multi-person Pose Tracking Challenge!
Our entry using this repo ranked 2nd place in the keypoint detection task of COCO 2018!

Introduction

This is an official pytorch implementation of Simple Baselines for Human Pose Estimation and Tracking. This work provides baseline methods that are surprisingly simple and effective, thus helpful for inspiring and evaluating new ideas for the field. State-of-the-art results are achieved on challenging benchmarks. On COCO keypoints valid dataset, our best single model achieves 74.3 of mAP. You can reproduce our results using this repo. All models are provided for research purpose.

Main Results

Results on MPII val

Arch	Head	Shoulder	Elbow	Wrist	Hip	Knee	Ankle	Mean	[email protected]
256x256_pose_resnet_50_d256d256d256	96.351	95.329	88.989	83.176	88.420	83.960	79.594	88.532	33.911
384x384_pose_resnet_50_d256d256d256	96.658	95.754	89.790	84.614	88.523	84.666	79.287	89.066	38.046
256x256_pose_resnet_101_d256d256d256	96.862	95.873	89.518	84.376	88.437	84.486	80.703	89.131	34.020
384x384_pose_resnet_101_d256d256d256	96.965	95.907	90.268	85.780	89.597	85.935	82.098	90.003	38.860
256x256_pose_resnet_152_d256d256d256	97.033	95.941	90.046	84.976	89.164	85.311	81.271	89.620	35.025
384x384_pose_resnet_152_d256d256d256	96.794	95.618	90.080	86.225	89.700	86.862	82.853	90.200	39.433

Note:

Flip test is used.

Results on COCO val2017 with detector having human AP of 56.4 on COCO val2017 dataset

Arch	AP	Ap .5	AP .75	AP (M)	AP (L)	AR	AR .5	AR .75	AR (M)	AR (L)
256x192_pose_resnet_50_d256d256d256	0.704	0.886	0.783	0.671	0.772	0.763	0.929	0.834	0.721	0.824
384x288_pose_resnet_50_d256d256d256	0.722	0.893	0.789	0.681	0.797	0.776	0.932	0.838	0.728	0.846
256x192_pose_resnet_101_d256d256d256	0.714	0.893	0.793	0.681	0.781	0.771	0.934	0.840	0.730	0.832
384x288_pose_resnet_101_d256d256d256	0.736	0.896	0.803	0.699	0.811	0.791	0.936	0.851	0.745	0.858
256x192_pose_resnet_152_d256d256d256	0.720	0.893	0.798	0.687	0.789	0.778	0.934	0.846	0.736	0.839
384x288_pose_resnet_152_d256d256d256	0.743	0.896	0.811	0.705	0.816	0.797	0.937	0.858	0.751	0.863

Results on Caffe-style ResNet

Arch	AP	Ap .5	AP .75	AP (M)	AP (L)	AR	AR .5	AR .75	AR (M)	AR (L)
256x192_pose_resnet_50_caffe_d256d256d256	0.704	0.914	0.782	0.677	0.744	0.735	0.921	0.805	0.704	0.783
256x192_pose_resnet_101_caffe_d256d256d256	0.720	0.915	0.803	0.693	0.764	0.753	0.928	0.821	0.720	0.802
256x192_pose_resnet_152_caffe_d256d256d256	0.728	0.925	0.804	0.702	0.766	0.760	0.931	0.828	0.729	0.806

Note:

Flip test is used.
Person detector has person AP of 56.4 on COCO val2017 dataset.
Difference between PyTorch-style and Caffe-style ResNet is the position of stride=2 convolution

Environment

The code is developed using python 3.6 on Ubuntu 16.04. NVIDIA GPUs are needed. The code is developed and tested using 4 NVIDIA P100 GPU cards. Other platforms or GPU cards are not fully tested.

Quick start

Installation

Install pytorch >= v0.4.0 following official instruction.

Disable cudnn for batch_norm:

# PYTORCH=/path/to/pytorch
# for pytorch v0.4.0
sed -i "1194s/torch\.backends\.cudnn\.enabled/False/g" ${PYTORCH}/torch/nn/functional.py
# for pytorch v0.4.1
sed -i "1254s/torch\.backends\.cudnn\.enabled/False/g" ${PYTORCH}/torch/nn/functional.py

Note that instructions like # PYTORCH=/path/to/pytorch indicate that you should pick a path where you'd like to have pytorch installed and then set an environment variable (PYTORCH in this case) accordingly.

Clone this repo, and we'll call the directory that you cloned as ${POSE_ROOT}.
Install dependencies:
```
pip install -r requirements.txt
```
Make libs:
```
cd ${POSE_ROOT}/lib
make
```

Install COCOAPI:

# COCOAPI=/path/to/clone/cocoapi
git clone https://github.com/cocodataset/cocoapi.git $COCOAPI
cd $COCOAPI/PythonAPI
# Install into global site-packages
make install
# Alternatively, if you do not have permissions or prefer
# not to install the COCO API into global site-packages
python3 setup.py install --user

Note that instructions like # COCOAPI=/path/to/install/cocoapi indicate that you should pick a path where you'd like to have the software cloned and then set an environment variable (COCOAPI in this case) accordingly.

Download pytorch imagenet pretrained models from pytorch model zoo and caffe-style pretrained models from GoogleDrive.

Download mpii and coco pretrained models from OneDrive or GoogleDrive. Please download them under ${POSE_ROOT}/models/pytorch, and make them look like this:

${POSE_ROOT}
 `-- models
     `-- pytorch
         |-- imagenet
         |   |-- resnet50-19c8e357.pth
         |   |-- resnet50-caffe.pth.tar
         |   |-- resnet101-5d3b4d8f.pth
         |   |-- resnet101-caffe.pth.tar
         |   |-- resnet152-b121ed2d.pth
         |   `-- resnet152-caffe.pth.tar
         |-- pose_coco
         |   |-- pose_resnet_101_256x192.pth.tar
         |   |-- pose_resnet_101_384x288.pth.tar
         |   |-- pose_resnet_152_256x192.pth.tar
         |   |-- pose_resnet_152_384x288.pth.tar
         |   |-- pose_resnet_50_256x192.pth.tar
         |   `-- pose_resnet_50_384x288.pth.tar
         `-- pose_mpii
             |-- pose_resnet_101_256x256.pth.tar
             |-- pose_resnet_101_384x384.pth.tar
             |-- pose_resnet_152_256x256.pth.tar
             |-- pose_resnet_152_384x384.pth.tar
             |-- pose_resnet_50_256x256.pth.tar
             `-- pose_resnet_50_384x384.pth.tar

Init output(training model output directory) and log(tensorboard log directory) directory:

mkdir output 
mkdir log

Your directory tree should look like this:

${POSE_ROOT}
├── data
├── experiments
├── lib
├── log
├── models
├── output
├── pose_estimation
├── README.md
└── requirements.txt

Data preparation

For MPII data, please download from MPII Human Pose Dataset. The original annotation files are in matlab format. We have converted them into json format, you also need to download them from OneDrive or GoogleDrive. Extract them under {POSE_ROOT}/data, and make them look like this:

${POSE_ROOT}
|-- data
`-- |-- mpii
    `-- |-- annot
        |   |-- gt_valid.mat
        |   |-- test.json
        |   |-- train.json
        |   |-- trainval.json
        |   `-- valid.json
        `-- images
            |-- 000001163.jpg
            |-- 000003072.jpg

For COCO data, please download from COCO download, 2017 Train/Val is needed for COCO keypoints training and validation. We also provide person detection result of COCO val2017 to reproduce our multi-person pose estimation results. Please download from OneDrive or GoogleDrive. Download and extract them under {POSE_ROOT}/data, and make them look like this:

${POSE_ROOT}
|-- data
`-- |-- coco
    `-- |-- annotations
        |   |-- person_keypoints_train2017.json
        |   `-- person_keypoints_val2017.json
        |-- person_detection_results
        |   |-- COCO_val2017_detections_AP_H_56_person.json
        `-- images
            |-- train2017
            |   |-- 000000000009.jpg
            |   |-- 000000000025.jpg
            |   |-- 000000000030.jpg
            |   |-- ... 
            `-- val2017
                |-- 000000000139.jpg
                |-- 000000000285.jpg
                |-- 000000000632.jpg
                |-- ...

Valid on MPII using pretrained models

python pose_estimation/valid.py \
    --cfg experiments/mpii/resnet50/256x256_d256x3_adam_lr1e-3.yaml \
    --flip-test \
    --model-file models/pytorch/pose_mpii/pose_resnet_50_256x256.pth.tar

Training on MPII

python pose_estimation/train.py \
    --cfg experiments/mpii/resnet50/256x256_d256x3_adam_lr1e-3.yaml

Valid on COCO val2017 using pretrained models

python pose_estimation/valid.py \
    --cfg experiments/coco/resnet50/256x192_d256x3_adam_lr1e-3.yaml \
    --flip-test \
    --model-file models/pytorch/pose_coco/pose_resnet_50_256x192.pth.tar

Training on COCO train2017

python pose_estimation/train.py \
    --cfg experiments/coco/resnet50/256x192_d256x3_adam_lr1e-3.yaml

Other Implementations

TensorFlow [Version1]
PaddlePaddle [Version1]
Gluon [Version1]

Citation

If you use our code or models in your research, please cite with:

@inproceedings{xiao2018simple,
    author={Xiao, Bin and Wu, Haiping and Wei, Yichen},
    title={Simple Baselines for Human Pose Estimation and Tracking},
    booktitle = {European Conference on Computer Vision (ECCV)},
    year = {2018}
}

human-pose-estimation.pytorch's People

Contributors

Stargazers

Watchers

Forkers

nannanwang ahangchen december-boy leoxiaobin ml-lab xingyizhou liuguoyou limiaopeng hibiscuses xizero00 huandrew lck1201 frextrite opencvfun fangyh09 wpf535236337 ahuirecome li-haoran createai gridl crazysnowboy algoskynet nikolayvoronchikhin wangzheallen sumitpai vigneshpraj cvrajesh ratapongonjun chriswo0724 tony32769 lxtgh andrewgeorge baiyancheng20 yangsenius alakia williamzcy mingliumengshao atery statml batermj yoyokitartora phineco allenwutao emigmo zgsxwsdxg acerge scut-githuber gavinzhang1995 lab-of-professor-zhu ww00426955 wang1104014663 shubhampachori12110095 lukasc-ch mihirpanchal4 spidermannnn seansyue hiiakku chip1967 firedfree nick-choudhary hemik2137 xiaohangzhan ashismaskara hanshenchen my-hello-world ashwanitr shlavocky boluoyu shinerio sam186 woolfel zoombapup padeler matthewtomas ye-xiyong romerobarata cigonzalez baipdiw kohakunushi acewjh amigocdt robinwenqian bowenc0221 greenteahua jiyulongxu schliffen klqulei dimancheite shahargigi xujin1184104394 dccho inkplay ahmadkhalafallah axjtan baileyqbb tiravata ybai62868 atlonxp baiti01 jhaux

human-pose-estimation.pytorch's Issues

Why "imagenet pretrained model dose not exist" even if I put models in "models/pytorch/imagenet"

As shown in the picture

During training, the validation performance is always 0.

I've been trying to train on coco. The training loss seems normal. However, the validation performance is always 0. Any idea what is happening?

Training suspends after validation

Hi, I met a question when training.
During training stage, after evaluating the current results, training suspended. Log info as below:

Accumulating evaluation results...
DONE (t=0.06s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.339
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.665
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.300
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.329
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.360
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.391
Average Recall (AR) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.693
Average Recall (AR) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.377
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.370
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.420
=> coco eval results saved to output/coco/pose_resnet_50/256x192_d256x3_adam_lr1e-3/results/keypoints_val2017_results.pkl
Traceback (most recent call last):
File "pose_estimation/train.py", line 206, in
main()
File "pose_estimation/train.py", line 180, in main
writer_dict)
File "/media/human-pose-estimation.pytorch/pose_estimation/../lib/core/function.py", line 182, in validate
filenames, imgnums)
File "/media/human-pose-estimation.pytorch/pose_estimation/../lib/dataset/coco.py", line 329, in evaluate
name_value = OrderedDict(info_str)
TypeError: 'NoneType' object is not iterable

During evaluation, the center and scale is used?

Hi, I see it uses center and scale when fetching validation dataset. Where are the center and scale come from? They are predicted by human detector or the gt information?
https://github.com/Microsoft/human-pose-estimation.pytorch/blob/8fc31b4dad3dee31b5d69f8cb2e8b781f811ab23/lib/dataset/JointsDataset.py#L103

About the eval() function in train.py

in main() function of 'train.py', line 92~94

    model = eval('models.'+config.MODEL.NAME+'.get_pose_net')(
        config, is_train=True
    )

I can't find the reference of the eval function and the functionality of the parenthesis after the eval().
In PyCharm, when I "ctrl-click" on the eval, it brings me to an eval definition that only has "pass" in it.

There is some explanation mistake in README

Valid on COCO val2017 using pretrained models

python pose_estimation/valid.py
--cfg experiments/mpii/resnet50/256x256_d256x3_adam_lr1e-3.yaml
--flip-test
--model-file models/pytorch/pose_coco/pose_resnet_50_256x256.pth.tar

I think the first and third line, 256x256 has to be changed to 256x192.
Or both directory must be changed to mpii

The result I get from valid.py is different from what I get from cocoapi.

I did the validation via valid.py and put the json file generated in the process through the cocoapi. These two gave rather different results. The cocoapi's result is more than 20% lower than valid.py's. Do you have different coordinates system? If so, could you so kindly tell us the conversion formula between your system and cocoapi's? Thanks!

Why the DataLoader get stuck during training ?

During training, the DataLoader (i.e., torch.utils.data.DataLoader) always got stuck, resulting in the low GPU utilization ? Have you encountered such a situation ?

How to predict my own image?

I read your code carefully, and implement with following code.
But I still get the wrong result.
Could you help me?

# config
from lib.models.pose_resnet import get_pose_net
from lib.core.config import config
from lib.core.config import update_config
config.TEST.FLIP_TEST = True
config.TEST.MODEL_FILE = 'pose_resnet_50_256x256.pth.tar'
update_config('experiments/mpii/resnet50/256x256_d256x3_adam_lr1e-3.yaml')
model = get_pose_net(config, is_train=False)

import torch
import torchvision.transforms as transforms
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]
toTensor = transforms.Compose([transforms.ToTensor(), 
                               transforms.Normalize(mean, std)])

def getpoint(mat):
    height, width = mat.shape
    mat = mat.reshape(-1)
    idx = np.argmax(mat)
    return idx % width, idx // width

# load image and predict
import cv2
import numpy as np
img = cv2.imread('0.png', cv2.IMREAD_COLOR | cv2.IMREAD_IGNORE_ORIENTATION)
img = cv2.resize(img, (256, 256))
x = toTensor(img).unsqueeze(0)
with torch.no_grad():
    res = model.forward(x)
res = np.array(res.detach().squeeze())
print(img.shape)
print(res.shape)

(256, 256, 3)
(16, 64, 64)

# plot
image = cv2.resize(img, (64, 64))
print(image.shape)
for mat in res:
    x, y = getpoint(mat)
    print(x, y)
    cv2.circle(image, (x, y), 2, (255, 0, 0), 2)
import matplotlib.pyplot as plt
plt.imshow(image)

(64, 64, 3)
10 46
8 37
27 29
13 37
33 7
30 7
25 18
17 31
31 22
29 21
15 32
12 51
23 15
36 18
13 40
12 41
<matplotlib.image.AxesImage at 0x7f14625c1160>

LoadNet.pdf

Could you tell me the detect speech?

Hello! Thank you for good job. Could you tell me the detect speech on COCO test dataset per image?

Error Result when predicting my own image.

This is my code about using your model to predict pose of my image.
Is there any wrong with it ?
Can you provide me a right one ?

from lib.models.pose_resnet import get_pose_net
from lib.core.config import config
from lib.core.config import update_config
config.TEST.FLIP_TEST = True
config.TEST.MODEL_FILE = 'pose_resnet_50_256x256.pth.tar'
update_config('experiments/mpii/resnet50/256x256_d256x3_adam_lr1e-3.yaml')
model = get_pose_net(config, is_train=False)

import torch
import cv2
import numpy as np
img = cv2.imread('0.png')
img = cv2.resize(img, (256, 256))
img = np.transpose(img, (2, 0, 1)).astype(np.float32)
img = (img / 255 - 0.5) * 2
img = torch.tensor(img).unsqueeze(0)
with torch.no_grad():
    res = model.forward(img)
print(img.shape)
print(res.shape)

torch.Size([1, 3, 256, 256])
torch.Size([1, 16, 64, 64])

def getpoint(mat):
    height, width = mat.shape
    mat = mat.reshape(-1)
    idx = np.argmax(mat)
    return idx // height, idx % width

raw = ((np.array(img.squeeze()) / 2 + 0.5) * 255).astype(np.uint8)
raw = cv2.resize(np.transpose(raw, (1, 2, 0)), (64, 64))
poi = np.array(res.detach().squeeze())
print(raw.shape)
print(poi.shape)

(64, 64, 3)
(16, 64, 64)

image = raw
for mat in poi:
    x, y = getpoint(mat)
    print(x, y)
    cv2.circle(image, (x, y), 1, (255, 0, 0))
import matplotlib.pyplot as plt
plt.imshow(image)

19 30
12 37
22 28
13 37
18 30
13 36
7 32
8 31
18 28
8 34
12 36
7 22
12 38
11 40
18 27
15 30





<matplotlib.image.AxesImage at 0x7fa42088ab00>

I trained on my own dataset, but I was unable to use model_best.path.tar to validate

I noticed that when I trained on my own dataset, your code produced three model: checkpoint.pth.tar, final_state.pth.tar, and model_best.path.tar. Unfortunately, I was unable to load the model_best.pth.tar in validation. It reports "Missing Key(s) Erro" in state_dict.

I'm curious, why don't you use ResNet in torchvision?

Is it because ResNet in torchvision is not up-to-date? Say, it may not have BN within the Residual Module?

Always got zeros during validation

Hi, I got the same issue as #17. Though I have disabled cudnn for batch_norm as mentioned in Installation, I got the same results. I'm using pytorch 0.5.0. Shall I do anything else to fix this problem? Thank you!

What kind of score strategy was used in posetrack evaluation?

Still kpt_score*bbox_score? or other strategies? Thanks!

During training, process interrupted by RuntimeError: cuda runtime error (4)

Thanks for your excellent work. I would like to ask a problem (might not your code bug).
When training on coco dataset under your instruction by
python pose_estimation/train.py \ --cfg experiments/coco/resnet50/384x288_d256x3_adam_lr1e-3.yaml
At an unexpected moment (not always in epoch 1, this happened at any time before), process is interrupted unexpectedly, the log in terminal is as follow

Epoch: [1][2000/4682] Time 0.499s (0.496s) Speed 64.1 samples/s Data 0.000s (0.007s) Loss 0.00067 (0.00070) Accuracy 0.648 (0.578)
Epoch: [1][2100/4682] Time 0.476s (0.496s) Speed 67.2 samples/s Data 0.000s (0.007s) Loss 0.00062 (0.00070) Accuracy 0.571 (0.580)
Epoch: [1][2200/4682] Time 0.479s (0.497s) Speed 66.8 samples/s Data 0.000s (0.007s) Loss 0.00068 (0.00070) Accuracy 0.624 (0.582)
THCudaCheck FAIL file=/pytorch/aten/src/THC/generated/../generic/THCTensorMathPointwise.cu line=265 error=4 : unspecified launch failure
Traceback (most recent call last):
File "pose_estimation/train.py", line 206, in
main()
File "pose_estimation/train.py", line 174, in main
final_output_dir, tb_log_dir, writer_dict)
File "/home/vision01/Workspace/Python/Pose/human-pose-estimation.pytorch-master/pose_estimation/../lib/core/function.py", line 53, in train
optimizer.step()
File "/home/vision01/anaconda3/lib/python3.6/site-packages/torch/optim/adam.py", line 92, in step
exp_avg.mul_(beta1).add_(1 - beta1, grad)
RuntimeError: cuda runtime error (4) : unspecified launch failure at /pytorch/aten/src/THC/generated/../generic/THCTensorMathPointwise.cu:265
THCudaCheckWarn FAIL file=/pytorch/aten/src/THC/THCStream.cpp line=50 error=4 : unspecified launch failure

I disabled cudnn for batch_norm as you instructed
My platform is Ubuntu 16.04, GTX1080TI, CUDA 9.0.176, cudnn 7.0.5, python3.6, pytorch0.4.0
Don`t think is a bug in your code, butI was confused a lot and still had no clue after google.
Hope your guys can help me, thanks a lot!

What's the relation betwen IMAGE_SIZE and HEATMAP_SIZE

My cards (Nvidia Titan XP) is unable to run on 384x288 (out of memory). I want to run on something less than 384x288 but more than 256x192. Thanks!

Error(s) in loading state_dict for PoseResNet: Missing key(s) in state_dict

When I run "python pose_estimation/valid.py --cfg experiments/coco/resnet50/256x192_d256x3_adam_lr1e-3.yaml --flip-test --model-file models/pytorch/pose_coco/pose_resnet_50_256x192.pth.tar", it reported an error:
=> loading model from models/pytorch/pose_coco/pose_resnet_50_256x192.pth.tar
Traceback (most recent call last):
File "pose_estimation/valid.py", line 165, in
main()
File "pose_estimation/valid.py", line 123, in main
model.load_state_dict(torch.load(config.TEST.MODEL_FILE))
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 721, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for PoseResNet:
Missing key(s) in state_dict: "bn1.num_batches_tracked", "layer1.0.bn1.num_batches_tracked", "layer1.0.bn2.num_batches_tracked", "layer1.0.bn3.num_batches_tracked", "layer1.0.downsample.1.num_batches_tracked", "layer1.1.bn1.num_batches_tracked", "layer1.1.bn2.num_batches_tracked", "layer1.1.bn3.num_batches_tracked", "layer1.2.bn1.num_batches_tracked", "layer1.2.bn2.num_batches_tracked", "layer1.2.bn3.num_batches_tracked", "layer2.0.bn1.num_batches_tracked", "layer2.0.bn2.num_batches_tracked", "layer2.0.bn3.num_batches_tracked", "layer2.0.downsample.1.num_batches_tracked", "layer2.1.bn1.num_batches_tracked", "layer2.1.bn2.num_batches_tracked", "layer2.1.bn3.num_batches_tracked", "layer2.2.bn1.num_batches_tracked", "layer2.2.bn2.num_batches_tracked", "layer2.2.bn3.num_batches_tracked", "layer2.3.bn1.num_batches_tracked", "layer2.3.bn2.num_batches_tracked", "layer2.3.bn3.num_batches_tracked", "layer3.0.bn1.num_batches_tracked", "layer3.0.bn2.num_batches_tracked", "layer3.0.bn3.num_batches_tracked", "layer3.0.downsample.1.num_batches_tracked", "layer3.1.bn1.num_batches_tracked", "layer3.1.bn2.num_batches_tracked", "layer3.1.bn3.num_batches_tracked", "layer3.2.bn1.num_batches_tracked", "layer3.2.bn2.num_batches_tracked", "layer3.2.bn3.num_batches_tracked", "layer3.3.bn1.num_batches_tracked", "layer3.3.bn2.num_batches_tracked", "layer3.3.bn3.num_batches_tracked", "layer3.4.bn1.num_batches_tracked", "layer3.4.bn2.num_batches_tracked", "layer3.4.bn3.num_batches_tracked", "layer3.5.bn1.num_batches_tracked", "layer3.5.bn2.num_batches_tracked", "layer3.5.bn3.num_batches_tracked", "layer4.0.bn1.num_batches_tracked", "layer4.0.bn2.num_batches_tracked", "layer4.0.bn3.num_batches_tracked", "layer4.0.downsample.1.num_batches_tracked", "layer4.1.bn1.num_batches_tracked", "layer4.1.bn2.num_batches_tracked", "layer4.1.bn3.num_batches_tracked", "layer4.2.bn1.num_batches_tracked", "layer4.2.bn2.num_batches_tracked", "layer4.2.bn3.num_batches_tracked", "deconv_layers.1.num_batches_tracked", "deconv_layers.4.num_batches_tracked", "deconv_layers.7.num_batches_tracked".

Environment:
Pytorch version 0.5.0
CUDA 8.0 and cudnn 6

Your results on COCO val2017 is based on OKS or PCKh?

I noticed that one of the requirements is installing cocoapi. So I'm courios your results on COCO val2017 is based on OKS[0.5~0.95] or PCKh？

Why do you disable cudnn for batch_norm?

Thank you for releasing the code. The README reads that it is required to disable cudnn for batch_norm. Would you please explain why we should do this?

Different AP using the same model on COCO dataset

In your paper , you achieved 70.4AP using 256x192_pose_resnet_50,
Results on COCO val2017 with detector having human AP of 56.4 on COCO val2017 dataset:

Arch	AP	Ap .5	AP .75	AP (M)	AP (L)	AR	AR .5	AR .75	AR (M)	AR (L)
256x192_pose_resnet_50_d256d256d256	0.704	0.886	0.783	0.671	0.772	0.763	0.929	0.834	0.721	0.824

while after using the same method and the same model, even after training with our own gpus, the AP achieved is
| Arch | AP | Ap .5 | AP .75 | AP (M) | AP (L) | AR | AR .5 | AR .75 | AR (M) | AR (L) |
2018-11-14 18:15:51,575 |---|---|---|---|---|---|---|---|---|---|---|
2018-11-14 18:15:51,575 | 256x192_pose_resnet_50_d256d256d256 | 0.723 | 0.925 | 0.794 | 0.697 | 0.765 | 0.755 | 0.932 | 0.820 | 0.723 | 0.802 |

Nearly 2percent higher .Wolud you be so kind to explian why?

Also,the validate command is not right. It should be:
python pose_estimation/valid.py
--cfg experiments/coco/resnet50/256x192_d256x3_adam_lr1e-3.yaml
--flip-test
--model-file models/pytorch/pose_coco/pose_resnet_50_256x192.pth.tar

Rahter than:
python pose_estimation/valid.py
--cfg experiments/mpii/resnet50/256x256_d256x3_adam_lr1e-3.yaml
--flip-test
--model-file models/pytorch/pose_coco/pose_resnet_50_256x256.pth.tar

The work is really remarkable, looking forward to your comment

some questions about select_data

Hi @leoxiaobin
Thanks for sharing your excellent work!
I am curious about following line in the select_data function in you code.
metric

metric = (0.2 / 16) * num_vis + 0.45 - 0.2 / 16

num_vis is the number of labelled joints.

Would you please explain how to calculate the metric?
Is 16 the number of symmetric joints in coco dataset?
Do you use the select_data only in COCO dataset？
What is 0.45 ?
Why minus 0.2 / 16?
Why select those data which are greater than the metric?

if ks > metric:
                db_selected.append(rec)

greater than the metric

How do I test the detection and tracking on sample images/video?

I was not able to find a python file to test inference. Could you please upload a test.py file to test inference on a batch of images/video?
The AP results in the paper look very promising.

Would you release the tracking code

Thanks for you pose estimation code. According to your paper,your method of tracking is also great, so will you release the tracking code?

ImportError: No module named core.config

Torch v0.5, requirements satisfied with pip install -r requirements.txt and trying to run valid.py with the suggested model/config for Coco

python pose_estimation/valid.py --cfg experiments/mpii/resnet50/256x256_d256x3_adam_lr1e-3.yaml     --flip-test     --model-file models/pytorch/pose_coco/pose_resnet_50_256x256.pth.tar
Traceback (most recent call last):
  File "pose_estimation/valid.py", line 25, in <module>
    from core.config import config
ImportError: No module named core.config

I saw in the paper that this was based on the latest Mask R-CNN and a similar issue was raised here so is the Facebook detectron required to run this code?

What's the correspondence of joints in Posetrack format to those in COCO format?

In your paper, you used pretrained model on COCO and then fine-tuned on Posetrack. But their points format are different, so there must be a conversion otherwise the model could not be used. How did you convert COCO format to Posetrack 17 in your experiment?

ModuleNotFoundError: No module named 'nms.gpu_nms'

Traceback (most recent call last):
File "pose_estimation/train.py", line 37, in
import dataset
File "/home/sunbin/my_project/simple_baselines/human-pose-estimation/pose_estimation/../lib/dataset/init.py", line 12, in
from .coco import COCODataset as coco
File "/home/sunbin/my_project/simple_baselines/human-pose-estimation/pose_estimation/../lib/dataset/coco.py", line 23, in
from nms.nms import oks_nms
File "/home/sunbin/my_project/simple_baselines/human-pose-estimation/pose_estimation/../lib/nms/nms.py", line 14, in
from .gpu_nms import gpu_nms

Which Faster RCNN repo do you use during testing and validation

Hi, @leoxiaobin
Thanks for sharing your excellent work! It have very good results.I am curious about which your bounding box detector.

I evaluate your valided bbox results, it's get 56.8 performance in detection task, and I get 49.5 with maskrcnn detector. Would you give your refered codes or repo, and have you train faster RCNN codes?

Thanks a lot!

MaskRCNN

What is the speed in fps?

Hello,

Thank you for releasing the code! Had a quick question what is the speed per image on a Nvidia GPU? At least approximately.

Thanks!

CUDA driver version is insufficient for CUDA runtime version

When I try to run the training on COCO, as showed in README.md file, I encountered this error.

THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=74 error=35 : CUDA driver version is insufficient for CUDA runtime version
Traceback (most recent call last):
  File "pose_estimation/train.py", line 206, in <module>
    main()
  File "pose_estimation/train.py", line 115, in main
    model = torch.nn.DataParallel(model, device_ids=gpus).cuda()
  File "/home/zhouzhou/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 258, in cuda
    return self._apply(lambda t: t.cuda(device))
  File "/home/zhouzhou/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 185, in _apply
    module._apply(fn)
  File "/home/zhouzhou/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 185, in _apply
    module._apply(fn)
  File "/home/zhouzhou/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 191, in _apply
    param.data = fn(param.data)
  File "/home/zhouzhou/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 258, in <lambda>
    return self._apply(lambda t: t.cuda(device))
RuntimeError: cuda runtime error (35) : CUDA driver version is insufficient for CUDA runtime version at /pytorch/aten/src/THC/THCGeneral.cpp:74

my cuda version is 8.0, my cudnn is 7.2.1, driver version is 381.22

Windows 10 installing pycocotools fails with make.

I followed the instructions and I have "make" on windows 10 from GNU I believe. Anyways the error I got is

# install pycocotools to the Python site-packages
process_begin: CreateProcess(NULL, # install pycocotools to the Python site-packages, ...) failed.
make (e=2): The system cannot find the file specified.
make: *** [install] Error 2

Not sure what file I am missing.

I can't download pretrained model from OneDrive

I got following error when I try to download pretrained model from OneDrive

Something went wrong
Please try again or refresh the page

Is the deepcopy necessary?

https://github.com/Microsoft/human-pose-estimation.pytorch/blob/a515cd38d33fad926501980c27ecafc4eeff9937/lib/dataset/JointsDataset.py#L64

Is the copy.deepcopy() necessary here? Can we directly adopt the value of the index position?

Is there any trick in training?

I tried to train the model by myself but it can not get the experiment results as you released no matter how I altered parameters.
Is there any trick in training? Or how to set the parameters?
Thank you!

can't replicate your performance at 72.4 testing with gt_box

thanks for your release code.
I run you code for 140 epochs , just reach 71.6 map.
does this have something with random seed in dataset ?

Could u provide the COCO_test2017_detections_person.json ?

hi, I have trained my own model, could you provide detections_person.json you used, so that i can compare to the COCO leaderboard. thank u .

Could you release the result on validation set?

Since this is a more convenient way to compare the results.

import tensorboardX, Segmentation fault (core dumped)

When 'import tensorboardX', it reported an error: Segmentation fault (core dumped) . The same error occur when run the train.py. I tried tensorboardX1.2 and 1.4 by 'conda install'.
Environment:
pytorch: v0.4.0
python: v3.6 (anaconda3)
cuda: 8.0
cudnn: v6

mpii visible annotation format is a little different from the original one

For image 008734041.jpg, the visible annotation is a little different.
Current:

# annot/train.json
"joints_vis": [
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1,
   1
],
"joints":[
[
   373.0, 
   374.0
],
...
,
]

Original:

{"is_visible": {"11": 1, "10": 0, "13": 1, "12": 0, "15": 1, "14": 1, "1": 0, "0": 0, "3": 1, "2": 0, "5": 1, "4": 1, "7": 0, "6": 0, "9": 0, "8": 0}, "head_rect": [353.0, 97.0, 391.0, 145.0],
 "train": 1, 
"joint_pos": {"11": [349.0, 113.0], "10": [344.0, 95.0], "13": [370.0, 147.0], "12": [384.0, 140.0], "15": [334.0, 96.0], "14": [332.0, 132.0], "1": [361.0, 311.0], "0": [373.0, 374.0], "3": [370.0, 236.0], "2": [375.0, 235.0], "5": [371.0, 378.0], "4": [356.0, 314.0], "7": [377.0, 144.0], "6": [373.0, 236.0], "9": [366.7979, 97.0705], "8": [377.2021, 144.9295]}, "filename": "008734041.jpg"}

ImportError: dynamic module does not define module export function (PyInit_cpu_nms)

Traceback (most recent call last):
File "pose_estimation/valid.py", line 32, in
import dataset
File "/home/liangjian/pose/pose_estimation/../lib/dataset/init.py", line 12, in
from .coco import COCODataset as coco
File "/home/liangjian/pose/pose_estimation/../lib/dataset/coco.py", line 23, in
from nms.nms import oks_nms
File "/home/liangjian/pose/pose_estimation/../lib/nms/nms.py", line 13, in
from .cpu_nms import cpu_nms
ImportError: dynamic module does not define module export function (PyInit_cpu_nms)

any clue?

different validation AP/AR result for coco

I'm running valid.py for 256x192 resnet 50 network in COCO dataset.
I get different validation result from yours in readme.
This is your result

256x192_pose_resnet_50_d256d256d256 | 0.704 | 0.886 | 0.783 | 0.671 | 0.772 | 0.763 | 0.929 | 0.834 | 0.721 | 0.824

and this is what i get

Arch	AP	Ap .5	AP .75	AP (M)	AP (L)	AR	AR .5	AR .75	AR (M)	AR (L)
256x192_pose_resnet_50_d256d256d256	0.724	0.915	0.804	0.697	0.765	0.756	0.930	0.823	0.723	0.804

I download all pretrained models from google drive and run command below

python3 pose_estimation/valid.py --cfg experiments/coco/resnet50/256x192_d256x3_adam_lr1e-3.yaml --flip-test --model-file models/pytorch/pose_coco/pose_resnet_50_256x192.pth.tar

Is there any update of model or weight?

Which posetrack standard do you use during evaluation?

There are two papers regarding posetrack. In paper "PoseTrack: Joint Multi-Person Pose Estimation and Tracking", the PCKh is 20% of head bounding box's diagonal. In paper "PoseTrack: A Benchmark for Human Pose Estimation and Tracking", the PCKh is 30% of head bounding box's diagonal. I'm wondering, which one you guys used? In PoseTrack GitHub site, their evaluation toolkit "poseval" uses 30% version.

Error occurs, No graph saved

I run into error, no graph saved.
I download model from https://s3.amazonaws.com/pytorch/models/resnet50-19c8e357.pth

python pose_estimation/train.py \
    --cfg experiments/mpii/resnet50/256x256_d256x3_adam_lr1e-3.yaml

writer_dict['writer'].add_graph(model, (dump_input, )) <--- found reason here
=> loading pretrained model models/pytorch/imagenet/resnet50-19c8e357.pth
Error occurs, No graph saved
Traceback (most recent call last):
  File "/home/user1/miniconda2/envs/env3.6-torch4-conda/lib/python3.6/site-packages/tensorboardX/pytorch_graph.py", line 84, in graph
    trace, _ = torch.jit.get_trace_graph(model, args)
  File "/home/user1/miniconda2/envs/env3.6-torch4-conda/lib/python3.6/site-packages/torch/jit/__init__.py", line 255, in get_trace_gr
aph
    return LegacyTracedModule(f, nderivs=nderivs)(*args, **kwargs)
  File "/home/user1/miniconda2/envs/env3.6-torch4-conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__

why the score for coco evaluation is the product of kpt score and bbox score?

I'm little confused here. It seems Detectron has a different stragety...

how to use model predict on my own datasett

hello,thank you for your perfect project, but I don't know how to use your model to predict on my own dataset, I just use a file like predict.py which input only include model path and my dataset path to predict on my dataset.thank you

COCO OKS Metrics Usage

Hi, I am unable to understand how OKS is calculated in experiments using COCO dataset.
In the train function in lib/core/function.py you seem to call accuracy from the file lib/core/evaluate.py. But that accuracy is PCKh right? So how do you calculate OKS.

Could you please explain the steps how can I calculate OKS given I use your dataloader?? Thanks alot in advance!!!

No module named 'nms.cpu_nms'

When I followed the instructions on readme, running "python pose_estimation/valid.py --cfg experiments/mpii/resnet50/256x256_d256x3_adam_lr1e-3.yaml --flip-test --model-file models/pytorch/pose_mpii/pose_resnet_50_256x256.pth.tar", it reported an error:
ImportError: No module named 'nms.cpu_nms'

I've noticed there are cpu_nmn.pyx under 'human-pose-estimation.pytorch/lib/nms'. I ran the 'setup.py' under the same directory by typing 'python setup.py install'. Unfortunately, it didn't solve my problem. I also tried to add various path including cpu_nms.py,cpu_nms.so...in to the environment, no use.

Confusion in inference.py

Hi @leoxiaobin
I am confused about the following code:

px and py

    coords, maxvals = get_max_preds(batch_heatmaps)
    ....
    # post-processing
    if config.TEST.POST_PROCESS:
        for n in range(coords.shape[0]):
            for p in range(coords.shape[1]):
                hm = batch_heatmaps[n][p]
                px = int(math.floor(coords[n][p][0] + 0.5))
                py = int(math.floor(coords[n][p][1] + 0.5))
                if 1 < px < heatmap_width-1 and 1 < py < heatmap_height-1:
                    diff = np.array([hm[py][px+1] - hm[py][px-1],
                                     hm[py+1][px]-hm[py-1][px]])
                    coords[n][p] += np.sign(diff) * .25

coords[n][p][0] is an integer (get_max_preds returns the integer coordinates).
Here we plus 0.5 and then use a floor function.
This will make it equal to coords[n][p][0]
so +0.5 here is useless?
Am I right?

In addition, I find that the hourglass code is different.
hourglass

 -- Very simple post-processing step to improve performance at tight PCK thresholds
    for i = 1,p:size(1) do
        for j = 1,p:size(2) do
            local hm = tmpOutput[i][j]
            local pX,pY = p[i][j][1], p[i][j][2]
            scores[i][j] = hm[pY][pX]
            if pX > 1 and pX < opt.outputRes and pY > 1 and pY < opt.outputRes then
               local diff = torch.Tensor({hm[pY][pX+1]-hm[pY][pX-1], hm[pY+1][pX]-hm[pY-1][pX]})
               p[i][j]:add(diff:sign():mul(.25))
            end
        end
    end
    p:add(0.5)

They plus 0.5 after getting the sub pixel coordinates.

Would you please explain the difference?
Thanks!

Can I run pose estimation part on posetrack datasets?

I noticed in the source code, your project support posetrack datasets. I plan to write a yaml file to run it on posetrack datasets. But I'm stumped on annotations' format. Do you support the original posetrack annotations (in matlab format)? or json verson of such annotations? or I have to turn it into coco format?

I'm sorry that I'm not very familiar with python, so I can't have those answers just by reading the source code.

Many thanks for any help you can offer.

Tracking Code

Hi ,
Thanks for your great paper,
I am also interested in your tracking code,
How can we access to that?

Thanks,

microsoft / human-pose-estimation.pytorch Goto Github PK

human-pose-estimation.pytorch's Introduction

Simple Baselines for Human Pose Estimation and Tracking

News

Introduction

Main Results

Results on MPII val

Note:

Results on COCO val2017 with detector having human AP of 56.4 on COCO val2017 dataset

Results on Caffe-style ResNet

Note:

Environment

Quick start

Installation

Data preparation

Valid on MPII using pretrained models

Training on MPII

Valid on COCO val2017 using pretrained models

Training on COCO train2017

Other Implementations

Citation

human-pose-estimation.pytorch's People

Contributors

Stargazers

Watchers

Forkers

human-pose-estimation.pytorch's Issues

Recommend Projects

Recommend Topics

Recommend Org