Giter Site home page Giter Site logo

liruilong940607 / pose2seg Goto Github PK

View Code? Open in Web Editor NEW
529.0 20.0 136.0 1.97 MB

Code for the paper "Pose2Seg: Detection Free Human Instance Segmentation" @ CVPR2019.

Home Page: http://www.liruilong.cn/projects/pose2seg/index.html

License: MIT License

Python 73.29% Jupyter Notebook 26.71%
segmentation pose-estimation cvpr2019 human

pose2seg's Introduction

Pose2Seg

Official code for the paper "Pose2Seg: Detection Free Human Instance Segmentation"[ProjectPage][arXiv] @ CVPR2019.

The OCHuman dataset proposed in our paper is released here

Pipeline of our pose-based instance segmentation framework.

Setup environment

pip install cython matplotlib tqdm opencv-python scipy pyyaml numpy
pip install torchvision torch

cd ~/github-public/cocoapi/PythonAPI/
python setup.py build_ext install
cd -

Download data

Note: person_keypoints_(train/val)2017_pose2seg.json is a subset of person_keypoints_(train/val)2017.json (in COCO2017 Train/Val annotations). We choose those instances with both keypoint and segmentation annotations for our experiments.

Setup data

The data folder should be like this:

data  
├── coco2017
│   ├── annotations  
│   │   ├── person_keypoints_train2017_pose2seg.json 
│   │   ├── person_keypoints_val2017_pose2seg.json 
│   ├── train2017  
│   │   ├── ####.jpg  
│   ├── val2017  
│   │   ├── ####.jpg  
├── OCHuman 
│   ├── annotations  
│   │   ├── ochuman_coco_format_test_range_0.00_1.00.json   
│   │   ├── ochuman_coco_format_val_range_0.00_1.00.json   
│   ├── images  
│   │   ├── ####.jpg 

How to train

python train.py

Note: Currently we only support for single-gpu training.

How to test

This allows you to test the model on (1) COCOPersons val set and (2) OCHuman val & test set.

python test.py --weights last.pkl --coco --OCHuman

We retrained our model using this repo, and got similar results with our paper. The final weights can be download here.

About Human Pose Templates in COCO

Pose templates clustered using K-means on COCO.

This repo already contains a template file modeling/templates.json which was used in our paper. But you are free to explore different cluster parameters as discussed in our paper. See visualize_cluster.ipynb for an example.

pose2seg's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pose2seg's Issues

Testing with Openpose Keypoints

Thank you for releasing your code.

I have generated keypoints using OpenPose and converted them to Coco format (OpenPose has an extra neck keypoint that I removed). Each keypoint is a 17x3 array. The first two columns are X and Y coordinates and the last column is the confidence score ranging from 0.0 to 1.0.

Is this keypoint format suitable for running inference on the pretrained pose2seg model?

Visualize Masks

Is there a way to visualize masks (as seen in the paper?) once the training is done?

Posture alignment

Hello, I am using alphapose to estimate key points and perform pose alignment without segmentation. I don’t know how to modify it.Do I need to change the size of my picture?

about the train error

I seted up the dataset as the repo refered, but when I run python train.py, I got the error like this.

image

what should I do to solve this?

Can you share your config with Mask R-CNN? I can't reproduce the result on mask_rcnn

In the paper you said you "use the author’s released code and configurations
from [11]" which is Detecton. I am current training the same dataset on maskrcnn-benchmark, but I cannot get anywhere near the result in your paper (0.532 | 0.433 | 0.648)

I was wondering whether you can share some lights on me, that would be so great!

By the way, I've visualized your results on coco and it's amazing. Great work you guys have done !

代码结果可视化问题

您好 ,我按下面这种方法生成json文件 然后可视化时出现错误。
代码如下:
imgIds.append(image_id)
filename = "./seg_images/annotations/" + "instances_val2017.json"
f_obj = open(filename, 'w')
json.dump(results_segm, f_obj)
这样生成json文件后,json部分显示如下:
[{"category_id": 1, "segmentation": {"counts": "hX]52V=3N1O1O1O100O2N1O1O1O1O1iNC^E?_:I[E9d:IVE<i:GQE=n:EmD?S;l001N3N3N3L2N2OO10O1001O1l0UO<D7H3MO0O2N2N1O101N3M4L5K7I5K5J7J4L4K7VOgDeN86W;f0S1DQmV2", "size": [426, 640]}, "image_id": 139, "score": 1.0}, {"category_id": 1, "segmentation": {"counts": "cQf31U=5K5L3N2N2O1N2O0O2OO10O10000O010000O010O010O010O01O00100O0010O010O10O10O010O0100O010O10OO2O001O1O010O1O1N1O2O1O001M3M2O2O1L3K6L4M3N1N3iNSNoFP2n8[NgFg1W9bNaF`1Z9Z1L4N101O1eIdLh2\3SMmLh2V3SMQMi2P3TMUMi2l2VMVMh2k2VMYMh2g2VM]Mg2d2VMbMg2^2XMfMe2[2XMiMe2X2ZMlMc2U2

但是我看您生成的 ochuman.json文件内容显示 不是这样的 ,我不知道 哪里出了问题。
我现在想生成json文件,然后可视化。
期待您的回复,谢谢!

Got wrong test results about AP(area=medium)

Hi, I just cloned the code and installed cocoAPI from GitHub. But I got wrong test results about AP(area=medium) when I ran test.py as your introductions in OCHuman dataset. The model is 'pose2seg_release.pkl' and the test results are as follows.

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.573
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.945
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.637
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.073
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.580
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.422
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.682
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.682
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.550
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.682
[POSE2SEG] AP|.5|.75| S| M| L| AR|.5|.75| S| M| L|
[segm_score] OCHumanVal 0.573 0.945 0.637 -1.000 0.073 0.580 0.422 0.682 0.682 -1.000 0.550 0.682

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.547
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.937
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.582
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.064
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.549
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.379
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.649
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.649
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.300
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.650
[POSE2SEG] AP|.5|.75| S| M| L| AR|.5|.75| S| M| L|
[segm_score] OCHumanTest 0.547 0.937 0.582 -1.000 0.064 0.549 0.379 0.649 0.649 -1.000 0.300 0.650

Why the AP(area=medium) are extremely low?

And I also have another question about the OCHuman dataset. I updated the dataset to the latest version, but I found the validation contains 4291 instances and the test contains 3819 instances, which is still not consistent with the description in the paper. Is there a problem with my dataset?

Request the dataset address

Hi,!Dear Researcher,
Thank you for your excellent work!
I can't access the dataset address provided by the dataset repo. Could you download it for me or give me a link to download?

Or anyone can help me? Thanks a lot! :)

Training on smaller dataset

Hello @liruilong940607, thank you very much for this repo. I wanted to know if it is possible to train on a smaller COCO dataset than what is originally provided in the readme.

I tried looking into the keypoints JSON file and train2017 folder of images. But I'm not sure which data to modify.

During training, the process line (snippet below) indicates there are 14150 images in dataloader.

Pose2Seg/train.py

Lines 75 to 85 in 64fcc5e

if i % 10 == 0:
logger.info('Epoch: [{0}][{1}/{2}]\t'
'Lr: [{3}]\t'
'Time {batch_time.val:.3f} ({batch_time.avg:.3f})\t'
'Data {data_time.val:.3f} ({data_time.avg:.3f})\t'
'loss {loss.val:.5f} ({loss.avg:.5f})\t'
.format(
epoch, i, len(dataloader), lr,
batch_time=averMeters['batch_time'], data_time=averMeters['data_time'],
loss=averMeters['loss'])
)

The train2017 directory contains 118288 images. (I found out by using ls -1 | wc -l in the train2017 directory)
The person_keypoints_train2017_pose2seg.json has the 149813 items in the "images" field. (Using pythons json module)
The person_keypoints_train2017_pose2seg.json has the 56599 items in the "images" field. (Using pythons json module)

I suppose that if I am trying to reduce the number of images the training process uses, I need to reduce the 14150 indicated by the dataloader, but I'm not sure how.

Thank you

Ski2DPose dataset

Hi, Loving your work!
Im using Ski2DPose dataset that i have already made in coco dataset format. However i have run into issues please check:

in train(model, dataloader, optimizer, epoch, iteration)
49 averMeters.clear()
50 end = time.time()
---> 51 for i, inputs in enumerate(dataloader):
52 averMeters['data_time'].update(time.time() - end)
53 iteration += 1

/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py in next(self)
628 # TODO(pytorch/pytorch#76750)
629 self._reset() # type: ignore[call-arg]
--> 630 data = self._next_data()
631 self._num_yielded += 1
632 if self._dataset_kind == _DatasetKind.Iterable and \

/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py in _next_data(self)
1343 else:
1344 del self._task_info[idx]
-> 1345 return self._process_data(data)
1346
1347 def _try_put_index(self):

/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py in _process_data(self, data)
1369 self._try_put_index()
1370 if isinstance(data, ExceptionWrapper):
-> 1371 data.reraise()
1372 return data
1373

/usr/local/lib/python3.10/dist-packages/torch/_utils.py in reraise(self)
692 # instantiate since we don't know how to
693 raise RuntimeError(msg) from None
--> 694 raise exception
695
696

KeyError: Caught KeyError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
data = fetcher.fetch(index)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 51, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "", line 104, in getitem
rawdata = self.datainfos[idx]
File "/content/Pose2Seg/datasets/CocoDatasetInfo.py", line 130, in getitem
return self.getitem(idx)
File "/content/Pose2Seg/datasets/CocoDatasetInfo.py", line 216, in getitem
if isinstance(obj['segmentation'], list):
KeyError: 'segmentation'

论文几个问题

你好,读完论文有几个问题,望解答
1) 4.1节说image + pose作为input,那么,
1.1)pose是怎么表达的?是heatmap吗,还是如Figure4,一样,画在原始图像上?
1.2) “+” 的操作是什么意思?是concat,将heatmap(若是)和image按通道组合起来吗?

2)pose获取的问题:我在另外的issue里读到,你们用Associative embedding: End-to-end learning for joint detection and grouping方法提取pose,那么你们是否有用过MSRA的Simple Baselines for Human Pose Estimation and Tracking呢?bottom-up与top-down的结果对于你们这里的做segment的效果有影响吗?

3)既然skeleton feature是joint heatmap 与 PAF 叠在一起,那为什么不直接在原图上跑openpose的方法,然后可以直接affine-align operation把skeleton feature map给转过来,这样就省去了一次预测pose的过程。因为现在的pipeline实际上预测了2次pose(input一次,skeleton feature一次),如果我理解的没错,那么原文的做法是否有点冗余?

4)整个系统(Figure4所描述的)是end-to-end训练的吗?

结果可视化

您好
我运行了你的模型在ochuamn上 给了一些数
请问 怎么将结果可视化出来呢 好像在你们的论文里图片那样

How to visualize the output ?

Hi,
I am getting the segmentation output after running the test.py file on my in the wild images.
output -
{"image_id": 6, "category_id": 1, "score": 1.0, "segmentation": {"size": [720, 1280], "counts": "obh`01^f02N2OLiYO0\f0000kUW;"}

How are we supposed to visualize this ?

关于输入的问题

最近在follow 你的工作,你们的输入是单张图片然后估计skeleton 在进行affine,还是输入是两个,一个是图片另外一个是已经存在的skeleton? 我对论文又些许不明白

The instance segmentation performance rely heavily on keypoints performance

Pose2Seg model provided by author was used to generate instance segmentation on Crowdhuman dataset with the input of keypoints generated by AlphaPose model. According to the visualization result, the instance segmentation is not so good compared to mask rcnn model because the performance of instance segmentation of Pose2Seg depended heavily on keypoints performance.

The keypoints visualization:
273275,8192f000acfb8e7b

The instance segmentation visualization:
273275,8192f000acfb8e7b

I was amazed when I saw the visualization of instance segmentation in COCO or COHuman dataset. But when using another model to generate keypoints, the result is not good.

Reproducing results without GT Keypoints

Hi @liruilong940607, we are trying to reproduce the results of Pose2Seg.

For the test results without using GT keypoints (2nd row of Table 2a and 2b on your paper), i.e., using another pose estimator to predict the keypoints instead, I understand that you use Associative Embedding: https://github.com/princeton-vl/pose-ae-train to generate the predicted keypoints first.

Are you able to share the same json file that contains the predicted keypoints from AE so we can reproduce the results?

Thank you!

This repo need to have "inference.py/ipynb"

First of all, Impressive work @liruilong940607 .

Secondly, This repo needs to have an extra file explaining how should input look, how to feed inputs and get outputs (and maybe explaining pre-processing steps as well). I am posting this because there are many "issues" posted in this repo discussing and wanting to know the same (including myself).

What is the goal for aligning the keypoints?

Hi, thanks for your contribution!
I have a question that why should we align all the keypoints? For ROI align, the features must be input to fully connected layer so the size of them should be the same. But in the segmentation task, there is no fc layer and I cannot figure out the function of affine-align operation.

pose point

dear author, i have a question ,use your model how to get 17 pose points.I found your model output no pose point parameter ,can you explain ?thank you

How can I generate coco format keypoints json file?

I used pos-ae get 17 keypoints of one person. But how can I generate the coco format json file using the 17 keypoints . Like this , {"keypoints": [176.7578125, 111.5390625, 0.9444317817687988, 182.6171875, 105.6796875, 0.9576615691184998, 170.8984375, 105.6796875, 0.9330981969833374, 192.3828125, 111.5390625, 0.9314539432525635, 166.9921875, 111.5390625, 0.6315260529518127, 0, 0, 0, 155.2734375, 140.8359375, 0.8885959982872009, 217.7734375, 177.9453125, 0.9087645411491394, 151.3671875, 179.8984375, 0.8826849460601807, 219.7265625, 207.2421875, 0.8257762789726257, 145.5078125, 215.0546875, 0.6683874130249023, 194.3359375, 218.9609375, 0.7143456935882568, 161.1328125, 217.0078125, 0.7429389357566833, 192.3828125, 285.3671875, 0.7802388072013855, 159.1796875, 279.5078125, 0.7996304035186768, 190.4296875, 330.2890625, 0.831089973449707, 159.1796875, 336.1484375, 0.8201078176498413], "score": 0.7800430655479431, "image_path": "test1.png"}

如何通过前向过程,保存可视化掩膜的信息。

  您好,非常感谢帮忙解决了我在上一个issues提出的问题,我现在还有一个疑问,在您提供的代码中怎样通过前向输出coco annotation的格式json文件,通过您提供的ochuman API 进行掩膜和骨架的显示。
 还是输出coco annotation格式的json,在另外的代码中,能方便提供吗?
 谢谢!

RuntimeError: CUDA out of memory in training

I follow all steps but when I run python train.py I find this error RuntimeError: CUDA out of memory. Tried to allocate 128.00 MiB (GPU 0; 4.00 GiB total capacity; 2.57 GiB already allocated; 74.77 MiB free; 2.85 GiB reserved in total by PyTorch) (malloc at ..\c10\cuda\CUDACachingAllocator.cpp:289) (no backtrace available)

What is in this work's COCO dataset

Hi, thank you for the nice work here.

Can I understand how you obtained person_keypoints_train2017_pose2seg.json and person_keypoints_val2017_pose2seg.json? Was it obtained by filtering out the "person" class from the COCO dataset and also removing "Small" persons?

Thanks!

How to run the test on a single image ?

HI man,
I installed pose2seg and it runs perfectly on your OCHuman,
How can I test it on a single image using your trained model?

BTW, I cannot open the link of visualize_cluster.ipynb .
Thanks.

Could not able to download the images

How to download "" *images [667MB] & annotations"" that belongs to ochuman.. The link you presented there is not redirecting to download, Help to download

What's the environment setup and requirement?

Your work is fascinating.

Could you share some information on the environment you used, like OS (& version), python version, and libraries' versions (especially PyTorch)?

Also, when it comes to inference, How much GPU memory is required for a single image (let's say 512x512, same as your experiments)?

Thanks

Data Augmentation

In training, I notice that all InputMatirxs obtained by get_aug_matrix are all identical. It means that there is no translation/rotation/scale/etc. for any input image and pose pair.

Actually, there is little rotation angle as the default input argument for get_aug_matrix. And it's interesting to see these little changes are ALL suppressed when it's called. Could you please kindly explain the pros and cons of doing so? As stated in other issue, the released model works not that good when we pass poor human pose estimation results. Can we improve the issue with some active data augmentation during training? Many thanks!

How to run model for new images ?

I want to run model to process my images. My data has not grand truth annotations and keypoints.
How can i process my images ?

`import argparse
import numpy as np
from tqdm import tqdm

from modeling.build_model import Pose2Seg
from datasets.CocoDatasetInfo import CocoDatasetInfo, annToMask
from pycocotools import mask as maskUtils
import cv2, os
import matplotlib
import matplotlib.pyplot as plt

from skimage import data, io, filters

import os
import cv2
import json
model = Pose2Seg().cuda()
model.init('pose2seg_release.pkl')
ImageRoot = './data/coco2017/val2017'
AnnoFile = './data/coco2017/annotations/person_keypoints_val2017_pose2seg.json'
datainfos = CocoDatasetInfo(ImageRoot, AnnoFile, onlyperson=True, loadimg=True)
model.eval()
results_segm = []
imgIds = []
rawdata = datainfos[1]
img = rawdata['data']
print(rawdata['image'])
image_id = rawdata['id']
height, width = img.shape[0:2]
gt_kpts = np.float32(rawdata['gt_keypoints']).transpose(0, 2, 1)  # (N, 17, 3)
gt_segms = rawdata['segms']
gt_masks = np.array([annToMask(segm, height, width) for segm in gt_segms])
output = model([img], [gt_kpts], [gt_masks])
for mask in output[0]:
    maskencode = maskUtils.encode(np.asfortranarray(mask))
    maskencode['counts'] = maskencode['counts'].decode('ascii')
    results_segm.append({
        "image_id": image_id,
        "category_id": 1,
        "score": 1.0,
        "segmentation": maskencode
    })
imgIds.append(image_id)` 

model need inputs as keypoints and masks.

output = model([img], [gt_kpts], [gt_masks])

I need to create mask and keypoints from my image and visualize them
Great work , thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.