gengzigang / pct Goto Github PK
View Code? Open in Web Editor NEWThis is an official implementation of our CVPR 2023 paper "Human Pose as Compositional Tokens" (https://arxiv.org/pdf/2303.11638.pdf)
License: MIT License
This is an official implementation of our CVPR 2023 paper "Human Pose as Compositional Tokens" (https://arxiv.org/pdf/2303.11638.pdf)
License: MIT License
Traceback (most recent call last):
File "./tools/train.py", line 15, in
from mmpose.apis import train_model
File "/root/miniconda3/lib/python3.8/site-packages/mmpose/apis/init.py", line 2, in
from .inference import (collect_multi_frames, inference_bottom_up_pose_model,
File "/root/miniconda3/lib/python3.8/site-packages/mmpose/apis/inference.py", line 17, in
from mmpose.datasets.dataset_info import DatasetInfo
File "/root/miniconda3/lib/python3.8/site-packages/mmpose/datasets/init.py", line 7, in
from .datasets import ( # isort:skip
File "/root/miniconda3/lib/python3.8/site-packages/mmpose/datasets/datasets/init.py", line 2, in
from ...deprecated import (TopDownFreiHandDataset, TopDownOneHand10KDataset,
File "/root/miniconda3/lib/python3.8/site-packages/mmpose/deprecated.py", line 5, in
from .datasets.datasets.base import Kpt2dSviewRgbImgTopDownDataset
File "/root/miniconda3/lib/python3.8/site-packages/mmpose/datasets/datasets/base/init.py", line 2, in
from .kpt_2d_sview_rgb_img_bottom_up_dataset import
File "/root/miniconda3/lib/python3.8/site-packages/mmpose/datasets/datasets/base/kpt_2d_sview_rgb_img_bottom_up_dataset.py", line 8, in
from xtcocotools.coco import COCO
File "/root/miniconda3/lib/python3.8/site-packages/xtcocotools/coco.py", line 58, in
from . import mask as maskUtils
File "/root/miniconda3/lib/python3.8/site-packages/xtcocotools/mask.py", line 3, in
import xtcocotools._mask as _mask
File "xtcocotools/_mask.pyx", line 1, in init xtcocotools._mask
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
我在执行指令./tools/dist_train.sh configs/pct_[base/large/huge]_classifier.py 8
时,发生错误:AssertionError: MMCV==1.7.0 is used but incompatible. Please install mmcv>=2.0.0rc4, <=2.1.0.
接着我卸载当前版本的mmcv并下载2.0.0rc4版本,但是又出现错误:ImportError: cannot import name 'Config' from 'mmcv'
,该如何解决呀
hi , can i have better information about the swin backbone you used ? Is your backbone a modified version of swin transformers? I tried to have the number of parameter of the backbone only and it turn out that it only contains 8000 parameters what is different from the one provided initially by Microsoft that is around a billion of parameter. Can you please give me more information about the swin backbone you used ?
Number of parameters ?
FLOPs? (only the backbone)
请问该模型只能使用分布式训练吗?如何不使用分布式训练使用该模型?
Hello, thanks for your attractive idea in this paper.
In the paper, the dimension of each token feature is H and the dimension of each embedding in codebook is N.
Is H equal to N?
Because in eq.2 we need to calculate the distance between these two vectors, the dimension should be same.
Looking forward to your reply!
Hi, I've been reading your paper, good work!
However, there are 2 terms i don't quite understand. In 4.5 section(ablation study), you mentioned Image Guidance
and auxiliary Pose Reconstruction Loss
, I don't know what they refer to since I'm new to this field. Could you explain?
想问一下支持转成onnx吗
hi, I've been trying to use your checkpoint for my private project. I wonder whether I can train my own pct, so I train them on COCO solely. After that, I plug the obtained model into my project, but it seems that it yielded subpar results.
I want to know the publicly available checkpoint is trained on what datasets @Gengzigang
Looking forward to your reply.
builder.py里面的函数怎么都被注释掉了,配了半天环境以为是我的问题
Hi, thanks for your insightful work. I was able to reproduce the paper's results on Coco dataset, and I am attempting to reproduce the results with H36m. However, I do not find any sample code nor instructions on that. Would you please help and share code or instruct on how you conducted the experiments on h36m?
Thank you for sharing code. I am a beginner in human pose estimation. I noticed that your method achieved very good results on the MPII dataset, so I would like to use it on the MPII dataset. But coco and MPII is not quite the same. How can I train the model on MPII data? Hope to receive advice, thank you!
I am having a problem regarding the installation of mmpose module. I need to use chumpy, which does not recognize pip, what can I do to solve this problem ? Did anybody else have this ?
I'm trying to train model with py version 3.10.10 on kaggle. Any can help me to choose the right version of all package in requirement.txt . Thanks
Hi, thanks for opening such a great project.
I ran multiple inferences using your pretrained swin_large model, and the scores of each keypoint are same, even if it doesn't seem right.
How can I fix this so that I can seek out keypoints that are not in the image?
Hi, Could I know how you initialize a new codebook? Furthermore, in you paper, you demonstrated that the codebook is updated by using exponential moving average of previous tokens, I wonder this step is in Stage I (i.e. train encoder) or is an additional stage before training tokenizers?
Have you add the CrowdPose, OCHuman, SyncOCC dataset to train the pct model ,try to improve the performance of 2d pose estimation
I have the idea that just change the config and dir to mpii dataset. Is this right? is the mpii dataset is almost the same with coco dataset
Hi, nice work. Thank you for sharing code,
I have ran the demo, but it produces only 2D pose.
I am trying to get wholebody (e.g., COCO-wholebody) 3D pose
How can I get wholebody or 3D pose?
Thank you
hi, When training a classifier, don't freeze the weights of the decoder and codebook? I didn't find any steps in the code for freezing weights.
I tried to change the backbone to see the performances of the model , i retrained my backbone(Resnet) with heatmap supervision on coco as said in the doc of the repository and then trained the tokenizer , and the Classifier but i get bad Precision for the final model (AP = 0.150 and [email protected] = 0.4) is there something i missed ? or just the model works only for swin?
现在我训练好了模型,并成功运行了dist_test.sh
。我应该怎么做,来实现demo中的效果呢?即如何输出关键点提取后的图像?
希望能得到一些建议
When I run your demo, I got that error. How can I solve it?
the output score of the keypoint more than 1, what does the output score means
Hello @Gengzigang
I tried using lightweight backbone to train the tokenizer and classifier ,but i get a bad result.Is there something i missed?
The result of tokenizer model:
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.965
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.990
The result of classifier model:
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.365
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.733
I changed the backbone and optimizer for config of classifier model :
optimizer = dict(type='AdamW', lr=8e-4, betas=(0.9, 0.999), weight_decay=0.05)
When i start train the classifier model ,the ap is 0.005 ->0.007. SO how to train the classifier model?
Thank you for taking the time to answer.
@Gengzigang how did you do the model's inferecence for a single image ? is there a file that miss in your repository? i checked the one of mmpose but it doesn't worked for me , can you share your working environment for the inference?
According the paper and code, the backbone should be frozen for faster training, however, the provided classifier model do not have the same parameters with the provided backbone, it is really confused for training the final classifier model?
The final classifier model only have 0.343 mAP on COCO dataset with "pct_base_woimgguide_classifier.py".
{"mode": "val", "epoch": 210, "iter": 67, "lr": 0.0, "AP": 0.3434, "AP .5": 0.68824, "AP .75": 0.30028, "AP (M)": 0.33237, "AP (L)": 0.36357, "AR": 0.38416, "AR .5": 0.71678, "AR .75": 0.36288, "AR (M)": 0.3646, "AR (L)": 0.41226}.
It will be very helpful if this question could be solved, looking forward to your response.
user/PCT/tools/train.py", line 19, in
from models import build_posenet
ModuleNotFoundError: No module named 'models'
I face with this issue although there is a models folder under PCT.
How can fix it?
When I ran demo_img_with_mmdet.py, I checked the output pose_results
from inference_top_down_pose_model()
. All the 17 keypoints share the same confidence value, which seems to be the aggregated confidence of all 17 keypoints. Could you fix the issue to show individual confidence of each keypoint in the demo script?
Hello @Gengzigang and team,
The idea of representing human pose as compositional tokens (PCT) is both unique and compelling. By modeling the relationship between keypoints in such a structured manner, it's pretty inspiring.
However, I have a question regarding your model choice. I noticed that you opted for a Swin-based model for implementation. Given the current success and traction of ViTPose, I'm curious as to why you didn't choose to integrate PCT directly with ViTPose. Was there a specific reason or advantage for preferring the Swin-based model over ViTPose when incorporating PCT?
Thank you for taking the time to answer. I'm eager to delve deeper into your work and truly appreciate the effort you've put into this research. Looking forward to your insights!
Warm regards,
Jia-Yau
@Gengzigang please, in your paper in the training process of the classifier you said you fixed the backbone for save computation cost thus only the classification head is updated.
My problem is that i changed the backbone and i have the good computation power and i want to update the backbone during the classifier training. I explore your code but didn't find where you specified that in order to change it. Could you please give me hints ?
First, thank you for this awesome project.
According to issue #1 , mmcv and mmpose are respectively set to 1.7.0 and 0.29.0. With these versions, the code works perfectly well, but since I need to use other models provided by mmpose (using latest version, i.e. > 1.0.0), I have compatibility issues with these two projects.
So, I was wondering if you had the intention to upgrade the current project using newest versions of mmcv/mmpose in a near future. I know that this is a really painful task (that's why I'm not willing to do it by myself), so I'm not trying to force you anything, I just want to know in advance if you were planning to make these changes or not.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.