bearpaw / pytorch-pose Goto Github PK

View Code? Open in Web Editor NEW

1.1K 31.0 251.0 6.87 MB

A PyTorch toolkit for 2D Human Pose Estimation.

License: GNU General Public License v3.0

Python 80.75% MATLAB 18.27% Shell 0.98%

pose human-pose-estimation pytorch mpii-dataset mscoco-keypoint hourglass-network pose-estimation

pytorch-pose's Issues

Is there any difference with original torch implementation?

Hi, thanks for your code!

I see that the mean [email protected] of 4-stack-hourglass model trained using this code is 83.58, which is 4-5 points lower than model trained on original torch code, which is 87.8 as reported in the paper. So is there any difference between this implementation and the original code? Like the data augmentation, or the intermediate supervision?

Thank you!

cocoScale not defined

When I am running gen_coco.m. I got the cocoScale not find error. It seems not in the official COCO Matlab API. Where can I find it?

About function "transform" in transforms.py

Hi,

Thanks for your code. I have one question about some code in the "transform" function.

new_pt = np.array([pt[0] - 1, pt[1] - 1, 1.]).T
new_pt = np.dot(t, new_pt)
return new_pt[:2].astype(int) + 1

According to the above code, you first subtract 1 from the coordinates and then add 1 after the transformation. I don't see the reason of doing this. There are two places calling this "transform" function. The first place is in datasets/mpii.py function,

tpts[i, 0:2] = to_torch(transform(tpts[i, 0:2]+1, c, s, [self.out_res, self.out_res], rot=r))
target[i] = draw_labelmap(target[i], tpts[i]-1, self.sigma, type=self.label_type)

Here you first add 1 and then subtract 1 before and after calling the "transform" function, which just offset what you do inside it. For this case, we could remove the plus 1 and minus 1 for clarity.

Second, function "final_preds" calls function "transform_preds" which then calls "transform" as follows:

coords[p, 0:2] = to_torch(transform(coords[p, 0:2], center, scale, res, 1, 0))

In this case, I also read the original torch code:
https://github.com/anewell/pose-hg-demo/blob/master/util.lua
It seems they don't add 1 and subtract 1 afterwards. I think adding 1 is not equivalent to subtracting 1 after the trasformation. Could please explain your reason?

Thanks,

About your MPII data preprocessing code

Hi, @bearpaw

I read and ran your script https://github.com/bearpaw/pytorch-pose/blob/master/miscs/gen_mpii.m.
I find the sizes of training and validation sets in the generated json file are 22246 and 2958. However, the script https://github.com/shihenw/convolutional-pose-machines-release/blob/master/training/genJSON.m creates a json file containing 25925 training and 2958 validation samples. I find the difference of the training set is due to your code

loc = find(sum(~bsxfun(@minus, tompson_i_p, [i;p]))==2, 1); loc2 = find(tompson.RELEASE_img_index == i); if(~isempty(loc)) validationCount = validationCount + 1; isValidation = 1; elseif (isempty(loc2)) trainCount = trainCount + 1; isValidation = 0; else continue; end

In file https://github.com/shihenw/convolutional-pose-machines-release/blob/master/training/genJSON.m, the corresponding code is
loc = find(sum(~bsxfun(@minus, tompson_i_p, [i;p]))==2, 1); if(~isempty(loc)) validationCount = validationCount + 1; %fprintf('Tomspon''s validation! %d\n', validationCount); isValidation = 1 else isValidation = 0; end

The difference means that if there is one person an image belonging to the validation, you don't use all other persons on that image for the training, which results in fewer training samples at last. Do you have any specific reason of doing this? I think larger training set should be more beneficial. Are your results posted here https://github.com/bearpaw/pytorch-pose/ based on the smaller training set of size 22246?

Thanks!

Could not train well on small dataset

Hi bearpaw, I tried the codes with default setting on small dataset, say 4 images and found that it could not train better than using full mpii dataset, with respect to training loss, i.e. the training loss on full mpii dataset is even lower(~ 1e-4) while on a small set of 4 images it's been on a plateau at ~1e-3.

It is very strange. As I think verifying the model and the codes could overfit a small set is necessary before any further step, could you provide some advise for above issue? Thanks.

Train loss rises normally, validate loss has been fluctuating in a small range

validation acc drops down drastically after epoch 10

I just run a very simple hg8 architecture, the log is as following.

==> creating model 'hg', stacks=8, blocks=1
    Total params: 25.59M
    Mean: 0.4404, 0.4440, 0.4327
    Std:  0.2458, 0.2410, 0.2468

Epoch: 1 | LR: 0.00025000
Processing |################################| (3708/3708) Data: 0.000182s | Batch: 2.038s | Total: 0:20:48 | ETA: 0:00:01 | Loss: 0.0063 | Acc:  0.1950
Processing |################################| (493/493) Data: 0.000143s | Batch: 0.154s | Total: 0:01:15 | ETA: 0:00:01 | Loss: 0.0077 | Acc:  0.3638

Epoch: 2 | LR: 0.00025000
Processing |################################| (3708/3708) Data: 0.000244s | Batch: 0.281s | Total: 0:20:44 | ETA: 0:00:01 | Loss: 0.0052 | Acc:  0.3876
Processing |################################| (493/493) Data: 0.000139s | Batch: 0.140s | Total: 0:01:09 | ETA: 0:00:01 | Loss: 0.0072 | Acc:  0.5017

Epoch: 3 | LR: 0.00025000
Processing |################################| (3708/3708) Data: 0.000260s | Batch: 0.279s | Total: 0:20:39 | ETA: 0:00:01 | Loss: 0.0048 | Acc:  0.5024
Processing |################################| (493/493) Data: 0.000133s | Batch: 0.141s | Total: 0:01:09 | ETA: 0:00:01 | Loss: 0.0064 | Acc:  0.5538

Epoch: 4 | LR: 0.00025000
Processing |################################| (3708/3708) Data: 0.000276s | Batch: 0.277s | Total: 0:20:33 | ETA: 0:00:01 | Loss: 0.0046 | Acc:  0.5604
Processing |################################| (493/493) Data: 0.000131s | Batch: 0.133s | Total: 0:01:05 | ETA: 0:00:01 | Loss: 0.0055 | Acc:  0.6337

Epoch: 5 | LR: 0.00025000
Processing |################################| (3708/3708) Data: 0.000401s | Batch: 0.286s | Total: 0:20:35 | ETA: 0:00:01 | Loss: 0.0044 | Acc:  0.6009
Processing |################################| (493/493) Data: 0.000134s | Batch: 0.130s | Total: 0:01:04 | ETA: 0:00:01 | Loss: 0.0049 | Acc:  0.6572

Epoch: 6 | LR: 0.00025000
Processing |################################| (3708/3708) Data: 0.000247s | Batch: 0.283s | Total: 0:20:32 | ETA: 0:00:01 | Loss: 0.0043 | Acc:  0.6289
Processing |################################| (493/493) Data: 0.000095s | Batch: 0.131s | Total: 0:01:04 | ETA: 0:00:01 | Loss: 0.0046 | Acc:  0.6670

Epoch: 7 | LR: 0.00025000
Processing |################################| (3708/3708) Data: 0.000177s | Batch: 0.252s | Total: 0:20:42 | ETA: 0:00:01 | Loss: 0.0042 | Acc:  0.6469
Processing |################################| (493/493) Data: 0.000153s | Batch: 0.128s | Total: 0:01:03 | ETA: 0:00:01 | Loss: 0.0046 | Acc:  0.6934

Epoch: 8 | LR: 0.00025000
Processing |################################| (3708/3708) Data: 0.000222s | Batch: 0.282s | Total: 0:20:23 | ETA: 0:00:01 | Loss: 0.0041 | Acc:  0.6661
Processing |################################| (493/493) Data: 0.000157s | Batch: 0.130s | Total: 0:01:04 | ETA: 0:00:01 | Loss: 0.0056 | Acc:  0.6942

Epoch: 9 | LR: 0.00025000
Processing |################################| (3708/3708) Data: 0.000262s | Batch: 0.276s | Total: 0:20:26 | ETA: 0:00:01 | Loss: 0.0040 | Acc:  0.6812
Processing |################################| (493/493) Data: 0.000144s | Batch: 0.149s | Total: 0:01:13 | ETA: 0:00:01 | Loss: 0.0068 | Acc:  0.6930

Epoch: 10 | LR: 0.00025000
Processing |################################| (3708/3708) Data: 0.000206s | Batch: 0.254s | Total: 0:20:36 | ETA: 0:00:01 | Loss: 0.0039 | Acc:  0.6923
Processing |################################| (493/493) Data: 0.000211s | Batch: 0.149s | Total: 0:01:13 | ETA: 0:00:01 | Loss: 0.0076 | Acc:  0.7049

Epoch: 11 | LR: 0.00025000
Processing |################################| (3708/3708) Data: 0.000268s | Batch: 0.290s | Total: 0:20:27 | ETA: 0:00:01 | Loss: 0.0039 | Acc:  0.7046
Processing |################################| (493/493) Data: 0.000173s | Batch: 0.142s | Total: 0:01:09 | ETA: 0:00:01 | Loss: 0.0095 | Acc:  0.7010

Epoch: 12 | LR: 0.00025000
Processing |################################| (3708/3708) Data: 0.000269s | Batch: 0.288s | Total: 0:20:57 | ETA: 0:00:01 | Loss: 0.0038 | Acc:  0.7108
Processing |################################| (493/493) Data: 0.000177s | Batch: 0.136s | Total: 0:01:07 | ETA: 0:00:01 | Loss: 0.0138 | Acc:  0.6223

Epoch: 13 | LR: 0.00025000
Processing |################################| (3708/3708) Data: 0.000226s | Batch: 0.253s | Total: 0:20:23 | ETA: 0:00:01 | Loss: 0.0037 | Acc:  0.7201
Processing |################################| (493/493) Data: 0.000237s | Batch: 0.140s | Total: 0:01:08 | ETA: 0:00:01 | Loss: 0.0221 | Acc:  0.5394

Epoch: 14 | LR: 0.00025000
Processing |################################| (3708/3708) Data: 0.000273s | Batch: 0.293s | Total: 0:20:31 | ETA: 0:00:01 | Loss: 0.0037 | Acc:  0.7289
Processing |################################| (493/493) Data: 0.000174s | Batch: 0.130s | Total: 0:01:04 | ETA: 0:00:01 | Loss: 0.0315 | Acc:  0.3212

Epoch: 15 | LR: 0.00025000
Processing |################################| (3708/3708) Data: 0.000453s | Batch: 0.321s | Total: 0:20:47 | ETA: 0:00:01 | Loss: 0.0036 | Acc:  0.7355
Processing |################################| (493/493) Data: 0.000147s | Batch: 0.149s | Total: 0:01:13 | ETA: 0:00:01 | Loss: 0.0528 | Acc:  0.0971

Epoch: 16 | LR: 0.00025000
Processing |################################| (3708/3708) Data: 0.000270s | Batch: 0.280s | Total: 0:20:42 | ETA: 0:00:01 | Loss: 0.0036 | Acc:  0.7417
Processing |################################| (493/493) Data: 0.000178s | Batch: 0.129s | Total: 0:01:03 | ETA: 0:00:01 | Loss: 0.0900 | Acc:  0.0151

Epoch: 17 | LR: 0.00025000
Processing |################################| (3708/3708) Data: 0.000268s | Batch: 0.289s | Total: 0:20:28 | ETA: 0:00:01 | Loss: 0.0035 | Acc:  0.7481
Processing |################################| (493/493) Data: 0.000145s | Batch: 0.148s | Total: 0:01:13 | ETA: 0:00:01 | Loss: 0.1890 | Acc:  0.0089

Epoch: 18 | LR: 0.00025000
Processing |################################| (3708/3708) Data: 0.000220s | Batch: 0.276s | Total: 0:20:24 | ETA: 0:00:01 | Loss: 0.0035 | Acc:  0.7525
Processing |################################| (493/493) Data: 0.000082s | Batch: 0.136s | Total: 0:01:06 | ETA: 0:00:01 | Loss: 0.3065 | Acc:  0.0000

Epoch: 19 | LR: 0.00025000
Processing |################################| (3708/3708) Data: 0.000186s | Batch: 0.275s | Total: 0:20:02 | ETA: 0:00:01 | Loss: 0.0035 | Acc:  0.7589
Processing |################################| (493/493) Data: 0.000080s | Batch: 0.135s | Total: 0:01:06 | ETA: 0:00:01 | Loss: 1.0547 | Acc:  0.0015

Epoch: 20 | LR: 0.00025000
Processing |################################| (3708/3708) Data: 0.000127s | Batch: 0.240s | Total: 0:20:19 | ETA: 0:00:01 | Loss: 0.0034 | Acc:  0.7641
Processing |################################| (493/493) Data: 0.000147s | Batch: 0.130s | Total: 0:01:03 | ETA: 0:00:01 | Loss: 1.7841 | Acc:  0.0019

Epoch: 21 | LR: 0.00025000
Processing |################################| (3708/3708) Data: 0.000245s | Batch: 0.271s | Total: 0:20:13 | ETA: 0:00:01 | Loss: 0.0034 | Acc:  0.7690
Processing |################################| (493/493) Data: 0.000080s | Batch: 0.128s | Total: 0:01:02 | ETA: 0:00:01 | Loss: 4.3475 | Acc:  0.0000

Epoch: 22 | LR: 0.00025000
Processing |################################| (3708/3708) Data: 0.000315s | Batch: 0.295s | Total: 0:20:08 | ETA: 0:00:01 | Loss: 0.0034 | Acc:  0.7716
Processing |################################| (493/493) Data: 0.000086s | Batch: 0.136s | Total: 0:01:07 | ETA: 0:00:01 | Loss: 11.9544 | Acc:  0.0029

Epoch: 23 | LR: 0.00025000
Processing |################################| (3708/3708) Data: 0.000165s | Batch: 0.261s | Total: 0:20:04 | ETA: 0:00:01 | Loss: 0.0034 | Acc:  0.7757
Processing |################################| (493/493) Data: 0.000140s | Batch: 0.141s | Total: 0:01:09 | ETA: 0:00:01 | Loss: 22.9730 | Acc:  0.0000

Epoch: 24 | LR: 0.00025000
Processing |################################| (3708/3708) Data: 0.000216s | Batch: 0.267s | Total: 0:20:06 | ETA: 0:00:01 | Loss: 0.0034 | Acc:  0.7793
Processing |################################| (493/493) Data: 0.000123s | Batch: 0.137s | Total: 0:01:07 | ETA: 0:00:01 | Loss: 141.7624 | Acc:  0.0000

Intermediate supervision

Hi, Thanks for your code!
Does this implement has intermediate supervision mechanism?

something wrong in gen_*.m

ImportError: No module named pose

# Thanks for you code! could you give me some advice?

sun@sunwin:$ cd /home/sun/pytorch-pose
sun@sunwin:/pytorch-pose$ ln -s PATH_TO_MPII_IMAGES_DIR data/mpii/images
sun@sunwin:/pytorch-pose$ CUDA_VISIBLE_DEVICES=0 python example/mpii.py -a hg4 --checkpoint checkpoint/mpii/hg4 --resume checkpoint/mpii/hg4/model_best.pth.tar -e -d
Traceback (most recent call last):
File "example/mpii.py", line 14, in
from pose import Bar
ImportError: No module named pose
sun@sunwin:/pytorch-pose$ CUDA_VISIBLE_DEVICES=0 python example/mpii.py -a hg4 --checkpoint checkpoint/mpii/hg4 --resume checkpoint/mpii/hg4/model_best.pth.tar -e -d
Traceback (most recent call last):
File "example/mpii.py", line 14, in
from pose import Bar
ImportError: No module named pose
sun@sunwin:/pytorch-pose$ CUDA_VISIBLE_DEVICES=0 python example/mpii.py -a hg1 --checkpoint checkpoint/mpii/hg1 -j 4
Traceback (most recent call last):
File "example/mpii.py", line 14, in
from pose import Bar
ImportError: No module named pose
sun@sunwin:/pytorch-pose$
really need help! thank you !

What's the meaning of the features parameter?

I cannot find where the features parameter is called, what's the meaning of it?

ValueError: A global iterator flag was passed as a per-operand flag to the iterator constructor

Hi,
when I run the testing and training example following the Usage instructions, there is a mistake. I don't know how to fix it.

==> creating model 'hg', stacks=2, blocks=1
=> loading checkpoint 'checkpoint/mpii/hg_s2_b1/model_best.pth.tar'
=> loaded checkpoint 'checkpoint/mpii/hg_s2_b1/model_best.pth.tar' (epoch 185)
Total params: 6.73M
Mean: 0.4404, 0.4440, 0.4327
Std: 0.2458, 0.2410, 0.2468

Evaluation only
Traceback (most recent call last):
File "example/mpii.py", line 352, in
main(parser.parse_args())
File "example/mpii.py", line 92, in main
loss, acc, predictions = validate(val_loader, model, criterion, args.num_classes, args.debug, args.flip)
File "example/mpii.py", line 222, in validate
for i, (inputs, target, meta) in enumerate(val_loader):
File "/home/amax/anaconda2/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 201, in next
return self._process_next_batch(batch)
File "/home/amax/anaconda2/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 221, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
ValueError: Traceback (most recent call last):
File "/home/amax/anaconda2/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 40, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/amax/pytorch-hourglass/pytorch-pose/pose/datasets/mpii.py", line 119, in getitem
target[i] = draw_labelmap(target[i], tpts[i]-1, self.sigma, type=self.label_type)
File "/home/amax/pytorch-hourglass/pytorch-pose/pose/utils/imutils.py", line 72, in draw_labelmap
g = np.exp(- ((x - x0) ** 2 + (y - y0) ** 2) / (2 * sigma ** 2))
ValueError: A global iterator flag was passed as a per-operand flag to the iterator constructor

About data preprocessing of LSP dataset

Hi Dr. Yang,

I read the data loading file of LSP dataset:
https://github.com/bearpaw/pytorch-pose/blob/master/pose/datasets/lsp.py

I have two questions:

The augmentation code is commented out in this file. Is the augmentation unnecessary for the LSP data?
You enlarge the scale by a factor of 1.4375. How do you pre-compute the scale of the LSP data? I see only the joints and their visibility are given. If you use the image size as the body size, do you use the long side or short side the rectangle? Besides, how do you pre-compute the center of human bodies? Do you think the testing performance is better if I use the mean coordinates of visible joints or the just the image center?

Thanks!

One extra residual module?

Hi,

I was wondering if there is an extra residual model in your model compared to the original implementation. In your code, the output of the hourglass is followed by:

y = self.hg[i](x)
y = self.res[i](y)
y = self.fc[i](y)
score = self.score[i](y)

However, in the original implementation there is a sequence of 3 1X1 convolutions (the first two are followed by batch_norm and relu).

-- First hourglass
local hg1 = hourglass(4,256,512,r6)

local l1 = lin(512,512,hg1)
local l2 = lin(512,256,l1)

-- First predicted heatmaps
local out1 = nnlib.SpatialConvolution(256,outputDim[1][1],1,1,1,1,0,0)(l2)

Can you please comment on that?

About rotation of the data augmentation

Hi, @bearpaw , very clear code!

In /pose/datasets/mpii.py line 95
r = torch.randn(1).mul_(rf).clamp(-2*rf, 2*rf)[0] if random.random() <= 0.6 else 0
The rotation angles are between [-60, 60]. It is not consistent with the Hourglass work and your paper(Multi-Context Attention for Human Pose Estimation) [-30, 30]. I also see similar codes at pose-attention .

Does larger angle variation help training of pose detection?

Can anyone reproduce the same training accurarcy performance as claimed with pytorch 0.4?

I trained with the origin code and dataset on 2 different machine, one with a 1060 gpu and another with 2 1080Ti, but never have I got an accurarcy rate over 70% and it was growing pretty slow (some got 20% after 2 epochs, but mine is still way lower than 10%). I noticed someone mentioned in another issue said that he couldn't get good performance on pytorch 0.4.0 either, so I wonder if anyone got good performance. I really don't want to down-grade my pytorch version since I have been modifying the code to implement some points of a paper that couldn't work on lower version pytorch.

what's the meaning of 'scale' in mpii dataset?

if I use my own dataset, every image is 640 * 480, what's the value of scale should I set it to?

Does it use heatmap? I could not find nms.

Does it use heatmap? I could not find nms function.

cannot reproduce hg_s1_b1 result

I noticed that you log here using a larger learnong rate from 0.001 and schedule=[150, 175, 200], below is part of your log

Epoch	LR	Train Loss	Val Loss	Train Acc	Val Acc	
1.000000	0.001000	0.001369	0.000828	0.070879	0.138562	
2.000000	0.001000	0.000856	0.001058	0.158208	0.200655	
3.000000	0.001000	0.000758	0.000854	0.213208	0.208725	
4.000000	0.001000	0.000699	0.000596	0.281929	0.384714	
5.000000	0.001000	0.000635	0.000575	0.337208	0.440630	
6.000000	0.001000	0.000582	0.000541	0.421062	0.487058	
7.000000	0.001000	0.000559	0.000521	0.467490	0.538204	
8.000000	0.001000	0.000536	0.000495	0.514954	0.582253	
9.000000	0.001000	0.000520	0.000483	0.549438	0.609111	
10.000000	0.001000	0.000506	0.000469	0.574788	0.634015	
11.000000	0.001000	0.000497	0.000475	0.595450	0.629678	
12.000000	0.001000	0.000488	0.000458	0.610554	0.655569	
13.000000	0.001000	0.000481	0.000464	0.621428	0.642120	
14.000000	0.001000	0.000475	0.000444	0.634942	0.674910	
15.000000	0.001000	0.000470	0.000445	0.643844	0.672073	
16.000000	0.001000	0.000465	0.000457	0.649695	0.644244	
17.000000	0.001000	0.000461	0.000434	0.657655	0.692058	
18.000000	0.001000	0.000455	0.000432	0.669486	0.699718	
19.000000	0.001000	0.000451	0.000431	0.675828	0.704502	
20.000000	0.001000	0.000450	0.000427	0.676318	0.705441	
21.000000	0.001000	0.000447	0.000423	0.685184	0.715312	
22.000000	0.001000	0.000444	0.000439	0.687975	0.685048	
23.000000	0.001000	0.000440	0.000420	0.694823	0.718964	
24.000000	0.001000	0.000439	0.000423	0.697721	0.718909	
25.000000	0.001000	0.000435	0.000417	0.704000	0.727210	
26.000000	0.001000	0.000433	0.000420	0.706374	0.718607	
27.000000	0.001000	0.000432	0.000414	0.706610	0.727208	
28.000000	0.001000	0.000429	0.000415	0.713337	0.726208	
29.000000	0.001000	0.000426	0.000414	0.718950	0.731994

and my training log drops down drastically, with the same lr and schedule with you, momentum=0 (default) or 0.1(your model internal parameter)

1.000000	0.001000	0.000911	0.001155	0.144576	0.245235	
2.000000	0.001000	0.000635	6.696480	0.292642	0.002924	
3.000000	0.001000	0.000599	79.269006	0.368526	0.000000	
4.000000	0.001000	0.000577	342.079974	0.411786	0.001092	
5.000000	0.001000	0.000560	1973.556534	0.447012	0.000176

is there any other should be changed on your default parameters ?

Data loader is slow

The data loader seems to extremely slow for few batches. After every few batches (like after 10 or 20 batches), it takes few seconds (up to 15s) to load the data. I have tried increasing the number of data loader workers (via option -j 12) and increasing the train batch size, but this issue persists. Is this issue expected? Is it because of the data transforms? This issue becomes severe when I run the code on more than one GPU. Most of the times, the GPU's remains idle which increases the overall time taken for one epoch (which for me is 1hr, 20 mins).

My machine configurations are:
4x1080Ti, Intel Xeon E5-2640, and I am loading the data from an SSD.

test error!

Thanks for you code! could you give me some advice? thanks a lot!
sun@sunwin:$ cd /home/sun/pytorch-pose
sun@sunwin:/pytorch-pose$ CUDA_VISIBLE_DEVICES=0 python example/mpii.py -a hg --stacks 2 --blocks 1 --checkpoint checkpoint/mpii/hg_s2_b1 --resume checkpoint/mpii/hg_s2_b1/model_best.pth.tar -e -d
usage: mpii.py [-h] [--arch ARCH] [-j N] [--epochs N] [--start-epoch N]
[--train-batch N] [--test-batch N] [--lr LR] [--momentum M]
[--weight-decay W] [--print-freq N] [-c PATH] [--resume PATH]
[-e] [-d] [-f]
mpii.py: error: argument --arch/-a: invalid choice: 'hg' (choose from 'hg1', 'hg2', 'hg4', 'hg8', 'preresnet110', 'preresnet1202', 'preresnet20', 'preresnet32', 'preresnet44', 'preresnet56')

bugs here

pytorch-pose/pose/utils/transforms.py

Line 18 in b8dc3b3

t.sub_(m)

It seems a bug in color_normalization

t.sub_(m).div_(s)

bug in pose/datasets/mpii.py

I think the line

if tpts[i, 0] > 0:

should be

if tpts[i, 1] > 0:

because when you do fliplr, x=0 is set to be x=width then there will be a "ground truth point" in upper right corner. Above is a compromised correction. Hopefully we are not doing flipud.

why the loss is always very small?

hi, everyone. I have read the code and trained the modes of this pytorch-pose. I confused why the train loss and validation loss so small ( always 1e-3 )? It looks different from traing other deep learning models. Below is my training process of stack-8 block-1. Thank you for the help!
Epoch LR Train Loss Val Loss Train Acc Val Acc
1.000000 0.000100 0.040519 0.006604 0.066519 0.154301
2.000000 0.000100 0.005778 0.005863 0.196297 0.295626
3.000000 0.000100 0.005601 0.007943 0.249709 0.333515
4.000000 0.000100 0.005433 0.006463 0.292266 0.382962
5.000000 0.000100 0.005278 0.011357 0.340348 0.433160
6.000000 0.000100 0.005151 0.013878 0.379125 0.458898
7.000000 0.000100 0.005042 0.011793 0.414802 0.508453
8.000000 0.000100 0.004942 0.006337 0.445820 0.515834
9.000000 0.000100 0.004853 0.008923 0.476248 0.553932
10.000000 0.000100 0.004755 0.005029 0.510092 0.593099
11.000000 0.000100 0.004662 0.004275 0.542758 0.619362
12.000000 0.000100 0.004590 0.004533 0.563967 0.634098
13.000000 0.000100 0.004530 0.004842 0.581655 0.632911
14.000000 0.000100 0.004477 0.004521 0.595700 0.646019
15.000000 0.000100 0.004435 0.004461 0.606501 0.667473
16.000000 0.000100 0.004382 0.004361 0.619863 0.679526
17.000000 0.000100 0.004341 0.004131 0.630043 0.691884
18.000000 0.000100 0.004302 0.004430 0.638425 0.689532
19.000000 0.000100 0.004270 0.004337 0.645869 0.695146
20.000000 0.000100 0.004236 0.004448 0.652779 0.694888
21.000000 0.000100 0.004204 0.004129 0.658970 0.716643
22.000000 0.000100 0.004180 0.004079 0.664444 0.710762
23.000000 0.000100 0.004145 0.004864 0.673128 0.702015
24.000000 0.000100 0.004116 0.004012 0.677755 0.719457
25.000000 0.000100 0.004089 0.004043 0.683418 0.726996
26.000000 0.000100 0.004075 0.004792 0.687907 0.717509
27.000000 0.000100 0.004049 0.004221 0.691811 0.729067
28.000000 0.000100 0.004026 0.004235 0.698158 0.728886
29.000000 0.000100 0.004002 0.003933 0.703373 0.743552
30.000000 0.000100 0.003983 0.003977 0.707411 0.738715
31.000000 0.000100 0.003959 0.003856 0.712823 0.749882
32.000000 0.000100 0.003947 0.004130 0.716247 0.750792
33.000000 0.000100 0.003936 0.004193 0.717466 0.743339
34.000000 0.000100 0.003905 0.004147 0.722726 0.748920
35.000000 0.000100 0.003887 0.003915 0.727711 0.749792
36.000000 0.000100 0.003873 0.004049 0.727161 0.759734
37.000000 0.000100 0.003859 0.004056 0.731356 0.740315
38.000000 0.000100 0.003834 0.003991 0.737066 0.766997
39.000000 0.000100 0.003833 0.004084 0.736956 0.761718
40.000000 0.000100 0.003819 0.003690 0.739787 0.769920
41.000000 0.000100 0.003797 0.003952 0.745627 0.760283
42.000000 0.000100 0.003791 0.003901 0.746524 0.772981
43.000000 0.000100 0.003773 0.004090 0.748357 0.766894
44.000000 0.000100 0.003758 0.003963 0.751360 0.774523
45.000000 0.000100 0.003753 0.004113 0.752556 0.775152
46.000000 0.000100 0.003728 0.004195 0.758551 0.782942
47.000000 0.000100 0.003711 0.003826 0.759776 0.781414
48.000000 0.000100 0.003717 0.003780 0.758977 0.783643
49.000000 0.000100 0.003705 0.004291 0.762976 0.783146
50.000000 0.000100 0.003684 0.003696 0.765159 0.782737
51.000000 0.000100 0.003675 0.003813 0.768794 0.788034
52.000000 0.000100 0.003665 0.003854 0.770016 0.793802
53.000000 0.000100 0.003661 0.003855 0.770352 0.787637
54.000000 0.000100 0.003640 0.003734 0.774723 0.790245
55.000000 0.000100 0.003636 0.003884 0.775233 0.794752
56.000000 0.000100 0.003624 0.003924 0.776930 0.785818
57.000000 0.000100 0.003613 0.003705 0.779447 0.796602
58.000000 0.000100 0.003604 0.003853 0.781621 0.795611
59.000000 0.000100 0.003594 0.003764 0.782702 0.798791
60.000000 0.000100 0.003591 0.003811 0.783326 0.797562
61.000000 0.000010 0.003265 0.003198 0.801716 0.814483
62.000000 0.000010 0.003238 0.003191 0.806262 0.815575
63.000000 0.000010 0.003230 0.003192 0.808299 0.814489
64.000000 0.000010 0.003216 0.003176 0.811464 0.815816
65.000000 0.000010 0.003214 0.003177 0.809532 0.817346
66.000000 0.000010 0.003201 0.003169 0.813702 0.817257
67.000000 0.000010 0.003202 0.003175 0.814226 0.814987
68.000000 0.000010 0.003195 0.003175 0.814527 0.816386
69.000000 0.000010 0.003190 0.003159 0.815462 0.819750
70.000000 0.000010 0.003196 0.003170 0.814338 0.817474
71.000000 0.000010 0.003187 0.003166 0.816677 0.820087
72.000000 0.000010 0.003186 0.003158 0.817255 0.820635
73.000000 0.000010 0.003181 0.003165 0.816634 0.818904
74.000000 0.000010 0.003184 0.003162 0.817163 0.819837
75.000000 0.000010 0.003176 0.003158 0.818142 0.818684
76.000000 0.000010 0.003172 0.003158 0.819772 0.820288
77.000000 0.000010 0.003169 0.003159 0.820208 0.819323
78.000000 0.000010 0.003167 0.003153 0.820191 0.820840
79.000000 0.000010 0.003164 0.003160 0.821483 0.820544
80.000000 0.000010 0.003164 0.003154 0.820176 0.820650
81.000000 0.000010 0.003160 0.003155 0.821318 0.822272
82.000000 0.000010 0.003157 0.003146 0.822337 0.823513
83.000000 0.000010 0.003163 0.003163 0.821482 0.819525
84.000000 0.000010 0.003157 0.003154 0.822139 0.822554
85.000000 0.000010 0.003152 0.003150 0.823072 0.824411
86.000000 0.000010 0.003151 0.003149 0.824456 0.823629
87.000000 0.000010 0.003150 0.003155 0.822820 0.822656
88.000000 0.000010 0.003152 0.003143 0.824378 0.824872
89.000000 0.000010 0.003146 0.003139 0.824860 0.823768
90.000000 0.000010 0.003136 0.003139 0.826952 0.825458
91.000000 0.000001 0.003135 0.003131 0.825520 0.825208
92.000000 0.000001 0.003128 0.003131 0.826755 0.824568
93.000000 0.000001 0.003131 0.003133 0.827131 0.825824
94.000000 0.000001 0.003126 0.003132 0.827385 0.824603
95.000000 0.000001 0.003128 0.003133 0.826717 0.824171
96.000000 0.000001 0.003131 0.003135 0.828059 0.824281
97.000000 0.000001 0.003127 0.003130 0.827289 0.824826
98.000000 0.000001 0.003121 0.003131 0.828627 0.823672
99.000000 0.000001 0.003127 0.003132 0.827220 0.825334
100.000000 0.000001 0.003126 0.003133 0.828195 0.823772
101.000000 0.000001 0.003122 0.003133 0.828492 0.825362
102.000000 0.000001 0.003123 0.003134 0.827998 0.825257
103.000000 0.000001 0.003120 0.003131 0.829216 0.825391
104.000000 0.000001 0.003131 0.003134 0.826828 0.824552
105.000000 0.000001 0.003121 0.003130 0.828140 0.826133
106.000000 0.000001 0.003124 0.003133 0.826996 0.824674
107.000000 0.000001 0.003125 0.003131 0.827876 0.826003
108.000000 0.000001 0.003122 0.003129 0.827146 0.826141
109.000000 0.000001 0.003118 0.003126 0.829902 0.827371
110.000000 0.000001 0.003120 0.003130 0.828066 0.825935
111.000000 0.000001 0.003116 0.003127 0.828986 0.825615
112.000000 0.000001 0.003125 0.003123 0.827381 0.826918
113.000000 0.000001 0.003123 0.003127 0.828666 0.824989
114.000000 0.000001 0.003119 0.003130 0.827995 0.825304
115.000000 0.000001 0.003121 0.003125 0.828034 0.825756
116.000000 0.000001 0.003119 0.003129 0.829143 0.825079
117.000000 0.000001 0.003122 0.003127 0.829562 0.826239
118.000000 0.000001 0.003120 0.003125 0.828884 0.825675
119.000000 0.000001 0.003118 0.003129 0.829933 0.824718
120.000000 0.000001 0.003122 0.003132 0.827937 0.823845
121.000000 0.000000 0.003118 0.003127 0.828167 0.826449
122.000000 0.000000 0.003114 0.003124 0.830013 0.826401
123.000000 0.000000 0.003120 0.003126 0.828216 0.826354
124.000000 0.000000 0.003115 0.003124 0.828879 0.826535
125.000000 0.000000 0.003113 0.003128 0.828978 0.826570
126.000000 0.000000 0.003124 0.003129 0.827313 0.824960
127.000000 0.000000 0.003120 0.003130 0.828555 0.825859
128.000000 0.000000 0.003121 0.003130 0.828872 0.825361
129.000000 0.000000 0.003122 0.003129 0.829600 0.824821
130.000000 0.000000 0.003112 0.003127 0.830041 0.826295
131.000000 0.000000 0.003110 0.003128 0.830714 0.825847
132.000000 0.000000 0.003117 0.003129 0.830742 0.824571
133.000000 0.000000 0.003118 0.003126 0.829283 0.827287
134.000000 0.000000 0.003123 0.003127 0.828451 0.825599
135.000000 0.000000 0.003118 0.003127 0.828554 0.826196
136.000000 0.000000 0.003119 0.003130 0.827563 0.825281

Validation acc varies in different test batch size.

@bearpaw thank you for your wonderful work,
I have trained my network. While, when I validated my best model in different test batch size, the validation acc suffered from different results, as I think the acc should be independent of test batch size.

test batchsize------Val Acc
1--------------------0.8743
6--------------------0.8660
16-------------------0.8685

And what's the test batch size do you use in the result you published？

Does this code reproduce the results of 8 stacked hg in the original paper?

Hi,

Thanks for sharing your code! Does this code reproduce the results of 8 stacked HG in the original paper? If not, what's your results of 8 stacked HG? Any possible reasons between the gap?

Best,

Learning rate

Why is the initial learning rate kept so low in the implementation ? A learning rate of 1e-3 is not yieldin NaNs. Is it an observation that low learning rates work well for pose estimation tasks ?

What is the accuracy that is being printed ? Is it just the distance between target and predicted masks? In that case lower accuracy should mean closer to gt.

BatchNorm in eval() model causes validation errors

Hi,

When I try to train hg8, it is giving me very large validation error (looks almost like it is not working), while in training mode, the loss is very small ( and looks working very well). I did some googling, and find out BatchNorm layer behaves differently in eval() mode and train() mode.

So I adjusted the code
in dataset/mpii.py getItem function, where color_normalization should only be used for training as a pre-processing mode.

After adjusting it worked well.

ImportError: No module named pose

Traceback (most recent call last):
File "example/mpii.py", line 14, in
from pose import Bar
ImportError: No module named pose

when I run the training code

different implementation about datasets loader

I notice that there are some difference on datasets loader between pytorch-pose and PyraNet. I wonder know whether it does matter. For example, the order of cropping and data augmentation.

some explanations for the files

Hi ,

I was wondering how did you generate some of the data files in evaluations folder? I feel like I can sort of guess what's inside these files, but it would be great if you could tell us how they are generated and what actually has been done.

For example:
data/mpii/mpii_annotations.json
evaluation/data/detections_out_format.mat
evaluation/data/detections.mat

Thanks in advance!

the result

The result in the paper is 90.09, but the recurring result is 88.78, where do you think the problem occurred?

models/preresnet.py question

Hey, I am reading your code recently. But I found in models/preresnet.py:

def preresnet56(**kwargs):
    model = PreResNet(Bottleneck, [9, 9, 9], **kwargs)

As I know, your code of Bottleneck class implements three conv layers, so it should be 9*3*3+2=83, but you give it 9*2*3+2=56.

I don't actually which one is wrong, could you just check this class?
Thanks.

Why does BN comes before Conv in BottleNeck layers?

I was wondering, is there any special reasons why BN comes before Conv in pose/models/hourglass.py Bottleneck class?

Are there bugs in gen_lsp.m?

Hi @bearpaw
Very great work!
In the file gen_lsp.m,
https://github.com/bearpaw/pytorch-pose/blob/master/miscs/gen_lsp.m#L37-L48
https://github.com/bearpaw/pytorch-pose/blob/master/miscs/gen_lsp.m#L54-L60

oriTrTe.joints(7, 1:2, :) = mean(oriTrTe.joints(3:4,1:2,:));
oriTrTe.joints(8, 1:2, :) = mean(oriTrTe.joints(13:14,1:2,:));

I find that the coordinates of the thorax and the pelvis are calculated by the locations of shoulders and hips.
If there are wrong locations of shoulders or hips, it will produce wrong calculated locations of thorax or pelvis
The calculation can be improved by add condition statement to judge whether the locations of shoulders and hips are right.

hello

train size

hi,
i tried the training code, but confused about some detail.
could you tell why the size of training image set is 3708?
thx

how to get score of each point?

we can get the point location by heatmap, but how can we get the confidence score of each score?

init() got an unexpected keyword argument 'pretrained'

Hi,

I was trying to test out a trained model on validation set . When I use argument
-e -a hg1 -c checkpoint/mpii/hg1 -d --pretrained
it's giving me error : init() got an unexpected keyword argument 'pretrained'

It looks like it's from HourglassNet() init() function, where there's no pretrained argument.

Did I do something wrong? Or should I just use the --test-batch argument in the command line?

I'm a newbie for pytorch :(

Is there a pretrained stacked hourglass model I can test out in PyTorch? Since the original paper provided a torch version and it seems a bit pain to try to load it into PyTorch.

Thanks in advance.

what is the use of transform function to generate the target

Hi,
With the transform function before generating the target and transform_preds after predictiong, I've got several negative numbers in the x-axis.
Is the transformation necessary? What do I need to change in this case to get the positive prediction points？

TypeError: 'numpy.float64' object cannot be interpreted as an index

Hi! I ran the testing command and I got this error:

Traceback (most recent call last): File "example/mpii.py", line 309, in <module> main(parser.parse_args()) File "example/mpii.py", line 88, in main loss, acc, predictions = validate(val_loader, model, criterion, args.debug, args.flip) File "example/mpii.py", line 238, in validate gt_batch_img = batch_with_heatmap(inputs, target) File "/home/minghua/Codes/intel-contest/pytorch-pose/pose/utils/imutils.py", line 169, in batch_with_heatmap sample_with_heatmap(inp.clamp(0, 1), outputs[n], num_rows=num_rows, parts_to_show=parts_to_show) File "/home/minghua/Codes/intel-contest/pytorch-pose/pose/utils/imutils.py", line 144, in sample_with_heatmap full_img = np.zeros((img.shape[0], size * (num_cols + num_rows), 3), np.uint8) TypeError: 'numpy.float64' object cannot be interpreted as an index

Thank you!

Mscoco preprocessing flipping

First of all thanks for the great repo!

In pose/datasets/mscoco.py, when the flipping happens, there is
pts = shufflelr(pts, width=img.size(2), dataset='mpii').

Does the flipping / mscoco training script work, or one should implement the matchedParts for mscoco?

Thanks!

Intermediate supervision

thanks for the pytorch version, it seems that adding the intermediate loss to the single hourglass haven't repreduce?
btw, is all the intermediate losses added together at last and then do bp?

block.expansion

Hi,
would you mind briefly explaining what's the purpose of block.expansion in Bottleneck class? I find out it's been used several times for

if stride != 1 or self.inplanes = planes*block.expansion:
downsample=....

I am a little bit confused, why does this have anything to do with adding downsampling layer to self.layer
Thank you!

got stuck in training.

Epoch: 54 | LR: 0.00005000
Processing |################################| (3708/3708) Data: 0.000265s | Batch: 0.286s | Total: 0:20:19 | ETA: 0:00:01 | Loss: 0.0033 | Acc:  0.7968
Processing |################################| (493/493) Data: 0.000184s | Batch: 0.127s | Total: 0:01:02 | ETA: 0:00:01 | Loss: 0.0035 | Acc:  0.8025

Epoch: 55 | LR: 0.00005000
Processing |################################| (3708/3708) Data: 0.000226s | Batch: 0.283s | Total: 0:20:22 | ETA: 0:00:01 | Loss: 0.0033 | Acc:  0.7994
Processing |################################| (493/493) Data: 0.000168s | Batch: 0.128s | Total: 0:01:02 | ETA: 0:00:01 | Loss: 0.0036 | Acc:  0.7947

Epoch: 56 | LR: 0.00005000
Processing |################################| (3708/3708) Data: 0.000174s | Batch: 0.249s | Total: 0:20:24 | ETA: 0:00:01 | Loss: 0.0033 | Acc:  0.7997
Processing |################################| (493/493) Data: 0.000158s | Batch: 0.128s | Total: 0:01:03 | ETA: 0:00:01 | Loss: 0.0038 | Acc:  0.8001

Epoch: 57 | LR: 0.00005000
Processing |################################| (3708/3708) Data: 0.000286s | Batch: 0.309s | Total: 0:20:31 | ETA: 0:00:01 | Loss: 0.0033 | Acc:  0.8026
Processing |################################| (493/493) Data: 0.000217s | Batch: 0.128s | Total: 0:01:03 | ETA: 0:00:01 | Loss: 0.0035 | Acc:  0.7993

Epoch: 58 | LR: 0.00005000
Processing |################################| (3708/3708) Data: 0.000279s | Batch: 0.296s | Total: 0:20:16 | ETA: 0:00:01 | Loss: 0.0033 | Acc:  0.8038
Processing |################################| (493/493) Data: 0.000223s | Batch: 0.122s | Total: 0:01:00 | ETA: 0:00:01 | Loss: 0.0036 | Acc:  0.7977

Epoch: 59 | LR: 0.00005000
Processing |######                          | (789/3708) Data: 0.000346s | Batch: 0.328s | Total: 0:04:18 | ETA: 0:15:57 | Loss: 0.0033 | Acc:  0.8042

it just stay here, and don't move any more.

Given an image (containing one person), how can I get keypoints?

I sorry if it is a simple question or has been asked in another issue. I am new to the pose estimation task.
In this code, I cannot to compute keypoints if given an image including only one person? I also cannot find a "test.py" for the output of keypoints. Instead, some demos in ./example folder for evaluation. However, the evalution codes use keypoint infomation for computing "meta" variable. In my application, I do not know the exact "meta" variable. Could you please give me some advices? Thank you!

How to choose a good number of worker and batchsize on multi-GPU?

my server has 4 1080Ti, when i run code on multi-GPU, it always get trunk, so which number is good.

evaluation.py

hi,
i ran the training code and met the following errror:

Traceback (most recent call last):
  File "example/mpii.py", line 318, in <module>
    main(parser.parse_args())
  File "example/mpii.py", line 104, in main
    valid_loss, valid_acc, predictions = validate(val_loader, model, criterion, args.debug, args.flip)
  File "example/mpii.py", line 233, in validate
    acc = accuracy(score_map.cuda(), target, idx)
  File "~/pytorch-pose/pose/utils/evaluation.py", line 61, in accuracy
    acc[i+1] = dist_acc(dists[:, idxs[i]-1, :])
IndexError: index 10 is out of range for dimension 1 (of size 6)

is this a bug?

find a bug

in _hour_glass_forward
out = up1 + up2
RuntimeError: The size of tensor a (15) must match the size of tensor b (14) at non-singleton dimension 3
because my second smallest layer of the hourglass is up1, has torch.Size([1, 512, 10, 15]), after max pool(2),it becomes [1,512,5,7], but after do up2 = self.upsample(low3), it becomes [1,512,10,14], then up1 and up2 can't be added because of the different size.

Why the train code has no initialization?

Rt, is it a bug?

bearpaw / pytorch-pose Goto Github PK

pytorch-pose's Issues

Recommend Projects

Recommend Topics

Recommend Org