mks0601 / v2v-posenet_release Goto Github PK

Official Torch7 implementation of "V2V-PoseNet: Voxel-to-Voxel Prediction Network for Accurate 3D Hand and Human Pose Estimation from a Single Depth Map", CVPR 2018

Home Page: https://arxiv.org/abs/1711.07399

License: MIT License

MATLAB 53.11% Lua 41.55% Python 5.33%

hand-pose-estimation human-pose-estimation v2v-posenet deep-learning computer-vision 3d-pose-estimation 3d-hand-pose 3d-human-pose cvpr2018

v2v-posenet_release's People

Contributors

Stargazers

Watchers

Forkers

xinghuokang zhangboshen wavelet303 linhanxiao gftq puchan-hci lingjingying fulin-wei azuredsky jonathanlehner eshafeeqe yutaidong africamachineintelligence yihuanliu baileyqbb zoombapup dahburj blanktec ziqichai uts-ri daydreamer2023 mbencherif taeinkwon poorstreet abdullah-19 pixelsenseiavi rchavezj luben2018 ericustc hwtwj minlattnwe cjue hiker-xu codefreak7 yangzhouyoo uyoung-jeong pandinosaurus scq2020 zhangmaom phymucs boonyew jclimma mahdinobar catherleen jcjs tuskaw markchangkm pdkyll ccjack sirnader nmz0429 samshin7 subburajs cwc1260 albaintelligence swipswaps wujinzhong aristotle-li mlahoud ithink3iam hoangcuongbk80 wificsi-video ewenwan ruggeror fruitboy1226 avobee shunyizhao glassstone

v2v-posenet_release's Issues

How to use the function 'generate_heatmap_gt'

Hi,It's so convenient to use your code.
I find tool function called "generate_heatmap_gt(heatmap,jointWorld,refPt,newSz,angle,trans)" in line 141,
What the( 'heatmap','refPt','newSz','angle','trans')exactly means?How to preprocess the data to generate these parameter?

I also confused about the process of DeepPrior++ in Section 4 of the paper.The original code is used to predict 3D locations of hand, how to translate it to be used on humanbody?

Why I get an image of a hand gesture standing on its head

Why I get an image of a hand gesture standing on its head（all the results are the same!）:

Which hdf5 lua package do you use to load ITOP dataset?

I think that the current hdf5 version cannot read float 16 bit.

source code: https://github.com/deepmind/torch-hdf5/blob/master/luasrc/ffi.lua
This is from the line 325th.

 elseif className == 'FLOAT' then
        if size == 4 then
            return 'torch.FloatTensor'
        end
        if size == 8 then
            return 'torch.DoubleTensor'
        end
        error("Cannot support reading float data with size = " .. size .. " bytes")

ITOP use float 16 bit on real_world_coordinates label.

Video-based hand estimation

Hi, Gyeongsik
First of all, thank you very much for your outstanding work-V2V! I just want to discuss with you other than V2V. So far, most discriminative methods performing hand estimation from single hand depth map have been conducted on Frame level, and they most struggle in hand self-occlusion or other unseen hand joint situation. However, the actual hand estimation is mostly applied to video-level scene (continuous frame). In the case of discriminative methods, why can we introduce the constraints of former-hand-frame to infer current hand pose? Such like RNN, LSTM. Unfortunately, I found that LSTM and RNN are only adopted in video classification, Is there any possibility that taking constraints of continuous frame into account for hand estimation. More directly, can we introduce LSTM or RNN to discriminative methods?

Look forward to your opinion.

Hopes for your replay, thanks very much.

The MSRA dataset is incomplete

Thank you for sharing nice research work.
I find that the MSRA dataset from the provided link is incomplete, where did you get the complete dataset?

Request for PyTorch implementation

Do you plan to implement this work with PyTorch? Thank you!

no visible label 'INVALIDFRAME_TRAIN'

Thanks for sharing your code! When I try to run "th run_me.lua" I get an error message:

./data/ICVL/data.lua:234: no visible label 'INVALIDFRAME_TRAIN' for at line 163
stack traceback:
[C]: in function 'dofile'
run_me.lua:18: in main chunk
[C]: in function 'dofile'
...sluo/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: in ?

Do you know what the issue could be? Thanks!

ITOP dataset

any chance i could get the ITOP dataset?? The link provided is down.
thanks

Pre-computed centers

I want to run V2V for hand pose estimation on some RGBD data that I have, but I don't have any ground truth labels. What exactly is the format of the center_trainset and center_testset text files, and how did you get that information? Thanks!

BatchSize setting

Hi,

What's your batchSize values in those different datasets(e.g. msra, nyu) experiments? I see batchSz=1 in config.lua, is that applied to all your experiments?

I trained few models on MSRA hand dataset and my own smaller dataset(not about hand) with V2V-PoseNet-pytorch, I found that the smaller batchSizes often achieve much better results. And the models perform bad if batchSize upto 32 or 64. Have you faced similar situation in your trainings(with your own torch7 implementation)?

Thanks.

CUDA versions?

Thank you for sharing nice research work.
Is there any version dependency of CUDA/cuDNN or any other tools?

Help for testing the approach using depth map created using my kinect sensor

Hello,

First of all thanks for making this awesome project into the public.

I am wondering whether I can use this approach for getting human pose joints from a depth image captured by my Kinect sensor. If yes can you give me a minimal example of how to do it? Since I am new to torch7 and Lua programming it's little hard for me to understand the codes.

Any help will be much appreciated.

Thanks and Regards,
Shafeeq E

A PyTorch implementation

Hi,
I have recently implemented your excellent work with PyTorch(https://github.com/dragonbook/V2V-PoseNet-pytorch). After some tests, it should work well now. I think it may be some useful for other pytorch users, so I posted it here. You can optionally add this to REAME :)

Thanks.

can not get train dataset

Hi sir
I CAN NOT GET the dataset "ITOP Human Pose Dataset [link] [paper]",because I can't link to the "link" URL. Do I have any other path to get this data set?

"bad argument #2 to '?' (out of range at /home/miruware/sdc/linh/torch/pkg/torch/generic/Tensor.c:913)" Error

Hi I've been trying to train and test on NYU hand dataset and though I followed the instructions including changing data dir, convert png to bin, etc... but still getting this same error:

Found Environment variable CUDNN_PATH = /home/miruware/anaconda3/envs/linh/lib/libcudnn.so.5db: NYU mode: train
training data loading...
testing data loading...
invalid frame in test set: 0
Found Environment variable CUDNN_PATH = /home/miruware/anaconda3/envs/linh/lib/libcudnn.so.5Found Environment variable CUDNN_PATH = /home/miruware/anaconda3/envs/linh/lib/libcudnn.so.5Found Environment variable CUDNN_PATH = /home/miruwle CUDNN_PATH = /home/miruware/anaconda3/envs/linh/lib/libcudnn.so.5Found Environment variable CUDNN_PATH = /home/milib/libcudnn.so.5Found Environment variable CUDNN_PATH = /home/miruware/anaconda3/envs/linh/lib/libcudnn.so.5Found Enviruware/anaconda3/envs/linh/lib/libcudnn.so.5Found Environment variable CUDNN_PATH = /home/miruware/anaconda3/envs/linh/lib/libcudnn.so.5Found Environment variable CUDNN_PATH = /home/miruware/anaconda3/envs/linh/lib/libcudnn.so.5Founle CUDNN_PATH = /home/miruware/anaconda3/envs/linh/lib/libcudnn.so.5Found Environment variable CUDNN_PATH = /home/miruwinvalid frame in test set: 0
Found Environment variable CUDNN_PATH = /home/miruware/anaconda3/envs/linh/lib/libcudnn.so.5Found Environment variable CUDNN_PATH = /home/miruware/anaconda3/envs/linh/lib/libcudnn.so.5Found Environment variable CUDNN_PATH = /home/mirch/pkg/torch/generic/Tensor.c:913)ruware/anaconda3/envs/linh/lib/libcudnn.so.5Found Environment variable CUDNN_PATH = /home/miruware/anaconda3/envs/linh/lib/libcudnn.so.5Found Environment variable CUDNN_PATH = /home/miruware/anaconda3/envs/linh/lib/libcudnn.so.5Found Environment variable CUDNN_PATH = /home/miruware/anaconda3/envs/linh/lib/libcudnn.so.5==> training startFound Environment variable CUDNN_PATH = /home/miruware/anaconda3/envs/linh/lib/libcudnn.so.5Found Environment variable CUDNN_PATH = /home/miruware/anaconda3/envs/linh/lib/libcudnn.so.5Found Environment variable CUDNN_PATH = /home/miruware/anaconda3/envs/linh/lib/libcudnn.so.5epoch: 0/10 batch: 40000/72757 loss: 0.00031782130122301
==> testing:
/home/miruware/sdc/linh/torch/install/bin/luajit: bad argument #2 to '?' (out of range at /home/miruware/sdc/linh/torch/pkg/torch/generic/Tensor.c:913)
stack traceback: "digitbox" 12:46 25-Apr-1
[C]: at 0x7f4574773b30
[C]: in function '__index'
test.lua:28: in function 'test'
run_me.lua:58: in main chunk
[C]: in function 'dofile' ...linh/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405e90

Do you have any ideas what is going on? And what should I do?
Thanks in advance.

I have a problem in testing ITOP

HDF5-DIAG: Error detected in HDF5 (1.10.2) thread 140034839357248:
#000: H5F.c line 511 in H5Fopen(): unable to open file
major: File accessibilty
minor: Unable to open file
#1: H5Fint.c line 1452 in H5F_open(): unable to open file: name = '/media/zhengyuezhi/ren/3D_ human_pose/V2V-PoseNet_RELEASE/data/ITOP/side_view/ITOP_side_nil_depth_map.h5', tent_flags = 0
major: File accessibilty
minor: Unable to open file
#2: H5FD.c line 733 in H5FD_open(): open failed
major: Virtual File Layer
minor: Unable to initialize object
#3: H5FDsec2.c line 346 in H5FD_sec2_open(): unable to open file: name = '/media/zhengyuezhi/ren/3D_ human_pose/V2V-PoseNet_RELEASE/data/ITOP/side_view/ITOP_side_nil_depth_map.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0
major: File accessibilty
minor: Unable to open file
/home/zhengyuezhi/torch/install/bin/luajit: /home/zhengyuezhi/torch/install/share/lua/5.1/hdf5/file.lua:12: attempt to concatenate 'int64_t' and 'string'
stack traceback:
/home/zhengyuezhi/torch/install/share/lua/5.1/hdf5/file.lua:12: in function '__init'
...e/zhengyuezhi/torch/install/share/lua/5.1/torch/init.lua:91: in function <...e/zhengyuezhi/torch/install/share/lua/5.1/torch/init.lua:87>
[C]: in function 'open'
./data/ITOP/data.lua:62: in function 'load_depthmap'
test.lua:28: in function 'test'
run_me.lua:79: in main chunk
[C]: in function 'dofile'
...ezhi/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50

Why does the file /media/zhengyuezhi/ren/3D_human_pose/V2V-PoseNet_RELEASE/data/ITOP/side_view/ITOP_side_nil_depth_map.h5' appear? Shouldn't it be an ITOP_side_test_depth_map.h5 file?

disagreement between MSRA dataset and the provided precomputed center

the first line of joint.txt is a number X, which indicates the number of frames and the remaining lines contain float numbers.
For all joint.txt in MSRA dataset, I have verified that X equals len(remaining_lines)

And I found that the number of precomputed center can not be matched with the dataset.
Do I mistaken anything?

# data(number of frame )  file
500 /home/dexter/git/deep-prior-pp/data/cvpr15_MSRAHandGestureDB/P6/1/joint.txt
499 /home/dexter/git/deep-prior-pp/data/cvpr15_MSRAHandGestureDB/P6/2/joint.txt
500 /home/dexter/git/deep-prior-pp/data/cvpr15_MSRAHandGestureDB/P6/3/joint.txt
499 /home/dexter/git/deep-prior-pp/data/cvpr15_MSRAHandGestureDB/P6/4/joint.txt
500 /home/dexter/git/deep-prior-pp/data/cvpr15_MSRAHandGestureDB/P6/5/joint.txt
500 /home/dexter/git/deep-prior-pp/data/cvpr15_MSRAHandGestureDB/P6/6/joint.txt
500 /home/dexter/git/deep-prior-pp/data/cvpr15_MSRAHandGestureDB/P6/7/joint.txt
500 /home/dexter/git/deep-prior-pp/data/cvpr15_MSRAHandGestureDB/P6/8/joint.txt
500 /home/dexter/git/deep-prior-pp/data/cvpr15_MSRAHandGestureDB/P6/9/joint.txt
500 /home/dexter/git/deep-prior-pp/data/cvpr15_MSRAHandGestureDB/P6/I/joint.txt
500 /home/dexter/git/deep-prior-pp/data/cvpr15_MSRAHandGestureDB/P6/IP/joint.txt
500 /home/dexter/git/deep-prior-pp/data/cvpr15_MSRAHandGestureDB/P6/L/joint.txt
500 /home/dexter/git/deep-prior-pp/data/cvpr15_MSRAHandGestureDB/P6/MP/joint.txt
500 /home/dexter/git/deep-prior-pp/data/cvpr15_MSRAHandGestureDB/P6/RP/joint.txt
500 /home/dexter/git/deep-prior-pp/data/cvpr15_MSRAHandGestureDB/P6/T/joint.txt
500 /home/dexter/git/deep-prior-pp/data/cvpr15_MSRAHandGestureDB/P6/TIP/joint.txt
499 /home/dexter/git/deep-prior-pp/data/cvpr15_MSRAHandGestureDB/P6/Y/joint.txt
8497 (sum over the above)
8498 center_test_6_refined.txt (extracted from http://cv.snu.ac.kr/research/V2V-PoseNet/MSRA/center/center.tar.gz)

The last two number (8497, 8498) should be the same.
This also happens in P1, P2, P3, P6, P8

ps.

md5sum cvpr15_MSRAHandGestureDB.zip 
535cd77bd651326453fd75cb4bec4b6c  cvpr15_MSRAHandGestureDB.zip

error in train.lua

in train.lua:59: tot_err=r = 0 i guess it should be tot_error = 0 ?

how to use the pre-trained models ?

could you please give me some instructions?

A confused of the ITOP data about Test Code

Hi,Thanks for your help,I'm begining to run the code now.But I find some question make me confused.
1.in the line 28 of src/test.lua,the code is:

local depthimage = load_depthmap(input_name),

but in the line 59 of src/data/ITOP/data.lua,the def of depthmap is:

function load_depthmap(fileId,db_type)
local file = hdf5.open(db_dir .. 'ITOP_' .. db .. '_' .. tostring(db_type) .. 'depth_map.h5','r'),

it has some question of work because the db_type is nil

2.I downloaded the [center_trainset] [center_testset] [models] of the ITOP Human Pose Dataset (front-view)
2.1.I find the models have ten folders,named 'epoch1~10' respectively,every folder has a 'model.net',but the test process just need one file called 'model.net',I,m can not choose the right one,so how to use it ?
2.2.In the line 93 of src/data/ITOP/data.lua,I find the code:
for line in io.lines(center_dir .. "center_" .. tostring(db_type) .. ".txt") do
table.insert(refPt_,line)
end
and I'm put the 'center_test.txt' in the corresponding path and it working,is it the right way to use this data?

Pre-trained models

Hello,
we are trying to test the pre-trained models but the link is not working, can you give us a hand?

https://image.ibb.co/mnGFvf/Captura-de-pantalla-de-2018-10-16-08-53-21.png

MSRA test list

Thanks for your sharing.
Could you provide the MSRA test list to show the images correspounding with the estimations you have provided?

Hands 2017 challenge dataloader

Hi,
Thanks for sharing your code. Could you please upload the hands 2017 dataloader?

error in data.lua of MSRA dataset

in src/data/MSRA/data.lua, line 136,137: "jointWorld = torch.Tensor(trainSz,jointNum,worldDim):zero() ...", I guess it should change the trainSz if it is in 'test' mode?

mrsa_center_refined

Whats the difference between center_test_1_refined.txt and center_train_1_refined.txt ? How *_refined.txt are build ? (i mean what data and in which order is used to build for example center_train_1_refined.txt)

MSRA consist of 76500 frames why
sum of lines in center_test_1_refined and center_train_1_refined = 76391

The Detail of mAP

Hi.From the model,the Net output 'xyzOutput' which represents the predicted coordinate of key points.And I extract the corresponding 'joint_real' from ITOP dataset.I'm coding for evaluate this data but encountered some question.
In section 8.2 the paper discribed that:
"For 3D human pose estimation, we used mean average precision (mAP) that is defined as the detected ratio of all human body joints based on 10 cm rule following"
Is that mean count the ratio of error between X_predict and X_real;Y_predict and Y_real;Z_predict and Z_real less than 10cm respectively? Or the ratio of Euclidean distance between Predict_XYZ and Key_XYZ less than 10cm?
Or it have another way to calculate,I used two of them but can not be close to the result in this paper.

Tensorflow model.py

Hi Moon,

I'm implementing a Tensorflow version of the V2V-PoseNet. The code down below is my implementation of the V2V-PoseNet model for training ITOP dataset. Could you please have a look of it and give me some feedback. Cause I'm not entirely sure it is a correct model. Thanks in advance!

model.py

import numpy as np
from keras.models import Sequential
from keras.layers import Conv3D, MaxPool3D, Dropout, BatchNormalization
from keras.layers import Conv3DTranspose, Input, Conv2D, MaxPool2D, Flatten, Dense
from keras import layers, models
from keras.initializers import Zeros, TruncatedNormal

def build_3DBlock(y, next_fDim=16, kernelSz=1):
    y = Conv3D(next_fDim, (kernelSz, kernelSz, kernelSz), padding="same", # activation='relu',
                      use_bias=True, bias_initializer=Zeros(),
                      kernel_initializer=TruncatedNormal(mean=0, stddev=0.001)
                      )(y)
    y = add_common_layers(y)
    # module.add(Dropout(_dropout_rate))
    return y

def add_common_layers(y):
    y = layers.BatchNormalization()(y)
    y = layers.LeakyReLU()(y)
    return y

def build_3DResBlock(y, next_fDim, _strides=(1, 1, 1), _project_shortcut=False):
    shortcut = y

    y = layers.Conv3D(next_fDim, kernel_size=(3, 3, 3), padding="same", strides=_strides,
                      use_bias=True, bias_initializer=Zeros(),
                      kernel_initializer=TruncatedNormal(mean=0, stddev=0.001)
                      )(y)
    y = layers.BatchNormalization()(y)
    y = layers.LeakyReLU()(y)

    y = layers.Conv3D(next_fDim, kernel_size=(3, 3, 3), padding="same", strides=(1, 1, 1),
                      use_bias=True, bias_initializer=Zeros(),
                      kernel_initializer=TruncatedNormal(mean=0, stddev=0.001)
                      )(y)
    y = layers.BatchNormalization()(y)

    if _project_shortcut or _strides != (1, 1):
        shortcut = layers.Conv3D(next_fDim, kernel_size=(1, 1, 1), strides=_strides, padding="same",
                                 use_bias=True, bias_initializer=Zeros(),
                                 kernel_initializer=TruncatedNormal(mean=0, stddev=0.001)
                                 )(shortcut)
        shortcut = layers.BatchNormalization()(shortcut)
    y = layers.add([shortcut, y])
    y = layers.LeakyReLU()(y)

    return y

def build_3DpoolBlock(y, poolSz):
    y = MaxPool3D(pool_size=(poolSz, poolSz, poolSz), strides=(poolSz, poolSz, poolSz), padding="same")(y)
    return y

def build_3DupsampleBlock(y, next_fDim, kernelSz, str):
    y = Conv3DTranspose(next_fDim, (kernelSz, kernelSz, kernelSz), padding="same", # activation='relu',
                        use_bias=True, bias_initializer=Zeros(), strides=str,
                        kernel_initializer=TruncatedNormal(mean=0, stddev=0.001))(y)
    y = BatchNormalization()(y)
    y = layers.LeakyReLU()(y)
    return y

def build_branch1(y):
    y = build_3DpoolBlock(y, 2)
    y = build_3DResBlock(y, 64)
    y = build_branch2(y)
    y = build_3DResBlock(y, 64)
    y = build_3DupsampleBlock(y, 32, 2, 2)
    return y

def build_branch2(y):
    x = build_3DResBlock(y, 64)
    y = build_3DpoolBlock(y, 2)
    for i in range(3):
        proj_scut = True if i == 0 else False
        y = build_3DResBlock(y, 128, _project_shortcut=proj_scut)
    y = build_3DupsampleBlock(y, 64, 2, 2)
    y = layers.add([y, x])
    return y


def build_V2VModel(x):
    x = build_3DBlock(x, next_fDim=16, kernelSz=7)
    x = build_3DpoolBlock(x, 2)

    for i in range(3):
        proj_scut = True if i == 0 else False
        x = build_3DResBlock(x, 32, _project_shortcut=proj_scut)

    y = build_3DResBlock(x, 32)
    b1 = build_branch1(x)

    x = layers.add([b1, y])
    x = build_3DResBlock(x, next_fDim=32)
    x = build_3DBlock(x, next_fDim=32, kernelSz=1)
    x = build_3DBlock(x, next_fDim=32, kernelSz=1)
    x = Conv3D(15, kernel_size=(1, 1, 1), strides=(1, 1, 1), padding="valid",
               use_bias=True, bias_initializer=Zeros(),
               kernel_initializer=TruncatedNormal(mean=0, stddev=0.001)
               )(x)
    return x

inputDim = 88

# Create V2V model
voxel_input = Input(shape=(inputDim, inputDim, inputDim, 1), dtype=np.float32, name='input_layer')

heatmap_output = build_V2VModel(voxel_input)

model = models.Model(inputs=voxel_input, outputs=heatmap_output)

print(model.summary())

model.compile(optimizer='RMSprop', loss='mean_squared_error')

# hist = model.fit(voxel, heatmap, batch_size=2, validation_split=0.2, epochs=10, verbose=1)

Model Summary:

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
===================================================================
input_layer (InputLayer)        (None, 88, 88, 88, 1 0
__________________________________________________________________________________________________
conv3d_1 (Conv3D)               (None, 88, 88, 88, 1 5504        input_layer[0][0]
__________________________________________________________________________________________________
batch_normalization_1 (BatchNor (None, 88, 88, 88, 1 64          conv3d_1[0][0]
__________________________________________________________________________________________________
leaky_re_lu_1 (LeakyReLU)       (None, 88, 88, 88, 1 0           batch_normalization_1[0][0]
__________________________________________________________________________________________________
max_pooling3d_1 (MaxPooling3D)  (None, 44, 44, 44, 1 0           leaky_re_lu_1[0][0]
__________________________________________________________________________________________________
conv3d_2 (Conv3D)               (None, 44, 44, 44, 3 13856       max_pooling3d_1[0][0]
__________________________________________________________________________________________________
batch_normalization_2 (BatchNor (None, 44, 44, 44, 3 128         conv3d_2[0][0]
__________________________________________________________________________________________________
leaky_re_lu_2 (LeakyReLU)       (None, 44, 44, 44, 3 0           batch_normalization_2[0][0]
__________________________________________________________________________________________________
conv3d_4 (Conv3D)               (None, 44, 44, 44, 3 544         max_pooling3d_1[0][0]
__________________________________________________________________________________________________
conv3d_3 (Conv3D)               (None, 44, 44, 44, 3 27680       leaky_re_lu_2[0][0]
__________________________________________________________________________________________________
batch_normalization_4 (BatchNor (None, 44, 44, 44, 3 128         conv3d_4[0][0]
__________________________________________________________________________________________________
batch_normalization_3 (BatchNor (None, 44, 44, 44, 3 128         conv3d_3[0][0]
__________________________________________________________________________________________________
add_1 (Add)                     (None, 44, 44, 44, 3 0           batch_normalization_4[0][0]
                                                                 batch_normalization_3[0][0]
__________________________________________________________________________________________________
leaky_re_lu_3 (LeakyReLU)       (None, 44, 44, 44, 3 0           add_1[0][0]
__________________________________________________________________________________________________
conv3d_5 (Conv3D)               (None, 44, 44, 44, 3 27680       leaky_re_lu_3[0][0]
__________________________________________________________________________________________________
batch_normalization_5 (BatchNor (None, 44, 44, 44, 3 128         conv3d_5[0][0]
__________________________________________________________________________________________________
leaky_re_lu_4 (LeakyReLU)       (None, 44, 44, 44, 3 0           batch_normalization_5[0][0]
__________________________________________________________________________________________________
conv3d_7 (Conv3D)               (None, 44, 44, 44, 3 1056        leaky_re_lu_3[0][0]
__________________________________________________________________________________________________
conv3d_6 (Conv3D)               (None, 44, 44, 44, 3 27680       leaky_re_lu_4[0][0]
__________________________________________________________________________________________________
batch_normalization_7 (BatchNor (None, 44, 44, 44, 3 128         conv3d_7[0][0]
__________________________________________________________________________________________________
batch_normalization_6 (BatchNor (None, 44, 44, 44, 3 128         conv3d_6[0][0]
__________________________________________________________________________________________________
add_2 (Add)                     (None, 44, 44, 44, 3 0           batch_normalization_7[0][0]
                                                                 batch_normalization_6[0][0]
__________________________________________________________________________________________________
leaky_re_lu_5 (LeakyReLU)       (None, 44, 44, 44, 3 0           add_2[0][0]
__________________________________________________________________________________________________
conv3d_8 (Conv3D)               (None, 44, 44, 44, 3 27680       leaky_re_lu_5[0][0]
__________________________________________________________________________________________________
batch_normalization_8 (BatchNor (None, 44, 44, 44, 3 128         conv3d_8[0][0]
__________________________________________________________________________________________________
leaky_re_lu_6 (LeakyReLU)       (None, 44, 44, 44, 3 0           batch_normalization_8[0][0]
__________________________________________________________________________________________________
conv3d_10 (Conv3D)              (None, 44, 44, 44, 3 1056        leaky_re_lu_5[0][0]
__________________________________________________________________________________________________
conv3d_9 (Conv3D)               (None, 44, 44, 44, 3 27680       leaky_re_lu_6[0][0]
__________________________________________________________________________________________________
batch_normalization_10 (BatchNo (None, 44, 44, 44, 3 128         conv3d_10[0][0]
__________________________________________________________________________________________________
batch_normalization_9 (BatchNor (None, 44, 44, 44, 3 128         conv3d_9[0][0]
__________________________________________________________________________________________________
add_3 (Add)                     (None, 44, 44, 44, 3 0           batch_normalization_10[0][0]
                                                                 batch_normalization_9[0][0]
__________________________________________________________________________________________________
leaky_re_lu_7 (LeakyReLU)       (None, 44, 44, 44, 3 0           add_3[0][0]
__________________________________________________________________________________________________
max_pooling3d_2 (MaxPooling3D)  (None, 22, 22, 22, 3 0           leaky_re_lu_7[0][0]
__________________________________________________________________________________________________
conv3d_14 (Conv3D)              (None, 22, 22, 22, 6 55360       max_pooling3d_2[0][0]
__________________________________________________________________________________________________
batch_normalization_14 (BatchNo (None, 22, 22, 22, 6 256         conv3d_14[0][0]
__________________________________________________________________________________________________
leaky_re_lu_10 (LeakyReLU)      (None, 22, 22, 22, 6 0           batch_normalization_14[0][0]
__________________________________________________________________________________________________
conv3d_16 (Conv3D)              (None, 22, 22, 22, 6 2112        max_pooling3d_2[0][0]
__________________________________________________________________________________________________
conv3d_15 (Conv3D)              (None, 22, 22, 22, 6 110656      leaky_re_lu_10[0][0]
__________________________________________________________________________________________________
batch_normalization_16 (BatchNo (None, 22, 22, 22, 6 256         conv3d_16[0][0]
__________________________________________________________________________________________________
batch_normalization_15 (BatchNo (None, 22, 22, 22, 6 256         conv3d_15[0][0]
__________________________________________________________________________________________________
add_5 (Add)                     (None, 22, 22, 22, 6 0           batch_normalization_16[0][0]
                                                                 batch_normalization_15[0][0]
__________________________________________________________________________________________________
leaky_re_lu_11 (LeakyReLU)      (None, 22, 22, 22, 6 0           add_5[0][0]
__________________________________________________________________________________________________
max_pooling3d_3 (MaxPooling3D)  (None, 11, 11, 11, 6 0           leaky_re_lu_11[0][0]
__________________________________________________________________________________________________
conv3d_20 (Conv3D)              (None, 11, 11, 11, 1 221312      max_pooling3d_3[0][0]
__________________________________________________________________________________________________
batch_normalization_20 (BatchNo (None, 11, 11, 11, 1 512         conv3d_20[0][0]
__________________________________________________________________________________________________
leaky_re_lu_14 (LeakyReLU)      (None, 11, 11, 11, 1 0           batch_normalization_20[0][0]
__________________________________________________________________________________________________
conv3d_22 (Conv3D)              (None, 11, 11, 11, 1 8320        max_pooling3d_3[0][0]
__________________________________________________________________________________________________
conv3d_21 (Conv3D)              (None, 11, 11, 11, 1 442496      leaky_re_lu_14[0][0]
__________________________________________________________________________________________________
batch_normalization_22 (BatchNo (None, 11, 11, 11, 1 512         conv3d_22[0][0]
__________________________________________________________________________________________________
batch_normalization_21 (BatchNo (None, 11, 11, 11, 1 512         conv3d_21[0][0]
__________________________________________________________________________________________________
add_7 (Add)                     (None, 11, 11, 11, 1 0           batch_normalization_22[0][0]
                                                                 batch_normalization_21[0][0]
__________________________________________________________________________________________________
leaky_re_lu_15 (LeakyReLU)      (None, 11, 11, 11, 1 0           add_7[0][0]
__________________________________________________________________________________________________
conv3d_23 (Conv3D)              (None, 11, 11, 11, 1 442496      leaky_re_lu_15[0][0]
__________________________________________________________________________________________________
batch_normalization_23 (BatchNo (None, 11, 11, 11, 1 512         conv3d_23[0][0]
__________________________________________________________________________________________________
leaky_re_lu_16 (LeakyReLU)      (None, 11, 11, 11, 1 0           batch_normalization_23[0][0]
__________________________________________________________________________________________________
conv3d_25 (Conv3D)              (None, 11, 11, 11, 1 16512       leaky_re_lu_15[0][0]
__________________________________________________________________________________________________
conv3d_24 (Conv3D)              (None, 11, 11, 11, 1 442496      leaky_re_lu_16[0][0]
__________________________________________________________________________________________________
batch_normalization_25 (BatchNo (None, 11, 11, 11, 1 512         conv3d_25[0][0]
__________________________________________________________________________________________________
batch_normalization_24 (BatchNo (None, 11, 11, 11, 1 512         conv3d_24[0][0]
__________________________________________________________________________________________________
add_8 (Add)                     (None, 11, 11, 11, 1 0           batch_normalization_25[0][0]
                                                                 batch_normalization_24[0][0]
__________________________________________________________________________________________________
leaky_re_lu_17 (LeakyReLU)      (None, 11, 11, 11, 1 0           add_8[0][0]
__________________________________________________________________________________________________
conv3d_26 (Conv3D)              (None, 11, 11, 11, 1 442496      leaky_re_lu_17[0][0]
__________________________________________________________________________________________________
batch_normalization_26 (BatchNo (None, 11, 11, 11, 1 512         conv3d_26[0][0]
__________________________________________________________________________________________________
leaky_re_lu_18 (LeakyReLU)      (None, 11, 11, 11, 1 0           batch_normalization_26[0][0]
__________________________________________________________________________________________________
conv3d_28 (Conv3D)              (None, 11, 11, 11, 1 16512       leaky_re_lu_17[0][0]
__________________________________________________________________________________________________
conv3d_27 (Conv3D)              (None, 11, 11, 11, 1 442496      leaky_re_lu_18[0][0]
__________________________________________________________________________________________________
conv3d_17 (Conv3D)              (None, 22, 22, 22, 6 110656      leaky_re_lu_11[0][0]
__________________________________________________________________________________________________
batch_normalization_28 (BatchNo (None, 11, 11, 11, 1 512         conv3d_28[0][0]
__________________________________________________________________________________________________
batch_normalization_27 (BatchNo (None, 11, 11, 11, 1 512         conv3d_27[0][0]
__________________________________________________________________________________________________
batch_normalization_17 (BatchNo (None, 22, 22, 22, 6 256         conv3d_17[0][0]
__________________________________________________________________________________________________
add_9 (Add)                     (None, 11, 11, 11, 1 0           batch_normalization_28[0][0]
                                                                 batch_normalization_27[0][0]
__________________________________________________________________________________________________
leaky_re_lu_12 (LeakyReLU)      (None, 22, 22, 22, 6 0           batch_normalization_17[0][0]
__________________________________________________________________________________________________
leaky_re_lu_19 (LeakyReLU)      (None, 11, 11, 11, 1 0           add_9[0][0]
__________________________________________________________________________________________________
conv3d_19 (Conv3D)              (None, 22, 22, 22, 6 4160        leaky_re_lu_11[0][0]
__________________________________________________________________________________________________
conv3d_18 (Conv3D)              (None, 22, 22, 22, 6 110656      leaky_re_lu_12[0][0]
__________________________________________________________________________________________________
conv3d_transpose_1 (Conv3DTrans (None, 22, 22, 22, 6 65600       leaky_re_lu_19[0][0]
__________________________________________________________________________________________________
batch_normalization_19 (BatchNo (None, 22, 22, 22, 6 256         conv3d_19[0][0]
__________________________________________________________________________________________________
batch_normalization_18 (BatchNo (None, 22, 22, 22, 6 256         conv3d_18[0][0]
__________________________________________________________________________________________________
batch_normalization_29 (BatchNo (None, 22, 22, 22, 6 256         conv3d_transpose_1[0][0]
__________________________________________________________________________________________________
add_6 (Add)                     (None, 22, 22, 22, 6 0           batch_normalization_19[0][0]
                                                                 batch_normalization_18[0][0]
__________________________________________________________________________________________________
leaky_re_lu_20 (LeakyReLU)      (None, 22, 22, 22, 6 0           batch_normalization_29[0][0]
__________________________________________________________________________________________________
leaky_re_lu_13 (LeakyReLU)      (None, 22, 22, 22, 6 0           add_6[0][0]
__________________________________________________________________________________________________
add_10 (Add)                    (None, 22, 22, 22, 6 0           leaky_re_lu_20[0][0]
                                                                 leaky_re_lu_13[0][0]
__________________________________________________________________________________________________
conv3d_29 (Conv3D)              (None, 22, 22, 22, 6 110656      add_10[0][0]
__________________________________________________________________________________________________
batch_normalization_30 (BatchNo (None, 22, 22, 22, 6 256         conv3d_29[0][0]
__________________________________________________________________________________________________
leaky_re_lu_21 (LeakyReLU)      (None, 22, 22, 22, 6 0           batch_normalization_30[0][0]
__________________________________________________________________________________________________
conv3d_31 (Conv3D)              (None, 22, 22, 22, 6 4160        add_10[0][0]
__________________________________________________________________________________________________
conv3d_30 (Conv3D)              (None, 22, 22, 22, 6 110656      leaky_re_lu_21[0][0]
__________________________________________________________________________________________________
conv3d_11 (Conv3D)              (None, 44, 44, 44, 3 27680       leaky_re_lu_7[0][0]
__________________________________________________________________________________________________
batch_normalization_32 (BatchNo (None, 22, 22, 22, 6 256         conv3d_31[0][0]
__________________________________________________________________________________________________
batch_normalization_31 (BatchNo (None, 22, 22, 22, 6 256         conv3d_30[0][0]
__________________________________________________________________________________________________
batch_normalization_11 (BatchNo (None, 44, 44, 44, 3 128         conv3d_11[0][0]
__________________________________________________________________________________________________
add_11 (Add)                    (None, 22, 22, 22, 6 0           batch_normalization_32[0][0]
                                                                 batch_normalization_31[0][0]
__________________________________________________________________________________________________
leaky_re_lu_8 (LeakyReLU)       (None, 44, 44, 44, 3 0           batch_normalization_11[0][0]
__________________________________________________________________________________________________
leaky_re_lu_22 (LeakyReLU)      (None, 22, 22, 22, 6 0           add_11[0][0]
__________________________________________________________________________________________________
conv3d_13 (Conv3D)              (None, 44, 44, 44, 3 1056        leaky_re_lu_7[0][0]
__________________________________________________________________________________________________
conv3d_12 (Conv3D)              (None, 44, 44, 44, 3 27680       leaky_re_lu_8[0][0]
__________________________________________________________________________________________________
conv3d_transpose_2 (Conv3DTrans (None, 44, 44, 44, 3 16416       leaky_re_lu_22[0][0]
__________________________________________________________________________________________________
batch_normalization_13 (BatchNo (None, 44, 44, 44, 3 128         conv3d_13[0][0]
__________________________________________________________________________________________________
batch_normalization_12 (BatchNo (None, 44, 44, 44, 3 128         conv3d_12[0][0]
__________________________________________________________________________________________________
batch_normalization_33 (BatchNo (None, 44, 44, 44, 3 128         conv3d_transpose_2[0][0]
__________________________________________________________________________________________________
add_4 (Add)                     (None, 44, 44, 44, 3 0           batch_normalization_13[0][0]
                                                                 batch_normalization_12[0][0]
__________________________________________________________________________________________________
leaky_re_lu_23 (LeakyReLU)      (None, 44, 44, 44, 3 0           batch_normalization_33[0][0]
__________________________________________________________________________________________________
leaky_re_lu_9 (LeakyReLU)       (None, 44, 44, 44, 3 0           add_4[0][0]
__________________________________________________________________________________________________
add_12 (Add)                    (None, 44, 44, 44, 3 0           leaky_re_lu_23[0][0]
                                                                 leaky_re_lu_9[0][0]
__________________________________________________________________________________________________
conv3d_32 (Conv3D)              (None, 44, 44, 44, 3 27680       add_12[0][0]
__________________________________________________________________________________________________
batch_normalization_34 (BatchNo (None, 44, 44, 44, 3 128         conv3d_32[0][0]
__________________________________________________________________________________________________
leaky_re_lu_24 (LeakyReLU)      (None, 44, 44, 44, 3 0           batch_normalization_34[0][0]
__________________________________________________________________________________________________
conv3d_34 (Conv3D)              (None, 44, 44, 44, 3 1056        add_12[0][0]
__________________________________________________________________________________________________
conv3d_33 (Conv3D)              (None, 44, 44, 44, 3 27680       leaky_re_lu_24[0][0]
__________________________________________________________________________________________________
batch_normalization_36 (BatchNo (None, 44, 44, 44, 3 128         conv3d_34[0][0]
__________________________________________________________________________________________________
batch_normalization_35 (BatchNo (None, 44, 44, 44, 3 128         conv3d_33[0][0]
__________________________________________________________________________________________________
add_13 (Add)                    (None, 44, 44, 44, 3 0           batch_normalization_36[0][0]
                                                                 batch_normalization_35[0][0]
__________________________________________________________________________________________________
leaky_re_lu_25 (LeakyReLU)      (None, 44, 44, 44, 3 0           add_13[0][0]
__________________________________________________________________________________________________
conv3d_35 (Conv3D)              (None, 44, 44, 44, 3 1056        leaky_re_lu_25[0][0]
__________________________________________________________________________________________________
batch_normalization_37 (BatchNo (None, 44, 44, 44, 3 128         conv3d_35[0][0]
__________________________________________________________________________________________________
leaky_re_lu_26 (LeakyReLU)      (None, 44, 44, 44, 3 0           batch_normalization_37[0][0]
__________________________________________________________________________________________________
conv3d_36 (Conv3D)              (None, 44, 44, 44, 3 1056        leaky_re_lu_26[0][0]
__________________________________________________________________________________________________
batch_normalization_38 (BatchNo (None, 44, 44, 44, 3 128         conv3d_36[0][0]
__________________________________________________________________________________________________
leaky_re_lu_27 (LeakyReLU)      (None, 44, 44, 44, 3 0           batch_normalization_38[0][0]
__________________________________________________________________________________________________
conv3d_37 (Conv3D)              (None, 44, 44, 44, 15 495         leaky_re_lu_27[0][0]
==================================================================================================
Total params: 3,461,615
Trainable params: 3,456,847
Non-trainable params: 4,768
__________________________________________________________________________________________________
None

cannot open param.lua: No such file or directory

db: NYU mode: train
training data loading...
testing data loading...
invalid frame in test set: 0
/root/torch/install/bin/luajit: /root/torch/install/share/lua/5.1/threads/threads.lua:183: [thread 6 callback] cannot open param.lua: No such file or directory

About the algorithm flow

Thank you for your research. I read your code line by line and I have two questions as follows:
(1) Firstly, I was confused about why you first voxelize the whole image, then discretize the extracted hand, instead of the opposite. This will not increase the amount of calculation?
(2) Secondly, in the function named "scattering()", why the lower_mask and the upper_mask are set to choose coords between 1 to 88 instead of -44 to 43.
Thanks again for your patience.

reference point

hi@mks0601
can you tell me how to get reference point of MSRA dataset， because i find the numble of sample with MSRA dataset downloaded on net is less than your provided. i need to product new reference point of dataset. thanks

Different Sz in Generate_cubic_input() and Discretize()

HI, Thanks for your open source code. I read your code carefully and try to implement it, and i get confused in different Size in Generate_cubic_input() and Discretize(), including oringinalSz =96, CroppedSz =88, CubicSz = 250 (NYU). There seems two kinds of 'normalize', first implenmeted in Generate_cubic_input() by "/(cubicSz/2)", and second implemented in discretize() by "/croppedSz", What is the differences among them, especially between cubicSz and orginalSz.
Hopes for your replay, thanks very much.

Pre-computed centers for ITOP dataset seems problematic

Hi Moon,

I came across a problem while inspecting the depth map and pre-computed center from ITOP dataset. Here is the result:

The center is denoted by a red dot. However, it seems that the red dots are not the real center of the human body. Why is that? Did I miss anything?

Lua files not found

Hi there,

First of all, thank you for providing this code and for replying to previous issues.
For a graduation project, I try to get this repository running. I installed all dependencies and cloned the repository. I think there is a minor typo in README.md, so run from my src directory:
th run_me.lua
This results in an error:
/usr/bin/luajit: /usr/share/lua/5.1/trepl/init.lua:389: module 'cunn' not found:
no field package.preload['cunn']
no file './cunn.lua'
no file '/usr/share/luajit-2.1.0-beta3/cunn.lua'
no file '/usr/local/share/lua/5.1/cunn.lua'
no file '/usr/local/share/lua/5.1/cunn/init.lua'
no file '/usr/share/lua/5.1/cunn.lua'
no file '/usr/share/lua/5.1/cunn/init.lua'
no file './cunn.so'
no file '/usr/local/lib/lua/5.1/cunn.so'
no file '/usr/lib/x86_64-linux-gnu/lua/5.1/cunn.so'
no file '/usr/local/lib/lua/5.1/loadall.so'
stack traceback:
[C]: in function 'error'
/usr/share/lua/5.1/trepl/init.lua:389: in function 'require'
run_me.lua:2: in main chunk
[C]: in function 'dofile'
/usr/lib/torch-trepl/th:149: in main chunk
[C]: at 0x559c7c0dc1d0

Lua is installed, however: $ luajit works just fine. What am I missing?

How background removal works for train data?

Hi Moon,

I'm having problem about the voxelization part. I understand how you generate the refPoint for background removal. However, my implementation result doesn't seem to be correct.
Here is the Point cloud:

Resulting Voxels.

Why the background is not removed? What is the mechanism that controls how to crop the point cloud based on the refPoint?

The code of tensorflow

Hello?Do you have the code of tensorflow about this project?

Generate cubic input and data augmentation

I have two questions as follows:
(1) Firstly, I know the corpped img size is 96(pixel) in DeepPrior++ and REN,but i was confused about why you set oringinalSz =96 in generate_cubic_input()?And why you use "originalSz" and "croppedSz" variables.
(2) Secondly, in generate_cubic_input(), why you use the following code to resize the data instead of multiplying a coefficient directly？
if newSz < 100 then
coord = coord / originalSz * math.floor(originalSznewSz/100) + math.floor(originalSz/2 - originalSz/2newSz/100)
elseif newSz > 100 then
coord = coord / originalSz * math.floor(originalSznewSz/100) - math.floor(originalSz/2newSz/100 - originalSz/2)
end

Opposite sign of Z value

Hi, I found the Z value of joints(showed in joints.txt) in MSRA dataset are negative. But the depth image pixels values are all positive, and when transformed to 3D, their Z values(which are pixel values) stay unchanged in current code. So they have opposite signs on Z and cannot match in 3D space. Did I miss something?

PS. I also tried to correct the sign of Z of ground truth joints, and they can match well.

Thanks!

Clarity about pixel and world coordinates

Hi, I'm trying to use the pretrained model to estimate pose on my custom images. For center estimation, I used a different quick technique for now to estimate the (x,y) image coordinates of the center of the head (since I'm doing top-view ITOP, I estimate the center of the head), instead of DeepPrior++. While trying to convert that to the world coordinates, which the model requires, I tried using the pixel2world function in the data.lua file for ITOP. But the function takes in (x,y,z) while I have only (x,y) from the image coordinates. Hence I want some clarity on what's the z in pixel coordinates. I assumed pixel coordinates are the same as image's coordinates (row,column), is that correct?

I tried with different values of z and find the resulting joint coordinates to be spaced apart with lower z and grouped closer together with higher z (The estimation still wasn't correct).

Can you give me a brief explanation about pixel coordinates, world coordinates and how to go about my current problem here, concerning them?

Thanks a lot in advance!

The Code of PYTHON,Tensorflow or used on the Windows

Hi,Is this programe has other dependency code like python?Or is it has the version of tensorflow?I'm trying to rebuild it on the Windows but the framework maynot working well.

Steps to run the ITOP dataset

Hello, thank you for your open program, I read your readme, still can't know the visual human body posture estimation steps to run the program, can you tell me?
How to generate result_', db, 'pixel.txt' and result', db, '_world.txt'

In world2pixel.py
Labels = np.loadtxt('result.txt')
Labels = np.reshape(labels, (-1, jointNum, 3))
Labels[:,:,0],labels[:,:,1] = world2pixel(labels,fx,fy,imgWidth,imgHeight)
Labels = np.reshape(labels,(-1,jointNum*3))

Np.savetxt('result_pixel.txt', labels, fmt='%0.3f')
Where is the result.txt file?

Training with custom dataset

Hi, I have created my own custom dataset. I have disparity maps and want to finetune V2V-posenet. Can you explain what the following values are ? How you get them from presented datasets ?

fx = 241.42
fy = 241.42
cubicSz = 200

minDepth = 100
maxDepth = 700

Experimental Comparison

Hi, Gyeongsik,
My research is human pose estimate. Is it convenient to share your training code or pre-trained model for human pose estimating?

HAND 2017

Could you provide HAND2017 code and centroid files?

How to read the output of test?

Hi,thanks for your help,I have rebuilt the system.But I can not understand the output the of Test on ITOP dataset.
(1,.,.) =
0 0 2
0 0 2
0 0 2
0 0 2
0 0 2
0 0 2
0 0 2
0 0 2
0 0 2
0 0 2
0 0 2
0 0 2
0 0 2
0 -1 2
0 -1 3
[torch.CudaLongTensor of size 1x15x3]
The code said this is the 3D joint coordinates (final output) in world coordinate system, I'm confused that why the result is integer and range in [-1,3], the coordinate of x is zero only and how to show it like the figure 13 or 14 in the paper?Is I doing some wrong way that make the result stranger,or need other system to evaluate the result?

Testing with Pre-trained Model and Data

Hi, I've been trying to test the pre-trained ICVL model (from GitHub) with the ICVL data (linked from GitHub) and keep running into this error.

Found Environment variable CUDNN_PATH = /usr/local/cuda-8.0/lib64/libcudnn.so.5db: ICVL mode: test	
model loading...	
testing data loading...	
invalid frame in test set: 0	
==> testing:
/home/diego/research/torch/install/bin/luajit: invalid arguments: FloatTensor nil 
expected arguments: [*FloatTensor*] FloatTensor FloatTensor
stack traceback:
	[C]: at 0x7fb7dd940bd0
	[C]: in function 'cmul'
	./data/ICVL/data.lua:46: in function 'pixel2world'
	util.lua:96: in function 'generate_cubic_input'
	test.lua:42: in function 'test'
	run_me.lua:78: in main chunk
	[C]: in function 'dofile'
	...arch/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
	[C]: at 0x00405d50

Any ideas what is going on? I haven't changed anything other than where the model, centers, and data are located.

Change output heatmap's resolution up to 88x88x88

Hi,
I want to change the network's output resolution up to 88x88x88(that is, double the current heatmap's size) to simply enlarge estimation precision. Since your current network is pretty good now, I chose to simply adjust it by inserting one more decoder(with a upsample layer) block after original encoder-decoder block to double it's output(which is 44x44x44). (I also tried to add a longer skip/residual connection at original scale(88x88x88)). But they seems not to work much well like your original one in my experiments(emm..., actually some of them do work).

I wonder how do you designed your current network, except the common practices, like residual block and U-net like skip-connection. E.g. you used U-net style in a encoder-decoder sub-block in middle of the architecture after one basic conv layer, one pool layer and some residual blocks. What's your considerations?

Besides, Did you consider feature map cell's receptive field(in order to catch larger 3d context) when you design network? Did you try some experiments/network designs on 88x88x88 output resolution? Could you talk some experience or give me some suggestions?

Thanks!

Visualization of ITOP

Hi, I find the visualization code of ICVL, MSRA, NYU in the vis program. Is the code for ITOP visualization could be shared? Or just need to modify some parameters of the existing code?

Real-time inference on ITOP

I'm trying to perform real-time inference on the ITOP dataset. I would appreciate any help in that direction.
I did load the pretrained model and run test. But do I have to use matlab/octave to visualize? If I want to run the model on a depth cam feed and visualize at the same time. Do you think this is possible and if yes, can you point me in the right direction?

The Details of ITOP points.

Hi,In the line 54 of "test.lua",for ITOP dataset,the model output a Tensor named "xyzOutput" which have 15 rows.
Unfortunately,The web of ITOP has been closed.
I'm wondering that corresponding to the human part,what these rows exactly represent?
For example,maybe row 1 represent 'Head'.

Estimate 3D hand pose in ASL Finger Spelling Dataset

Can I estimate 3D hand pose (uvd) in ASL Finger Spelling Dataset using the pretrained models?
Note that the resolution of the given depth images is different and there is no camera configuration.
Thanks in advance~

mks0601 / v2v-posenet_release Goto Github PK

v2v-posenet_release's People

Contributors

Stargazers

Watchers

Forkers

v2v-posenet_release's Issues

Recommend Projects

Recommend Topics

Recommend Org