Giter Site home page Giter Site logo

zhaoj9014 / multi-human-parsing Goto Github PK

View Code? Open in Web Editor NEW
648.0 33.0 103.0 31.63 MB

🔥🔥Official Repository for Multi-Human-Parsing (MHP)🔥🔥

Home Page: http://lv-mhp.github.io/

License: MIT License

Python 34.63% MATLAB 3.24% HTML 0.24% PowerShell 0.01% CSS 0.66% JavaScript 61.19% Objective-C 0.03%
human-parsing multi-human-parsing instance-segmentation parsing segmentation detection scene-understanding group-behavior-analysis human-centric-analysis mhp

multi-human-parsing's Introduction

Multi-Human-Parsing (MHP)

⭐ ACM MM'18 Best Student Paper

Originality

  • To our best knowledge, we are the first to propose a new Multi-Human Parsing task, corresponding datasets, evaluation metrics and baseline methods.

Task Definition

  • Multi-Human Parsing refers to partitioning a crowd scene image into semantically consistent regions belonging to the body parts or clothes items while differentiating different identities, such that each pixel in the image is assigned a semantic part label, as well as the identity it belongs to. A lot of higher-level applications can be founded upon Multi-Human Parsing, such as virtual reality, automatic production recommendation, video surveillance, and group behavior analysis.

Motivation

  • The Multi-Human Parsing project of Learning and Vision (LV) Group, National University of Singapore (NUS) is proposed to push the frontiers of fine-grained visual understanding of humans in crowd scene.

  • Multi-Human Parsing is significantly different from traditional well-defined object recognition tasks, such as object detection, which only provides coarse-level predictions of object locations (bounding boxes); instance segmentation, which only predicts the instance-level mask without any detailed information on body parts and fashion categories; human parsing, which operates on category-level pixel-wise prediction without differentiating different identities.

  • In real world scenario, the setting of multiple persons with interactions are more realistic and usual. Thus a task, corresponding datasets and baseline methods to consider both the fine-grained semantic information of each individual person and the relationships and interactions of the whole group of people are highly desired.

Multi-Human Parsing (MHP) v1.0 Dataset

  • Statistics: The MHP v1.0 dataset contains 4,980 images, each with at least two persons (average is 3). We randomly choose 980 images and their corresponding annotations as the testing set. The rest form a training set of 3,000 images and a validation set of 1,000 images. For each instance, 18 semantic categories are defined and annotated except for the "background" category, i.e. “hat”, “hair”, “sunglasses”, “upper clothes”, “skirt”, “pants”, “dress”, “belt”, “left shoe”, “right shoe”, “face”, “left leg”, “right leg”, “left arm”, “right arm”, “bag”, “scarf” and “torso skin”. Each instance has a complete set of annotations whenever the corresponding category appears in the current image.

  • WeChat News.

  • Download: The MHP v1.0 dataset is available at google drive and baidu drive (password: cmtp).

  • Please refer to our MHP v1.0 paper (submitted to IJCV) for more details.

Multi-Human Parsing (MHP) v2.0 Dataset

  • Statistics: The MHP v2.0 dataset contains 25,403 images, each with at least two persons (average is 3). We randomly choose 5,000 images and their corresponding annotations as the testing set. The rest form a training set of 15,403 images and a validation set of 5,000 images. For each instance, 58 semantic categories are defined and annotated except for the "background" category, i.e. "cap/hat", "helmet", "face", "hair", "left-arm", "right-arm", "left-hand", "right-hand", "protector", "bikini/bra", "jacket/windbreaker/hoodie", "t-shirt", "polo-shirt", "sweater", "singlet", "torso-skin", "pants", "shorts/swim-shorts", "skirt", "stockings", "socks", "left-boot", "right-boot", "left-shoe", "right-shoe", "left-highheel", "right-highheel", "left-sandal", "right-sandal", "left-leg", "right-leg", "left-foot", "right-foot", "coat", "dress", "robe", "jumpsuit", "other-full-body-clothes", "headwear", "backpack", "ball", "bats", "belt", "bottle", "carrybag", "cases", "sunglasses", "eyewear", "glove", "scarf", "umbrella", "wallet/purse", "watch", "wristband", "tie", "other-accessary", "other-upper-body-clothes" and "other-lower-body-clothes". Each instance has a complete set of annotations whenever the corresponding category appears in the current image. Moreover, 2D human poses with 16 dense key points ("right-shoulder", "right-elbow", "right-wrist", "left-shoulder", "left-elbow", "left-wrist", "right-hip", "right-knee", "right-ankle", "left-hip", "left-knee", "left-ankle", "head", "neck", "spine" and "pelvis". Each key point has a flag indicating whether it is visible-0/occluded-1/out-of-image-2) and head & instance bounding boxes are also provided to facilitate Multi-Human Pose Estimation research.

  • Download: The MHP v2.0 dataset is available at google drive and baidu drive (password: uxrb).

  • Please refer to our MHP v2.0 paper (ACM MM'18 Best Student Paper) for more details.

Evaluation Metrics

  • Multi-Human Parsing: We use two human-centric metrics for multi-human parsing evaluation, which are initially reported by our MHP v1.0 paper. The two metrics are Average Precision based on part (APp) (%) and Percentage of Correctly parsed semantic Parts (PCP) (%). For evaluation code, please refer to the "Evaluation" folder under our "Multi-Human-Parsing_MHP" repository.

  • Multi-Human Pose Estimation: Followed MPII, we use mAP (%) evaluation measure.

CVPR VUHCS2018 Workshop

  • We have organized the CVPR 2018 Workshop on Visual Understanding of Humans in Crowd Scene (VUHCS 2018). This workshop is collaborated by NUS, CMU and SYSU. Based on VUHCS 2017, we have further strengthened this Workshop by augmenting it with 5 competition tracks: the single-person human parsing, the multi-person human parsing, the single-person pose estimation, the multi-human pose estimation and the fine-grained multi-human parsing.

  • Result Submission & Leaderboard.

  • WeChat News.

Citation

  • Please consult and consider citing the following papers:

    @article{zhao2018understanding,
    title={Understanding Humans in Crowded Scenes: Deep Nested Adversarial Learning and A New Benchmark for Multi-Human Parsing},
    author={Zhao, Jian and Li, Jianshu and Cheng, Yu and Zhou, Li and Sim, Terence and Yan, Shuicheng and Feng, Jiashi},
    journal={arXiv preprint arXiv:1804.03287},
    year={2018}
    }
    
    
    @article{li2017towards,
    title={Multi-Human Parsing in the Wild},
    author={Li, Jianshu and Zhao, Jian and Wei, Yunchao and Lang, Congyan and Li, Yidong and Sim, Terence and Yan, Shuicheng and Feng, Jiashi},
    journal={arXiv preprint arXiv:1705.07206},
    year={2017}
    }
    

multi-human-parsing's People

Contributors

ddddwee1 avatar sbrugman avatar zhaoj9014 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

multi-human-parsing's Issues

TypeError: Value passed to parameter 'labels' has DataType float32 not in list of allowed values: int32, int64

Run python train_step1.py gets the following error:

Conv_bias: False
BN training: False
Conv_bias: False
BN training: False
Conv_bias: False
BN training: False
Conv_bias: False
BN training: False
Conv_bias: False
BN training: False
Conv_bias: False
Conv_bias: False
BN training: False
Conv_bias: False
BN training: False
Conv_bias: False
BN training: False
Conv_bias: False
Conv_bias: False
BN training: False
Conv_bias: True
Conv_bias: True
Traceback (most recent call last):
  File "train_step1.py", line 137, in <module>
    net = network()
  File "train_step1.py", line 25, in __init__
    self.build_loss(seg_layer,lab_holder)
  File "train_step1.py", line 45, in build_loss
    seg_loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=seg_layer,labels=lab_reform))
  File "/home/skyuuka/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 2084, in sparse_softmax_cross_entropy_with_logits
    precise_logits, labels, name=name)
  File "/home/skyuuka/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 7515, in sparse_softmax_cross_entropy_with_logits
    labels=labels, name=name)
  File "/home/skyuuka/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 609, in _apply_op_helper
    param_name=input_name)
  File "/home/skyuuka/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 60, in _SatisfiesTypeConstraint
    ", ".join(dtypes.as_dtype(x).name for x in allowed_list)))
TypeError: Value passed to parameter 'labels' has DataType float32 not in list of allowed values: int32, int64

Not able to download from Google Drive

First, thank you so much for open sourcing such a useful project like this. I was trying to download the dataset but i got the following error:

"Sorry, you can't view or download this file at this time.

Too many users have viewed or downloaded this file recently. Please try accessing the file again later. If the file you are trying to access is particularly large or is shared with many people, it may take up to 24 hours to be able to view or download the file. If you still can't access a file after 24 hours, contact your domain administrator."

Then I tried Baidu, but everything was in Chinese so wasn't able to understand it. I wonder if there is another way to download the dataset without getting this error. Thanks a lot again!

Will you upload an available train_e2e.py?

Dear Mr. Zhao,

As mentioned in the title, the script train_e2e.py has some syntax errors, such as the missing of commas in line 211, and the inconsistency of 'tabs' and 'space' in line 247. I need a human parsing tool for some downstream application, but these problems as well as the missing of adversarial loss, which is mentioned in the paper, make me unconfident about the result and discourage me from training a model myself.

Sincerely looking forward to your modification of the code and released pre-trained model in the future.

Thank you.
Zhiyong

Add pretrained models

Please add pretrained models for use and maybe also for transfer learning. Thanks.

incorrect kpts order

it's said

Moreover, 2D human poses with 16 dense key points ("right-shoulder", "right-elbow", "right-wrist", "left-shoulder", "left-elbow", "left-wrist", "right-hip", "right-knee", "right-ankle", "left-hip", "left-knee", "left-ankle", "head", "neck", "spine" and "pelvis". Each key point has a flag indicating whether it is visible-0/occluded-1/out-of-image-2)

but it's not correct order if we visualize points. Is someone is interested the correct order is the following:
["right-ankle", "right-knee", "right-hip", "left-hip", "left-knee", "left-ankle", "pelvis", "spine", "neck", "head", "right-wrist", "right-elbow", "right-shoulder", "left-shoulder", "left-elbow", "left-wrist"]

it's also should be mentioned that visibility flags don't seem to be correct cause I see keypoints having negative coordinates with flag set to 0 (visible) so that I manually set those to 2 (out of image)
Selection_759

I hope the info will be useful for someone..

error

Hi,
I don't know what's wrong,
I use python2.7 and tensorflow1.2 and VOC2012 dataset
2018-11-18 21:48:15.769563: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2018-11-18 21:48:15.769595: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2018-11-18 21:48:15.769600: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2018-11-18 21:48:15.769604: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2018-11-18 21:48:15.769608: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2018-11-18 21:48:16.205096: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:893] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-11-18 21:48:16.205568: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 0 with properties:
name: GeForce GTX 1080
major: 6 minor: 1 memoryClockRate (GHz) 1.8475
pciBusID 0000:01:00.0
Total memory: 7.93GiB
Free memory: 7.12GiB
2018-11-18 21:48:16.205601: I tensorflow/core/common_runtime/gpu/gpu_device.cc:961] DMA: 0
2018-11-18 21:48:16.205615: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0: Y
2018-11-18 21:48:16.205632: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
('loading from model:', u'./savings_bgfg/pretrain.ckpt')
Traceback (most recent call last):
File "train_step1.py", line 146, in
loss = net.train(img_batch,lab_batch)
File "train_step1.py", line 62, in train
ls,_ = self.sess.run([self.loss,self.train_op],feed_dict={self.inp_holder:img_batch, self.lab_holder:lab_batch})
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 789, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 997, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1132, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1152, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: logits and labels must have the same first dimension, got logits shape [3364,2] and labels shape [423200]
[[Node: bg_fg/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits = SparseSoftmaxCrossEntropyWithLogits[T=DT_FLOAT, Tlabels=DT_INT64, _device="/job:localhost/replica:0/task:0/gpu:0"](bg_fg/SparseSoftmaxCrossEntropyWithLogits/Reshape, bg_fg/SparseSoftmaxCrossEntropyWithLogits/Reshape_1)]]
[[Node: bg_fg/Mean/_263 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_8656_bg_fg/Mean", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]]

Caused by op u'bg_fg/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits', defined at:
File "train_step1.py", line 140, in
net = network()
File "train_step1.py", line 26, in init
self.build_loss(seg_layer,lab_holder)
File "train_step1.py", line 47, in build_loss
seg_loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(labels=lab_reform,logits=seg_layer))
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/nn_ops.py", line 1703, in sparse_softmax_cross_entropy_with_logits
precise_logits, labels, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_nn_ops.py", line 2486, in _sparse_softmax_cross_entropy_with_logits
features=features, labels=labels, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2506, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1269, in init
self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): logits and labels must have the same first dimension, got logits shape [3364,2] and labels shape [423200]
[[Node: bg_fg/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits = SparseSoftmaxCrossEntropyWithLogits[T=DT_FLOAT, Tlabels=DT_INT64, _device="/job:localhost/replica:0/task:0/gpu:0"](bg_fg/SparseSoftmaxCrossEntropyWithLogits/Reshape, bg_fg/SparseSoftmaxCrossEntropyWithLogits/Reshape_1)]]
[[Node: bg_fg/Mean/_263 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_8656_bg_fg/Mean", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]]

Not found: Key bg_fg/SegLayer/conv_0/conv_0/bias not found in checkpoint

this error occurs when running 'output_sample.py', and I found that though there are some variable scopes when building the network in 'out_model.py', such as
with tf.variable_scope('SegLayer'): mod = M.Model(feature) mod.convLayer(3, 2, dilation_rate=dilation)
but after I run 'deploy_pretrain.py', still no corresponding values or defines are found in the checkpoint files when I check them with netron, all are WideRes, no Seglayer, MergingLayer, or inst_layer.

Code mistake and class_num value

There seems to be a small mistake(,) in the code while calling the function tf.nn.softmax at line 144

/Multi-Human-Parsing_MHP/Nested_Adversarial_Networks/output_model.py", line 144 tf.nn.softmax(tf.image.resize_images(net_bgfg.seg_layer,tf.shape(img_holder)[1:3]),1)[:,:,:,1], ^ SyntaxError: invalid syntax
What must be the value class_num to be passed from out_sample?

The label order is not match to the README..

I read the label map, but I find that the order of labels are not match to the description of README such as pixel 4 dosen't represent Hair but represents Singlet. Can you give me a right order label?

Connections between poses and parsings

Hi, thanks for the dataset.
It seems there is no connection between poses and parsings.
For example, train/parsing_annos/1360_05_01.png and person_0 in train/pose_annos/1360.mat represents different persons.
Could you provide connections between poses and parsings?

Also, the dataset in the Google Drive seems zipped in MAC OSX. It can be unzipped without problems in MAC OSX, however Linux and Window says the zipped file is a corrupted one.

Code for Evaluation

Hi, @ZhaoJ9014
The current code for multi-person parsing evaluation does not support the PCP metric.
I wonder will the code support PCP in the future.

About evaluation code

@ZhaoJ9014
Hi, Jian. Very nice job of the multiple human parsing dataset.
I'm here to ask if there is any official evaluation code or online servers for this task?

reform_img?

Where is /data/reform_img/ and /data/reform_anno/ shown here in train_step2.py

mask = read_label(fname[0].replace('reform_img', 'reform_annot').replace('jpg', 'png'))

Pretrained models?

The previous issue for this #9 was closed, though I still don't see any link to download pretrained models. Any update on this, please?

MHP v2.0 dataset baidu link is cancelled?

firstly,thanks for your great job~~~~
i am sorry for that when i download the mhp v2.0 dataset, found that the baidu link is cancelled......
but i have no vpn to download through the google drive.....
so, could someone share me a link that china can access without vpn?
many many thanks........
image

annotations errors

Thank you for your publication. However, there are many confused annotations such as instance bounding box has value [-1,-1,1] or the bottom right coordinate is smaller than the upper left coordinate. For example, 17608.jpg, 8865.jpg. I think this may be a mistake.

train_step1 all pixels are classified as backgound(black)

I did some small change and now the train step 1 can run but all pixels are classified as backgound(black).
Anyone has this same issue?

Initially the build_loss function is
def build_loss(self,seg_layer,lab_holder):
lab_reform = tf.expand_dims(lab_holder,-1)
lab_reform = tf.image.resize_images(seg_layer,tf.shape(lab_reform)[1:3])
lab_reform = tf.squeeze(lab_reform)
seg_loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=seg_layer,labels=lab_reform))

I changed it to
def build_loss(self,seg_layer,lab_holder):
# lab_reform = tf.expand_dims(lab_holder, -1)
# lab_reform = tf.image.resize_images(seg_layer, tf.shape(lab_reform)[1:3]) # z00445456
seg_reform = tf.image.resize_images(seg_layer, tf.shape(lab_holder)[1:3]) # z00445456
# lab_reform = tf.squeeze(lab_reform)
seg_loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=seg_reform,labels=lab_holder))

and it can run without error. But the result is weird. All pixels are classified as backgound.

LV-MHP-v1 analysis

LV-MHP-v1 README.txt show:
One image is corresponding to multiple annotation files with the same prefix, one file per person. In each annotation file, the label represents:

0:  background
1:  hat
2:  hair
3:  sunglass
4:  upper-clothes
5:  skirt
6:  pants
7:  dress
8:  belt 
9:  left-shoe
10: right-shoe
11: face
12: left-leg
13: right-leg
14: left-arm 
15: right-arm
16: bag
17: scarf
18: torso-skin

For example, the original picture of a photograph contains information such as hat, hair and so on, but it is a black picture. How can I parse and get the detailed information it contains?

output_sample.py

TypeError: init() missing 1 required positional argument: 'class_num'

Problem with net = network()

File "/Users/rida/Documents/Virtual Retail/External Repos/Multi-Human-Parsing/Nested_Adversarial_Networks/train_e2e.py", line 437, in
net = network()
TypeError: init() takes exactly 2 arguments (1 given)

[question] Compatibility with Python 2.7 or 3.6?

What is the version is supported?

I faced a issue in train_step1.py where my input image jpeg file were actually 2 paths in single string: which was failing on:

Reading data...
Data file= train.list
f= <_io.TextIOWrapper name='train.list' mode='r' encoding='UTF-8'>
i= VOC2012/JPEGImages/2007_000032.jpg	VOC2012/SegmentationClass/2007_000032.png

i= VOC2012/JPEGImages/2007_000032.jpg	VOC2012/SegmentationClass/2007_000032.png
rest= VOC2012/JPEGImages/2007_000032.jpg
jpgfile= VOC2012/JPEGImages/2007_000032.jpg	VOC2012/JPEGImages/2007_000032.jpg
Image path:  VOC2012/JPEGImages/2007_000032.jpg	VOC2012/JPEGImages/2007_000032.jpg
Traceback (most recent call last):
  File "train_step1.py", line 142, in <module>
    reader = data_provider('train.list')
  File "train_step1.py", line 98, in __init__
    jpg = img_reader.read_img(jpgfile,500,padding=True)
  File "/home/pegasus/proj/human-parts-parsing/Multi-Human-Parsing_MHP/Nested_Adversarial_Networks/img_reader.py", line 26, in read_img
    raw_im = np.array(Image.open(im_path).convert('RGB'), np.uint8)
  File "/home/pegasus/anaconda3/envs/mhp_nan/lib/python3.6/site-packages/PIL/Image.py", line 2609, in open
    fp = builtins.open(filename, "rb")
FileNotFoundError: [Errno 2] No such file or directory: 'VOC2012/JPEGImages/2007_000032.jpg\tVOC2012/JPEGImages/2007_000032.jpg'

Now I removed the part "\tVOC2012/JPEGImages/2007_000032.jpg" which results into:

Reading data...
Data file= train.list
f= <_io.TextIOWrapper name='train.list' mode='r' encoding='UTF-8'>
i= VOC2012/JPEGImages/2007_000032.jpg	VOC2012/SegmentationClass/2007_000032.png

i= VOC2012/JPEGImages/2007_000032.jpg	VOC2012/SegmentationClass/2007_000032.png
rest= VOC2012/JPEGImages/2007_000032.jpg
jpgfile= VOC2012/JPEGImages/2007_000032.jpg
Image path:  VOC2012/JPEGImages/2007_000032.jpg
Traceback (most recent call last):
  File "train_step1.py", line 142, in <module>
    reader = data_provider('train.list')
  File "train_step1.py", line 100, in __init__
    seg = img_reader.pad(seg,np.uint8,False)
  File "/home/pegasus/proj/human-parts-parsing/Multi-Human-Parsing_MHP/Nested_Adversarial_Networks/img_reader.py", line 43, in pad
    res[start_point:start_point+a] = img
ValueError: could not broadcast input array from shape (281,500,3) into shape (281,500)

I use Python3, so not PIL package(only supports 2.7), but Pillow which could be causing this issue

When will the rework finish?

The network class in the old code does not set the default value, so there will be problem with the training. Will the rework NAN come recently?
I have implement the first and second stage network in https://github.com/Windaway/Pytorch-Multi-Human-Parsing. But the third stage net has two version? A RPN and a GCN net in arxiv paper.
The NAN rework define another version, which is the baseline method?

output_sample.py not working

output_sample.py seems not working. class_num is missing at first. Even if a number is given here, loading pretrained model will fail.

loading from model: ./savings_bgfg/pretrain.ckpt
2019-03-07 18:40:09.127626: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at save_restore_v2_ops.cc:184 : Not found: Key bg_fg/SegLayer/conv_0/conv_0/bias not found in checkpoint

FileNotFoundError

when I run train_step1.py Error info show
FileNotFoundError: [Errno 2] No such file or directory: 'VOC2012/JPEGImages/2012_001955.jpg VOC2012/JPEGImages/2012_001955.jpg'

Training your model with MHPv2

Hello.

I'm trying to reproduce your model using the provided dataset, but I can't find on your code how to properly adapt the parsing annotations from MHPv2 and Nested Adversarial Networks, as shown on your paper.
Specially due how each image has multiple parsing images in order to load them for training...
How do I use your dataset in your model?

Thanks beforehand

dataset worked

But I'm still lack of NAN、MH-Parser well-trained model for comparison.
Do anyone have it and can share with me?

Error when unzipping MHP v2 dataset

There are not any errors while downloading the dataset, however, it is said that "the data is damaged" when I unzip it on Windows. (The error exists whether downloaded from Google drive or BaiduYun)
And as a result, the folder: /train/images is empty.
Is there anything wrong?
image

quality of the GTs

HI, thanks for this great dataset

I found some GTs are quite noisy, is there any bugs during encoding the GTs coz some noises seems quite buggy.
image
image
image

Could not find the discriminators (the adversarial losses)

As written in the paper, the NAN model is composed of three sub-nets, each one trained with some prediction loss (compared with ground truth labels) and adversarial loss. However I could not find the adversarial losses in the code. Is it not included, or lying somethere that I missed? BTW, thanks for sharing the code.

error on runing train_step1.py

Can you help me to analyse the error about it?

Reading data...
Traceback (most recent call last):
File "train_step1.py", line 134, in
reader = data_provider('train.list')
File "train_step1.py", line 92, in init
seg = img_reader.pad(seg,np.uint8,False)
File "/Multi-Human-Parsing/Nested_Adversarial_Networks/img_reader.py", line 42, in pad
res[start_point:start_point+a] = img
ValueError: could not broadcast input array from shape (281,500,3) into shape (281,500)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.