Giter Site home page Giter Site logo

victordibia / handtracking Goto Github PK

View Code? Open in Web Editor NEW
1.6K 56.0 459.0 242.25 MB

Building a Real-time Hand-Detector using Neural Networks (SSD) on Tensorflow

Home Page: https://medium.com/@victor.dibia/how-to-build-a-real-time-hand-detector-using-neural-networks-ssd-on-tensorflow-d6bac0e4b2ce

License: MIT License

Python 88.97% Starlark 11.03%
tensorflow hand-detector detector hand-detection neural-network computer-vision ssd

handtracking's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

handtracking's Issues

How to get you mAP?

Excuse me.
I want konw how do you get your mAP?
I eval my model with objection_datection/eval.py, but I can't get the value.
Hope you can tell me, thank you.

Network is not learning. mAP stays the same and Total Loss does not go below 6

Hey Victor, i had the same idea to do a realtime capable hand detector.
First i also did it with the oxford dataset, but like you said it is not really good.
Then i saw your repo and tried the egohands dataset which looks really promising.

But with both datasets my mAp always stays at 0 and the total loss never goes below ~6.
If you look at the weights you see that they stayed around zero, so no learning at all.

Do you have an idea why this could be?

Here is a screenshot of my current training with the exact same config you are using.
(by the way, why do you use a batch size of only 6? Is this any good except of saving memory?)
screenshot from 2018-01-19 13-21-37
screenshot from 2018-01-19 13-21-11

I would really appreciate it if you could help me with this.

And yeah, your egohands_dataset_clean.py code is really nice ๐Ÿ‘

EDIT: Are you sure the csv files that you create contain the correct bounding boxes? Could you share the csv files you used to create the tf record files? Thanks!

Polygons are incorrectly used for BoundingBox calculation

Hey Victor,

i decided to open a new issue for that.

I changed the cv2.watkey() to 1000 to see each frame with polygons and bounding boxes.

It appears that the polygons are not correct for the adressed image frames.

Here are some screnshots which visualize the problem pretty well:
screenshot from 2018-01-23 17-27-11
screenshot from 2018-01-23 17-26-51
screenshot from 2018-01-23 17-26-45

The Network wont be able to learn anything with that data,

Maybe this could be solved by sorting the images correctly befor feeding them into the rename_files() function! See this Thread: https://stackoverflow.com/questions/13122005/files-from-directory-being-pulled-in-wrong-order-with-python

Best,
Gustav

re-freeze the model and the accurate was decreased by tf-nightly 1.13

Hi, victor

I was using the function: export_inference_graph.py, from newest version of tensorflow object detection API, to generate the frozen_inference_graph.pb. (tensorflow version: tf-nightly 1.13..., Ubuntu 18.04)
But the accuracy was so bad both in ssd_mobile_v1 and ssdlite_mobile_v2.
The confidences were inaccurate and the bounding boxes were also not in where they should be.
(A few boxes were in the right place, but the sizes were not accurate, and some confidences of the incorrect boxes were higher than the real hand boxes.)
The results were worse than what the frozen_inference_graph.pb in folder hand_inference_graph made.

Can you please help me to analyse the possible problems?
A wired phenomenon is that when running export_inference_graph.py to generate the frozen pb file, a group of files were generated at the same time, including pipeline.config, model.ckpt.data-00000-of 00001, and its index and meta files, which are different from the original .config and .ckpt files.

Comparison with YOLO on the same dataset

Hi Victor and others,

Has anyone trained the recent yolo(v1 v2 or v3) on the ego hands dataset and can probile provide a absic comparison in terms of speed and accuracy?

thanks in advanace
shreeraman

Help Needed to convert the frozen graph to Tensorflow lite model

Hi @victordibia,
I am currently trying to convert the frozen inference graph in the repo to .tflite for integrating with an android application. However, I am trying to do that I am facing errors.

Below is the script that I am using to convert the frozen graph to tflite and the error log is also reference below. It would be great if you can share some insights.

Thanks,
Suraj

Code
`
import tensorflow as tf

localpb = 'frozen_inference_graph.pb'
tflite_file = 'hand_detector.lite'

print("{} -> {}".format(localpb, tflite_file))

converter = tf.lite.TFLiteConverter.from_frozen_graph(
#converter =tf.contrib.lite.TocoConverter.from_frozen_graph(
localpb,
['Const'],
['num_detections'])

tflite_model = converter.convert()
`

Error Log

`frozen_inference_graph.pb -> hand_detector.lite
2019-07-18 14:50:50.053195: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-07-18 14:50:50.077006: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3192000000 Hz
2019-07-18 14:50:50.077666: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4b47c90 executing computations on platform Host. Devices:
2019-07-18 14:50:50.077680: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): ,
2019-07-18 14:50:50.515219: I tensorflow/core/grappler/devices.cc:60] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0 (Note: TensorFlow was not compiled with CUDA support)
2019-07-18 14:50:50.515312: I tensorflow/core/grappler/clusters/single_machine.cc:356] Starting new session
2019-07-18 14:50:50.782857: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:716] Optimization results for grappler item: graph_to_optimize
2019-07-18 14:50:50.782885: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718] constant folding: Graph size after: 978 nodes (-1201), 1257 edges (-1368), time = 192.627ms.
2019-07-18 14:50:50.782890: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718] constant folding: Graph size after: 978 nodes (0), 1257 edges (0), time = 33.785ms.
Traceback (most recent call last):
File "HandTfliteConverter.py", line 14, in
tflite_model = converter.convert()
File "/usr/local/lib/python2.7/dist-packages/tensorflow_core/lite/python/lite.py", line 983, in convert
**converter_kwargs)
File "/usr/local/lib/python2.7/dist-packages/tensorflow_core/lite/python/convert.py", line 437, in toco_convert_impl
enable_mlir_converter=enable_mlir_converter)
File "/usr/local/lib/python2.7/dist-packages/tensorflow_core/lite/python/convert.py", line 188, in toco_convert_protos
raise ConverterError("See console for info.\n%s\n%s\n" % (stdout, stderr))
tensorflow.lite.python.convert.ConverterError: See console for info.
2019-07-18 14:50:52.037299: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: Enter
2019-07-18 14:50:52.037338: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: Enter
2019-07-18 14:50:52.037351: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: Enter
2019-07-18 14:50:52.037359: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: Enter
2019-07-18 14:50:52.037402: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: TensorArrayV3
2019-07-18 14:50:52.037412: I tensorflow/lite/toco/import_tensorflow.cc:193] Unsupported data type in placeholder op: 20
2019-07-18 14:50:52.037419: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: TensorArrayV3
2019-07-18 14:50:52.037425: I tensorflow/lite/toco/import_tensorflow.cc:193] Unsupported data type in placeholder op: 20
2019-07-18 14:50:52.037432: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: TensorArrayV3
2019-07-18 14:50:52.037437: I tensorflow/lite/toco/import_tensorflow.cc:193] Unsupported data type in placeholder op: 20
2019-07-18 14:50:52.037443: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: Enter
2019-07-18 14:50:52.037451: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: Enter
2019-07-18 14:50:52.037457: I tensorflow/lite/toco/import_tensorflow.cc:193] Unsupported data type in placeholder op: 20
2019-07-18 14:50:52.037462: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: Enter
2019-07-18 14:50:52.037468: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: Enter
2019-07-18 14:50:52.037473: I tensorflow/lite/toco/import_tensorflow.cc:193] Unsupported data type in placeholder op: 20
2019-07-18 14:50:52.037478: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: Enter
2019-07-18 14:50:52.037484: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: Enter
2019-07-18 14:50:52.037489: I tensorflow/lite/toco/import_tensorflow.cc:193] Unsupported data type in placeholder op: 20
2019-07-18 14:50:52.037496: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: TensorArrayScatterV3
2019-07-18 14:50:52.037507: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: Enter
2019-07-18 14:50:52.037513: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: LoopCond
2019-07-18 14:50:52.037528: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: Exit
2019-07-18 14:50:52.037536: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: Exit
2019-07-18 14:50:52.037553: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: TensorArrayReadV3
2019-07-18 14:50:52.037560: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: TensorArraySizeV3
2019-07-18 14:50:52.037566: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: TensorArraySizeV3
2019-07-18 14:50:52.037573: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: TensorArrayWriteV3
2019-07-18 14:50:52.037592: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: TensorArrayGatherV3
2019-07-18 14:50:52.037600: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: TensorArrayGatherV3
2019-07-18 14:50:52.037612: I tensorflow/lite/toco:/import_tensorflow.cc:1336] Converting unsupported operation: TensorArrayWriteV3
2019-07-18 14:50:52.038263: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: TensorArrayV3
2019-07-18 14:50:52.038274: I tensorflow/lite/toco/import_tensorflow.cc:193] Unsupported data type in placeholder op: 20
2019-07-18 14:50:52.038282: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: TensorArrayV3
2019-07-18 14:50:52.038288: I tensorflow/lite/toco/import_tensorflow.cc:193] Unsupported data type in placeholder op: 20
2019-07-18 14:50:52.038294: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: TensorArrayV3
2019-07-18 14:50:52.038300: I tensorflow/lite/toco/import_tensorflow.cc:193] Unsupported data type in placeholder op: 20
2019-07-18 14:50:52.038306: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: TensorArrayV3
2019-07-18 14:50:52.038311: I tensorflow/lite/toco/import_tensorflow.cc:193] Unsupported data type in placeholder op: 20
2019-07-18 14:50:52.038317: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: TensorArrayV3
2019-07-18 14:50:52.038323: I tensorflow/lite/toco/import_tensorflow.cc:193] Unsupported data type in placeholder op: 20
2019-07-18 14:50:52.038329: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: TensorArrayV3
2019-07-18 14:50:52.038335: I tensorflow/lite/toco/import_tensorflow.cc:193] Unsupported data type in placeholder op: 20
2019-07-18 14:50:52.038341: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: TensorArrayV3
2019-07-18 14:50:52.038346: I tensorflow/lite/toco/import_tensorflow.cc:193] Unsupported data type in placeholder op: 20
2019-07-18 14:50:52.038352: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: TensorArrayV3
2019-07-18 14:50:52.038358: I tensorflow/lite/toco/import_tensorflow.cc:193] Unsupported data type in placeholder op: 20
2019-07-18 14:50:52.038364: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: Enter
2019-07-18 14:50:52.038374: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: Enter
2019-07-18 14:50:52.038379: I tensorflow/lite/toco/import_tensorflow.cc:193] Unsupported data type in placeholder op: 20
2019-07-18 14:50:52.038385: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: TensorArrayScatterV3
2019-07-18 14:50:52.038392: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: Enter
2019-07-18 14:50:52.038397: I tensorflow/lite/toco/import_tensorflow.cc:193] Unsupported data type in placeholder op: 20
2019-07-18 14:50:52.038403: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: TensorArrayScatterV3
2019-07-18 14:50:52.038409: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: Enter
2019-07-18 14:50:52.038431: I tensorflow/lite/toco/import_tensorflow.cc:193] Unsupported data type in placeholder op: 20
2019-07-18 14:50:52.038437: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: Enter
2019-07-18 14:50:52.038442: I tensorflow/lite/toco/import_tensorflow.cc:193] Unsupported data type in placeholder op: 20:
2019-07-18 14:50:52.038448: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: Enter
2019-07-18 14:50:52.038455: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: Enter
2019-07-18 14:50:52.038460: I tensorflow/lite/toco/import_tensorflow.cc:193] Unsupported data type in placeholder op: 20
2019-07-18 14:50:52.038466: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: Enter
2019-07-18 14:50:52.038473: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: Enter
2019-07-18 14:50:52.038478: I tensorflow/lite/toco/import_tensorflow.cc:193] Unsupported data type in placeholder op: 20
2019-07-18 14:50:52.038484: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: Enter
2019-07-18 14:50:52.038491: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: Enter
2019-07-18 14:50:52.038496: I tensorflow/lite/toco/import_tensorflow.cc:193] Unsupported data type in placeholder op: 20
2019-07-18 14:50:52.038502: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: Enter
2019-07-18 14:50:52.038509: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: Enter
2019-07-18 14:50:52.038514: I tensorflow/lite/toco/import_tensorflow.cc:193] Unsupported data type in placeholder op: 20
2019-07-18 14:50:52.038522: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: TensorArrayScatterV3
2019-07-18 14:50:52.038544: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: Enter
2019-07-18 14:50:52.038551: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: Enter
2019-07-18 14:50:52.038564: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: Enter
2019-07-18 14:50:52.038572: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: LoopCond
2019-07-18 14:50:52.038594: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: Exit
2019-07-18 14:50:52.038797: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: TensorArrayReadV3
2019-07-18 14:50:52.038806: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: TensorArrayReadV3
2019-07-18 14:50:52.038813: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: TensorArrayReadV3
2019-07-18 14:50:52.038819: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: TensorArraySizeV3
2019-07-18 14:50:52.038826: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: TensorArrayScatterV3
2019-07-18 14:50:52.038851: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: Enter
2019-07-18 14:50:52.038860: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: TensorArrayGatherV3
2019-07-18 14:50:52.038868: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: TensorArrayReadV3
2019-07-18 14:50:52.038901: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: NonMaxSuppressionV3
2019-07-18 14:50:52.038918: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: Size
2019-07-18 14:50:52.039017: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: TensorArrayWriteV3
2019-07-18 14:50:52.039091: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: TensorArrayWriteV3
2019-07-18 14:50:52.039107: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: TensorArrayWriteV3
2019-07-18 14:50:52.039114: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: TensorArrayWriteV3
2019-07-18 14:50:52.039127: F tensorflow/lite/toco/tooling_util.cc:1462] Check failed: batch == 1 (2 vs. 1)
Fatal Python error: Aborted

Current thread 0x00007fefa7cb0700 (most recent call first):
File "/usr/local/lib/python3.5/dist-packages/tensorflow_core/lite/toco/python/toco_from_protos.py", line 52 in execute
File "/home/nxgdl/.local/lib/python3.5/site-packages/absl/app.py", line 251 in _run_main
File "/home/nxgdl/.local/lib/python3.5/site-packages/absl/app.py", line 300 in run
File "/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/platform/app.py", line 40 in run
File "/usr/local/lib/python3.5/dist-packages/tensorflow_core/lite/toco/python/toco_from_protos.py", line 89 in main
File "/usr/local/bin/toco_from_protos", line 11 in
Aborted (core dumped)
`

frozen window

Hi, I get below error and display window is frozen (as soon as implementation starts).

ResourceExhaustedError (see above for traceback): OOM when allocating tensor of shape [1024] and type float
[[Node: FeatureExtractor/MobilenetV1/Conv2d_13_pointwise/BatchNorm/gamma = Constdtype=DT_FLOAT, value=Tensor<type: float shape: [1024] values: 6.0167079 5.9263978 6.43220091...>, _device="/job:localhost/replica:0/task:0/device:GPU:0"]]

loading frozen model for worker
====== loading HAND frozen graph into memory
2019-02-22 16:47:36.197699: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-02-22 16:47:36.261777: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:892] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-02-22 16:47:36.262137: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties:
name: GeForce GTX 1060 with Max-Q Design major: 6 minor: 1 memoryClockRate(GHz): 1.48
pciBusID: 0000:01:00.0
totalMemory: 5.94GiB freeMemory: 44.25MiB
2019-02-22 16:47:36.262154: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1060 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 6.1)
2019-02-22 16:47:36.262774: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 44.25M (46399488 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2019-02-22 16:47:36.263103: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 39.83M (41759744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
====== Hand Inference graph loaded.
2019-02-22 16:47:36.269881: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1060 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 6.1)

How to train?

Could you point us to how to train on our own images? The article and readme files only seem to talk about inference, but no training: how can we perform the transfer learning and retrain with our own images like you did?
Thanks!

tensorflow error

ValueError: No op named NonMaxSuppressionV2 in defined operations.

How to load this model by java?

I just tested the handtrack.js performance, more or less about 5ps~8ps in my computer. I suspect the performance will be improved if I build a java server with strong compute ability node to handle the canvas image frame.
What a pity, I am not quite familiar with the tensorflow java api. Could you show me more about how to load pb file and manifest.json by java ?
Thanks so much

using scripts/retrain.py how to proceed?

I have a naive question. I followed the Google Collab tutorial for transfer learning on images of 5 categories of flowers. However, how can I do the same for the images of hands from egohands? What are the changes I should perform?

mona@Mona:~/code/handpose/handtracking$ IMAGE_SIZE=224
mona@Mona:~/code/handpose/handtracking$ ARCHITECTURE="mobilenet_0.50_${IMAGE_SIZE}"
mona@Mona:~/code/handpose/handtracking$ python -m scripts.retrain   --bottleneck_dir=tf_files/bottlenecks   --how_many_training_steps=500   --model_dir=tf_files/models/   --summaries_dir=tf_files/training_summaries/"${ARCHITECTURE}"   --output_graph=tf_files/retrained_graph.pb   --output_labels=tf_files/retrained_labels.txt   --architecture="${ARCHITECTURE}"   --image_dir=images/train
/home/mona/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
>> Downloading mobilenet_v1_0.50_224_frozen.tgz 100.1%
--- Logging error ---
Traceback (most recent call last):
  File "/home/mona/anaconda3/lib/python3.6/logging/__init__.py", line 992, in emit
    msg = self.format(record)
  File "/home/mona/anaconda3/lib/python3.6/logging/__init__.py", line 838, in format
    return fmt.format(record)
  File "/home/mona/anaconda3/lib/python3.6/logging/__init__.py", line 575, in format
    record.message = record.getMessage()
  File "/home/mona/anaconda3/lib/python3.6/logging/__init__.py", line 338, in getMessage
    msg = msg % self.args
TypeError: not all arguments converted during string formatting
Call stack:
  File "/home/mona/anaconda3/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/mona/anaconda3/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/mona/code/handpose/handtracking/scripts/retrain.py", line 1326, in <module>
    tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
  File "/home/mona/anaconda3/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 126, in run
    _sys.exit(main(argv))
  File "/home/mona/code/handpose/handtracking/scripts/retrain.py", line 982, in main
    maybe_download_and_extract(model_info['data_url'])
  File "/home/mona/code/handpose/handtracking/scripts/retrain.py", line 339, in maybe_download_and_extract
    'bytes.')
  File "/home/mona/anaconda3/lib/python3.6/site-packages/tensorflow/python/platform/tf_logging.py", line 116, in info
    _get_logger().info(msg, *args, **kwargs)
Message: 'Successfully downloaded'
Arguments: ('mobilenet_v1_0.50_224_frozen.tgz', 6308169, 'bytes.')
ERROR:tensorflow:No valid folders of images found at images/train

It does seem I am not the only one who is needing help with this based on other git issues. Some guidance is really appreciated.

Keypoints regression

Hi, Victor

Can we add additional feature, like hand keypoints regression when detecting the hand? If so, how? I want to do hand pose detection. Thanks.

Problem with egohands_dataset_clean.py

when i play these script "egohands_dataset_clean.py" it should return for me 2 folders (train and test) contained the generated CSV, files, right? so, the file only is downloading the egohands_data.zip and extracting it, how can i fix this?
ps.: i'm not a python programmer.
ps2.: sorry for the bad english.

There are problem in egohands_dataset_clean.py ?

In function get_bbox_visualize(base_path_dir), you use image_path_array list to store image path, but you didn`t sort is it form small to large, because when you read polygons.mat document, it has order, the order according to image label from small to large. is it?
Finally ,thank you for you sharing!!!

Executor failed to create kernel. Invalid argument: NodeDef mentions attr 'T' not in Op

My system environment is ๏ผš
Python: Anaconda3
Tensorflow-gpu : 1.4.0
GPU: nvidia gtx 1070

when I run the project, there is an error:

====== loading HAND frozen graph into memory
2017-12-06 11:59:38.773044: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\platform\cpu_feat
ure_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
2017-12-06 11:59:39.066517: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gp
u\gpu_device.cc:1030] Found device 0 with properties:
name: GeForce GTX 1070 major: 6 minor: 1 memoryClockRate(GHz): 1.7715
pciBusID: 0000:01:00.0
totalMemory: 8.00GiB freeMemory: 6.64GiB
2017-12-06 11:59:39.066767: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gp
u\gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1070, pci bus id: 0000
:01:00.0, compute capability: 6.1)
====== Hand Inference graph loaded.
2017-12-06 11:59:41.530536: E C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\ex
ecutor.cc:643] Executor failed to create kernel. Invalid argument: NodeDef mentions attr 'T' not in Op<name=Where; signa
ture=input:bool -> index:int64>; NodeDef: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppr
ession/FilterGreaterThan/Where = Where[T=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Postprocessor/
BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where/Cast). (Check whether you
r GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).
[[Node: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/
Where = Where[T=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Postprocessor/BatchMultiClassNonMaxSupp
ression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where/Cast)]]
Traceback (most recent call last):
File "C:\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1323, in _do_call
return fn(*args)
File "C:\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1302, in _run_fn
status, run_metadata)
File "C:\Anaconda3\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 473, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: NodeDef mentions attr 'T' not in Op<name=Where; signature=
input:bool -> index:int64>; NodeDef: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppressio
n/FilterGreaterThan/Where = Where[T=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Postprocessor/Batch
MultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where/Cast). (Check whether your Gra
phDef-interpreting binary is up to date with your GraphDef-generating binary.).
[[Node: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/
Where = Where[T=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Postprocessor/BatchMultiClassNonMaxSupp
ression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where/Cast)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "detect_single_threaded.py", line 52, in
image_np, detection_graph, sess)
File "D:\PythonProjects\handtracking\utils\detector_utils.py", line 90, in detect_objects
feed_dict={image_tensor: image_np_expanded})
File "C:\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 889, in run
run_metadata_ptr)
File "C:\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1120, in _run
feed_dict_tensor, options, run_metadata)
File "C:\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1317, in _do_run
options, run_metadata)
File "C:\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1336, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: NodeDef mentions attr 'T' not in Op<name=Where; signature=
input:bool -> index:int64>; NodeDef: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppressio
n/FilterGreaterThan/Where = Where[T=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Postprocessor/Batch
MultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where/Cast). (Check whether your Gra
phDef-interpreting binary is up to date with your GraphDef-generating binary.).
[[Node: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/
Where = Where[T=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Postprocessor/BatchMultiClassNonMaxSupp
ression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where/Cast)]]

Caused by op 'Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Whe
re', defined at:
File "detect_single_threaded.py", line 8, in
detection_graph, sess = detector_utils.load_inference_graph()
File "D:\PythonProjects\handtracking\utils\detector_utils.py", line 45, in load_inference_graph
tf.import_graph_def(od_graph_def, name='')
File "C:\Anaconda3\lib\site-packages\tensorflow\python\framework\importer.py", line 313, in import_graph_def
op_def=op_def)
File "C:\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 2956, in create_op
op_def=op_def)
File "C:\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1470, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): NodeDef mentions attr 'T' not in Op<name=Where; signature=input:bool ->
index:int64>; NodeDef: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreate
rThan/Where = Where[T=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Postprocessor/BatchMultiClassNonM
axSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where/Cast). (Check whether your GraphDef-interpre
ting binary is up to date with your GraphDef-generating binary.).
[[Node: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/
Where = Where[T=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Postprocessor/BatchMultiClassNonMaxSupp
ression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where/Cast)]]

Could everyone give me a solution? thx~

Find x and y position

I would like to get the position of the centre of the hand. Is there an easy way to do this from your code?

Thanks for the code! Great project

Oxford Dataset

Hey Victor,

do you have a script to convert the .mat files of the oxford dataset to .xml or .csv files?
I have one, but again i am not sure if the final annotations i get are correct, so i would like to crossvalidate with your solution.

Just sending a note

Hello Dibia,

I have created a project based on your repo. Hope you know about it.

And it will be my pleasure if you add my project to the list.๐Ÿ˜€

Here is the link to my project: Gesture Recognition

Do I have to modify the detect_utils.py?

Hi i'm trying to use your handtracking model, and I'm beginner of Tensorflow.
I started follow your repo from the 'Using Hand Detector'... and I executed detector_utils.py
python3 detector_utils.py
And I have an error like this.
Traceback (most recent call last): File "detector_utils.py", line 10, in <module> from utils import label_map_util ModuleNotFoundError: No module named 'utils'
I run this code in
~/venv/lib/python3.6/site-packages/tensorflow/models/research/handtracking-master/utils
I installed Object_detection already.
I wonder do I have to modify the sys.path.append("..") part in detector_utils.py ?
If I have to do, then What path should be in there?

I have googled about it for a long time, but still have a problem
How can I solve it?

run detect_*.py errors: Invalid argument: NodeDef mentions attr 'T' not in Op<name=Where

when i run cmd: python3 detect_single_threaded.py --source ~/Documents/test.mp4
i got those errors:

> ====== loading HAND frozen graph into memory
2018-03-26 22:47:54.129357: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2018-03-26 22:47:54.278568: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:892] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-03-26 22:47:54.278914: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties: 
name: GeForce MX150 major: 6 minor: 1 memoryClockRate(GHz): 1.5315
pciBusID: 0000:01:00.0
totalMemory: 1.95GiB freeMemory: 1.57GiB
2018-03-26 22:47:54.278933: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce MX150, pci bus id: 0000:01:00.0, compute capability: 6.1)
>  ====== Hand Inference graph loaded.
2018-03-26 22:47:54.651759: E tensorflow/core/common_runtime/executor.cc:643] Executor failed to create kernel. Invalid argument: NodeDef mentions attr 'T' not in Op<name=Where; signature=input:bool -> index:int64>; NodeDef: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where = Where[T=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where/Cast). (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).
	 [[Node: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where = Where[T=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where/Cast)]]
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1323, in _do_call
    return fn(*args)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1302, in _run_fn
    status, run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/errors_impl.py", line 473, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: NodeDef mentions attr 'T' not in Op<name=Where; signature=input:bool -> index:int64>; NodeDef: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where = Where[T=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where/Cast). (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).
	 [[Node: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where = Where[T=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where/Cast)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "detect_single_threaded.py", line 52, in <module>
    image_np, detection_graph, sess)
  File "/home/rosrobot/git/handtracking/utils/detector_utils.py", line 90, in detect_objects
    feed_dict={image_tensor: image_np_expanded})
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 889, in run
    run_metadata_ptr)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1120, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1317, in _do_run
    options, run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1336, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: NodeDef mentions attr 'T' not in Op<name=Where; signature=input:bool -> index:int64>; NodeDef: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where = Where[T=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where/Cast). (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).
	 [[Node: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where = Where[T=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where/Cast)]]

Caused by op 'Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where', defined at:
  File "detect_single_threaded.py", line 8, in <module>
    detection_graph, sess = detector_utils.load_inference_graph()
  File "/home/rosrobot/git/handtracking/utils/detector_utils.py", line 45, in load_inference_graph
    tf.import_graph_def(od_graph_def, name='')
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/importer.py", line 313, in import_graph_def
    op_def=op_def)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 2956, in create_op
    op_def=op_def)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1470, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): NodeDef mentions attr 'T' not in Op<name=Where; signature=input:bool -> index:int64>; NodeDef: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where = Where[T=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where/Cast). (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).
	 [[Node: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where = Where[T=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where/Cast)]]

it's hard to run this project, can anyone help me to solve this problem??

Share the csv files to retrain model to caffe

Iโ€™m trying to retrain the data to a caffemodel but Iโ€™m having issues installing the dependencies needed to run the egohands_dataset_clean.py script. Could you please share the train_labels.csv and test_labels.csv files you have so I can avoid having to run the script.

Thanks

config file

Hi, thanks for your code!
I try to train model using EgoHand dataset, and my config file follows this one but I cannot get your results in tensorboard. So I wanna know which config file you use? Can you share it? Thank you

TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'

Hi, I face an error like this. My tensorflow version is 1.4.0-rc0. how can I solve this? Please help me.

Traceback (most recent call last):
File "detect_single_threaded.py", line 52, in
image_np, detection_graph, sess)
File "/home/khin/models/research/object_detection/utils/detector_utils.py", line 90, in detect_objects
feed_dict={image_tensor: image_np_expanded})
File "/home/khin/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 889, in run
run_metadata_ptr)
File "/home/khin/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1089, in _run
np_val = np.asarray(subfeed_val, dtype=subfeed_dtype)
File "/home/khin/anaconda3/lib/python3.6/site-packages/numpy/core/numeric.py", line 492, in asarray
return array(a, dtype, copy=False, order=order)
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'

Training commands

Hi Victor,

Thank you for putting this project so that others can learn from it.

I am trying to understand how you train the model,
What commands you would use to train it?

Thank you again,
Alex

speed issue, 1FPS

It's giving me a speed of 0.7-0.8 s per image that amounts to 1FPS, which looks quite slow to me, what baffles me is that it gives this speed with both the single and multi threaded version, i am using Core i7-5500U , 2.4Ghz with 8 GB RAM CPU. is this behavior normal? if not what to do to improve it ?

how to compute mAp

I want to know how you compute the mAP and let it show in the tensorboard? I'm glad to receive a reply from you,thank you

TypeError: Descriptors should not be created directly, but only retrieved from their parent.

I am using tensorflow 1.9.0 and I am getting an error while executing the below given command in anaconda prompt_
python train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/ssd_mobilenet_v1_pets.config

Here's the error_
Traceback (most recent call last):
File "train.py", line 44, in
options=None),
File "C:\Users\keshav\Anaconda3\envs\neuralnets\lib\site-packages\google\protobuf\descriptor.py", line 530, in new
_message.Message._CheckCalledFromGeneratedFile()
TypeError: Descriptors should not be created directly, but only retrieved from their parent.

Any suggest about improving different viewpoint detection performance with additional data๏ผŸ

Hi, There! Thanks for your share first! I want to detected hands under both egocentric and non-egocentric viewpoints, so i planned to add some new data to the existing Egohands Dataset before training, were there something i have to take into consider? eg, newly added image should be the same size with images in Egohands Dataset? The number of newly added non-egocentric images should be comparable
to those in Egohands Dataset? And other factors that was import.

why some label is not correct?

image

there are lots of samples seems not correct at all. but some are right, why is that?

image

Strange thing is that, same picture one hand is right, and another is not.............

Low fps

I run your code but I get max 3 fps. How do i increase the fps.

when run detect_multi.py got error: KeyError: u'TensorArrayV3', how to resolve it?

Hello guys:
when i run this project get some errors:
ubuntu16.04
cuda8.0
tensorflow: tensorflow-0.11.0rc1-cp27-none-linux_x86_64.whl
python 2.7.12

KeyError: u'TensorArrayV3'
Process PoolWorker-648:
Process PoolWorker-647:
Traceback (most recent call last):
Traceback (most recent call last):
  File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
  File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
    self.run()
  File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
  File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 97, in worker
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 97, in worker
    initializer(*initargs)
  File "detect_multi_threaded.py", line 21, in worker
    initializer(*initargs)
  File "detect_multi_threaded.py", line 21, in worker
    detection_graph, sess = detector_utils.load_inference_graph()
    detection_graph, sess = detector_utils.load_inference_graph()
  File "/home/rosrobot/git/handtracking/utils/detector_utils.py", line 45, in load_inference_graph
  File "/home/rosrobot/git/handtracking/utils/detector_utils.py", line 45, in load_inference_graph
    tf.import_graph_def(od_graph_def, name='')
    tf.import_graph_def(od_graph_def, name='')
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/importer.py", line 258, in import_graph_def
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/importer.py", line 258, in import_graph_def
    op_def = op_dict[node.op]
    op_def = op_dict[node.op]

Speed on GeForce GTX 1080

Hi

I am running on GeForce GTX 1080(notebook GPU). I get a speed of 15 fps stable with image display and about 30 fps without display. I have a feeling that with my graphics card the performance should be better. Does anyone have any comments about the same. I have another object detector trained for yolo and there too the speeds I achieve are lesser than widely reported.

Any help would be greatly appreciated.

thanks
shreeraman

I can't identify the hand in the picture

I am very happy to see the report :How to implement real-time hand detection based on SSD neural network.when I in my ubuntu16.04LTS running program. I met such problem:
2018-02-08 14:34:22.983867: E tensorflow/core/common_runtime/executor.cc:643] Executor failed to create kernel. Invalid argument: NodeDef mentions attr 'T' not in Op<name=Where; signature=input:bool -> index:int64>; NodeDef: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where = WhereT=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:GPU:0". (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).
[[Node: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where = WhereT=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:GPU:0"]]
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1323, in _do_call
return fn(*args)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1302, in _run_fn
status, run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/errors_impl.py", line 473, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: NodeDef mentions attr 'T' not in Op<name=Where; signature=input:bool -> index:int64>; NodeDef: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where = WhereT=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:GPU:0". (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).
[[Node: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where = WhereT=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:GPU:0"]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/sun/untitled/hhand_rrecognition.py", line 93, in
boxes,scores=detect_objects(image_np,detection_graph,sess)
File "/home/sun/untitled/hhand_rrecognition.py", line 81, in detect_objects
feed_dict={image_tensor: image_np_expanded})
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 889, in run
run_metadata_ptr)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1120, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1317, in _do_run
options, run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1336, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: NodeDef mentions attr 'T' not in Op<name=Where; signature=input:bool -> index:int64>; NodeDef: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where = WhereT=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:GPU:0". (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).
[[Node: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where = WhereT=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:GPU:0"]]

Caused by op 'Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where', defined at:
File "/home/sun/untitled/hhand_rrecognition.py", line 92, in
detection_graph,sess=load_inference_graph()
File "/home/sun/untitled/hhand_rrecognition.py", line 48, in load_inference_graph
tf.import_graph_def(od_graph_def, name='')
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/importer.py", line 313, in import_graph_def
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 2956, in create_op
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1470, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): NodeDef mentions attr 'T' not in Op<name=Where; signature=input:bool -> index:int64>; NodeDef: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where = WhereT=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:GPU:0". (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).
[[Node: Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/FilterGreaterThan/Where = WhereT=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:GPU:0"]]
The tensorflow version 1.4.0rc0 GeForce GTX 1050 ,Can you give me some advice?

Training slow with GTX 1070

Hi, thank you for the tutorial!

I'm having issue training my own model. I'm using a GTX 1070 but only got 3.5s-4.5s/step of training speed. I think this is quite slow comparing to your reference with a GPU. Is this normal?

My setup:

OS: Windows 10
TF version: 1.5.0
Config base: ssd_mobilenet_v1_pets.config
Existing model for transfer learning: ssd_mobilenet_v1_coco_2017_11_17
CPU: E3 1231 v3 16G
GPU: GTX 1070 8G

Thank you!

Set up model

Hi!
Could you also release the scripts for building the initial model?
Actually, I have no clue how to make a tensorflow model from the ssd_mobilenet_v1_coco.config.

What I got is reading all trainable variables from the model.ckpt file:

import tensorflow as tf
MODEL_DIR = 'C:/MYPATH/'
MODEL_NAME = 'model.ckpt-200002'
MODEL_NAME_OUTPUT = 'raw_ssd_mobilenet_v1_coco'
with tf.Session() as sess:
	new_saver = tf.train.import_meta_graph(MODEL_DIR + '/' + MODEL_NAME + '.meta')

#get all variables
tf_variables = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES)

f = open(MODEL_NAME_OUTPUT+'_vars.txt','w')

for var in tf_variables:
	var_name= var.name
	var_dtype = var.dtype
	var_shape = var.shape
	f.write(str(var_name)+"\t"+str(var_dtype)+"\t"+str(var_shape)+"\n")
f.close()

from that I could rebuild most of the ANN but now, I am stuck at the BoxPredictor-Layers. It is hard to figure out what they are and what their input layers are.

Any hints?

My aim is to rebuild the net in Caffee, but I can not figure out the exact detailed layer-structure from tensorflow.

Problem while training: zero mAp.

Hello ,@victordibia, and thank you for your work! I have almost the same issue with training my custom hand detector as in Issue #5. I use a pre-trained model on EgoHands dataset you have provided and I do transfer learning on EgoFinger dataset. This dataset includes hand images taken from egocentric viewpoint. Image size is 640x480, format is png. For testing purposes I train on a little subset. But after every epoch I get zero mAp.
Also, I tried to learn by taking vanilla SSD and train on my dataset, but result is the same.
I am pretty sure that I have generated right csv files and tfrecords. On tensorboard "Image" tab I can view visualization of ground truth boxes and they are correct.
Are there any thoughts for what can be done? Thank you in advance!

Threading to "improve" visual performance

Hi, @victordibia
Thanks for your work! I've been experimenting with tensorflow object detection models (yours included) and I came by a trick to make the FPS higher. I ran the model inference on a separate thread, but I don't feed every frame into the model. While the child thread is still processing the previous frame, I simply skip model inference and re-use previous bounding box results. I understand that this isn't an actual improvement in model performance, but it helped me render the camera feed in 34 FPS while the actual model was running at 11 FPS. Is there any way I can adapt my code to this project? It's my first time contributing.

How to save output video?

Dear Sir,

I would like to use the detected hand bounding box in the scene to the next step. So, how to save the value of these detected bounding box to feed to next step?
I would like to save detected bounding box value in real time and would like to save output video file with detection.

VIDEOIO ERROR: V4L2: Pixel format of incoming image is unsupported by OpenCV Unable to stop the stream: Device or resource busy OpenCV(3.4.1) Error: Assertion failed (scn == 3 || scn == 4) in cvtColor, file /io/opencv/modules/imgproc/src/color.cpp, line 11115 Error converting to RGB

so when I run the detect_single_threaded.py I get the following error:


mona@Mona:~/code/handpose/handtracking$ python detect_single_threaded.py 
/home/mona/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
> ====== loading HAND frozen graph into memory
2018-06-25 15:41:14.073745: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-06-25 15:41:14.166765: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties: 
name: TITAN Xp major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:01:00.0
totalMemory: 11.90GiB freeMemory: 10.81GiB
2018-06-25 15:41:14.166792: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2018-06-25 15:41:14.357558: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-06-25 15:41:14.357588: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929]      0 
2018-06-25 15:41:14.357593: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0:   N 
2018-06-25 15:41:14.357801: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10461 MB memory) -> physical GPU (device: 0, name: TITAN Xp, pci bus id: 0000:01:00.0, compute capability: 6.1)
>  ====== Hand Inference graph loaded.
VIDEOIO ERROR: V4L2: Pixel format of incoming image is unsupported by OpenCV
Unable to stop the stream: Device or resource busy
OpenCV(3.4.1) Error: Assertion failed (scn == 3 || scn == 4) in cvtColor, file /io/opencv/modules/imgproc/src/color.cpp, line 11115
Error converting to RGB
Traceback (most recent call last):
  File "detect_single_threaded.py", line 53, in <module>
    image_np, detection_graph, sess)
  File "/home/mona/code/handpose/handtracking/utils/detector_utils.py", line 90, in detect_objects
    feed_dict={image_tensor: image_np_expanded})
  File "/home/mona/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 900, in run
    run_metadata_ptr)
  File "/home/mona/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1104, in _run
    np_val = np.asarray(subfeed_val, dtype=subfeed_dtype)
  File "/home/mona/anaconda3/lib/python3.6/site-packages/numpy/core/numeric.py", line 492, in asarray
    return array(a, dtype, copy=False, order=order)
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'

I have
$ ls /dev/video*
/dev/video1

and have set the video capture to use device 1. Why do I get this error and how to fix it?

mAP failed

I tried to evaluate mAP, but failed. Now, I have no idea about that.
Anyone have ideas, thanks!

Problem with tensorflow serving

when I was creating the tensorflow serving model with frozen PB. I am getting an empty variable folder. could you share the un-frozen graph file?

Please help me on this issue.

2018-06-25 10:34:50.968306: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 8.56M (8971776 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY

When I run the below code it gives me the CUDA_ERROR_OUT_OF_MEMORY

mona@Mona:~/code/handpose/handtracking$ python detect_multi_threaded.py 
/home/mona/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
{'im_width': 320.0, 'im_height': 180.0, 'score_thresh': 0.2, 'num_hands_detect': 2} Namespace(display=1, fps=1, height=200, num_hands=2, num_workers=4, queue_size=5, video_source=0, width=300)
>> loading frozen model for worker
> ====== loading HAND frozen graph into memory
>> loading frozen model for worker
> ====== loading HAND frozen graph into memory
>> loading frozen model for worker
> ====== loading HAND frozen graph into memory
>> loading frozen model for worker
> ====== loading HAND frozen graph into memory
2018-06-25 10:34:50.323937: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-06-25 10:34:50.341322: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-06-25 10:34:50.345967: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-06-25 10:34:50.352734: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-06-25 10:34:50.634127: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties: 
name: TITAN Xp major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:01:00.0
totalMemory: 11.90GiB freeMemory: 10.78GiB
2018-06-25 10:34:50.634172: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2018-06-25 10:34:50.702435: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties: 
name: TITAN Xp major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:01:00.0
totalMemory: 11.90GiB freeMemory: 10.44GiB
2018-06-25 10:34:50.702479: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2018-06-25 10:34:50.703274: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties: 
name: TITAN Xp major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:01:00.0
totalMemory: 11.90GiB freeMemory: 10.44GiB
2018-06-25 10:34:50.703303: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2018-06-25 10:34:50.710124: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties: 
name: TITAN Xp major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:01:00.0
totalMemory: 11.90GiB freeMemory: 10.30GiB
2018-06-25 10:34:50.710171: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2018-06-25 10:34:50.903013: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-06-25 10:34:50.903050: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929]      0 
2018-06-25 10:34:50.903056: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0:   N 
2018-06-25 10:34:50.903294: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9864 MB memory) -> physical GPU (device: 0, name: TITAN Xp, pci bus id: 0000:01:00.0, compute capability: 6.1)
2018-06-25 10:34:50.948216: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-06-25 10:34:50.948250: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929]      0 
2018-06-25 10:34:50.948255: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0:   N 
2018-06-25 10:34:50.948392: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 262 MB memory) -> physical GPU (device: 0, name: TITAN Xp, pci bus id: 0000:01:00.0, compute capability: 6.1)
2018-06-25 10:34:50.950178: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-06-25 10:34:50.950208: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929]      0 
2018-06-25 10:34:50.950214: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0:   N 
2018-06-25 10:34:50.950366: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 223 MB memory) -> physical GPU (device: 0, name: TITAN Xp, pci bus id: 0000:01:00.0, compute capability: 6.1)
2018-06-25 10:34:50.951312: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 223.88M (234749952 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
>  ====== Hand Inference graph loaded.
>  ====== Hand Inference graph loaded.
2018-06-25 10:34:50.957782: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2018-06-25 10:34:50.957830: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-06-25 10:34:50.957838: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929]      0 
2018-06-25 10:34:50.957844: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0:   N 
2018-06-25 10:34:50.957961: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 262 MB memory) -> physical GPU (device: 0, name: TITAN Xp, pci bus id: 0000:01:00.0, compute capability: 6.1)
2018-06-25 10:34:50.959096: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2018-06-25 10:34:50.959128: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-06-25 10:34:50.959135: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929]      0 
2018-06-25 10:34:50.959140: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0:   N 
2018-06-25 10:34:50.959247: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 223 MB memory) -> physical GPU (device: 0, name: TITAN Xp, pci bus id: 0000:01:00.0, compute capability: 6.1)
2018-06-25 10:34:50.962115: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-06-25 10:34:50.962147: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929]      0 
2018-06-25 10:34:50.962156: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0:   N 
2018-06-25 10:34:50.962296: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 19 MB memory) -> physical GPU (device: 0, name: TITAN Xp, pci bus id: 0000:01:00.0, compute capability: 6.1)
2018-06-25 10:34:50.963309: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 19.88M (20840448 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-25 10:34:50.963921: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 17.89M (18756608 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-25 10:34:50.964525: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 16.10M (16881152 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-25 10:34:50.965136: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 14.49M (15193088 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-25 10:34:50.965748: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 13.04M (13673984 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-25 10:34:50.966395: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 11.74M (12306688 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-25 10:34:50.967035: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 10.56M (11076096 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-25 10:34:50.967671: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 9.51M (9968640 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-25 10:34:50.968306: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 8.56M (8971776 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
>  ====== Hand Inference graph loaded.
2018-06-25 10:34:50.978706: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2018-06-25 10:34:50.978781: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-06-25 10:34:50.978801: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929]      0 
2018-06-25 10:34:50.978818: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0:   N 
2018-06-25 10:34:50.979073: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 19 MB memory) -> physical GPU (device: 0, name: TITAN Xp, pci bus id: 0000:01:00.0, compute capability: 6.1)
>  ====== Hand Inference graph loaded.
2018-06-25 10:34:51.014461: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2018-06-25 10:34:51.014504: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-06-25 10:34:51.014512: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929]      0 
2018-06-25 10:34:51.014517: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0:   N 
2018-06-25 10:34:51.014617: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9864 MB memory) -> physical GPU (device: 0, name: TITAN Xp, pci bus id: 0000:01:00.0, compute capability: 6.1)
2018-06-25 10:34:51.542720: E tensorflow/stream_executor/cuda/cuda_dnn.cc:455] could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2018-06-25 10:34:51.542755: E tensorflow/stream_executor/cuda/cuda_dnn.cc:427] could not destroy cudnn handle: CUDNN_STATUS_BAD_PARAM
2018-06-25 10:34:51.542773: F tensorflow/core/kernels/conv_ops.cc:713] Check failed: stream->parent()->GetConvolveAlgorithms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo<T>(), &algorithms) 
>> loading frozen model for worker
> ====== loading HAND frozen graph into memory
2018-06-25 10:34:52.231795: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-06-25 10:34:52.321816: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties: 
name: TITAN Xp major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:01:00.0
totalMemory: 11.90GiB freeMemory: 331.88MiB
2018-06-25 10:34:52.321844: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2018-06-25 10:34:52.548327: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-06-25 10:34:52.548361: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929]      0 
2018-06-25 10:34:52.548370: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0:   N 
2018-06-25 10:34:52.548511: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 50 MB memory) -> physical GPU (device: 0, name: TITAN Xp, pci bus id: 0000:01:00.0, compute capability: 6.1)
>  ====== Hand Inference graph loaded.
2018-06-25 10:34:52.554345: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2018-06-25 10:34:52.554370: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-06-25 10:34:52.554377: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929]      0 
2018-06-25 10:34:52.554382: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0:   N 
2018-06-25 10:34:52.554472: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 50 MB memory) -> physical GPU (device: 0, name: TITAN Xp, pci bus id: 0000:01:00.0, compute capability: 6.1)

what is regularization_loss_1

Hey Victor
you added a screenshot of the generalization_loss_1 which appears to steadily rise during your training.

What is this loss? i know Tensorboard displays this scalar and i know what generalization is, but the meaning of this is not fully clear to me.

In my training case it looks totally different:
screenshot from 2018-01-25 14-28-04

Can you help me out?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.