Giter Site home page Giter Site logo

chatbot-retrieval's People

Contributors

dennybritz avatar e-budur avatar hailiang-wang avatar j-min avatar wan-wei avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

chatbot-retrieval's Issues

Estimated Time for training

Hello denny,

How much should be the estimated time for training the model ?
Please give some estimated number of hours/days and GPU used for computation?

Getting error while running the training script.

Hi,
I am a newbie and was going through your amazing tutorial on chatbots. While training the model with udc_train.py script I am getting the error:
InvalidArgumentError (see above for traceback): Incompatible shapes: [80,1] vs. [160,1] [[Node: prediction/logistic_loss/mul = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](prediction/Squeeze, prediction/ToFloat)]]

Please recommend what steps I can take to train the models properly.

How to prepare training dataset structure

I am trying to use code with my dataset (training now goes).
For this i am trying to replicate structure of Ubuntu Dialog Corpus (UDC) from https://arxiv.org/pdf/1506.08909v3.pdf

In you article assumed that "The training data consists of 1,000,000 examples, 50% positive (label 1) and 50% negative (label 0)" (http://www.wildml.com/2016/07/deep-learning-for-chatbots-2-retrieval-based-model-tensorflow/)/
So i made a copy of dataset with Pandas and take random Utterance with flag 0.
Result is doubled dataset: each Context and Utterance appear 2 times, 1 correct Utterance and 1 incorrect.

Is that right for training?

In paper https://arxiv.org/pdf/1506.08909v3.pdf in is stated "In our experiments below, we consider both the case of 1 wrong response and 10 wrong responses." - this is completely other approach.

Should i add to train set 11 copies of Context - Utterance randomly selected to get more accurate results?

Also, i don't have โ€™EOSโ€™ tag in my Context - so naturally context is not dialog merged, it is one big problem post. How this can influence ?

Error running with TF 0.12.1, Python 3.4.3, Ubuntu 14.04

Hi...So I installed all the required dependencies w/ pip3, made sure I can import them in Python3 with no issues, and downloaded dataset as described. Now running: $ python3 udc_train.py produces the following error:

InvalidArgumentError (see above for traceback): Incompatible shapes: [80,1] vs. [160,1]
         [[Node: prediction/logistic_loss/mul = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"](prediction/Squeeze, prediction/ToFloat)]]
         [[Node: recall_at_2/ToInt64/_91 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_217_recall_at_2/ToInt64", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

Any idea why this is happening? Any idea how to fix it?

Thanks.

OOM on Ubuntu 14.04 with GTX670

Hi,

I tried to run the demo with GPU, and it raised an error at Step 68001 suddenly.

I tried to run this on my Macbook Pro with 'CPU only' version, and there was no error over 70000 steps. I believe this error is due to the small memory of the GPU. Is there any way to fix this problem?

Last part of the Log (excluding the memory part):

InUse:                  1890070528
MaxInUse:               1890070528
NumAllocs:              1307072597
MaxAllocSize:             73287680

W tensorflow/core/common_runtime/bfc_allocator.cc:270] ****************************************************************************************************
W tensorflow/core/common_runtime/bfc_allocator.cc:271] Ran out of memory trying to allocate 1.00MiB.  See logs for memory state.
W tensorflow/core/framework/op_kernel.cc:909] Resource exhausted: OOM when allocating tensor with shape[256,1024]
INFO:tensorflow:Saving checkpoint for step 68001 to checkpoint: /home/harmanjohll/SaveTheDepressed/chatbot-retrieval/runs/1471341127/model.ckpt.
Traceback (most recent call last):
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/venv/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 715, in _do_call
    return fn(*args)
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/venv/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 697, in _run_fn
    status, run_metadata)
  File "/usr/lib/python3.4/contextlib.py", line 66, in __exit__
    next(self.gen)
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/venv/lib/python3.4/site-packages/tensorflow/python/framework/errors.py", line 450, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors.ResourceExhaustedError: OOM when allocating tensor with shape[256,1024]
     [[Node: rnn/RNN/while/LSTMCell/BiasAdd = BiasAdd[T=DT_FLOAT, data_format="NHWC", _device="/job:localhost/replica:0/task:0/gpu:0"](rnn/RNN/while/LSTMCell/MatMul, rnn/RNN/while/LSTMCell/BiasAdd/Enter)]]
     [[Node: OptimizeLoss/train/update/_201717 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_1043_OptimizeLoss/train/update", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "udc_train.py", line 70, in <module>
    tf.app.run()
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/venv/lib/python3.4/site-packages/tensorflow/python/platform/app.py", line 30, in run
    sys.exit(main(sys.argv))
  File "udc_train.py", line 67, in main
    estimator.fit(input_fn=input_fn_train, steps=None, monitors=[eval_monitor])
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/venv/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 182, in fit
    monitors=monitors)
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/venv/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 484, in _train_model
    monitors=monitors)
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/venv/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/graph_actions.py", line 328, in train
    reraise(*excinfo)
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/venv/lib/python3.4/site-packages/six.py", line 686, in reraise
    raise value
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/venv/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/graph_actions.py", line 257, in train
    session, last_step + 1, [train_op, loss_op], feed_dict, monitors)
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/venv/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/graph_actions.py", line 111, in _run_with_monitors
    outputs = session.run(tensors, feed_dict=feed_dict)
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/venv/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 372, in run
    run_metadata_ptr)
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/venv/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 636, in _run
    feed_dict_string, options, run_metadata)
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/venv/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 708, in _do_run
    target_list, options, run_metadata)
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/venv/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 728, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.ResourceExhaustedError: OOM when allocating tensor with shape[256,1024]
     [[Node: rnn/RNN/while/LSTMCell/BiasAdd = BiasAdd[T=DT_FLOAT, data_format="NHWC", _device="/job:localhost/replica:0/task:0/gpu:0"](rnn/RNN/while/LSTMCell/MatMul, rnn/RNN/while/LSTMCell/BiasAdd/Enter)]]
     [[Node: OptimizeLoss/train/update/_201717 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_1043_OptimizeLoss/train/update", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
Caused by op 'rnn/RNN/while/LSTMCell/BiasAdd', defined at:
  File "udc_train.py", line 70, in <module>
    tf.app.run()
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/venv/lib/python3.4/site-packages/tensorflow/python/platform/app.py", line 30, in run
    sys.exit(main(sys.argv))
  File "udc_train.py", line 67, in main
    estimator.fit(input_fn=input_fn_train, steps=None, monitors=[eval_monitor])
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/venv/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 182, in fit
    monitors=monitors)
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/venv/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 449, in _train_model
    train_op, loss_op = self._get_train_ops(features, targets)
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/venv/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 673, in _get_train_ops
    _, loss, train_op = self._call_model_fn(features, targets, ModeKeys.TRAIN)
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/venv/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 656, in _call_model_fn
    features, targets, mode=mode)
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/udc_model.py", line 40, in model_fn
    targets)
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/models/dual_encoder.py", line 56, in dual_encoder_model
    dtype=tf.float32)
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/venv/lib/python3.4/site-packages/tensorflow/python/ops/rnn.py", line 580, in dynamic_rnn
    swap_memory=swap_memory, sequence_length=sequence_length)
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/venv/lib/python3.4/site-packages/tensorflow/python/ops/rnn.py", line 709, in _dynamic_rnn_loop
    swap_memory=swap_memory)
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/venv/lib/python3.4/site-packages/tensorflow/python/ops/control_flow_ops.py", line 1873, in while_loop
    result = context.BuildLoop(cond, body, loop_vars)
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/venv/lib/python3.4/site-packages/tensorflow/python/ops/control_flow_ops.py", line 1749, in BuildLoop
    body_result = body(*vars_for_body_with_tensor_arrays)
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/venv/lib/python3.4/site-packages/tensorflow/python/ops/rnn.py", line 692, in _time_step
    skip_conditionals=True)
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/venv/lib/python3.4/site-packages/tensorflow/python/ops/rnn.py", line 326, in _rnn_step
    new_output, new_state = call_cell()
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/venv/lib/python3.4/site-packages/tensorflow/python/ops/rnn.py", line 680, in <lambda>
    call_cell = lambda: cell(input_t, state)
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/venv/lib/python3.4/site-packages/tensorflow/python/ops/rnn_cell.py", line 552, in __call__
    lstm_matrix = nn_ops.bias_add(math_ops.matmul(cell_inputs, concat_w), b)
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/venv/lib/python3.4/site-packages/tensorflow/python/ops/nn_ops.py", line 313, in bias_add
    return gen_nn_ops._bias_add(value, bias, data_format=data_format, name=name)
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/venv/lib/python3.4/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 279, in _bias_add
    data_format=data_format, name=name)
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/venv/lib/python3.4/site-packages/tensorflow/python/ops/op_def_library.py", line 704, in apply_op
    op_def=op_def)
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/venv/lib/python3.4/site-packages/tensorflow/python/framework/ops.py", line 2260, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/home/harmanjohll/SaveTheDepressed/chatbot-retrieval/venv/lib/python3.4/site-packages/tensorflow/python/framework/ops.py", line 1230, in __init__
    self._traceback = _extract_stack()

By the way, will udc_train.py automatically resume training from the checkpoint? And do I need to process all the training data?

Couldn't find trained model at .........

andy1028@andy1028-Envy:/github/chatbot-retrieval$ python udc_train.py
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
WARNING:tensorflow:Setting feature info to {'utterance_len': TensorSignature(dtype=tf.int64, shape=TensorShape([Dimension(128), Dimension(1)]), is_sparse=False), 'context_len': TensorSignature(dtype=tf.int64, shape=TensorShape([Dimension(128), Dimension(1)]), is_sparse=False), 'utterance': TensorSignature(dtype=tf.int64, shape=TensorShape([Dimension(128), Dimension(160)]), is_sparse=False), 'context': TensorSignature(dtype=tf.int64, shape=TensorShape([Dimension(128), Dimension(160)]), is_sparse=False)}
WARNING:tensorflow:Setting targets info to TensorSignature(dtype=tf.int64, shape=TensorShape([Dimension(128), Dimension(1)]), is_sparse=False)
INFO:tensorflow:No glove/vocab path specificed, starting with random embeddings.
INFO:tensorflow:Create CheckpointSaver
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:925] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
name: GeForce GTX 950M
major: 5 minor: 0 memoryClockRate (GHz) 1.124
pciBusID 0000:01:00.0
Total memory: 4.00GiB
Free memory: 3.08GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:839] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 950M, pci bus id: 0000:01:00.0)
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:244] PoolAllocator: After 5742 get requests, put_count=3297 evicted_count=1000 eviction_rate=0.303306 and unsatisfied allocation rate=0.617381
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:256] Raising pool_size_limit_ from 100 to 110
INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.contrib.learn.python.learn.estimators._sklearn.NotFittedError'>, Couldn't find trained model at /home/andy1028/github/chatbot-retrieval/runs/1476255016.
E tensorflow/core/client/tensor_c_api.cc:485] Enqueue operation was cancelled
[[Node: read_batch_features_train/random_shuffle_queue_EnqueueMany = QueueEnqueueMany[Tcomponents=[DT_STRING, DT_STRING], _class=["loc:@read_batch_features_train/random_shuffle_queue"], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](read_batch_features_train/random_shuffle_queue, read_batch_features_train/read/ReaderReadUpTo, read_batch_features_train/read/ReaderReadUpTo:1)]]
E tensorflow/core/client/tensor_c_api.cc:485] Enqueue operation was cancelled
[[Node: read_batch_features_train/file_name_queue/file_name_queue_EnqueueMany = QueueEnqueueMany[Tcomponents=[DT_STRING], _class=["loc:@read_batch_features_train/file_name_queue"], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](read_batch_features_train/file_name_queue, read_batch_features_train/file_name_queue/RandomShuffle)]]
Traceback (most recent call last):
File "udc_train.py", line 70, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv))
File "udc_train.py", line 67, in main
estimator.fit(input_fn=input_fn_train, steps=None, monitors=[eval_monitor])
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 240, in fit
max_steps=max_steps)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 578, in _train_model
max_steps=max_steps)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/graph_actions.py", line 280, in _supervised_train
None)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/supervised_session.py", line 270, in run
run_metadata=run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/recoverable_session.py", line 54, in run
run_metadata=run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/coordinated_session.py", line 70, in run
self._coord.join(self._coordinated_threads_to_join)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/coordinator.py", line 357, in join
six.reraise(_self._exc_info_to_raise)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/coordinated_session.py", line 66, in run
return self._sess.run(_args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/monitored_session.py", line 107, in run
induce_stop = monitor.step_end(monitors_step, monitor_outputs)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/monitors.py", line 396, in step_end
return self.every_n_step_end(step, output)
File "udc_train.py", line 64, in every_n_step_end
steps=None)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 356, in evaluate
name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 620, in _evaluate_model
% checkpoint_path)
tensorflow.contrib.learn.python.learn.estimators._sklearn.NotFittedError: Couldn't find trained model at /home/andy1028/github/chatbot-retrieval/runs/1476255016.
andy1028@andy1028-Envy:
/github/chatbot-retrieval$

incompatible shapes: [80,1] vs. [160,1]

I am using tensorflow-0.10.0rc0-cp35-cp35m-linux_x86_64.whl, Cuda Toolkit 7.5 and cuDNN v4. i am running into this error during training. thanks.

INFO:tensorflow:Results after 1220 steps (0.081 sec/batch): recall_at_10 = 1.0, recall_at_2 = 0.333043032787, recall_at_1 = 0.184579918033, recall_at_5 = 0.663370901639, loss = 0.698521.
W tensorflow/core/framework/op_kernel.cc:968] Invalid argument: Incompatible shapes: [80,1] vs. [160,1]
[[Node: prediction/logistic_loss/mul = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"](prediction/Squeeze, prediction/ToFloat)]]
INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors.InvalidArgumentError'>, Incompatible shapes: [80,1] vs. [160,1]
     [[Node: prediction/logistic_loss/mul = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"](prediction/Squeeze, prediction/ToFloat)]]
     [[Node: recall_at_10/ToInt64/_145 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_145_recall_at_10/ToInt64", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
Caused by op 'prediction/logistic_loss/mul', defined at:
  File "udc_train.py", line 70, in <module>
    tf.app.run()
  File "/home/cent/tensorflow/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 30, in run
    sys.exit(main(sys.argv[:1] + flags_passthrough))
  File "udc_train.py", line 67, in main
    estimator.fit(input_fn=input_fn_train, steps=None, monitors=[eval_monitor])
  File "/home/cent/tensorflow/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 252, in fit
    max_steps=max_steps)
  File "/home/cent/tensorflow/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 584, in _train_model
    max_steps=max_steps)
  File "/home/cent/tensorflow/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/graph_actions.py", line 281, in _monitored_train
    None)
  File "/home/cent/tensorflow/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/monitored_session.py", line 333, in run
    run_metadata=run_metadata)
  File "/home/cent/tensorflow/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/monitored_session.py", line 490, in run
    run_metadata=run_metadata)
  File "/home/cent/tensorflow/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/monitored_session.py", line 537, in run
    return self._sess.run(*args, **kwargs)
  File "/home/cent/tensorflow/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/monitored_session.py", line 602, in run
    hook in outputs else None))
  File "/home/cent/tensorflow/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/monitors.py", line 1148, in after_run
    induce_stop = m.step_end(self._last_step, result)
  File "/home/cent/tensorflow/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/monitors.py", line 398, in step_end
    return self.every_n_step_end(step, output)
  File "udc_train.py", line 64, in every_n_step_end
    steps=None)
  File "/home/cent/tensorflow/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 318, in evaluate
    name=name)
  File "/home/cent/tensorflow/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 636, in _evaluate_model
    eval_dict = self._get_eval_ops(features, targets, metrics)
  File "/home/cent/tensorflow/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 855, in _get_eval_ops
    predictions, loss, _ = self._call_model_fn(features, targets, ModeKeys.EVAL)
  File "/home/cent/tensorflow/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 809, in _call_model_fn
    return self._model_fn(features, targets, mode=mode)
  File "/home/cent/play/chatbot-retrieval/udc_model.py", line 84, in model_fn
    tf.concat(0, all_targets))
  File "/home/cent/play/chatbot-retrieval/models/dual_encoder.py", line 81, in dual_encoder_model
    losses = tf.nn.sigmoid_cross_entropy_with_logits(logits, tf.to_float(targets))
  File "/home/cent/tensorflow/lib/python3.5/site-packages/tensorflow/python/ops/nn.py", line 445, in sigmoid_cross_entropy_with_logits
    return math_ops.add(relu_logits - logits * targets,
  File "/home/cent/tensorflow/lib/python3.5/site-packages/tensorflow/python/ops/math_ops.py", line 760, in binary_op_wrapper
    return func(x, y, name=name)
  File "/home/cent/tensorflow/lib/python3.5/site-packages/tensorflow/python/ops/math_ops.py", line 909, in _mul_dispatch
    return gen_math_ops.mul(x, y, name=name)
  File "/home/cent/tensorflow/lib/python3.5/site-packages/tensorflow/python/ops/gen_math_ops.py", line 1464, in mul
    result = _op_def_lib.apply_op("Mul", x=x, y=y, name=name)
  File "/home/cent/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 703, in apply_op
    op_def=op_def)
  File "/home/cent/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2334, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/home/cent/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1253, in __init__
    self._traceback = _extract_stack()

Traceback (most recent call last):
  File "/home/cent/tensorflow/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 972, in _do_call
    return fn(*args)
  File "/home/cent/tensorflow/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 954, in _run_fn
    status, run_metadata)
  File "/usr/local/lib/python3.5/contextlib.py", line 66, in __exit__
    next(self.gen)
  File "/home/cent/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/errors.py", line 450, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors.InvalidArgumentError: Incompatible shapes: [80,1] vs. [160,1]
     [[Node: prediction/logistic_loss/mul = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"](prediction/Squeeze, prediction/ToFloat)]]
     [[Node: recall_at_10/ToInt64/_145 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_145_recall_at_10/ToInt64", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

Poor results / training time

Hi

I'm getting poor results at validation, after 2000 steps I have recall_at_1 between 0.12 and 0.14, which is not much better than chance.

However the original blog has recall_at_1 of 0.50 after only 270 steps.

If I leave it running overnight, I reach a recall_at_1 of 0.50 after 28000 steps which seems to take 100x as long to converge as the original blog post.

I'm using a GTX980, Ubuntu 14.04, Tensorflow 12.1.

I've also tried other forks with no better results. The only modifications I have made so far are to the batch_size as advised elsewhere here.

Is anybody else getting better results than this? Any tricks/gotchas?

incompatible parameter initialization

As mentioned in the original paper:

The RNN architecture is set to 1 hidden layer with 50 neurons. The W_h matrix is initialized using orthogonal weights [23], while W_x is initialized using a uniform distribution with values between -0.01 and 0.01. We use Adam as our optimizer [15], with gradients clipped to 10. We found that weight initialization as well as the choice of optimizer were critical for training the RNNs.

In the implementation, W_x is initialized uniformly between -0.25 and 0.25. Also, W_h is initialized using tf.truncated_normal_initializer.

Initialization may have a little effect on the performance and get more compatible results.

problem with tensorflow 0.10

After running the train script on a virtual environment using python 3.5 and tensorflow 0.10, the following error:

$ python udc_train.py 
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
WARNING:tensorflow:Setting feature info to {'context_len': TensorSignature(dtype=tf.int64, shape=TensorShape([Dimension(128), Dimension(1)]), is_sparse=False), 'context': TensorSignature(dtype=tf.int64, shape=TensorShape([Dimension(128), Dimension(160)]), is_sparse=False), 'utterance_len': TensorSignature(dtype=tf.int64, shape=TensorShape([Dimension(128), Dimension(1)]), is_sparse=False), 'utterance': TensorSignature(dtype=tf.int64, shape=TensorShape([Dimension(128), Dimension(160)]), is_sparse=False)}
WARNING:tensorflow:Setting targets info to TensorSignature(dtype=tf.int64, shape=TensorShape([Dimension(128), Dimension(1)]), is_sparse=False)
INFO:tensorflow:No glove/vocab path specificed, starting with random embeddings.
INFO:tensorflow:Create CheckpointSaver
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties: 
name: GeForce GTX TITAN X
major: 5 minor: 2 memoryClockRate (GHz) 1.076
pciBusID 0000:02:00.0
Total memory: 12.00GiB
Free memory: 11.87GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:839] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:02:00.0)
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:244] PoolAllocator: After 5858 get requests, put_count=3524 evicted_count=1000 eviction_rate=0.283768 and unsatisfied allocation rate=0.586207
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:256] Raising pool_size_limit_ from 100 to 110
INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.contrib.learn.python.learn.estimators._sklearn.NotFittedError'>, Couldn't find trained model at /home/admin1/exp/projects/chatbot-retrieval/runs/1471797260.
E tensorflow/core/client/tensor_c_api.cc:485] Enqueue operation was cancelled
     [[Node: read_batch_features_train/file_name_queue/file_name_queue_EnqueueMany = QueueEnqueueMany[Tcomponents=[DT_STRING], _class=["loc:@read_batch_features_train/file_name_queue"], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](read_batch_features_train/file_name_queue, read_batch_features_train/file_name_queue/RandomShuffle)]]
E tensorflow/core/client/tensor_c_api.cc:485] Enqueue operation was cancelled
     [[Node: read_batch_features_train/random_shuffle_queue_EnqueueMany = QueueEnqueueMany[Tcomponents=[DT_STRING, DT_STRING], _class=["loc:@read_batch_features_train/random_shuffle_queue"], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](read_batch_features_train/random_shuffle_queue, read_batch_features_train/read/ReaderReadUpTo, read_batch_features_train/read/ReaderReadUpTo:1)]]
Traceback (most recent call last):
  File "udc_train.py", line 70, in <module>
    tf.app.run()
  File "/home/admin1/exp/projects/chatbot-retrieval/venv/local/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 30, in run
    sys.exit(main(sys.argv))
  File "udc_train.py", line 67, in main
    estimator.fit(input_fn=input_fn_train, steps=None, monitors=[eval_monitor])
  File "/home/admin1/exp/projects/chatbot-retrieval/venv/local/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 240, in fit
    max_steps=max_steps)
  File "/home/admin1/exp/projects/chatbot-retrieval/venv/local/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 578, in _train_model
    max_steps=max_steps)
  File "/home/admin1/exp/projects/chatbot-retrieval/venv/local/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/graph_actions.py", line 280, in _supervised_train
    None)
  File "/home/admin1/exp/projects/chatbot-retrieval/venv/local/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/supervised_session.py", line 270, in run
    run_metadata=run_metadata)
  File "/home/admin1/exp/projects/chatbot-retrieval/venv/local/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/recoverable_session.py", line 54, in run
    run_metadata=run_metadata)
  File "/home/admin1/exp/projects/chatbot-retrieval/venv/local/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/coordinated_session.py", line 70, in run
    self._coord.join(self._coordinated_threads_to_join)
  File "/home/admin1/exp/projects/chatbot-retrieval/venv/local/lib/python3.5/site-packages/tensorflow/python/training/coordinator.py", line 357, in join
    six.reraise(*self._exc_info_to_raise)
  File "/home/admin1/exp/projects/chatbot-retrieval/venv/local/lib/python3.5/site-packages/six.py", line 686, in reraise
    raise value
  File "/home/admin1/exp/projects/chatbot-retrieval/venv/local/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/coordinated_session.py", line 66, in run
    return self._sess.run(*args, **kwargs)
  File "/home/admin1/exp/projects/chatbot-retrieval/venv/local/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/monitored_session.py", line 107, in run
    induce_stop = monitor.step_end(monitors_step, monitor_outputs)
  File "/home/admin1/exp/projects/chatbot-retrieval/venv/local/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/monitors.py", line 396, in step_end
    return self.every_n_step_end(step, output)
  File "udc_train.py", line 64, in every_n_step_end
    steps=None)
  File "/home/admin1/exp/projects/chatbot-retrieval/venv/local/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 356, in evaluate
    name=name)
  File "/home/admin1/exp/projects/chatbot-retrieval/venv/local/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 620, in _evaluate_model
    % checkpoint_path)
tensorflow.contrib.learn.python.learn.estimators._sklearn.NotFittedError: Couldn't find trained model at /home/admin1/exp/projects/chatbot-retrieval/runs/1471797260.
(venv) 

Leverage 'accuracy' as a metric in ValidationMonitor

I want to add 'accuracy' in my validation metrics in udc_train.py:
eval_metrics = {"accuracy": tf.contrib.metrics.streaming_accuracy}
It seems that the prediction result is a float32 tensor and the label is int64, so this leads to a problem:
TypeError: Input 'y' of 'Equal' Op has type int64 that does not match type float32 of argument 'x'.

How to cast the prediction tensor before feeding it to ValidationMonitor?

Upgraded to Python 3 still could not run udc_test.py?

I upgraded to Python 3 but errors still ... Any parameters need to change in udc_test.py in order to evaluate the model?

Traceback (most recent call last):
File "udc_test.py", line 39, in
estimator.evaluate(input_fn=input_fn_test, steps=None, metrics=eval_metrics)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 356, in evaluate
name=name)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 641, in _evaluate_model
max_steps=steps)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/graph_actions.py", line 773, in evaluate
_write_summary_results(output_dir, eval_results, current_global_step)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/graph_actions.py", line 613, in _write_summary_results
_eval_results_to_str(eval_results))
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/graph_actions.py", line 607, in _eval_results_to_str
return ', '.join('%s = %s' % (k, v) for k, v in eval_results.items())
AttributeError: 'NoneType' object has no attribute 'items'

AttributeError: 'NoneType' object has no attribute 'get_shape'

When I try to run python3 udc_predict.py --model_dir=runs/1481407347, I get the following error

Context: Example context
Traceback (most recent call last):
File "udc_predict.py", line 57, in
prob = estimator.predict(input_fn=lambda: get_features(INPUT_CONTEXT, r))
File "/home/ubuntu/.local/lib/python3.5/site-packages/tensorflow/python/util/deprecation.py", line 191, in new_func
return func(*args, **kwargs)
File "/home/ubuntu/.local/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 469, in predict
as_iterable=as_iterable)
File "/home/ubuntu/.local/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 839, in _infer_model
infer_ops = self._get_predict_ops(features)
File "/home/ubuntu/.local/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 1107, in _get_predict_ops
return self._call_model_fn(features, labels, model_fn_lib.ModeKeys.INFER)
File "/home/ubuntu/.local/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 1021, in _call_model_fn
model_fn_results = self._model_fn(features, labels, mode=mode)
File "/home/ubuntu/neural_net/chatbot-retrieval/udc_model.py", line 29, in model_fn
batch_size = targets.get_shape().as_list()[0]
AttributeError: 'NoneType' object has no attribute 'get_shape'

Any ideas what might be causing this? Thanks!

udc_predict.py error on Tensorflow 1.0 with Python 3

After slightly altering a number of things, I have managed to get train and test running in TF 1.0, but can't figure out how to get around this:

File "chatbot-retrieval/udc_predict.py" line 53, in
estimator._targets_info = tf.contrib.learn.estimators.tensor_signature.TensorSignature(tf.constant(0, shape=[1,1]))
AttributeError: module 'tensorflow.contrib.learn' has no attribute 'estimators'

I've tried on udc_train.py on the training data, and the tensor has NAN values on step 194001

E tensorflow/core/kernels/check_numerics_op.cc:157] abnormal_detected_host @0x208a7b900 = {1, 0} Loss is inf or nan

tensorflow.python.framework.errors.InvalidArgumentError: Loss is inf or nan : Tensor had NaN values
[[Node: OptimizeLoss/CheckNumerics = CheckNumericsT=DT_FLOAT, message="Loss is inf or nan", _device="/job:localhost/replica:0/task:0/gpu:0"]]
[[Node: OptimizeLoss/train/update/_201717 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_1043_OptimizeLoss/train/update", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]]

InvalidArgumentError (see above for traceback): Incompatible shapes: [80,1] vs. [160,1]

Hello,

I ran python udc_train.py on a CUDA-enabled EC2 instance. After 2000 steps, when the first evaluation began, I got the following error that I have no idea how to correct:

I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcurand.so locally
INFO:tensorflow:Using config: {'task': 0, 'save_summary_steps': 100, 'master': '', 'num_ps_replicas': 0, 'tf_config': gpu_options {
per_process_gpu_memory_fraction: 1
}
, 'save_checkpoints_secs': 600, 'keep_checkpoint_max': 5, 'cluster_spec': None, 'is_chief': True, 'tf_random_seed': None, 'keep_checkpoint_every_n_hours': 10000, 'job_name': None, 'evaluation_master': ''}
WARNING:tensorflow:parser_num_threads is deprecated, it will be removed onSept 3 2016
INFO:tensorflow:Setting feature info to {'utterance_len': TensorSignature(dtype=tf.int64, shape=TensorShape([Dimension(128), Dimension(1)]), is_sparse=False), 'context_len': TensorSignature(dtype=tf.int64, shape=TensorShape([Dimension(128), Dimension(1)]), is_sparse=False), 'context': TensorSignature(dtype=tf.int64, shape=TensorShape([Dimension(128), Dimension(160)]), is_sparse=False), 'utterance': TensorSignature(dtype=tf.int64, shape=TensorShape([Dimension(128), Dimension(160)]), is_sparse=False)}
INFO:tensorflow:Setting targets info to TensorSignature(dtype=tf.int64, shape=TensorShape([Dimension(128), Dimension(1)]), is_sparse=False)
INFO:tensorflow:No glove/vocab path specificed, starting with random embeddings.
INFO:tensorflow:Create CheckpointSaverHook
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:925] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:951] Found device 0 with properties:
name: GRID K520
major: 3 minor: 0 memoryClockRate (GHz) 0.797
pciBusID 0000:00:03.0
Total memory: 3.94GiB
Free memory: 3.91GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:972] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GRID K520, pci bus id: 0000:00:03.0)
INFO:tensorflow:loss = 2.621, step = 1
INFO:tensorflow:Saving checkpoints for 1 into /home/ubuntu/bin/chatbot-retrieval/runs/1477342660/model.ckpt.
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:245] PoolAllocator: After 8115 get requests, put_count=7404 evicted_count=1000 eviction_rate=0.135062 and unsatisfied allocation rate=0.223167
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:257] Raising pool_size_limit
from 100 to 110
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:245] PoolAllocator: After 8940 get requests, put_count=8571 evicted_count=1000 eviction_rate=0.116673 and unsatisfied allocation rate=0.155705
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:257] Raising pool_size_limit
from 256 to 281
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:245] PoolAllocator: After 14654 get requests, put_count=14834 evicted_count=1000 eviction_rate=0.0674127 and unsatisfied allocation rate=0.0599836
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:257] Raising pool_size_limit
from 655 to 720
INFO:tensorflow:loss = 0.842207, step = 101
INFO:tensorflow:loss = 0.809799, step = 201
INFO:tensorflow:loss = 0.787452, step = 301
INFO:tensorflow:loss = 0.791614, step = 401
INFO:tensorflow:loss = 0.691451, step = 501
INFO:tensorflow:loss = 0.681929, step = 601
INFO:tensorflow:loss = 0.659081, step = 701
INFO:tensorflow:loss = 0.681274, step = 801
INFO:tensorflow:loss = 0.679236, step = 901
INFO:tensorflow:loss = 0.689731, step = 1001
INFO:tensorflow:loss = 0.659311, step = 1101
INFO:tensorflow:loss = 0.683635, step = 1201
INFO:tensorflow:loss = 0.672502, step = 1301
INFO:tensorflow:loss = 0.672189, step = 1401
INFO:tensorflow:loss = 0.682205, step = 1501
INFO:tensorflow:loss = 0.667743, step = 1601
INFO:tensorflow:loss = 0.648141, step = 1701
INFO:tensorflow:loss = 0.638145, step = 1801
INFO:tensorflow:loss = 0.662156, step = 1901
WARNING:tensorflow:parser_num_threads is deprecated, it will be removed onSept 3 2016
WARNING:tensorflow:Given features: {'distractor_7_len': <tf.Tensor 'read_batch_features_eval/fifo_queue_Dequeue:17' shape=(?, 1) dtype=int64>, 'distractor_2_len': <tf.Tensor 'read_batch_features_eval/fifo_queue_Dequeue:7' shape=(?, 1) dtype=int64>, 'distractor_5_len': <tf.Tensor 'read_batch_features_eval/fifo_queue_Dequeue:13' shape=(?, 1) dtype=int64>, 'distractor_0_len': <tf.Tensor 'read_batch_features_eval/fifo_queue_Dequeue:3' shape=(?, 1) dtype=int64>, 'distractor_1_len': <tf.Tensor 'read_batch_features_eval/fifo_queue_Dequeue:5' shape=(?, 1) dtype=int64>, 'distractor_3_len': <tf.Tensor 'read_batch_features_eval/fifo_queue_Dequeue:9' shape=(?, 1) dtype=int64>, 'distractor_2': <tf.Tensor 'read_batch_features_eval/fifo_queue_Dequeue:6' shape=(?, 160) dtype=int64>, 'distractor_1': <tf.Tensor 'read_batch_features_eval/fifo_queue_Dequeue:4' shape=(?, 160) dtype=int64>, 'distractor_4': <tf.Tensor 'read_batch_features_eval/fifo_queue_Dequeue:10' shape=(?, 160) dtype=int64>, 'distractor_7': <tf.Tensor 'read_batch_features_eval/fifo_queue_Dequeue:16' shape=(?, 160) dtype=int64>, 'distractor_6_len': <tf.Tensor 'read_batch_features_eval/fifo_queue_Dequeue:15' shape=(?, 1) dtype=int64>, 'distractor_6': <tf.Tensor 'read_batch_features_eval/fifo_queue_Dequeue:14' shape=(?, 160) dtype=int64>, 'distractor_8': <tf.Tensor 'read_batch_features_eval/fifo_queue_Dequeue:18' shape=(?, 160) dtype=int64>, 'distractor_0': <tf.Tensor 'read_batch_features_eval/fifo_queue_Dequeue:2' shape=(?, 160) dtype=int64>, 'distractor_8_len': <tf.Tensor 'read_batch_features_eval/fifo_queue_Dequeue:19' shape=(?, 1) dtype=int64>, 'distractor_3': <tf.Tensor 'read_batch_features_eval/fifo_queue_Dequeue:8' shape=(?, 160) dtype=int64>, 'context': <tf.Tensor 'read_batch_features_eval/fifo_queue_Dequeue:0' shape=(?, 160) dtype=int64>, 'utterance_len': <tf.Tensor 'read_batch_features_eval/fifo_queue_Dequeue:21' shape=(?, 1) dtype=int64>, 'context_len': <tf.Tensor 'read_batch_features_eval/fifo_queue_Dequeue:1' shape=(?, 1) dtype=int64>, 'distractor_5': <tf.Tensor 'read_batch_features_eval/fifo_queue_Dequeue:12' shape=(?, 160) dtype=int64>, 'distractor_4_len': <tf.Tensor 'read_batch_features_eval/fifo_queue_Dequeue:11' shape=(?, 1) dtype=int64>, 'utterance': <tf.Tensor 'read_batch_features_eval/fifo_queue_Dequeue:20' shape=(?, 160) dtype=int64>}, required signatures: {'utterance_len': TensorSignature(dtype=tf.int64, shape=TensorShape([Dimension(128), Dimension(1)]), is_sparse=False), 'context_len': TensorSignature(dtype=tf.int64, shape=TensorShape([Dimension(128), Dimension(1)]), is_sparse=False), 'context': TensorSignature(dtype=tf.int64, shape=TensorShape([Dimension(128), Dimension(160)]), is_sparse=False), 'utterance': TensorSignature(dtype=tf.int64, shape=TensorShape([Dimension(128), Dimension(160)]), is_sparse=False)}.
WARNING:tensorflow:Given targets: Tensor("zeros:0", shape=(16, 1), dtype=int64), required signatures: TensorSignature(dtype=tf.int64, shape=TensorShape([Dimension(128), Dimension(1)]), is_sparse=False).
INFO:tensorflow:No glove/vocab path specificed, starting with random embeddings.
WARNING:tensorflow:Please specify metrics using MetricSpec. Using bare functions or (key, fn) tuples is deprecated and support for it will be removed on Oct 1, 2016.
WARNING:tensorflow:Please specify metrics using MetricSpec. Using bare functions or (key, fn) tuples is deprecated and support for it will be removed on Oct 1, 2016.
WARNING:tensorflow:Please specify metrics using MetricSpec. Using bare functions or (key, fn) tuples is deprecated and support for it will be removed on Oct 1, 2016.
WARNING:tensorflow:Please specify metrics using MetricSpec. Using bare functions or (key, fn) tuples is deprecated and support for it will be removed on Oct 1, 2016.
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GRID K520, pci bus id: 0000:00:03.0)
INFO:tensorflow:Restored model from /home/ubuntu/bin/chatbot-retrieval/runs/1477342660
W tensorflow/core/framework/op_kernel.cc:968] Out of range: Reached limit of 1
[[Node: read_batch_features_eval/file_name_queue/limit_epochs/CountUpTo = CountUpToT=DT_INT64, _class=["loc:@read_batch_features_eval/file_name_queue/limit_epochs/epochs"], limit=1, _device="/job:localhost/replica:0/task:0/cpu:0"]]
INFO:tensorflow:Eval steps [0,inf) for training step 1.
W tensorflow/core/framework/op_kernel.cc:968] Out of range: Reached limit of 1
[[Node: read_batch_features_eval/file_name_queue/limit_epochs/CountUpTo = CountUpToT=DT_INT64, _class=["loc:@read_batch_features_eval/file_name_queue/limit_epochs/epochs"], limit=1, _device="/job:localhost/replica:0/task:0/cpu:0"]]
INFO:tensorflow:Results after 10 steps (0.123 sec/batch): recall_at_2 = 0.1875, recall_at_5 = 0.50625, recall_at_1 = 0.10625, recall_at_10 = 1.0, loss = 3.59806.
INFO:tensorflow:Results after 20 steps (0.120 sec/batch): recall_at_2 = 0.184375, recall_at_5 = 0.478125, recall_at_1 = 0.109375, recall_at_10 = 1.0, loss = 3.98682.
INFO:tensorflow:Results after 30 steps (0.120 sec/batch): recall_at_2 = 0.2, recall_at_5 = 0.48125, recall_at_1 = 0.127083333333, recall_at_10 = 1.0, loss = 3.92182.
INFO:tensorflow:Results after 40 steps (0.120 sec/batch): recall_at_2 = 0.2078125, recall_at_5 = 0.490625, recall_at_1 = 0.1265625, recall_at_10 = 1.0, loss = 4.01366.
INFO:tensorflow:Results after 50 steps (0.120 sec/batch): recall_at_2 = 0.205, recall_at_5 = 0.47875, recall_at_1 = 0.12, recall_at_10 = 1.0, loss = 4.06847.
INFO:tensorflow:Results after 60 steps (0.120 sec/batch): recall_at_2 = 0.213541666667, recall_at_5 = 0.4875, recall_at_1 = 0.127083333333, recall_at_10 = 1.0, loss = 4.07837.
INFO:tensorflow:Results after 70 steps (0.120 sec/batch): recall_at_2 = 0.208035714286, recall_at_5 = 0.489285714286, recall_at_1 = 0.121428571429, recall_at_10 = 1.0, loss = 4.11944.
INFO:tensorflow:Results after 80 steps (0.120 sec/batch): recall_at_2 = 0.2046875, recall_at_5 = 0.48515625, recall_at_1 = 0.121875, recall_at_10 = 1.0, loss = 4.12956.
INFO:tensorflow:Results after 90 steps (0.120 sec/batch): recall_at_2 = 0.2, recall_at_5 = 0.479166666667, recall_at_1 = 0.120138888889, recall_at_10 = 1.0, loss = 4.1555.
INFO:tensorflow:Results after 100 steps (0.119 sec/batch): recall_at_2 = 0.2025, recall_at_5 = 0.480625, recall_at_1 = 0.120625, recall_at_10 = 1.0, loss = 4.15988.
INFO:tensorflow:Results after 110 steps (0.120 sec/batch): recall_at_2 = 0.203977272727, recall_at_5 = 0.479545454545, recall_at_1 = 0.123295454545, recall_at_10 = 1.0, loss = 4.16756.
INFO:tensorflow:Results after 120 steps (0.120 sec/batch): recall_at_2 = 0.2015625, recall_at_5 = 0.477604166667, recall_at_1 = 0.122916666667, recall_at_10 = 1.0, loss = 4.1667.
INFO:tensorflow:Results after 130 steps (0.120 sec/batch): recall_at_2 = 0.2, recall_at_5 = 0.479326923077, recall_at_1 = 0.120673076923, recall_at_10 = 1.0, loss = 4.14795.
INFO:tensorflow:Results after 140 steps (0.119 sec/batch): recall_at_2 = 0.196875, recall_at_5 = 0.478125, recall_at_1 = 0.120535714286, recall_at_10 = 1.0, loss = 4.14746.
INFO:tensorflow:Results after 150 steps (0.119 sec/batch): recall_at_2 = 0.199166666667, recall_at_5 = 0.480416666667, recall_at_1 = 0.122083333333, recall_at_10 = 1.0, loss = 4.12638.
INFO:tensorflow:Results after 160 steps (0.120 sec/batch): recall_at_2 = 0.19609375, recall_at_5 = 0.48046875, recall_at_1 = 0.1203125, recall_at_10 = 1.0, loss = 4.12208.
INFO:tensorflow:Results after 170 steps (0.120 sec/batch): recall_at_2 = 0.196323529412, recall_at_5 = 0.481985294118, recall_at_1 = 0.120588235294, recall_at_10 = 1.0, loss = 4.111.
INFO:tensorflow:Results after 180 steps (0.120 sec/batch): recall_at_2 = 0.196527777778, recall_at_5 = 0.483333333333, recall_at_1 = 0.121180555556, recall_at_10 = 1.0, loss = 4.10636.
INFO:tensorflow:Results after 190 steps (0.120 sec/batch): recall_at_2 = 0.195723684211, recall_at_5 = 0.484868421053, recall_at_1 = 0.121052631579, recall_at_10 = 1.0, loss = 4.12293.
INFO:tensorflow:Results after 200 steps (0.119 sec/batch): recall_at_2 = 0.196875, recall_at_5 = 0.486875, recall_at_1 = 0.12125, recall_at_10 = 1.0, loss = 4.12455.
INFO:tensorflow:Results after 210 steps (0.119 sec/batch): recall_at_2 = 0.196428571429, recall_at_5 = 0.486904761905, recall_at_1 = 0.120238095238, recall_at_10 = 1.0, loss = 4.11937.
INFO:tensorflow:Results after 220 steps (0.120 sec/batch): recall_at_2 = 0.196875, recall_at_5 = 0.488068181818, recall_at_1 = 0.121022727273, recall_at_10 = 1.0, loss = 4.09725.
INFO:tensorflow:Results after 230 steps (0.120 sec/batch): recall_at_2 = 0.196195652174, recall_at_5 = 0.488858695652, recall_at_1 = 0.119565217391, recall_at_10 = 1.0, loss = 4.10444.
INFO:tensorflow:Results after 240 steps (0.121 sec/batch): recall_at_2 = 0.195572916667, recall_at_5 = 0.488541666667, recall_at_1 = 0.119270833333, recall_at_10 = 1.0, loss = 4.09378.
INFO:tensorflow:Results after 250 steps (0.119 sec/batch): recall_at_2 = 0.19475, recall_at_5 = 0.48775, recall_at_1 = 0.1185, recall_at_10 = 1.0, loss = 4.09283.
INFO:tensorflow:Results after 260 steps (0.119 sec/batch): recall_at_2 = 0.194711538462, recall_at_5 = 0.488221153846, recall_at_1 = 0.11875, recall_at_10 = 1.0, loss = 4.08133.
INFO:tensorflow:Results after 270 steps (0.119 sec/batch): recall_at_2 = 0.19375, recall_at_5 = 0.488425925926, recall_at_1 = 0.118287037037, recall_at_10 = 1.0, loss = 4.0882.
INFO:tensorflow:Results after 280 steps (0.120 sec/batch): recall_at_2 = 0.196205357143, recall_at_5 = 0.487723214286, recall_at_1 = 0.120535714286, recall_at_10 = 1.0, loss = 4.09415.
INFO:tensorflow:Results after 290 steps (0.120 sec/batch): recall_at_2 = 0.197198275862, recall_at_5 = 0.489655172414, recall_at_1 = 0.120689655172, recall_at_10 = 1.0, loss = 4.09119.
INFO:tensorflow:Results after 300 steps (0.120 sec/batch): recall_at_2 = 0.197083333333, recall_at_5 = 0.490416666667, recall_at_1 = 0.120416666667, recall_at_10 = 1.0, loss = 4.0915.
INFO:tensorflow:Results after 310 steps (0.119 sec/batch): recall_at_2 = 0.198185483871, recall_at_5 = 0.494959677419, recall_at_1 = 0.119758064516, recall_at_10 = 1.0, loss = 4.08625.
INFO:tensorflow:Results after 320 steps (0.120 sec/batch): recall_at_2 = 0.198828125, recall_at_5 = 0.4951171875, recall_at_1 = 0.11953125, recall_at_10 = 1.0, loss = 4.07738.
INFO:tensorflow:Results after 330 steps (0.120 sec/batch): recall_at_2 = 0.199621212121, recall_at_5 = 0.495075757576, recall_at_1 = 0.119696969697, recall_at_10 = 1.0, loss = 4.06898.
INFO:tensorflow:Results after 340 steps (0.119 sec/batch): recall_at_2 = 0.200183823529, recall_at_5 = 0.495404411765, recall_at_1 = 0.120036764706, recall_at_10 = 1.0, loss = 4.08386.
INFO:tensorflow:Results after 350 steps (0.120 sec/batch): recall_at_2 = 0.199285714286, recall_at_5 = 0.49625, recall_at_1 = 0.119642857143, recall_at_10 = 1.0, loss = 4.0845.
INFO:tensorflow:Results after 360 steps (0.120 sec/batch): recall_at_2 = 0.2, recall_at_5 = 0.496006944444, recall_at_1 = 0.120486111111, recall_at_10 = 1.0, loss = 4.08962.
INFO:tensorflow:Results after 370 steps (0.119 sec/batch): recall_at_2 = 0.199831081081, recall_at_5 = 0.496790540541, recall_at_1 = 0.120777027027, recall_at_10 = 1.0, loss = 4.08704.
INFO:tensorflow:Results after 380 steps (0.119 sec/batch): recall_at_2 = 0.200328947368, recall_at_5 = 0.496710526316, recall_at_1 = 0.120394736842, recall_at_10 = 1.0, loss = 4.08253.
INFO:tensorflow:Results after 390 steps (0.121 sec/batch): recall_at_2 = 0.199358974359, recall_at_5 = 0.497275641026, recall_at_1 = 0.120352564103, recall_at_10 = 1.0, loss = 4.09325.
INFO:tensorflow:Results after 400 steps (0.119 sec/batch): recall_at_2 = 0.2, recall_at_5 = 0.4990625, recall_at_1 = 0.12140625, recall_at_10 = 1.0, loss = 4.09894.
INFO:tensorflow:Results after 410 steps (0.121 sec/batch): recall_at_2 = 0.199085365854, recall_at_5 = 0.499695121951, recall_at_1 = 0.120426829268, recall_at_10 = 1.0, loss = 4.09629.
INFO:tensorflow:Results after 420 steps (0.122 sec/batch): recall_at_2 = 0.201041666667, recall_at_5 = 0.5, recall_at_1 = 0.121726190476, recall_at_10 = 1.0, loss = 4.10748.
INFO:tensorflow:Results after 430 steps (0.120 sec/batch): recall_at_2 = 0.200290697674, recall_at_5 = 0.49898255814, recall_at_1 = 0.121220930233, recall_at_10 = 1.0, loss = 4.10508.
INFO:tensorflow:Results after 440 steps (0.119 sec/batch): recall_at_2 = 0.200710227273, recall_at_5 = 0.498721590909, recall_at_1 = 0.120596590909, recall_at_10 = 1.0, loss = 4.10168.
INFO:tensorflow:Results after 450 steps (0.120 sec/batch): recall_at_2 = 0.201527777778, recall_at_5 = 0.498611111111, recall_at_1 = 0.121527777778, recall_at_10 = 1.0, loss = 4.10034.
INFO:tensorflow:Results after 460 steps (0.120 sec/batch): recall_at_2 = 0.202445652174, recall_at_5 = 0.498913043478, recall_at_1 = 0.12214673913, recall_at_10 = 1.0, loss = 4.09887.
INFO:tensorflow:Results after 470 steps (0.121 sec/batch): recall_at_2 = 0.201728723404, recall_at_5 = 0.498404255319, recall_at_1 = 0.121542553191, recall_at_10 = 1.0, loss = 4.10034.
INFO:tensorflow:Results after 480 steps (0.120 sec/batch): recall_at_2 = 0.200520833333, recall_at_5 = 0.496354166667, recall_at_1 = 0.120963541667, recall_at_10 = 1.0, loss = 4.10182.
INFO:tensorflow:Results after 490 steps (0.119 sec/batch): recall_at_2 = 0.201913265306, recall_at_5 = 0.497193877551, recall_at_1 = 0.121556122449, recall_at_10 = 1.0, loss = 4.09906.
INFO:tensorflow:Results after 500 steps (0.120 sec/batch): recall_at_2 = 0.202125, recall_at_5 = 0.497375, recall_at_1 = 0.12175, recall_at_10 = 1.0, loss = 4.0943.
INFO:tensorflow:Results after 510 steps (0.120 sec/batch): recall_at_2 = 0.202328431373, recall_at_5 = 0.498039215686, recall_at_1 = 0.12181372549, recall_at_10 = 1.0, loss = 4.09313.
INFO:tensorflow:Results after 520 steps (0.120 sec/batch): recall_at_2 = 0.202644230769, recall_at_5 = 0.498918269231, recall_at_1 = 0.122355769231, recall_at_10 = 1.0, loss = 4.09397.
INFO:tensorflow:Results after 530 steps (0.120 sec/batch): recall_at_2 = 0.203419811321, recall_at_5 = 0.5, recall_at_1 = 0.122641509434, recall_at_10 = 1.0, loss = 4.09302.
INFO:tensorflow:Results after 540 steps (0.120 sec/batch): recall_at_2 = 0.204513888889, recall_at_5 = 0.499537037037, recall_at_1 = 0.122685185185, recall_at_10 = 1.0, loss = 4.09574.
INFO:tensorflow:Results after 550 steps (0.120 sec/batch): recall_at_2 = 0.204886363636, recall_at_5 = 0.500227272727, recall_at_1 = 0.123181818182, recall_at_10 = 1.0, loss = 4.09305.
INFO:tensorflow:Results after 560 steps (0.120 sec/batch): recall_at_2 = 0.20390625, recall_at_5 = 0.501116071429, recall_at_1 = 0.1234375, recall_at_10 = 1.0, loss = 4.08918.
INFO:tensorflow:Results after 570 steps (0.121 sec/batch): recall_at_2 = 0.205153508772, recall_at_5 = 0.501535087719, recall_at_1 = 0.124013157895, recall_at_10 = 1.0, loss = 4.08862.
INFO:tensorflow:Results after 580 steps (0.120 sec/batch): recall_at_2 = 0.204849137931, recall_at_5 = 0.500862068966, recall_at_1 = 0.123814655172, recall_at_10 = 1.0, loss = 4.08325.
INFO:tensorflow:Results after 590 steps (0.120 sec/batch): recall_at_2 = 0.204661016949, recall_at_5 = 0.5, recall_at_1 = 0.123411016949, recall_at_10 = 1.0, loss = 4.08492.
INFO:tensorflow:Results after 600 steps (0.120 sec/batch): recall_at_2 = 0.203958333333, recall_at_5 = 0.5, recall_at_1 = 0.123333333333, recall_at_10 = 1.0, loss = 4.08494.
INFO:tensorflow:Results after 610 steps (0.120 sec/batch): recall_at_2 = 0.204098360656, recall_at_5 = 0.500307377049, recall_at_1 = 0.122848360656, recall_at_10 = 1.0, loss = 4.07892.
INFO:tensorflow:Results after 620 steps (0.120 sec/batch): recall_at_2 = 0.203427419355, recall_at_5 = 0.5, recall_at_1 = 0.122177419355, recall_at_10 = 1.0, loss = 4.08029.
INFO:tensorflow:Results after 630 steps (0.120 sec/batch): recall_at_2 = 0.203571428571, recall_at_5 = 0.500396825397, recall_at_1 = 0.122123015873, recall_at_10 = 1.0, loss = 4.07678.
INFO:tensorflow:Results after 640 steps (0.120 sec/batch): recall_at_2 = 0.20263671875, recall_at_5 = 0.49921875, recall_at_1 = 0.12138671875, recall_at_10 = 1.0, loss = 4.07654.
INFO:tensorflow:Results after 650 steps (0.121 sec/batch): recall_at_2 = 0.202980769231, recall_at_5 = 0.5, recall_at_1 = 0.121538461538, recall_at_10 = 1.0, loss = 4.07632.
INFO:tensorflow:Results after 660 steps (0.120 sec/batch): recall_at_2 = 0.203598484848, recall_at_5 = 0.501325757576, recall_at_1 = 0.122064393939, recall_at_10 = 1.0, loss = 4.06906.
INFO:tensorflow:Results after 670 steps (0.120 sec/batch): recall_at_2 = 0.202985074627, recall_at_5 = 0.50177238806, recall_at_1 = 0.121735074627, recall_at_10 = 1.0, loss = 4.06498.
INFO:tensorflow:Results after 680 steps (0.120 sec/batch): recall_at_2 = 0.203400735294, recall_at_5 = 0.502481617647, recall_at_1 = 0.122058823529, recall_at_10 = 1.0, loss = 4.06492.
INFO:tensorflow:Results after 690 steps (0.120 sec/batch): recall_at_2 = 0.203623188406, recall_at_5 = 0.502355072464, recall_at_1 = 0.122192028986, recall_at_10 = 1.0, loss = 4.06477.
INFO:tensorflow:Results after 700 steps (0.120 sec/batch): recall_at_2 = 0.203839285714, recall_at_5 = 0.501785714286, recall_at_1 = 0.121964285714, recall_at_10 = 1.0, loss = 4.06561.
INFO:tensorflow:Results after 710 steps (0.120 sec/batch): recall_at_2 = 0.203873239437, recall_at_5 = 0.502024647887, recall_at_1 = 0.121654929577, recall_at_10 = 1.0, loss = 4.06292.
INFO:tensorflow:Results after 720 steps (0.120 sec/batch): recall_at_2 = 0.204340277778, recall_at_5 = 0.502256944444, recall_at_1 = 0.121875, recall_at_10 = 1.0, loss = 4.05911.
INFO:tensorflow:Results after 730 steps (0.121 sec/batch): recall_at_2 = 0.203852739726, recall_at_5 = 0.50102739726, recall_at_1 = 0.121489726027, recall_at_10 = 1.0, loss = 4.0559.
INFO:tensorflow:Results after 740 steps (0.119 sec/batch): recall_at_2 = 0.203631756757, recall_at_5 = 0.501351351351, recall_at_1 = 0.121199324324, recall_at_10 = 1.0, loss = 4.05687.
INFO:tensorflow:Results after 750 steps (0.121 sec/batch): recall_at_2 = 0.204083333333, recall_at_5 = 0.50125, recall_at_1 = 0.12125, recall_at_10 = 1.0, loss = 4.05769.
INFO:tensorflow:Results after 760 steps (0.120 sec/batch): recall_at_2 = 0.204605263158, recall_at_5 = 0.501644736842, recall_at_1 = 0.121710526316, recall_at_10 = 1.0, loss = 4.05516.
INFO:tensorflow:Results after 770 steps (0.120 sec/batch): recall_at_2 = 0.204383116883, recall_at_5 = 0.501217532468, recall_at_1 = 0.121672077922, recall_at_10 = 1.0, loss = 4.05528.
INFO:tensorflow:Results after 780 steps (0.120 sec/batch): recall_at_2 = 0.204407051282, recall_at_5 = 0.500801282051, recall_at_1 = 0.121955128205, recall_at_10 = 1.0, loss = 4.0588.
INFO:tensorflow:Results after 790 steps (0.119 sec/batch): recall_at_2 = 0.204509493671, recall_at_5 = 0.501819620253, recall_at_1 = 0.121993670886, recall_at_10 = 1.0, loss = 4.05593.
INFO:tensorflow:Results after 800 steps (0.120 sec/batch): recall_at_2 = 0.203984375, recall_at_5 = 0.501640625, recall_at_1 = 0.12140625, recall_at_10 = 1.0, loss = 4.06064.
INFO:tensorflow:Results after 810 steps (0.119 sec/batch): recall_at_2 = 0.204320987654, recall_at_5 = 0.502700617284, recall_at_1 = 0.12137345679, recall_at_10 = 1.0, loss = 4.05953.
INFO:tensorflow:Results after 820 steps (0.120 sec/batch): recall_at_2 = 0.204268292683, recall_at_5 = 0.502820121951, recall_at_1 = 0.121417682927, recall_at_10 = 1.0, loss = 4.05478.
INFO:tensorflow:Results after 830 steps (0.121 sec/batch): recall_at_2 = 0.204668674699, recall_at_5 = 0.502259036145, recall_at_1 = 0.121762048193, recall_at_10 = 1.0, loss = 4.05327.
INFO:tensorflow:Results after 840 steps (0.120 sec/batch): recall_at_2 = 0.204464285714, recall_at_5 = 0.502529761905, recall_at_1 = 0.121651785714, recall_at_10 = 1.0, loss = 4.05148.
INFO:tensorflow:Results after 850 steps (0.119 sec/batch): recall_at_2 = 0.204558823529, recall_at_5 = 0.5025, recall_at_1 = 0.121617647059, recall_at_10 = 1.0, loss = 4.05511.
INFO:tensorflow:Results after 860 steps (0.121 sec/batch): recall_at_2 = 0.20414244186, recall_at_5 = 0.502470930233, recall_at_1 = 0.121511627907, recall_at_10 = 1.0, loss = 4.06215.
INFO:tensorflow:Results after 870 steps (0.120 sec/batch): recall_at_2 = 0.204022988506, recall_at_5 = 0.501795977011, recall_at_1 = 0.121120689655, recall_at_10 = 1.0, loss = 4.06376.
INFO:tensorflow:Results after 880 steps (0.119 sec/batch): recall_at_2 = 0.204048295455, recall_at_5 = 0.501988636364, recall_at_1 = 0.12109375, recall_at_10 = 1.0, loss = 4.06375.
INFO:tensorflow:Results after 890 steps (0.121 sec/batch): recall_at_2 = 0.204143258427, recall_at_5 = 0.502738764045, recall_at_1 = 0.12106741573, recall_at_10 = 1.0, loss = 4.06116.
INFO:tensorflow:Results after 900 steps (0.120 sec/batch): recall_at_2 = 0.204722222222, recall_at_5 = 0.503125, recall_at_1 = 0.121458333333, recall_at_10 = 1.0, loss = 4.06297.
INFO:tensorflow:Results after 910 steps (0.121 sec/batch): recall_at_2 = 0.204532967033, recall_at_5 = 0.502884615385, recall_at_1 = 0.120879120879, recall_at_10 = 1.0, loss = 4.06331.
INFO:tensorflow:Results after 920 steps (0.120 sec/batch): recall_at_2 = 0.204619565217, recall_at_5 = 0.502785326087, recall_at_1 = 0.120855978261, recall_at_10 = 1.0, loss = 4.05974.
INFO:tensorflow:Results after 930 steps (0.120 sec/batch): recall_at_2 = 0.204637096774, recall_at_5 = 0.502688172043, recall_at_1 = 0.120564516129, recall_at_10 = 1.0, loss = 4.06012.
INFO:tensorflow:Results after 940 steps (0.119 sec/batch): recall_at_2 = 0.204654255319, recall_at_5 = 0.502460106383, recall_at_1 = 0.120478723404, recall_at_10 = 1.0, loss = 4.05967.
INFO:tensorflow:Results after 950 steps (0.120 sec/batch): recall_at_2 = 0.204802631579, recall_at_5 = 0.502302631579, recall_at_1 = 0.120460526316, recall_at_10 = 1.0, loss = 4.06118.
INFO:tensorflow:Results after 960 steps (0.120 sec/batch): recall_at_2 = 0.205338541667, recall_at_5 = 0.502799479167, recall_at_1 = 0.120703125, recall_at_10 = 1.0, loss = 4.05963.
INFO:tensorflow:Results after 970 steps (0.120 sec/batch): recall_at_2 = 0.205541237113, recall_at_5 = 0.502641752577, recall_at_1 = 0.120940721649, recall_at_10 = 1.0, loss = 4.0598.
INFO:tensorflow:Results after 980 steps (0.119 sec/batch): recall_at_2 = 0.205357142857, recall_at_5 = 0.502104591837, recall_at_1 = 0.121045918367, recall_at_10 = 1.0, loss = 4.05721.
INFO:tensorflow:Results after 990 steps (0.120 sec/batch): recall_at_2 = 0.205366161616, recall_at_5 = 0.502651515152, recall_at_1 = 0.12077020202, recall_at_10 = 1.0, loss = 4.06209.
INFO:tensorflow:Results after 1000 steps (0.120 sec/batch): recall_at_2 = 0.2048125, recall_at_5 = 0.5025625, recall_at_1 = 0.1203125, recall_at_10 = 1.0, loss = 4.05877.
INFO:tensorflow:Results after 1010 steps (0.120 sec/batch): recall_at_2 = 0.204579207921, recall_at_5 = 0.502722772277, recall_at_1 = 0.120235148515, recall_at_10 = 1.0, loss = 4.0585.
INFO:tensorflow:Results after 1020 steps (0.120 sec/batch): recall_at_2 = 0.204534313725, recall_at_5 = 0.502328431373, recall_at_1 = 0.120404411765, recall_at_10 = 1.0, loss = 4.05737.
INFO:tensorflow:Results after 1030 steps (0.119 sec/batch): recall_at_2 = 0.204126213592, recall_at_5 = 0.502063106796, recall_at_1 = 0.120084951456, recall_at_10 = 1.0, loss = 4.05443.
INFO:tensorflow:Results after 1040 steps (0.119 sec/batch): recall_at_2 = 0.204326923077, recall_at_5 = 0.50234375, recall_at_1 = 0.120252403846, recall_at_10 = 1.0, loss = 4.05527.
INFO:tensorflow:Results after 1050 steps (0.119 sec/batch): recall_at_2 = 0.204345238095, recall_at_5 = 0.501845238095, recall_at_1 = 0.12005952381, recall_at_10 = 1.0, loss = 4.05548.
INFO:tensorflow:Results after 1060 steps (0.121 sec/batch): recall_at_2 = 0.204304245283, recall_at_5 = 0.502240566038, recall_at_1 = 0.119811320755, recall_at_10 = 1.0, loss = 4.05943.
INFO:tensorflow:Results after 1070 steps (0.120 sec/batch): recall_at_2 = 0.204147196262, recall_at_5 = 0.501985981308, recall_at_1 = 0.119509345794, recall_at_10 = 1.0, loss = 4.05738.
INFO:tensorflow:Results after 1080 steps (0.120 sec/batch): recall_at_2 = 0.204050925926, recall_at_5 = 0.501736111111, recall_at_1 = 0.119328703704, recall_at_10 = 1.0, loss = 4.05586.
INFO:tensorflow:Results after 1090 steps (0.120 sec/batch): recall_at_2 = 0.204013761468, recall_at_5 = 0.501834862385, recall_at_1 = 0.119438073394, recall_at_10 = 1.0, loss = 4.05553.
INFO:tensorflow:Results after 1100 steps (0.120 sec/batch): recall_at_2 = 0.203977272727, recall_at_5 = 0.501931818182, recall_at_1 = 0.119772727273, recall_at_10 = 1.0, loss = 4.05839.
INFO:tensorflow:Results after 1110 steps (0.119 sec/batch): recall_at_2 = 0.203941441441, recall_at_5 = 0.50213963964, recall_at_1 = 0.119650900901, recall_at_10 = 1.0, loss = 4.05665.
INFO:tensorflow:Results after 1120 steps (0.120 sec/batch): recall_at_2 = 0.203850446429, recall_at_5 = 0.502566964286, recall_at_1 = 0.119698660714, recall_at_10 = 1.0, loss = 4.05864.
INFO:tensorflow:Results after 1130 steps (0.121 sec/batch): recall_at_2 = 0.203871681416, recall_at_5 = 0.502986725664, recall_at_1 = 0.119579646018, recall_at_10 = 1.0, loss = 4.05617.
INFO:tensorflow:Results after 1140 steps (0.119 sec/batch): recall_at_2 = 0.204221491228, recall_at_5 = 0.503015350877, recall_at_1 = 0.119736842105, recall_at_10 = 1.0, loss = 4.05588.
INFO:tensorflow:Results after 1150 steps (0.120 sec/batch): recall_at_2 = 0.203858695652, recall_at_5 = 0.503315217391, recall_at_1 = 0.119510869565, recall_at_10 = 1.0, loss = 4.05729.
INFO:tensorflow:Results after 1160 steps (0.121 sec/batch): recall_at_2 = 0.204525862069, recall_at_5 = 0.503771551724, recall_at_1 = 0.119665948276, recall_at_10 = 1.0, loss = 4.05297.
INFO:tensorflow:Results after 1170 steps (0.119 sec/batch): recall_at_2 = 0.204433760684, recall_at_5 = 0.503792735043, recall_at_1 = 0.119551282051, recall_at_10 = 1.0, loss = 4.05068.
INFO:tensorflow:Results after 1180 steps (0.119 sec/batch): recall_at_2 = 0.204343220339, recall_at_5 = 0.504025423729, recall_at_1 = 0.11938559322, recall_at_10 = 1.0, loss = 4.04996.
INFO:tensorflow:Results after 1190 steps (0.121 sec/batch): recall_at_2 = 0.204359243697, recall_at_5 = 0.504149159664, recall_at_1 = 0.119222689076, recall_at_10 = 1.0, loss = 4.04873.
INFO:tensorflow:Results after 1200 steps (0.120 sec/batch): recall_at_2 = 0.204270833333, recall_at_5 = 0.504166666667, recall_at_1 = 0.119114583333, recall_at_10 = 1.0, loss = 4.04727.
INFO:tensorflow:Results after 1210 steps (0.119 sec/batch): recall_at_2 = 0.20444214876, recall_at_5 = 0.504132231405, recall_at_1 = 0.119318181818, recall_at_10 = 1.0, loss = 4.04796.
INFO:tensorflow:Results after 1220 steps (0.119 sec/batch): recall_at_2 = 0.204405737705, recall_at_5 = 0.503944672131, recall_at_1 = 0.119364754098, recall_at_10 = 1.0, loss = 4.04897.
W tensorflow/core/framework/op_kernel.cc:968] Invalid argument: Incompatible shapes: [80,1] vs. [160,1]
[[Node: prediction/logistic_loss/mul = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"](prediction/Squeeze, prediction/ToFloat)]]
Traceback (most recent call last):
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 972, in _do_call
return fn(*args)
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 954, in _run_fn
status, run_metadata)
File "/usr/lib/python3.4/contextlib.py", line 66, in exit
next(self.gen)
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/python/framework/errors.py", line 463, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors.InvalidArgumentError: Incompatible shapes: [80,1] vs. [160,1]
[[Node: prediction/logistic_loss/mul = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"](prediction/Squeeze, prediction/ToFloat)]]
[[Node: recall_at_10/ToInt64/_97 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_274_recall_at_10/ToInt64", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/cpu:0"]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "udc_train.py", line 70, in
tf.app.run()
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "udc_train.py", line 67, in main
estimator.fit(input_fn=input_fn_train, steps=None, monitors=[eval_monitor])
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 333, in fit
max_steps=max_steps)
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 708, in _train_model
max_steps=max_steps)
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/graph_actions.py", line 285, in _monitored_train
None)
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/monitored_session.py", line 368, in run
run_metadata=run_metadata)
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/monitored_session.py", line 521, in run
run_metadata=run_metadata)
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/monitored_session.py", line 488, in run
return self._sess.run(_args, *_kwargs)
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/monitored_session.py", line 625, in run
hook in outputs else None))
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/monitors.py", line 1215, in after_run
induce_stop = m.step_end(self._last_step, result)
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/monitors.py", line 411, in step_end
return self.every_n_step_end(step, output)
File "udc_train.py", line 64, in every_n_step_end
steps=None)
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 399, in evaluate
name=name)
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 771, in _evaluate_model
max_steps=steps)
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/graph_actions.py", line 738, in evaluate
session.run(update_op, feed_dict=feed_dict)
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 717, in run
run_metadata_ptr)
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 915, in _run
feed_dict_string, options, run_metadata)
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 965, in _do_run
target_list, options, run_metadata)
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 985, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.InvalidArgumentError: Incompatible shapes: [80,1] vs. [160,1]
[[Node: prediction/logistic_loss/mul = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"](prediction/Squeeze, prediction/ToFloat)]]
[[Node: recall_at_10/ToInt64/_97 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_274_recall_at_10/ToInt64", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/cpu:0"]]

Caused by op 'prediction/logistic_loss/mul', defined at:
File "udc_train.py", line 70, in
tf.app.run()
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "udc_train.py", line 67, in main
estimator.fit(input_fn=input_fn_train, steps=None, monitors=[eval_monitor])
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 333, in fit
max_steps=max_steps)
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 708, in _train_model
max_steps=max_steps)
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/graph_actions.py", line 285, in _monitored_train
None)
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/monitored_session.py", line 368, in run
run_metadata=run_metadata)
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/monitored_session.py", line 521, in run
run_metadata=run_metadata)
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/monitored_session.py", line 488, in run
return self._sess.run(_args, *_kwargs)
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/monitored_session.py", line 625, in run
hook in outputs else None))
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/monitors.py", line 1215, in after_run
induce_stop = m.step_end(self._last_step, result)
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/monitors.py", line 411, in step_end
return self.every_n_step_end(step, output)
File "udc_train.py", line 64, in every_n_step_end
steps=None)
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 399, in evaluate
name=name)
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 760, in _evaluate_model
eval_dict = self._get_eval_ops(features, targets, metrics)
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 991, in _get_eval_ops
predictions, loss, _ = self._call_model_fn(features, targets, ModeKeys.EVAL)
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 946, in _call_model_fn
return self._model_fn(features, targets, mode=mode)
File "/home/ubuntu/bin/chatbot-retrieval/udc_model.py", line 83, in model_fn
tf.concat(0, all_targets))
File "/home/ubuntu/bin/chatbot-retrieval/models/dual_encoder.py", line 81, in dual_encoder_model
losses = tf.nn.sigmoid_cross_entropy_with_logits(logits, tf.to_float(targets))
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/python/ops/nn.py", line 448, in sigmoid_cross_entropy_with_logits
return math_ops.add(relu_logits - logits * targets,
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/python/ops/math_ops.py", line 751, in binary_op_wrapper
return func(x, y, name=name)
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/python/ops/math_ops.py", line 910, in _mul_dispatch
return gen_math_ops.mul(x, y, name=name)
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/python/ops/gen_math_ops.py", line 1519, in mul
result = _op_def_lib.apply_op("Mul", x=x, y=y, name=name)
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/python/framework/op_def_library.py", line 749, in apply_op
op_def=op_def)
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/python/framework/ops.py", line 2380, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/home/ubuntu/.virtualenvs/chatbot-retrieval/lib/python3.4/site-packages/tensorflow/python/framework/ops.py", line 1298, in init
self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): Incompatible shapes: [80,1] vs. [160,1]
[[Node: prediction/logistic_loss/mul = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"](prediction/Squeeze, prediction/ToFloat)]]
[[Node: recall_at_10/ToInt64/_97 = Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_274_recall_at_10/ToInt64", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/cpu:0"]]

Is there any command-line arguments that I need to specify in order to correct this error?

TypeError: object() takes no parameters

Hi,
while I am trying to execute udc_train.py, I am facing this issue,can anyone suggest me to solve this issue.
Thanks :)

INFO:tensorflow:Using config: {'_save_summary_steps': 100, '_task_id': 0, '_save_checkpoints_secs': 600, '_evaluation_master': '', '_tf_config': gpu_options {
per_process_gpu_memory_fraction: 1
}
, '_keep_checkpoint_every_n_hours': 10000, '_master': '', '_is_chief': True, '_environment': 'local', '_num_ps_replicas': 0, '_keep_checkpoint_max': 5, '_model_dir': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fe3298dd7b8>, '_task_type': None, '_tf_random_seed': None, '_num_worker_replicas': 0, '_save_checkpoints_steps': None}
Traceback (most recent call last):
File "udc_train.py", line 66, in
tf.app.run()
File "/home/akhil_gollapudi5/ex/envname/lib/python3.4/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "udc_train.py", line 60, in main
metrics=eval_metrics)
TypeError: object() takes no parameters

'moving_average_decay' keyword argument

I got the following error when running your code in Tensorflow:

TypeError: optimize_loss() got an unexpected keyword argument 'moving_average_decay'

The error code is:
def create_train_op(loss, hparams):
train_op = tf.contrib.layers.optimize_loss(
loss=loss,
global_step=tf.contrib.framework.get_global_step(),
learning_rate=hparams.learning_rate,
clip_gradients=10.0,
moving_average_decay=None, #### error here
optimizer=hparams.optimizer)
return train_op

I searched for the doc of "tf.contrib.layers.optimize_loss" but could not find the keyword. Can you please fix this?

Error in tokenizing data.

In Jupyter notebook, I wrote the following code:
`import pandas as pd
import matplotlib.pyplot as plt
import matplotlib
matplotlib.style.use('ggplot')

Load Data

train_df = pd.read_csv("C:\Users\Sachin\Downloads\udc\data\train.csv")
train_df.Label = train_df.Label.astype('category')

test_df = pd.read_csv("C:\Users\Sachin\Downloads\udc\data\test.csv")
validation_df = pd.read_csv("C:\Users\Sachin\Downloads\udc\data\valid.csv")
train_df.describe()`

Error:
ParserError Traceback (most recent call last)
in ()
4 matplotlib.style.use('ggplot')
5 # Load Data
----> 6 train_df = pd.read_csv("C:\Users\Sachin\Downloads\udc\data\train.csv")
7 train_df.Label = train_df.Label.astype('category')
ParserError: Error tokenizing data. C error: EOF inside string starting at line 623956

I am using python3.

udc_pedict.py is thowing error

INPUT_CONTEXT = "sir iam manohar please respond eos "
POTENTIAL_RESPONSES = ["please give me one or two minutes to check this for you eos ","hi eos"," please give eos ","i have checked and see that you have contacted earlier for the same issue eos ","the replacement policy is applicable from the date of product delivered and this has crossed the sellerreplacementperiod eos hope you understand eos is there anything else that i can help you with today eos ","due to unforeseen reason your return request has been cancelled eos not to worry i have created a new return request for you eos is there anything else that i can help you with today eos ","i request you to confirm when you get that reward points eos ","sorry to keep you waiting its taking a little while to get this information eos thank you for your kind patience eos ","thank you for sharing the address eos please wait for a minute or two while i check the details eos "]
When I run udc_predict.py it gives me following error:

Context: sir iam manohar please respond eos
Traceback (most recent call last):
File "udc_predict.py", line 58, in
print("{}: {:g}".format(r, prob[0,0]))
TypeError: 'generator' object has no attribute 'getitem'

Please let me know if any other information is needed

Cannot Evaluating the model using udc_test.py

Got the following errors. Should I have to change anything in udc_test.py?

Traceback (most recent call last):
File
"udc_test.py", line 39, in
estimator.evaluate(input_fn=input_fn_test, steps=None,
metrics=eval_metrics)
File
"/Library/Python/2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py",
line 356, in evaluate
name=name)
File
"/Library/Python/2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py",
line 641, in _evaluate_model
max_steps=steps)
File
"/Library/Python/2.7/site-packages/tensorflow/contrib/learn/python/learn/graph_actions.py",
line 773, in evaluate
_write_summary_results(output_dir, eval_results, current_global_step)
File
"/Library/Python/2.7/site-packages/tensorflow/contrib/learn/python/learn/graph_actions.py",
line 613, in _write_summary_results
_eval_results_to_str(eval_results))
File
"/Library/Python/2.7/site-packages/tensorflow/contrib/learn/python/learn/graph_actions.py",
line 607, in _eval_results_to_str
return ', '.join('%s = %s' % (k, v) for k, v in eval_results.items())
AttributeError: 'NoneType' object has no attribute 'items'

(udc_test.py) Incompatible shapes: [80,1] vs. [160,1] error even after changing the batch size

Getting the below error, please help.

`INFO:tensorflow:Results after 1180 steps (0.934 sec/batch): loss = 0.493843, recall_at_1 = 0.399258474576, recall_at_10 = 1.0, recall_at_2 = 0.592213983051, recall_at_5 = 0.868485169492.
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1022, in _do_call
return fn(*args)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1004, in _run_fn
status, run_metadata)
File "/usr/lib/python3.5/contextlib.py", line 66, in exit
next(self.gen)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/errors_impl.py", line 469, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [80,1] vs. [160,1]
[[Node: prediction/logistic_loss/mul = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](prediction/Squeeze, prediction/ToFloat)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "udc_test.py", line 39, in
estimator.evaluate(input_fn=input_fn_test, steps=None, metrics=eval_metrics)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 247, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 436, in evaluate
name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 816, in _evaluate_model
max_steps=steps)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/learn/python/learn/graph_actions.py", line 754, in evaluate
session.run(update_op, feed_dict=feed_dict)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 767, in run
run_metadata_ptr)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 965, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1015, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1035, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [80,1] vs. [160,1]
[[Node: prediction/logistic_loss/mul = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](prediction/Squeeze, prediction/ToFloat)]]

Caused by op 'prediction/logistic_loss/mul', defined at:
File "udc_test.py", line 39, in
estimator.evaluate(input_fn=input_fn_test, steps=None, metrics=eval_metrics)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 247, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 436, in evaluate
name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 800, in _evaluate_model
eval_ops = self._get_eval_ops(features, labels, metrics)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 1106, in _get_eval_ops
features, labels, model_fn_lib.ModeKeys.EVAL)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 1047, in _call_model_fn
model_fn_results = self._model_fn(features, labels, mode=mode)
File "/home/rohitg/Documents/chatbot-retrieval-master/udc_model.py", line 83, in model_fn
tf.concat(0, all_targets))
File "/home/rohitg/Documents/chatbot-retrieval-master/models/dual_encoder.py", line 81, in dual_encoder_model
losses = tf.nn.sigmoid_cross_entropy_with_logits(logits, tf.to_float(targets))
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/nn_impl.py", line 162, in sigmoid_cross_entropy_with_logits
return math_ops.add(relu_logits - logits * targets,
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/math_ops.py", line 816, in binary_op_wrapper
return func(x, y, name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/math_ops.py", line 1037, in _mul_dispatch
return gen_math_ops.mul(x, y, name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_math_ops.py", line 1613, in mul
result = _op_def_lib.apply_op("Mul", x=x, y=y, name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 2371, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1258, in init
self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): Incompatible shapes: [80,1] vs. [160,1]
[[Node: prediction/logistic_loss/mul = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](prediction/Squeeze, prediction/ToFloat)]]
`

Is something wrong with the cell?

hi,denny,shoul we use
"cell = tf.nn.rnn_cell.BasicLSTMCell(LSTM_CELL_SIZE,input_size=EMBEDDING_SIZE)" instead of "cell = tf.nn.rnn_cell.BasicLSTMCell(LSTM_CELL_SIZE)"?

AttributeError: 'module' object has no attribute 'Estimator'

envy@ub1404:/media/envy/data1t/os_prj/github/chatbot-retrieval$ python udc_train.py
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcurand.so locally
Traceback (most recent call last):
File "udc_train.py", line 70, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv))
File "udc_train.py", line 37, in main
estimator = tf.contrib.learn.Estimator(
AttributeError: 'module' object has no attribute 'Estimator'
envy@ub1404:/media/envy/data1t/os_prj/github/chatbot-retrieval$

Get warning - Out of range after 1110 steps

Description

OS: Mac OSX
Tensorflow Version: 0.9.0, CPU-Only

After running for two hours, the programs seems to be stopped and get results like below.
So, I think it is because of the warning here. What does it mean?

image

INFO:tensorflow:Results after 1080 steps (3.808 sec/batch): loss = 4.96496, recall_at_5 = 0.494155092593, recall_at_10 = 1.0, recall_at_2 = 0.225347222222, recall_at_1 = 0.166377314815.
INFO:tensorflow:Results after 1090 steps (3.915 sec/batch): loss = 7.98244, recall_at_5 = 0.49375, recall_at_10 = 1.0, recall_at_2 = 0.225229357798, recall_at_1 = 0.166055045872.
INFO:tensorflow:Results after 1100 steps (4.357 sec/batch): loss = 5.99536, recall_at_5 = 0.493295454545, recall_at_10 = 1.0, recall_at_2 = 0.225056818182, recall_at_1 = 0.165852272727.
INFO:tensorflow:Results after 1110 steps (4.098 sec/batch): loss = 6.19901, recall_at_5 = 0.493130630631, recall_at_10 = 1.0, recall_at_2 = 0.224831081081, recall_at_1 = 0.165709459459.
W tensorflow/core/framework/op_kernel.cc:936] Out of range: RandomShuffleQueue '_5_read_batch_features_eval/random_shuffle_queue' is closed and has insufficient elements (requested 16, current size 8)
         [[Node: read_batch_features_eval = QueueDequeueMany[_class=["loc:@read_batch_features_eval/random_shuffle_queue"], component_types=[DT_STRING, DT_STRING], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](read_batch_features_eval/random_shuffle_queue, read_batch_features_eval/n)]]
W tensorflow/core/framework/op_kernel.cc:936] Out of range: RandomShuffleQueue '_5_read_batch_features_eval/random_shuffle_queue' is closed and has insufficient elements (requested 16, current size 8)
         [[Node: read_batch_features_eval = QueueDequeueMany[_class=["loc:@read_batch_features_eval/random_shuffle_queue"], component_types=[DT_STRING, DT_STRING], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](read_batch_features_eval/random_shuffle_queue, read_batch_features_eval/n)]]
W tensorflow/core/framework/op_kernel.cc:936] Out of range: RandomShuffleQueue '_5_read_batch_features_eval/random_shuffle_queue' is closed and has insufficient elements (requested 16, current size 8)
         [[Node: read_batch_features_eval = QueueDequeueMany[_class=["loc:@read_batch_features_eval/random_shuffle_queue"], component_types=[DT_STRING, DT_STRING], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](read_batch_features_eval/random_shuffle_queue, read_batch_features_eval/n)]]
W tensorflow/core/framework/op_kernel.cc:936] Out of range: RandomShuffleQueue '_5_read_batch_features_eval/random_shuffle_queue' is closed and has insufficient elements (requested 16, current size 8)
         [[Node: read_batch_features_eval = QueueDequeueMany[_class=["loc:@read_batch_features_eval/random_shuffle_queue"], component_types=[DT_STRING, DT_STRING], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](read_batch_features_eval/random_shuffle_queue, read_batch_features_eval/n)]]
INFO:tensorflow:Input queue is exhausted.
INFO:tensorflow:Saving evaluation summary for 0 step: loss = 6.19901, recall_at_5 = 0.493130630631, recall_at_10 = 1.0, recall_at_2 = 0.224831081081, recall_at_1 = 0.165709459459
INFO:tensorflow:Step 1: loss = 4.48393
INFO:tensorflow:training step 100, loss = 1.43115 (4.198 sec/batch).
INFO:tensorflow:Step 101: loss = 1.58307
INFO:tensorflow:training step 200, loss = 0.88204 (5.462 sec/batch).
INFO:tensorflow:Step 201: loss = 0.771809
INFO:tensorflow:training step 300, loss = 0.82967 (4.408 sec/batch).
INFO:tensorflow:Step 301: loss = 0.808524
INFO:tensorflow:training step 400, loss = 0.69053 (4.296 sec/batch).
INFO:tensorflow:Step 401: loss = 0.725083
INFO:tensorflow:training step 500, loss = 0.70153 (4.188 sec/batch).
INFO:tensorflow:Step 501: loss = 0.687135
INFO:tensorflow:training step 600, loss = 0.70823 (4.896 sec/batch).
INFO:tensorflow:Step 601: loss = 0.662524
INFO:tensorflow:training step 700, loss = 0.69382 (4.257 sec/batch).
INFO:tensorflow:Step 701: loss = 0.702078
INFO:tensorflow:training step 800, loss = 0.69694 (4.268 sec/batch).
INFO:tensorflow:Step 801: loss = 0.685443

Loading word embeddings in the validation phase

Word embedding is initialized at the beginning. It's tuned during the learning process. When evaluating the model I see the following message in the output:

INFO:tensorflow:Saving evaluation summary for 17400 step: loss = 1.10763,...
INFO:tensorflow:No glove/vocab path specificed, starting with random embeddings.

I'm interested to know if the learned embeddings is stored in the saved checkpoints?
If yes, why the above message is appear in the evaluation phase?

ValueError: Dimension must be 4 but is 3 for 'rnn/transpose' (op: 'Transpose') with input shapes: [20,?,160,100], [3].

chatbot-retrieval-master/udc_train.py
` INFO:tensorflow:Using config: {'_evaluation_master': '', '_num_ps_replicas': 0, '_tf_config': gpu_options {
per_process_gpu_memory_fraction: 1.0
}
, '_save_summary_steps': 100, '_master': '', '_keep_checkpoint_max': 5, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fc3e5215588>, '_save_checkpoints_steps': None, '_task_id': 0, '_keep_checkpoint_every_n_hours': 10000, '_task_type': None, '_tf_random_seed': None, '_num_worker_replicas': 0, '_is_chief': True, '_environment': 'local', '_model_dir': None, '_save_checkpoints_secs': 600}
WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/contrib/learn/python/learn/monitors.py:267: BaseMonitor.init (from tensorflow.contrib.learn.python.learn.monitors) is deprecated and will be removed after 2016-12-05.
Instructions for updating:
Monitors are deprecated. Please use tf.train.SessionRunHook.
INFO:tensorflow:No glove/vocab path specificed, starting with random embeddings.
INFO:tensorflow:Create CheckpointSaverHook.
2017-06-16 12:26:23.467688: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-16 12:26:23.467713: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-16 12:26:23.467720: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-06-16 12:26:23.467725: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-16 12:26:23.467729: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
INFO:tensorflow:Saving checkpoints for 1 into /home/yogesh/l7Project/chatbot-retrieval-master/runs/1497596182/model.ckpt.
INFO:tensorflow:step = 1, loss = 0.715735
INFO:tensorflow:No glove/vocab path specificed, starting with random embeddings.
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/common_shapes.py", line 671, in _call_cpp_shape_fn_impl
input_tensors_as_shapes, status)
File "/usr/lib/python3.5/contextlib.py", line 66, in exit
next(self.gen)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimension must be 4 but is 3 for 'rnn/transpose' (op: 'Transpose') with input shapes: [20,?,160,100], [3].

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/yogesh/l7Project/chatbot-retrieval-master/udc_train.py", line 64, in
tf.app.run()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "/home/yogesh/l7Project/chatbot-retrieval-master/udc_train.py", line 61, in main
estimator.fit(input_fn=input_fn_train, steps=None, monitors=[eval_monitor])
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 281, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 430, in fit
loss = self._train_model(input_fn=input_fn, hooks=hooks)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 978, in _train_model
_, loss = mon_sess.run([model_fn_ops.train_op, model_fn_ops.loss])
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 484, in run
run_metadata=run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 820, in run
run_metadata=run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 776, in run
return self._sess.run(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/monitored_session.py", line 938, in run
run_metadata=run_metadata))
File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/learn/python/learn/monitors.py", line 1155, in after_run
induce_stop = m.step_end(self._last_step, result)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/learn/python/learn/monitors.py", line 356, in step_end
return self.every_n_step_end(step, output)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/learn/python/learn/monitors.py", line 662, in every_n_step_end
name=self.name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 281, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 518, in evaluate
log_progress=log_progress)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 804, in _evaluate_model
eval_dict = self._get_eval_ops(features, labels, metrics).eval_metric_ops
File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 1160, in _get_eval_ops
features, labels, model_fn_lib.ModeKeys.EVAL)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 1103, in call_model_fn
model_fn_results = self.model_fn(features, labels, **kwargs)
File "/home/yogesh/l7Project/chatbot-retrieval-master/udc_model.py", line 85, in model_fn
tf.concat([all_targets], 0))
File "/home/yogesh/l7Project/chatbot-retrieval-master/models/dual_encoder.py", line 57, in dual_encoder_model
dtype=tf.float32)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/rnn.py", line 497, in dynamic_rnn
for input
in flat_input)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/rnn.py", line 497, in
for input
in flat_input)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/array_ops.py", line 1270, in transpose
ret = gen_array_ops.transpose(a, perm, name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 3721, in transpose
result = _op_def_lib.apply_op("Transpose", x=x, perm=perm, name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 768, in apply_op
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 2338, in create_op
set_shapes_for_outputs(ret)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1719, in set_shapes_for_outputs
shapes = shape_func(op)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1669, in call_with_requiring
return call_cpp_shape_fn(op, require_shape_fn=True)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/common_shapes.py", line 610, in call_cpp_shape_fn
debug_python_shape_fn, require_shape_fn)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/common_shapes.py", line 676, in _call_cpp_shape_fn_impl
raise ValueError(err.message)
ValueError: Dimension must be 4 but is 3 for 'rnn/transpose' (op: 'Transpose') with input shapes: [20,?,160,100], [3]. `

Process finished with exit code 1

change in udc_model.py as below :

probs, loss = model_impl( hparams, mode, tf.concat([all_contexts], 0), tf.concat([all_context_lens], 0), tf.concat([all_utterances], 0), tf.concat([all_utterance_lens], 0), tf.concat([all_targets], 0))

and dual_encoder.py contains this :

tf.concat([context_embedded, utterance_embedded], 0), sequence_length=tf.concat([context_len, utterance_len], 0), dtype=tf.float32) encoding_context, encoding_utterance = tf.split(rnn_states.h, 2, 0)

How predict all responses at one time?

in udc_predict.py at line 57, the answer is predict step by step, this will cause creating new gpu device every time
this will take some time( 1 sec or above).

how predict all responses as array at one time?
in tensorflow readme says

Args:

**x**: Matrix of shape [n_samples, n_features...]. Can be iterator that returns arrays of features. The training input samples for fitting the model. If set, input_fn must be None.

Issue with dict.iteritems

Ok so this one is in between your code and tensorflow potentially being bugged. I'm just posting my quick fix for other people to see.

So if you get the "AttributeError: 'dict' object has no attribute 'iteritems' " error when running udc_train.py, the simple fix is to change all occurences of 'iteritems' in graph_io.py by 'items' (using sed for example).

I know it is not a good long term solution but it will allow you to run the script at least temporally.

No module named 'tensorflow.contrib.learn.python.learn.metric_spec

I am using tensorflow 0.10 and python 3.5 and getting following error while running udc_train.py

Traceback (most recent call last):
File "udc_train.py", line 7, in <module>
import udc_metrics
File "/home/sudeep/git_hub/chatbot-retrieval/udc_metrics.py", line 3, in <module>
from tensorflow.contrib.learn.python.learn.metric_spec import MetricSpec
ImportError: No module named 'tensorflow.contrib.learn.python.learn.metric_spec'

why

Traceback (most recent call last):
File "udc_train.py", line 4, in
import tensorflow as tf
File "/usr/local/lib/python2.7/dist-packages/tensorflow/init.py", line 23, in
from tensorflow.python import *
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/init.py", line 45, in
from tensorflow.python import pywrap_tensorflow
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 28, in
_pywrap_tensorflow = swig_import_helper()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow', fp, pathname, description)
ImportError: libcudart.so.7.5: cannot open shared object file: No such file or directory

ValueError: Only call `sigmoid_cross_entropy_with_logits` with named arguments (labels=..., logits=..., ...)

please help me with solving the ValueError, following are the error logs

abhilash@abhilash-Inspiron-N5010:~/Documents/chat bot/chatbot-retrieval-master$ python udc_train.py
INFO:tensorflow:Using config: {'_model_dir': None, '_save_checkpoints_secs': 600, '_num_ps_replicas': 0, '_keep_checkpoint_max': 5, '_tf_random_seed': None, '_task_type': None, '_environment': 'local', '_is_chief': True, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f6719eb9090>, '_tf_config': gpu_options {
per_process_gpu_memory_fraction: 1.0
}
, '_num_worker_replicas': 0, '_task_id': 0, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_evaluation_master': '', '_keep_checkpoint_every_n_hours': 10000, '_master': ''}
WARNING:tensorflow:From /home/abhilash/.local/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/monitors.py:267: init (from tensorflow.contrib.learn.python.learn.monitors) is deprecated and will be removed after 2016-12-05.
Instructions for updating:
Monitors are deprecated. Please use tf.train.SessionRunHook.
INFO:tensorflow:No glove/vocab path specificed, starting with random embeddings.
Traceback (most recent call last):
File "udc_train.py", line 64, in
tf.app.run()
File "/home/abhilash/.local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "udc_train.py", line 61, in main
estimator.fit(input_fn=input_fn_train, steps=None, monitors=[eval_monitor])
File "/home/abhilash/.local/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py", line 281, in new_func
return func(*args, **kwargs)
File "/home/abhilash/.local/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 430, in fit
loss = self._train_model(input_fn=input_fn, hooks=hooks)
File "/home/abhilash/.local/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 927, in _train_model
model_fn_ops = self._get_train_ops(features, labels)
File "/home/abhilash/.local/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 1132, in _get_train_ops
return self._call_model_fn(features, labels, model_fn_lib.ModeKeys.TRAIN)
File "/home/abhilash/.local/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 1103, in _call_model_fn
model_fn_results = self._model_fn(features, labels, **kwargs)
File "/home/abhilash/Documents/chat bot/chatbot-retrieval-master/udc_model.py", line 39, in model_fn
targets)
File "/home/abhilash/Documents/chat bot/chatbot-retrieval-master/models/dual_encoder.py", line 81, in dual_encoder_model
losses = tf.nn.sigmoid_cross_entropy_with_logits(logits, tf.to_float(targets))
File "/home/abhilash/.local/lib/python2.7/site-packages/tensorflow/python/ops/nn_impl.py", line 147, in sigmoid_cross_entropy_with_logits
_sentinel, labels, logits)
File "/home/abhilash/.local/lib/python2.7/site-packages/tensorflow/python/ops/nn_ops.py", line 1562, in _ensure_xent_args
"named arguments (labels=..., logits=..., ...)" % name)
ValueError: Only call sigmoid_cross_entropy_with_logits with named arguments (labels=..., logits=..., ...)

Get an error - segmentation fault python udc_train.py

Description

Try to run the program from Mac OSX EI Caption with Python2.7 and tensorflow-0.9.0, tf is built with sourcecodes (commit id 71f6bb336e5e11d6da2cedac6ba1c992ad9992bd). Also, cuda and cuDNN are configured to support GPU.

tf is built with

bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

But get the following error when executing python udc_train.py.

image

The data is downloaded and extracted as below.
image

And the packages in README.md are installed.

Python 2.7.6 hit "unsupported pickle protocol: 3"

This might be a tensorflow issue.
xxx$ python udc_predict.py --model_dir=./runs/1468290997/
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
Traceback (most recent call last):
File "udc_predict.py", line 27, in
FLAGS.vocab_processor_file)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/preprocessing/text.py", line 228, in restore
return pickle.loads(f.read())
ValueError: unsupported pickle protocol: 3

Looks like the default pickle protocol for tensorflow is 3 which is not supported by Python2.x.
From Python cli, I saw the highest protocol supported is 2.
xx$ python
Python 2.7.6 (default, Jun 22 2015, 17:58:13)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.

import pickle
print pickle.HIGHEST_PROTOCOL
2
import cPickle
print cPickle.HIGHEST_PROTOCOL
2

not working anymore on Tensorflow 1.0 version

The first error is about tf.nn has no attribute rnn_cell for models/dual_encoder.py line 45.
I fixed this error by changing from tf.nn.rnn_cell to tf.contrib.rnn.core_rnn_cell.

Then the second error is about TypeError: Expected int32, got list containing Tensors of type '_Message' instead.

Some parameter types are not matched anymore.
Can anyone fix this?

I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:126] Couldn't open CUDA library libcudnn.so.5. LD_LIBRARY_PATH: /usr/local/cuda-8.0/lib64:
I tensorflow/stream_executor/cuda/cuda_dnn.cc:3517] Unable to load cuDNN DSO
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally
INFO:tensorflow:Using config: {'_tf_random_seed': None, '_task_id': 0, '_save_summary_steps': 100, '_keep_checkpoint_max': 5, '_save_checkpoints_secs': 600, '_master': '', '_environment': 'local', '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f0aa666a358>, '_evaluation_master': '', '_task_type': None, '_num_ps_replicas': 0, '_keep_checkpoint_every_n_hours': 10000, '_tf_config': gpu_options {
per_process_gpu_memory_fraction: 1
}
, '_save_checkpoints_steps': None, '_is_chief': True}
WARNING:tensorflow:From /home/ucl/.local/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/monitors.py:267: BaseMonitor.init (from tensorflow.contrib.learn.python.learn.monitors) is deprecated and will be removed after 2016-12-05.
Instructions for updating:
Monitors are deprecated. Please use tf.train.SessionRunHook.
INFO:tensorflow:No glove/vocab path specificed, starting with random embeddings.
Traceback (most recent call last):
File "udc_train.py", line 64, in
tf.app.run()
File "/home/ucl/.local/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 44, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "udc_train.py", line 61, in main
estimator.fit(input_fn=input_fn_train, steps=None, monitors=[eval_monitor])
File "/home/ucl/.local/lib/python3.5/site-packages/tensorflow/python/util/deprecation.py", line 280, in new_func
return func(*args, **kwargs)
File "/home/ucl/.local/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 426, in fit
loss = self._train_model(input_fn=input_fn, hooks=hooks)
File "/home/ucl/.local/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 934, in _train_model
model_fn_ops = self._call_legacy_get_train_ops(features, labels)
File "/home/ucl/.local/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 1003, in _call_legacy_get_train_ops
train_ops = self._get_train_ops(features, labels)
File "/home/ucl/.local/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 1162, in _get_train_ops
return self._call_model_fn(features, labels, model_fn_lib.ModeKeys.TRAIN)
File "/home/ucl/.local/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 1133, in _call_model_fn
model_fn_results = self._model_fn(features, labels, **kwargs)
File "/home/ucl/chatbot-retrieval/udc_model.py", line 39, in model_fn
targets)
File "/home/ucl/chatbot-retrieval/models/dual_encoder.py", line 54, in dual_encoder_model
tf.concat(0, [context_embedded, utterance_embedded]),
File "/home/ucl/.local/lib/python3.5/site-packages/tensorflow/python/ops/array_ops.py", line 1029, in concat
dtype=dtypes.int32).get_shape(
File "/home/ucl/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 637, in convert_to_tensor
as_ref=False)
File "/home/ucl/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 702, in internal_convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "/home/ucl/.local/lib/python3.5/site-packages/tensorflow/python/framework/constant_op.py", line 110, in _constant_tensor_conversion_function
return constant(v, dtype=dtype, name=name)
File "/home/ucl/.local/lib/python3.5/site-packages/tensorflow/python/framework/constant_op.py", line 99, in constant
tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape, verify_shape=verify_shape))
File "/home/ucl/.local/lib/python3.5/site-packages/tensorflow/python/framework/tensor_util.py", line 367, in make_tensor_proto
_AssertCompatible(values, dtype)
File "/home/ucl/.local/lib/python3.5/site-packages/tensorflow/python/framework/tensor_util.py", line 302, in _AssertCompatible
(dtype.name, repr(mismatch), type(mismatch).name))
TypeError: Expected int32, got list containing Tensors of type '_Message' instead.

udc_train.py-- Error with metricSpec.. @k=10

Could not create metric ops for MetricSpec(metric_fn=streaming_sparse_recall_at_k, prediction_key=None, label_key=None, weight_key=None), input must have last dimension >= k = 10 but is 1 for 'recall_at_10/TopKV2' (op: 'TopKV2') with input shapes: [?,1], [] and withcomputed input tensors: input[1]

What am i Doing wrong? I have run it for 2 hours.. and ended up having errors. please suggest

Monitors are deprecated. Please use tf.train.SessionRunHook.

while I run the udc_train.py on tensorflow 1.1.0, it raise the error like that:

WARNING:tensorflow:From C:\Anaconda3\lib\site-packages\tensorflow\contrib\learn\python\learn\monitors.py:267: BaseMonitor.init (from tensorflow.contrib.learn.python.learn.monitors) is deprecated and will be removed after 2016-12-05.
Instructions for updating:
Monitors are deprecated. Please use tf.train.SessionRunHook.
Traceback (most recent call last):
File "E:/project/chatbot/udc_train.py", line 67, in
tf.app.run()
File "C:\Anaconda3\lib\site-packages\tensorflow\python\platform\app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "E:/project/chatbot/udc_train.py", line 64, in main
estimator.fit(input_fn=input_fn_train, steps=None, monitors=[eval_monitor])
File "C:\Anaconda3\lib\site-packages\tensorflow\python\util\deprecation.py", line 281, in new_func
return func(*args, **kwargs)
File "C:\Anaconda3\lib\site-packages\tensorflow\contrib\learn\python\learn\estimators\estimator.py", line 430, in fit
loss = self._train_model(input_fn=input_fn, hooks=hooks)
File "C:\Anaconda3\lib\site-packages\tensorflow\contrib\learn\python\learn\estimators\estimator.py", line 927, in _train_model
model_fn_ops = self._get_train_ops(features, labels)
File "C:\Anaconda3\lib\site-packages\tensorflow\contrib\learn\python\learn\estimators\estimator.py", line 1132, in _get_train_ops
return self._call_model_fn(features, labels, model_fn_lib.ModeKeys.TRAIN)
File "C:\Anaconda3\lib\site-packages\tensorflow\contrib\learn\python\learn\estimators\estimator.py", line 1103, in _call_model_fn
model_fn_results = self._model_fn(features, labels, **kwargs)
File "E:\project\chatbot\udc_model.py", line 39, in model_fn
targets
File "E:\project\chatbot\models\dual_encoder.py", line 47, in dual_encoder_model
tf.concat(0, [context_embedded, utterance_embedded]),
File "C:\Anaconda3\lib\site-packages\tensorflow\python\ops\array_ops.py", line 1029, in concat
dtype=dtypes.int32).get_shape(
File "C:\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 639, in convert_to_tensor
as_ref=False)
File "C:\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 704, in internal_convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "C:\Anaconda3\lib\site-packages\tensorflow\python\framework\constant_op.py", line 113, in _constant_tensor_conversion_function
return constant(v, dtype=dtype, name=name)
File "C:\Anaconda3\lib\site-packages\tensorflow\python\framework\constant_op.py", line 102, in constant
tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape, verify_shape=verify_shape))
File "C:\Anaconda3\lib\site-packages\tensorflow\python\framework\tensor_util.py", line 370, in make_tensor_proto
_AssertCompatible(values, dtype)
File "C:\Anaconda3\lib\site-packages\tensorflow\python\framework\tensor_util.py", line 302, in _AssertCompatible
(dtype.name, repr(mismatch), type(mismatch).name))
TypeError: Expected int32, got list containing Tensors of type '_Message' instead.

could you give me some advice, thank you

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.