baidu / dialogue Goto Github PK

Python 95.47% Shell 4.53%

chatbot dialogue research

dialogue's Introduction

house.baidu.com

dialogue's People

Contributors

Stargazers

Watchers

Forkers

0yuanzhang0 yinyunfeng mabaochang yucoian cstghitpku ritaran xingxinyu96 xiaoggw liuchang97 crisyu zxy556677 helicqin czheng17 r-wheeler szho42 tomarraj008 happyyolanda sra1github yangliuy wwqydy imguozhen mozhouting yunying24 test1855 elegant-bot chenjindong yangjun023 xiongchao beethovenvirus chenmoshushi chenq1114 xiaopanlyu kennylsn wuzhiye7 pfzhu kauttoj qsong4 scape1989 auscenery wenjing9870 ai-coder yhx123hero wangxiao1021 helloworld729 xyzhou-puck chancezhw leichen9 bringtree wakedupchan workingchen kifish zhangxt gshan4056 zhaoyun630 wangjw424 simplejian zhangsancode jaymarx gzm9583 xiaoanshi 154912369 liushui9404 echo719 leerainbow vivianzy1985 lhmzll charliezhugj horsedongmin markwjj canyuchen shelleyyyyu whatyouknow123 parvez2017 ironbeliever eric-seekas chuhai-lin nanqiai mma1979 renzhan xuanphu108 victorhek jiasumatrix echomaster wonderinvention stevenlol candy77721 morecry ajunlonglive zsc19 qihanduan fhz-coder urue-2 maldil flyfie iclgg todorka82 iq-scm

dialogue's Issues

Douban model save condition

ubuntu corpus
每训练完一个epoch，就会在验证集上跑一遍utils/evaluation.py，根据p1@10和p2@10的累计得分判断模型性能
douban corpus
训练阶段保存模型的条件也是用utils/evaluation.py吗？因为我看test阶段会把脚本改成utils/douban_evaluation.py

ValueError: (InvalidArgument) Broadcast dimension mismatch.

执行“bash run.sh atis_intent train”命令时出现如下错误：

ValueError: (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [-1, 12, 128, 128] and the shape of Y = [-1, 12, -1]. Received [128] in X is not equal to [12] in Y at i:2.
[Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at /paddle/paddle/fluid/operators/elementwise/elementwise_op_function.h:169)
[operator < elementwise_add > error]

evaluate question

您好，我在跑豆瓣的data.pkl数据集的时候，在save_step会卡住，我初步分析原因应该是验证集数量太大和for循环太多导致的。我观察了在跑data_small.pkl的时候，验证集的batch_num大概是31个，验证需要花费时间大概两分钟，而data.pkl的batch_num大概是1562个，验证的时候就会一直卡在那个地方。我想问一下您在实验的时候对这个部分有什么优化吗

There is something wrong

The line 10 in the document "test_and_evaluate.py" should be

import utils.evaluation as eva #for ubuntu
#import utils.douban_evaluation as eva #for douban

not just "import utils.evaluation as eva"
I am not sure if the same line in the document "train_and_evaluate.py" should be modified or not.It seems to work without modification.

Output response

您好，
请问使用ubuntu和douban的数据集, 最后输出的对话结果大概是什么样子？

不太懂test的结果。

你好，我把main文件里的调用改为test.test(conf, model)。参数的设置如下：
conf = {
"data_path": "./data/douban/data.pkl",
"save_path": "./output/douban/DAM_test/",
"word_emb_init": "./data/douban/word_embedding.pkl",
#"init_model": None, # None for train
"init_model": "./output/douban/DAM/DAM.ckpt", #should be set for test

"rand_seed": None, 

"drop_dense": None,
"drop_attention": None,

"is_mask": True,
"is_layer_norm": True,
"is_positional": False,  

"stack_num": 5,  
"attention_type": "dot",

"learning_rate": 1e-3,
"vocab_size": 172130, #434512
"emb_size": 200,
"batch_size": 32, #200 for test

"max_turn_num": 9,  
"max_turn_len": 50, 

"max_to_keep": 1,
"num_scan_data": 2,
"_EOS_": 1, #28270 for douban data
"final_n_class": 1,

}

得到的程序在shell中的运行结果：
sucess init ./output/douban/DAM/DAM.ckpt
starting test
2019-01-16 15:50:51
finish test
2019-01-16 15:51:39
finish evaluation
2019-01-16 15:51:39
我观察到，在我的save_path中得到了两个文件：result.test和score.test。
其中，result.test数据为：
0.542857142857
0.133834586466
0.24962406015
0.565413533835
请问这个结果代表什么呢？后三个数似乎指的是原文中的R10@1、R10@2和R10@5，是这样么？

报错UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.

tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[node cnn_aggregation/Conv3D (defined at /DAM/utils/layers.py:257) = Conv3D[T=DT_FLOAT, data_format="NDHWC", dilations=[1, 1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1, 1], _device="/job:localhost/replica:0/task:0/device:GPU:0"](stack_18, cnn_aggregation/filter_0/read)]]
[[{{node loss/Mean/_273}} = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_46002_loss/Mean", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

请问有人遇到过类似的问题嘛？在没有sudo权限的情况下怎样才能提高cudnn的版本呢

DAM model in serving

Hi, authors,

In the production stage, given a fix size of canned responses, it needs to compute the confidence level, i.e. logits, with each canned responses. Do you guys have a smarter way of getting the top n from the canned responses, for each context?

Cheers

Data and model with a direct download link.

Can you share the data and model using any other portal with direct download link available?

Different between test.txt and score

Hello Mr,Zhou:
There is one thing i found that is different length between test.txt and score in douban
In data/douban/test.txt there is 10000L.But when i download models and unzip it ,i found output/douban/DAM/score is 6656L.
some operation i miss?

In predicting time, do we need to compute all the sentence in the dataset?

Any sugguestion to accelerate?
Thank you!

这个多轮模型预测用的时候，是不是要把库里所有候选都算一遍？
@xyzhou-puck

Where is CUDA_VISIBLE_DEVICES=0 ../attention_SMN/python/bin/python main.py？

Hi, I would like to ask about which place this attention_SMN lies in.
#! /bin/bash
CUDA_VISIBLE_DEVICES=0 ../attention_SMN/python/bin/python main.py
Thank you very much and look forward to your reply!

Does the model have a pytorch version?

the question about “Attentive Module”

作者你好，就是关于您论文中的“Attentive Module”中的这个模型和论文“Attention is all you need” 中提出来的Transformer模型真的是完全一模一样的吗？
如果是一样的，请问一下在您的代码中Transformer模型的相关参数设置在哪里呀？
麻烦您解答一下

layers.py内容好像不太对

你好，我在阅读您的代码时，发现在utils下的layers.py文件中，第285及以后的各行，都是if add_relu:conv_0 = tf.nn.elu(conv_0) 也就是您定义的变量是是否增加relu——add_relu，而实际上执行的是是否增加elu。据我所知，这两个函数应该是不一样的，您是否考虑修改add_relu变量名或者修改elu呢？

Pickle Error

I'm using the data files hosted on dropbox. I'm getting the following error:

Traceback (most recent call last):
  File "main.py", line 54, in <module>
    train.train(conf, model)
  File "/home/shikib/multi_level/sota/Dialogue/DAM/bin/train_and_evaluate.py", line 24, in train
    train_data, val_data, test_data = pickle.load(open(conf["data_path"], 'rb'))
cPickle.UnpicklingError: invalid load key, '4'.

Any help would be appreciated.

I would like to ask when <Multi-View Response Selection for Human-Computer Conversation>'s code will be released?~

Just the title describes, I'd like to implement this algorithm in this paper.
Hope you could release it. Or could you please tell me when you would like to release it?
Thank you very much, looking forward to your reply!

The format of the input_example about "udc" task.

Dialogue/DGU/dgu/reader.py

Lines 769 to 774 in 8fe0e64

    
           input_ids.append(CLS) 
        
           segment_ids.append(0) 
        
           input_ids.extend(tokens_a_ids) 
        
           segment_ids.extend([0] * len(tokens_a_ids)) 
        
           input_ids.append(SEP) 
        
           segment_ids.append(0)

Hello!
When the task is "udc", does the context been seen as a single sentence rather than multi sentences?

How did you generate the input data files like data.pkl, word2id and word_embedding.pkl ?

Firstly thanks for the great ACL paper and open source code!

I have a question on the data preprocessing part. How did you generate the input data files like data.pkl, word2id,vocab.txt and word_embedding.pkl ? Let's take UDC as the example. The raw data only contains train.txt/valid.txt/test.txt. I checked your code and there are no scripts on generating these files like data.pkl and word_embedding.pkl. Could you also upload these data preprocessing scripts ?

有没尝试用过BERT的tokenizer方式？

vocab只有2W-3W个词

Preprocessing scripts

Can you share please the scripts you used to preprocess data and also to train word embeddings ?

Some questions about the model and dataset

您好，我在看了源码后有些问题，希望作者能帮忙解答下，感谢。
1、在Attentive Module的FFN层里面，权重初始化用了orthogonal_initializer()，请问为什么用到这种初始化？
2、 FFN层的隐藏层维度大小和输入维度相同，请问是否有测试过其他维度？
3、请问ubuntu dataset里面的word_embedding.pkl文件，是自己训练出来的吗？如果是，请问有没有相应的代码？
希望作者可以回答下，谢谢了哈~

Direct link to download data

Please can you provide direct link to download data ? I've to register via your link and it's impossible.

The hardware requirement to run the model

I configured a vm in GCP to run the DMA model train code. It seems 256GB is need to run it. Is it normal?

tf.einsum('biks,bjks->bijs', t_a_r, r_a_t)

想请问一下，这个公式的含义是什么？有没有相关的资料可以参考。多谢

the douban testset

hi,
I have questions about douban test data.

the pos/neg label is not 1:9 ?
I find that some context has all ten neg responses but no pos response, for such case, the denominator will be 0, thus how to calculate p1/10 result by using douban_evaluation.py?
I find that the testset len in data.pkl in douban is 6670, not equal to 10000 in test.txt, how to deal with this? is there any special point of these 6670 context?

thanks

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [434513,200] rhs shape= [172131,200]

I'm trying to reproduce the test result on douban corpus dataset.

tersorflow 1.5
python 2.7
ubuntu14.04

1. here is the dir structrue

output/douban/

|-- DAM
|   |-- DAM.ckpt.data-00000-of-00001
|   |-- DAM.ckpt.index
|   |-- DAM.ckpt.meta
|   |-- checkpoint
|   |-- result
|   `-- score
|-- DAM_cross
| . . . . . .    
| . . . . . .

data/douban/

data/douban/
|-- data.pkl
|-- data_small.pkl
|-- dev.txt
|-- test.txt
|-- train.txt
|-- word2id
`-- word_embedding.pkl

2. here is the error

(DAM) zhibo@cvda-ultra:~/zhibo/DAM$ python main.py
loading word emb init
starting loading data
2018-09-27 16:39:54
finish loading data
finish building test batches
2018-09-27 16:41:20
configurations: {'vocab_size': 434512, 'num_scan_data': 2, 'data_path': './data/douban/data.pkl', 'max_turn_num': 9, 'emb_size': 200, 'is_mask': True, 'drop_attention': None, 'word_emb_init': './data/douban/word_embedding.pkl', 'save_path': './output/douban/temp/', 'is_positional': False, 'is_layer_norm': True, '_EOS_': 1, 'learning_rate': 0.001, 'drop_dense': None, 'rand_seed': None, 'final_n_class': 1, 'batch_size': 200, 'attention_type': 'dot', 'max_turn_len': 50, 'max_to_keep': 1, 'init_model': './output/douban/DAM/DAM.ckpt', 'stack_num': 5}
WARNING:tensorflow:From /home1/zhibo/codebase/DAM/utils/operations.py:157: calling reduce_mean (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.
Instructions for updating:
keep_dims is deprecated, use keepdims instead
sim shape: (200, 9, 50, 50, 12)
conv_0 shape: (200, 9, 50, 50, 32)
pooling_0 shape: (200, 3, 17, 17, 32)
conv_1 shape: (200, 3, 17, 17, 16)
pooling_1 shape: (200, 1, 6, 6, 16)
build graph sucess
2018-09-27 16:44:29
2018-09-27 16:44:29.380616: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2018-09-27 16:44:29.608946: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:05:00.0
totalMemory: 10.91GiB freeMemory: 1.72GiB
2018-09-27 16:44:29.797182: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 1 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:06:00.0
totalMemory: 10.92GiB freeMemory: 9.18GiB
2018-09-27 16:44:30.025387: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 2 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:09:00.0
totalMemory: 10.92GiB freeMemory: 10.55GiB
2018-09-27 16:44:30.263003: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 3 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:0a:00.0
totalMemory: 10.92GiB freeMemory: 8.12GiB
2018-09-27 16:44:30.263730: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Device peer to peer matrix
2018-09-27 16:44:30.263833: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1126] DMA: 0 1 2 3
2018-09-27 16:44:30.263845: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 0:   Y Y Y Y
2018-09-27 16:44:30.263853: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 1:   Y Y Y Y
2018-09-27 16:44:30.263863: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 2:   Y Y Y Y
2018-09-27 16:44:30.263870: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 3:   Y Y Y Y
2018-09-27 16:44:30.263882: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:05:00.0, compute capability: 6.1)
2018-09-27 16:44:30.263891: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:1) -> (device: 1, name: GeForce GTX 1080 Ti, pci bus id: 0000:06:00.0, compute capability: 6.1)
2018-09-27 16:44:30.263899: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:2) -> (device: 2, name: GeForce GTX 1080 Ti, pci bus id: 0000:09:00.0, compute capability: 6.1)
2018-09-27 16:44:30.263907: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:3) -> (device: 3, name: GeForce GTX 1080 Ti, pci bus id: 0000:0a:00.0, compute capability: 6.1)
Traceback (most recent call last):
  File "main.py", line 62, in <module>
    test.test(conf, model)
  File "/home1/zhibo/codebase/DAM/bin/test_and_evaluate.py", line 41, in test
    _model.saver.restore(sess, conf["init_model"])
  File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1686, in restore
    {self.saver_def.filename_tensor_name: save_path})
  File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 895, in run
    run_metadata_ptr)
  File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1128, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1344, in _do_run
    options, run_metadata)
  File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1363, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [434513,200] rhs shape= [172131,200]
         [[Node: loss/save/Assign_293 = Assign[T=DT_FLOAT, _class=["loc:@word_embedding"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](loss/word_embedding/Adam, loss/save/RestoreV2_293/_571)]]

Caused by op u'loss/save/Assign_293', defined at:
  File "main.py", line 62, in <module>
    test.test(conf, model)
  File "/home1/zhibo/codebase/DAM/bin/test_and_evaluate.py", line 35, in test
    _graph = _model.build_graph()
  File "/home1/zhibo/codebase/DAM/models/net.py", line 188, in build_graph
    self.saver = tf.train.Saver(max_to_keep = self._conf["max_to_keep"])
  File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1239, in __init__
    self.build()
  File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1248, in build
    self._build(self._filename, build_save=True, build_restore=True)
  File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1284, in _build
    build_save=build_save, build_restore=build_restore)
  File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 765, in _build_internal
    restore_sequentially, reshape)
  File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 440, in _AddRestoreOps
    assign_ops.append(saveable.restore(tensors, shapes))
  File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 160, in restore
    self.op.get_shape().is_fully_defined())
  File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/ops/state_ops.py", line 276, in assign
    validate_shape=validate_shape)
  File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/ops/gen_state_ops.py", line 59, in assign
    use_locking=use_locking, name=name)
  File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3160, in create_op
    op_def=op_def)
  File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1625, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [434513,200] rhs shape= [172131,200]
         [[Node: loss/save/Assign_293 = Assign[T=DT_FLOAT, _class=["loc:@word_embedding"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](loss/word_embedding/Adam, loss/save/RestoreV2_293/_571)]]

3. here's the `main.py` for evaluation

under main.py I made some modification for testing on douban corpus. I simply change data_path, save_path , init_model . batch_size and _EOS_ as comment suggested.

conf = {
    # "data_path": "./data/ubuntu/data.pkl",
    "data_path": "./data/douban/data.pkl",
    # "save_path": "./output/ubuntu/temp/",
    "save_path": "./output/douban/temp/",
    "word_emb_init": "./data/douban/word_embedding.pkl",
    "init_model": "./output/douban/DAM/DAM.ckpt", #should be set for test

    "rand_seed": None,

    "drop_dense": None,
    "drop_attention": None,

    "is_mask": True,
    "is_layer_norm": True,
    "is_positional": False,

    "stack_num": 5,
    "attention_type": "dot",

    "learning_rate": 1e-3,
    "vocab_size": 434512,
    "emb_size": 200,
    # "batch_size": 256, #200 for test
    "batch_size": 200, #200 for test

    "max_turn_num": 9,
    "max_turn_len": 50,

    "max_to_keep": 1,
    "num_scan_data": 2,
    # "_EOS_": 28270, #1 for douban data
    "_EOS_": 1, #1 for douban data
    "final_n_class": 1,
}

my question is

is it because i'm using tensoflow1.5? Or something wrong with the config? It seems that something wrong with loading the model.(I'm new to tensorflow.)

需要自己写数据预处理的代码？

没有数据转成pickle的代码吧？谢谢！
@xyzhou-puck

豆瓣数据集训练有问题

请问一下，我在源程序上没有做改动运行，python2和tensorflow1.2.1，然后第一次保存模型时evaluate的结果MAP和MRR都达到了0.9以上，之后还会更高，逼近0.99，这肯定是有问题的，这会是什么问题导致的呢

what does the "corpus_file text_file topic_file index_file 1 1" mean?

In the model "knowledge-driven dialogue/generative_pytorch_version/"
In the readme.rd:
Step 1: Preprocess
preprocess all the data using in model training and testing stage with the following commands

python ./tools/convert_conversation_corpus_to_model_text.py corpus_file text_file topic_file index_file 1 1

what does the "corpus_file text_file topic_file index_file 1 1" mean? where are the three files?

No such file or directory: './data/word_embedding.pkl

I got this error message when I try to reproduce the test result using the pretrained model. However, the pretrained word embedding model is missing.

loading word emb init
Traceback (most recent call last):
  File "main.py", line 57, in <module>
    model = net.Net(conf)
  File "./models/net.py", line 22, in __init__
    self._word_embedding_init = pickle.load(open(self._conf['word_emb_init'], 'rb'))
IOError: [Errno 2] No such file or directory: './data/word_embedding.pkl'

my question is where can i get ./data/word_embedding.pkl ?
Thank you

请问下多轮和单轮的兼容问题有什么好的策略吗？

@xyzhou-puck 多谢！

比如第二轮时候，可以走多轮，也可以走单轮，这时怎么办。

new question about ubuntu experiment

您好，我在Ubuntu数据集上使用默认参数运行时，无法得到您paper中的结果，仅能达到0.90624，请问这块是有其他的参数变动或者trick吗？

oov problem in training

Hi
i want to retrain the model in douban dataset
but i got oov error
my gpu is Tesla M40(11448MiB)
mem free 141G
anything i can do except change batch_size?

Training configuration (number of GPUS and epoch durations)

Can you please provide information about the hardware configuration of the machine on which you trained your model (GPU, RAM, Disk ..) ?
the duration of one epoch and how for how many epochs did you train your model ?

Thanks

Number of attention layers exploration

In our projects, we tested the accuracy of using different number of layers. We tested the model:
(1) retrieve true responses from 128 (random responses, like the validation data in ubuntu dataset - which has 9 false response).
(2) retrieve the true response from a big canned list (rough 300 responses).

As shown in the result, the number of attention layers in our case does not significantly affect the accuracy.

Also, from your experience, the accuracy about 50% top 3 (out of 128) is a reasonable number?

Also, for production, we tested the throughput. Within Hight end GPU, it is about 100ms for one request of batch size of 128; however, it is extremely slow on CPU systems.

Do you guys try to deploy the DAM model on CPUs?

Cheers

貌似build_batches方法会漏掉最后一批不足一batch的数据

reader.py里的build_batches方法

Why use 3d CNN not 2d CNN in your paper?

您好，请问一下在您的论文里为什么使用是3D卷积而不是2D卷积呢？
3D卷积在这里相对于2D卷积带来了什么优势呢？有没有什么公式或者数学上的解释或者之类的？
还有就是filter设定为【3,3,3】这个大小设定是试验出来的吗？如果是，还试了哪些？
谢谢！

data.zip 无法下载

hi，
您好，数据集一直无法下载，能否重新共享一份啊？非常感谢！

vocabulary size of douban data

Hi,

what's the min count when training douban data word embedding，as there are more than 30w+ vocab in douban data, do you test the impact of different vocb size to this model.
Is there significant influence whether or not to set word embedding trainable?

Thanks.

Why use single head and do not use positional embedding in Attentive module

请问一下作者，您这里为什么在Attentive module只是用单头，而且没有使用位置编码。并且用于连接的FFN也设置隐藏层数目也设置成为了和词向量一样的维度。

另外就是还有一个问题在交叉注意力中，
Ui= AttentiveModule(U, R, R)，Rl = AttentiveModule(R，U, U)
这个公式的设计大概是一个怎样的想法

刚刚那个问题一不小心手误删掉了，表示尴尬。
麻烦作者解答一下谢谢啦。

论文下载问题

DAM这篇论文现在不能下载了吗

what is the usage for tt_turns_len placeholder in the model

Hi,
A short question:
when going through the code, wondering the reason why the placeholder, tt_turns_len, is NOT used in the model, after being defined?

cheers

预处理的pkl文件

您好，
能否释放一下数据集处理成pkl文件的代码吗，或者给一个参考的code，自己也按照模型的占位符以及论文内容写了一个处理的代码，但是发现处理的结果有点出入。非常感谢！

best

test result running pre-trained model for douban data is far lower than published result

Hi,
This is result in output/douban/DAM/result(sam as published paper)
0.550061141847
0.600620002387
0.427067669173
0.25380833035
0.410119345984
0.756952500298

below is result I using pre-trained model output/douban/DAM/DAM.ckpt
0.54090909090909
0.1348484848486
0.25
0.566666666667

I wonder why there is so largely difference between the two results, as they should have been similar.

Thanks.

Some questions about the results on Douban

hi:
On the Douban dataset, we can't get the result as higher as yours. We trained the models and selected the best result but it is lower than issued on valid. How do you train the model and get the result?
Thanks

The matching scores are all equal when training and the model can not converge.

I am trying to train the model on douban data named "data_small.pkl" .
My question: The matching scores are all equal when training and the model can not converge.

The data is downloaded from the link in “ReadMe.txt”.

tersorflow 1.8
python 3.5

1. Here is the dir structrue.

data/douban/
|-- data.pkl
|-- data_small.pkl
|-- dev.txt
|-- test.txt
|-- train.txt
|-- word2id
`-- word_embedding.pkl

2. Here, here's the main.py for training.

conf = {
    "data_path": "./data/douban/data_small.pkl",
    "save_path": "./output/douban/DAM/",
    "word_emb_init": "./data/douban/word_embedding.pkl",
    "init_model": None, #"./output/douban/DAM_cross/DAM.ckpt", #should be set for test
    
    "rand_seed": None, 

    "drop_dense": None,
    "drop_attention": None,

    "is_mask": True,
    "is_layer_norm": True,
    "is_positional": False,  

    "stack_num": 5,  
    "attention_type": "dot",

    "learning_rate": 1e-3,
    "vocab_size": 172130, #434512
    "emb_size": 200,
    "batch_size": 256, #200 for test

    "max_turn_num": 9,  
    "max_turn_len": 50, 

    "max_to_keep": 1,
    "num_scan_data": 2,
    "_EOS_": 1, #28270
    "final_n_class": 1,
}

问下怎么应用？

问下模型训练好了以后怎么用在问答系统中，就是说怎么根据模型做出一个人机对话系统？当我输入一句话以后，模型给我输出一句话？可以发一下应用模块的代码么？

	input_ids.append(CLS)
	segment_ids.append(0)
	input_ids.extend(tokens_a_ids)
	segment_ids.extend([0] * len(tokens_a_ids))
	input_ids.append(SEP)
	segment_ids.append(0)