Giter Site home page Giter Site logo

dialogue's Introduction

house.baidu.com

dialogue's People

Contributors

bringtree avatar guotong1988 avatar imguozhen avatar szho42 avatar wangxiao1021 avatar xixiaoyao avatar xyzhou-puck avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dialogue's Issues

Douban model save condition

  • ubuntu corpus
    每训练完一个epoch,就会在验证集上跑一遍utils/evaluation.py,根据p1@10和p2@10的累计得分判断模型性能
  • douban corpus
    训练阶段保存模型的条件也是用utils/evaluation.py吗?因为我看test阶段会把脚本改成utils/douban_evaluation.py

ValueError: (InvalidArgument) Broadcast dimension mismatch.

执行“bash run.sh atis_intent train”命令时出现如下错误:

ValueError: (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [-1, 12, 128, 128] and the shape of Y = [-1, 12, -1]. Received [128] in X is not equal to [12] in Y at i:2.
[Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at /paddle/paddle/fluid/operators/elementwise/elementwise_op_function.h:169)
[operator < elementwise_add > error]

evaluate question

您好,我在跑豆瓣的data.pkl数据集的时候,在save_step会卡住,我初步分析原因应该是验证集数量太大和for循环太多导致的。我观察了在跑data_small.pkl的时候,验证集的batch_num大概是31个,验证需要花费时间大概两分钟,而data.pkl的batch_num大概是1562个,验证的时候就会一直卡在那个地方。我想问一下您在实验的时候对这个部分有什么优化吗

There is something wrong

The line 10 in the document "test_and_evaluate.py" should be

import utils.evaluation as eva #for ubuntu
#import utils.douban_evaluation as eva #for douban

not just "import utils.evaluation as eva"
I am not sure if the same line in the document "train_and_evaluate.py" should be modified or not.It seems to work without modification.

Output response

您好,
请问使用ubuntu和douban的数据集, 最后输出的对话结果大概是什么样子?

不太懂test的结果。

你好,我把main文件里的调用改为test.test(conf, model)。参数的设置如下:
conf = {
"data_path": "./data/douban/data.pkl",
"save_path": "./output/douban/DAM_test/",
"word_emb_init": "./data/douban/word_embedding.pkl",
#"init_model": None, # None for train
"init_model": "./output/douban/DAM/DAM.ckpt", #should be set for test

"rand_seed": None, 

"drop_dense": None,
"drop_attention": None,

"is_mask": True,
"is_layer_norm": True,
"is_positional": False,  

"stack_num": 5,  
"attention_type": "dot",

"learning_rate": 1e-3,
"vocab_size": 172130, #434512
"emb_size": 200,
"batch_size": 32, #200 for test

"max_turn_num": 9,  
"max_turn_len": 50, 

"max_to_keep": 1,
"num_scan_data": 2,
"_EOS_": 1, #28270 for douban data
"final_n_class": 1,

}

得到的程序在shell中的运行结果:
sucess init ./output/douban/DAM/DAM.ckpt
starting test
2019-01-16 15:50:51
finish test
2019-01-16 15:51:39
finish evaluation
2019-01-16 15:51:39
我观察到,在我的save_path中得到了两个文件:result.test和score.test。
其中,result.test数据为:
0.542857142857
0.133834586466
0.24962406015
0.565413533835
请问这个结果代表什么呢?后三个数似乎指的是原文中的R10@1、R10@2和R10@5,是这样么?

报错UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.

tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[node cnn_aggregation/Conv3D (defined at /DAM/utils/layers.py:257) = Conv3D[T=DT_FLOAT, data_format="NDHWC", dilations=[1, 1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1, 1], _device="/job:localhost/replica:0/task:0/device:GPU:0"](stack_18, cnn_aggregation/filter_0/read)]]
[[{{node loss/Mean/_273}} = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_46002_loss/Mean", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

请问有人遇到过类似的问题嘛?在没有sudo权限的情况下怎样才能提高cudnn的版本呢

DAM model in serving

Hi, authors,

In the production stage, given a fix size of canned responses, it needs to compute the confidence level, i.e. logits, with each canned responses. Do you guys have a smarter way of getting the top n from the canned responses, for each context?

Cheers

Different between test.txt and score

Hello Mr,Zhou:
There is one thing i found that is different length between test.txt and score in douban
In data/douban/test.txt there is 10000L.But when i download models and unzip it ,i found output/douban/DAM/score is 6656L.
some operation i miss?

the question about “Attentive Module”

作者你好,就是关于您论文中的“Attentive Module”中的这个模型和论文“Attention is all you need” 中提出来的Transformer模型真的是完全一模一样的吗?
如果是一样的,请问一下在您的代码中Transformer模型的相关参数设置在哪里呀?
麻烦您解答一下

layers.py内容好像不太对

你好,我在阅读您的代码时,发现在utils下的layers.py文件中,第285及以后的各行,都是if add_relu:conv_0 = tf.nn.elu(conv_0) 也就是您定义的变量是是否增加relu——add_relu,而实际上执行的是是否增加elu。据我所知,这两个函数应该是不一样的,您是否考虑修改add_relu变量名或者修改elu呢?

Pickle Error

I'm using the data files hosted on dropbox. I'm getting the following error:

Traceback (most recent call last):
  File "main.py", line 54, in <module>
    train.train(conf, model)
  File "/home/shikib/multi_level/sota/Dialogue/DAM/bin/train_and_evaluate.py", line 24, in train
    train_data, val_data, test_data = pickle.load(open(conf["data_path"], 'rb'))
cPickle.UnpicklingError: invalid load key, '4'.

Any help would be appreciated.

How did you generate the input data files like data.pkl, word2id and word_embedding.pkl ?

Firstly thanks for the great ACL paper and open source code!

I have a question on the data preprocessing part. How did you generate the input data files like data.pkl, word2id,vocab.txt and word_embedding.pkl ? Let's take UDC as the example. The raw data only contains train.txt/valid.txt/test.txt. I checked your code and there are no scripts on generating these files like data.pkl and word_embedding.pkl. Could you also upload these data preprocessing scripts ?

Preprocessing scripts

Can you share please the scripts you used to preprocess data and also to train word embeddings ?

Some questions about the model and dataset

您好,我在看了源码后有些问题,希望作者能帮忙解答下,感谢。
1、在Attentive Module的FFN层里面,权重初始化用了orthogonal_initializer(),请问为什么用到这种初始化?
2、 FFN层的隐藏层维度大小和输入维度相同,请问是否有测试过其他维度?
3、请问ubuntu dataset里面的word_embedding.pkl文件,是自己训练出来的吗?如果是,请问有没有相应的代码?
希望作者可以回答下,谢谢了哈~

the douban testset

hi,
I have questions about douban test data.

  1. the pos/neg label is not 1:9 ?
  2. I find that some context has all ten neg responses but no pos response, for such case, the denominator will be 0, thus how to calculate p1/10 result by using douban_evaluation.py?
  3. I find that the testset len in data.pkl in douban is 6670, not equal to 10000 in test.txt, how to deal with this? is there any special point of these 6670 context?

thanks

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [434513,200] rhs shape= [172131,200]

I'm trying to reproduce the test result on douban corpus dataset.

  • tersorflow 1.5
  • python 2.7
  • ubuntu14.04

1. here is the dir structrue

output/douban/

|-- DAM
|   |-- DAM.ckpt.data-00000-of-00001
|   |-- DAM.ckpt.index
|   |-- DAM.ckpt.meta
|   |-- checkpoint
|   |-- result
|   `-- score
|-- DAM_cross
| . . . . . .    
| . . . . . .    

data/douban/

data/douban/
|-- data.pkl
|-- data_small.pkl
|-- dev.txt
|-- test.txt
|-- train.txt
|-- word2id
`-- word_embedding.pkl

2. here is the error

(DAM) zhibo@cvda-ultra:~/zhibo/DAM$ python main.py
loading word emb init
starting loading data
2018-09-27 16:39:54
finish loading data
finish building test batches
2018-09-27 16:41:20
configurations: {'vocab_size': 434512, 'num_scan_data': 2, 'data_path': './data/douban/data.pkl', 'max_turn_num': 9, 'emb_size': 200, 'is_mask': True, 'drop_attention': None, 'word_emb_init': './data/douban/word_embedding.pkl', 'save_path': './output/douban/temp/', 'is_positional': False, 'is_layer_norm': True, '_EOS_': 1, 'learning_rate': 0.001, 'drop_dense': None, 'rand_seed': None, 'final_n_class': 1, 'batch_size': 200, 'attention_type': 'dot', 'max_turn_len': 50, 'max_to_keep': 1, 'init_model': './output/douban/DAM/DAM.ckpt', 'stack_num': 5}
WARNING:tensorflow:From /home1/zhibo/codebase/DAM/utils/operations.py:157: calling reduce_mean (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.
Instructions for updating:
keep_dims is deprecated, use keepdims instead
sim shape: (200, 9, 50, 50, 12)
conv_0 shape: (200, 9, 50, 50, 32)
pooling_0 shape: (200, 3, 17, 17, 32)
conv_1 shape: (200, 3, 17, 17, 16)
pooling_1 shape: (200, 1, 6, 6, 16)
build graph sucess
2018-09-27 16:44:29
2018-09-27 16:44:29.380616: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2018-09-27 16:44:29.608946: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:05:00.0
totalMemory: 10.91GiB freeMemory: 1.72GiB
2018-09-27 16:44:29.797182: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 1 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:06:00.0
totalMemory: 10.92GiB freeMemory: 9.18GiB
2018-09-27 16:44:30.025387: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 2 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:09:00.0
totalMemory: 10.92GiB freeMemory: 10.55GiB
2018-09-27 16:44:30.263003: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 3 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:0a:00.0
totalMemory: 10.92GiB freeMemory: 8.12GiB
2018-09-27 16:44:30.263730: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Device peer to peer matrix
2018-09-27 16:44:30.263833: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1126] DMA: 0 1 2 3
2018-09-27 16:44:30.263845: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 0:   Y Y Y Y
2018-09-27 16:44:30.263853: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 1:   Y Y Y Y
2018-09-27 16:44:30.263863: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 2:   Y Y Y Y
2018-09-27 16:44:30.263870: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 3:   Y Y Y Y
2018-09-27 16:44:30.263882: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:05:00.0, compute capability: 6.1)
2018-09-27 16:44:30.263891: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:1) -> (device: 1, name: GeForce GTX 1080 Ti, pci bus id: 0000:06:00.0, compute capability: 6.1)
2018-09-27 16:44:30.263899: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:2) -> (device: 2, name: GeForce GTX 1080 Ti, pci bus id: 0000:09:00.0, compute capability: 6.1)
2018-09-27 16:44:30.263907: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:3) -> (device: 3, name: GeForce GTX 1080 Ti, pci bus id: 0000:0a:00.0, compute capability: 6.1)
Traceback (most recent call last):
  File "main.py", line 62, in <module>
    test.test(conf, model)
  File "/home1/zhibo/codebase/DAM/bin/test_and_evaluate.py", line 41, in test
    _model.saver.restore(sess, conf["init_model"])
  File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1686, in restore
    {self.saver_def.filename_tensor_name: save_path})
  File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 895, in run
    run_metadata_ptr)
  File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1128, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1344, in _do_run
    options, run_metadata)
  File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1363, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [434513,200] rhs shape= [172131,200]
         [[Node: loss/save/Assign_293 = Assign[T=DT_FLOAT, _class=["loc:@word_embedding"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](loss/word_embedding/Adam, loss/save/RestoreV2_293/_571)]]

Caused by op u'loss/save/Assign_293', defined at:
  File "main.py", line 62, in <module>
    test.test(conf, model)
  File "/home1/zhibo/codebase/DAM/bin/test_and_evaluate.py", line 35, in test
    _graph = _model.build_graph()
  File "/home1/zhibo/codebase/DAM/models/net.py", line 188, in build_graph
    self.saver = tf.train.Saver(max_to_keep = self._conf["max_to_keep"])
  File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1239, in __init__
    self.build()
  File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1248, in build
    self._build(self._filename, build_save=True, build_restore=True)
  File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1284, in _build
    build_save=build_save, build_restore=build_restore)
  File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 765, in _build_internal
    restore_sequentially, reshape)
  File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 440, in _AddRestoreOps
    assign_ops.append(saveable.restore(tensors, shapes))
  File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 160, in restore
    self.op.get_shape().is_fully_defined())
  File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/ops/state_ops.py", line 276, in assign
    validate_shape=validate_shape)
  File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/ops/gen_state_ops.py", line 59, in assign
    use_locking=use_locking, name=name)
  File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3160, in create_op
    op_def=op_def)
  File "/home/zhibo/anaconda3/envs/DAM/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1625, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [434513,200] rhs shape= [172131,200]
         [[Node: loss/save/Assign_293 = Assign[T=DT_FLOAT, _class=["loc:@word_embedding"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](loss/word_embedding/Adam, loss/save/RestoreV2_293/_571)]]

3. here's the main.py for evaluation

under main.py I made some modification for testing on douban corpus. I simply change data_path, save_path , init_model . batch_size and _EOS_ as comment suggested.

conf = {
    # "data_path": "./data/ubuntu/data.pkl",
    "data_path": "./data/douban/data.pkl",
    # "save_path": "./output/ubuntu/temp/",
    "save_path": "./output/douban/temp/",
    "word_emb_init": "./data/douban/word_embedding.pkl",
    "init_model": "./output/douban/DAM/DAM.ckpt", #should be set for test

    "rand_seed": None,

    "drop_dense": None,
    "drop_attention": None,

    "is_mask": True,
    "is_layer_norm": True,
    "is_positional": False,

    "stack_num": 5,
    "attention_type": "dot",

    "learning_rate": 1e-3,
    "vocab_size": 434512,
    "emb_size": 200,
    # "batch_size": 256, #200 for test
    "batch_size": 200, #200 for test

    "max_turn_num": 9,
    "max_turn_len": 50,

    "max_to_keep": 1,
    "num_scan_data": 2,
    # "_EOS_": 28270, #1 for douban data
    "_EOS_": 1, #1 for douban data
    "final_n_class": 1,
}

my question is

is it because i'm using tensoflow1.5? Or something wrong with the config? It seems that something wrong with loading the model.(I'm new to tensorflow.)

豆瓣数据集训练有问题

请问一下,我在源程序上没有做改动运行,python2和tensorflow1.2.1,然后第一次保存模型时evaluate的结果MAP和MRR都达到了0.9以上,之后还会更高,逼近0.99,这肯定是有问题的,这会是什么问题导致的呢

what does the "corpus_file text_file topic_file index_file 1 1" mean?

In the model "knowledge-driven dialogue/generative_pytorch_version/"
In the readme.rd:
Step 1: Preprocess
preprocess all the data using in model training and testing stage with the following commands

python ./tools/convert_conversation_corpus_to_model_text.py corpus_file text_file topic_file index_file 1 1

what does the "corpus_file text_file topic_file index_file 1 1" mean? where are the three files?

No such file or directory: './data/word_embedding.pkl

I got this error message when I try to reproduce the test result using the pretrained model. However, the pretrained word embedding model is missing.

loading word emb init
Traceback (most recent call last):
  File "main.py", line 57, in <module>
    model = net.Net(conf)
  File "./models/net.py", line 22, in __init__
    self._word_embedding_init = pickle.load(open(self._conf['word_emb_init'], 'rb'))
IOError: [Errno 2] No such file or directory: './data/word_embedding.pkl'

my question is where can i get ./data/word_embedding.pkl ?
Thank you

new question about ubuntu experiment

您好,我在Ubuntu数据集上使用默认参数运行时,无法得到您paper中的结果,仅能达到0.90624,请问这块是有其他的参数变动或者trick吗?

oov problem in training

Hi
i want to retrain the model in douban dataset
but i got oov error
my gpu is Tesla M40(11448MiB)
mem free 141G
anything i can do except change batch_size?

Number of attention layers exploration

attention_performance_multipe_layers
r128 3-10-performance_dam

In our projects, we tested the accuracy of using different number of layers. We tested the model:
(1) retrieve true responses from 128 (random responses, like the validation data in ubuntu dataset - which has 9 false response).
(2) retrieve the true response from a big canned list (rough 300 responses).

As shown in the result, the number of attention layers in our case does not significantly affect the accuracy.

Also, from your experience, the accuracy about 50% top 3 (out of 128) is a reasonable number?

Also, for production, we tested the throughput. Within Hight end GPU, it is about 100ms for one request of batch size of 128; however, it is extremely slow on CPU systems.

Do you guys try to deploy the DAM model on CPUs?

Cheers

Why use 3d CNN not 2d CNN in your paper?

您好,请问一下在您的论文里为什么使用是3D卷积而不是2D卷积呢?
3D卷积在这里相对于2D卷积带来了什么优势呢?有没有什么公式或者数学上的解释或者之类的?
还有就是filter设定为【3,3,3】这个大小设定是试验出来的吗?如果是,还试了哪些?
谢谢!

data.zip 无法下载

hi,
您好,数据集一直无法下载,能否重新共享一份啊?非常感谢!

vocabulary size of douban data

Hi,

  1. what's the min count when training douban data word embedding,as there are more than 30w+ vocab in douban data, do you test the impact of different vocb size to this model.
  2. Is there significant influence whether or not to set word embedding trainable?

Thanks.

Why use single head and do not use positional embedding in Attentive module

请问一下作者,您这里为什么在Attentive module只是用单头,而且没有使用位置编码。并且用于连接的FFN也设置隐藏层数目也设置成为了和词向量一样的维度。

另外就是还有一个问题在交叉注意力中,
Ui= AttentiveModule(U, R, R),Rl = AttentiveModule(R,U, U)
这个公式的设计大概是一个怎样的想法

刚刚那个问题一不小心手误删掉了,表示尴尬。
麻烦作者解答一下谢谢啦。

预处理的pkl文件

您好,
能否释放一下数据集处理成pkl文件的代码吗,或者给一个参考的code,自己也按照模型的占位符以及论文内容写了一个处理的代码,但是发现处理的结果有点出入。非常感谢!

best

test result running pre-trained model for douban data is far lower than published result

Hi,
This is result in output/douban/DAM/result(sam as published paper)
0.550061141847
0.600620002387
0.427067669173
0.25380833035
0.410119345984
0.756952500298

below is result I using pre-trained model output/douban/DAM/DAM.ckpt
0.54090909090909
0.1348484848486
0.25
0.566666666667

I wonder why there is so largely difference between the two results, as they should have been similar.

Thanks.

Some questions about the results on Douban

hi:
On the Douban dataset, we can't get the result as higher as yours. We trained the models and selected the best result but it is lower than issued on valid. How do you train the model and get the result?
Thanks

The matching scores are all equal when training and the model can not converge.

I am trying to train the model on douban data named "data_small.pkl" .
My question: The matching scores are all equal when training and the model can not converge.

The data is downloaded from the link in “ReadMe.txt”.

  • tersorflow 1.8
  • python 3.5

1. Here is the dir structrue.

data/douban/
|-- data.pkl
|-- data_small.pkl
|-- dev.txt
|-- test.txt
|-- train.txt
|-- word2id
`-- word_embedding.pkl

2. Here, here's the main.py for training.

conf = {
    "data_path": "./data/douban/data_small.pkl",
    "save_path": "./output/douban/DAM/",
    "word_emb_init": "./data/douban/word_embedding.pkl",
    "init_model": None, #"./output/douban/DAM_cross/DAM.ckpt", #should be set for test
    
    "rand_seed": None, 

    "drop_dense": None,
    "drop_attention": None,

    "is_mask": True,
    "is_layer_norm": True,
    "is_positional": False,  

    "stack_num": 5,  
    "attention_type": "dot",

    "learning_rate": 1e-3,
    "vocab_size": 172130, #434512
    "emb_size": 200,
    "batch_size": 256, #200 for test

    "max_turn_num": 9,  
    "max_turn_len": 50, 

    "max_to_keep": 1,
    "num_scan_data": 2,
    "_EOS_": 1, #28270
    "final_n_class": 1,
}

问下怎么应用?

问下模型训练好了以后怎么用在问答系统中,就是说怎么根据模型做出一个人机对话系统?当我输入一句话以后,模型给我输出一句话?可以发一下应用模块的代码么?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.