sogou / sogoumrctoolkit Goto Github PK

This toolkit was designed for the fast and efficient development of modern machine comprehension models, including both published models and original prototypes.

License: Apache License 2.0

Python 100.00%

sogoumrctoolkit's People

Contributors

Stargazers

Watchers

Forkers

yxk9810 zhengyk11 czy36mengfei charlottesean qzhangsogou aman229 sujithjoseph tomarraj008 legendtianjin colinsongf kyang888 trendingtechnology hfxunlp libertatis snakecy shihuaxing flopascual amoonhappy lrh000 jinxueling booltime liu3xing3long tedrepo sunnymarkliu jerryten youtang1993 nangeblog qitong 90217 wenhangbao feixiang7701 yueyedeai yutaoxxx michael-wzhu guancoder xylary andong0323 ishine whitespur riosober rickqi djsasadvs chenmoshushi jx1100370217 nipengmath ssgalitsky beethovenvirus fcoolish 0xyuzi lamony leo-xxx hsqcarter bumplzz69 takeshineshiro ravish0007 phantomtide wxc1884 shaunstanislauslau hhy5277 zhaoyong111 snowcranestart b-xiang kxlshitou bobkentt qyx01020 sheng-si 2585575866 guardbl brucekyle99 loganpc wanghaolonggit liuzhiyong01 mbyase shenyuan416 williamjin zengai xujun05 xcgfth bailianfa zsweet xiaoyumao1996 yanchlu dogydev hitluobin zmwebdev asdlei99 pluto-junzeng vickzhang yuhuofei apricotxingya decstionback oumkale ouyangmingyuan nudtchengqing dogskybear hunterkai henryfriedlander qianrenjian wyxingyux christophersperl

sogoumrctoolkit's Issues

improvement to 'load' and 'save' method

The 'load' and 'save' methods in BaseModel only deal with the model, other related data like 'vocab' is not processed. So, to load a model from file, the vocabulary needs to be rebuilt or loaded separately, which is not ideal. It's better to save and load all related stuff.

Class vocabulary has 'load' and 'save' methods, but all data are save in JSON. It's larger and easy to corrupt. I don't see the benefit of using JSON format here.

Is there any code refer to answer_verification?

I find there's field for answer_verification in bert_coqa.py, but i can not find any code for that.
Thanks for any reply for this.

No module named 'sogou_mrc'的问题

您好，在运行examples中的文件时出现“No module named 'sogou_mrc'”的错误，但是sogou_mrc下包含__init__.py文件，不知道这个问题是怎么产生的，可以烦请您解答一下么？谢谢！

Instance key should be "query_id" when reader is CMRC

https://github.com/sogou/SMRCToolkit/blob/472fbe228297e77578efdf127600a9a0ff8ad01e/sogou_mrc/model/bidaf.py#L235

NotFoundError: ./vacab.txt

When I run the run_bert_coqa.py, it comes out an error.
tensorflow.python.framework.errors_impl.NotFoundError: ./vocab.txt; No such file or directory
I want to know where does this file come from.

关于适用不同阅读理解任务类型的问题

请问这个设计中是不是只考虑了抽取式的任务？是否适用选择题或者完型填空类型的阅读理解任务呢？

what is the thesis of Answer Verification ?

Thanks for your codes !!!
I wanna read the thesis of the Answer Verification in your code.
is this Read + Verify: Machine Reading Comprehension
with Unanswerable Questions ?

Shuffle buffer filled

command: python run_bert_squadv2.py
GPU: P40
CPU memory: 126G

log:
2019-07-28 01:28:56.654470: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N
2019-07-28 01:28:56.866822: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 21139 MB memory) -> physical GPU (device: 0, name: Tesla P40, pci bus id: 0000:05:00.0, compute capability: 6.1)
2019-07-28 01:29:01.610923: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2019-07-28 01:29:01.611087: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-07-28 01:29:01.611125: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0
2019-07-28 01:29:01.611143: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N
2019-07-28 01:29:01.792284: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 21139 MB memory) -> physical GPU (device: 0, name: Tesla P40, pci bus id: 0000:05:00.0, compute capability: 6.1)
2019-07-28 01:29:04.056203: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2019-07-28 01:29:04.056312: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-07-28 01:29:04.056346: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0
2019-07-28 01:29:04.056365: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N
2019-07-28 01:29:04.057344: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 21139 MB memory) -> physical GPU (device: 0, name: Tesla P40, pci bus id: 0000:05:00.0, compute capability: 6.1)
2019-07-28 01:29:49.138936: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:98] Filling up shuffle buffer (this may take a while): 7971 of 130497
2019-07-28 01:29:59.139110: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:98] Filling up shuffle buffer (this may take a while): 15820 of 130497
2019-07-28 01:30:09.139348: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:98] Filling up shuffle buffer (this may take a while): 23963 of 130497
2019-07-28 01:30:19.138907: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:98] Filling up shuffle buffer (this may take a while): 31963 of 130497
2019-07-28 01:30:29.139424: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:98] Filling up shuffle buffer (this may take a while): 39830 of 130497
2019-07-28 01:30:39.139672: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:98] Filling up shuffle buffer (this may take a while): 48245 of 130497
2019-07-28 01:30:49.138921: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:98] Filling up shuffle buffer (this may take a while): 56364 of 130497
2019-07-28 01:30:59.139422: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:98] Filling up shuffle buffer (this may take a while): 64376 of 130497
2019-07-28 01:31:09.139234: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:98] Filling up shuffle buffer (this may take a while): 72092 of 130497
2019-07-28 01:31:19.139490: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:98] Filling up shuffle buffer (this may take a while): 79574 of 130497
2019-07-28 01:31:29.139453: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:98] Filling up shuffle buffer (this may take a while): 87148 of 130497
2019-07-28 01:31:39.139663: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:98] Filling up shuffle buffer (this may take a while): 94795 of 130497
2019-07-28 01:31:49.139455: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:98] Filling up shuffle buffer (this may take a while): 102515 of 130497
2019-07-28 01:31:59.140305: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:98] Filling up shuffle buffer (this may take a while): 110001 of 130497
2019-07-28 01:32:09.138877: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:98] Filling up shuffle buffer (this may take a while): 117752 of 130497
2019-07-28 01:32:19.139864: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:98] Filling up shuffle buffer (this may take a while): 125678 of 130497
2019-07-28 01:32:25.347410: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:136] Shuffle buffer filled.
~

Can I handle Chinese using other models?

In the examples given , you just handled Chinese using model BIDAF by choosing the CMRCReader as the reader,I guess I can handle Chinese with other models by using CMRCReader,too. But I not very sure about it,could you give me the answer?

Can you share the BertCoQA pretrained model?

Is it possible to share the BertCoQA pretrained model?

中文ELMo支持

如果想实现支持中文的ELMo+BIDAF，是否需要自己封装tf.hub的接口。还是有一些更好的解决方案？

ran out of memory

TF： tensorflow-gpu==1.12
显卡：Tesla P4 8G
尝试运行run_bidafplus_squad.py，报了显存分配的问题，我不知道这会不会对运行结果有影响

2019-04-07 05:11:40.657538: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-04-07 05:11:41.446788: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:964] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-04-07 05:11:41.447151: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties: 
name: Tesla P4 major: 6 minor: 1 memoryClockRate(GHz): 1.1135
pciBusID: 0000:00:06.0
totalMemory: 7.43GiB freeMemory: 7.31GiB
2019-04-07 05:11:41.447178: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2019-04-07 05:11:41.882084: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-04-07 05:11:41.882132: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988]      0 
2019-04-07 05:11:41.882141: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0:   N 
2019-04-07 05:11:41.882363: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7051 MB memory) -> physical GPU (device: 0, name: Tesla P4, pci bus id: 0000:00:06.0, compute capability: 6.1)
2019-04-07 05:11:42,321 - root - INFO - Reading file at train-v1.1.json
2019-04-07 05:11:42,322 - root - INFO - Processing the dataset.
87599it [07:43, 189.13it/s]
2019-04-07 05:19:25,497 - root - INFO - Reading file at dev-v1.1.json
2019-04-07 05:19:25,497 - root - INFO - Processing the dataset.
10570it [00:53, 196.53it/s]
2019-04-07 05:20:19,349 - root - INFO - Building vocabulary.
100%|███████████████████████████████████| 98169/98169 [00:30<00:00, 3218.07it/s]
2019-04-07 05:21:05.747563: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2019-04-07 05:21:05.747695: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-04-07 05:21:05.747711: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988]      0 
2019-04-07 05:21:05.747718: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0:   N 
2019-04-07 05:21:05.747925: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7051 MB memory) -> physical GPU (device: 0, name: Tesla P4, pci bus id: 0000:00:06.0, compute capability: 6.1)
2019-04-07 05:21:06.489069: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2019-04-07 05:21:06.489145: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-04-07 05:21:06.489156: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988]      0 
2019-04-07 05:21:06.489162: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0:   N 
2019-04-07 05:21:06.489389: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7051 MB memory) -> physical GPU (device: 0, name: Tesla P4, pci bus id: 0000:00:06.0, compute capability: 6.1)
2019-04-07 05:21:07.117979: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2019-04-07 05:21:07.118055: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-04-07 05:21:07.118066: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988]      0 
2019-04-07 05:21:07.118072: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0:   N 
2019-04-07 05:21:07.118278: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7051 MB memory) -> physical GPU (device: 0, name: Tesla P4, pci bus id: 0000:00:06.0, compute capability: 6.1)
2019-04-07 05:21:13,046 - root - INFO - Epoch 1/15
2019-04-07 05:21:13,351 - root - INFO - Eposide 1/2
2019-04-07 05:21:23.422390: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:98] Filling up shuffle buffer (this may take a while): 10494 of 87599
2019-04-07 05:21:33.422566: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:98] Filling up shuffle buffer (this may take a while): 21931 of 87599
2019-04-07 05:21:43.422157: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:98] Filling up shuffle buffer (this may take a while): 32210 of 87599
2019-04-07 05:21:53.422415: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:98] Filling up shuffle buffer (this may take a while): 42018 of 87599
2019-04-07 05:22:03.422089: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:98] Filling up shuffle buffer (this may take a while): 52336 of 87599
2019-04-07 05:22:13.422587: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:98] Filling up shuffle buffer (this may take a while): 62125 of 87599
2019-04-07 05:22:23.422099: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:98] Filling up shuffle buffer (this may take a while): 72157 of 87599
2019-04-07 05:22:33.421957: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:98] Filling up shuffle buffer (this may take a while): 82242 of 87599
2019-04-07 05:22:38.605655: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:136] Shuffle buffer filled.
2019-04-07 05:22:57.952087: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 2.88G (3091968768 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-04-07 05:23:27.134938: W tensorflow/core/common_runtime/bfc_allocator.cc:211] Allocator (GPU_0_bfc) ran out of memory trying to allocate 3.96GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-04-07 05:24:09.911666: W tensorflow/core/common_runtime/bfc_allocator.cc:211] Allocator (GPU_0_bfc) ran out of memory trying to allocate 3.28GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-04-07 05:28:01.375542: W tensorflow/core/common_runtime/bfc_allocator.cc:211] Allocator (GPU_0_bfc) ran out of memory trying to allocate 3.23GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-04-07 05:28:01.673176: W tensorflow/core/common_runtime/bfc_allocator.cc:211] Allocator (GPU_0_bfc) ran out of memory trying to allocate 3.94GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-04-07 05:28:33.173192: W tensorflow/core/common_runtime/bfc_allocator.cc:211] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.92GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-04-07 05:28:33.490319: W tensorflow/core/common_runtime/bfc_allocator.cc:211] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.93GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-04-07 05:28:33.502105: W tensorflow/core/common_runtime/bfc_allocator.cc:211] Allocator (GPU_0_bfc) ran out of memory trying to allocate 3.52GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-04-07 05:28:44,872 - root - INFO - - Train metrics: loss: 5.875
2019-04-07 05:28:46.141381: W tensorflow/core/common_runtime/bfc_allocator.cc:211] Allocator (GPU_0_bfc) ran out of memory trying to allocate 3.27GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-04-07 05:28:46.477394: W tensorflow/core/common_runtime/bfc_allocator.cc:211] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.64GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-04-07 05:28:47.501813: W tensorflow/core/common_runtime/bfc_allocator.cc:211] Allocator (GPU_0_bfc) ran out of memory trying to allocate 3.09GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2019-04-07 05:29:05,078 - root - INFO - - Eval metrics: loss: 3.759
2019-04-07 05:29:21,705 - root - INFO - - Eval metrics: exact_match: 51.325 ; f1: 63.040
2019-04-07 05:29:21,705 - root - INFO - - epoch 1 eposide 1: Found new best score: 63.039909
2019-04-07 05:29:21,705 - root - INFO - Eposide 2/2
2019-04-07 05:34:47,135 - root - INFO - - Train metrics: loss: 4.882
2019-04-07 05:35:02,895 - root - INFO - - Eval metrics: loss: 3.376
2019-04-07 05:35:19,210 - root - INFO - - Eval metrics: exact_match: 57.313 ; f1: 68.490
2019-04-07 05:35:19,210 - root - INFO - - epoch 1 eposide 2: Found new best score: 68.490210
2019-04-07 05:35:19,210 - root - INFO - Epoch 2/15
2019-04-07 05:35:19,213 - root - INFO - Eposide 1/2

Can't find model 'en'

Traceback (most recent call last):
File "E:/HDL/SMRCToolkit/run_bidafplus.py", line 21, in
reader = CoQAReader(history=-1)
File "E:\HDL\SMRCToolkit\sogou_mrc\dataset\coqa.py", line 16, in init
self.tokenizer = SpacyTokenizer()
File "E:\HDL\SMRCToolkit\sogou_mrc\utils\tokenizer.py", line 9, in init
self.nlp = spacy.load('en', disable=['parser','tagger','entity'])
File "D:\Program Files (x86)\Anaconda\lib\site-packages\spacy_init_.py", line 27, in load
return util.load_model(name, **overrides)
File "D:\Program Files (x86)\Anaconda\lib\site-packages\spacy\util.py", line 136, in load_model
raise IOError(Errors.E050.format(name=name))
OSError: [E050] Can't find model 'en'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.

TypeError: sparse_to_dense() missing 2 required positional arguments

在运行bidaf_squadv2.py时遇到了如下报错：
File "G:\SMRCToolkit-master\sogou_mrc\data\batch_generator.py", line 121, in extract_char
out = tf.sparse.to_dense(out, default_value=default_value)
AttributeError: module 'tensorflow' has no attribute 'sparse'

我将文件中的tf.sparse.to_dense改为tf.sparse_to_dense,但是依旧报错
File "G:\SMRCToolkit-master\sogou_mrc\data\batch_generator.py", line 145, in transform_new_instance
context_char = extract_char(context_tokens)
File "G:\SMRCToolkit-master\sogou_mrc\data\batch_generator.py", line 121, in extract_char
out = tf.sparse_to_dense(out, default_value=default_value)
TypeError: sparse_to_dense() missing 2 required positional arguments: 'output_shape' and 'sparse_values'
请问这个问题该如何解决呢？

Need help regarding the installation

Hey @litao-buptsse, @yukyang, @wujindou

I am getting the following error while running any examples:

ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory

Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/errors

It seems, either I do not have a compatible CUDA version or cudnn version.
My current versions are as follows:
CUDA: 9.0
cudnn: 7.5.1

Please help

several minor errors of Trainer

This framework is handy, I like it. Here are several problems I found:

Trainer._evaluate requires 'model_path', but BaseModel doesn't provide that parameter;
Trainer._inference call Trainer.inference (with 3 parameters) which doesn't exists.

Also, the default logging level disable all training information, that's inconvenient. At the end of training, there is no information of training at all.

some questions in application

你好，请问SMRCToolkit具体是做什么的？当我采用CMRC中的数据集进行训练之后，需要将模型用于我们自己的AI机器人对话之中。那么是不是当用cmrc2018_train数据集训练完之后，保存下来的模型就可以用于其他的中文阅读理解了？如果我输入自己的文章和问题，是否也能得到答案？如果可以，请问应该在哪里输入自己的文章和问题，怎么获取答案。（如果用中文回复那就再好不过了）

FailedPreconditionError: Error Loading Models

Hi,

Thank you very much for your code. I have been able to replicate your results for many on datasets using the model.train_and_evaluate() method. However, when I have tried to save and load a model, I have experienced an error. Initially I tried to save and evaluate using the BertCoQA model, but I am even experiencing errors when running the code from model_save_load.md tutorial.

Below is the error thrown (here is a pastebin with the full error if that would be helpful).

FailedPreconditionError (see above for traceback): Attempting to use uninitialized value eval_metrics/mean/count [[node eval_metrics/mean/AssignAdd_1 (defined at /juicier/scr126/scr/hnf035/fresh/SMRCToolkit/sogou_mrc/model/bert_coqa.py:199) = AssignAdd[T=DT_FLOAT, use_locking=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](eval_metrics/mean/count, eval_metrics/mean/ToFloat, ^add_8)]]

Thank you very much for the help!

no 'session' and TypeError

Two questions come up when I run run_bert_coqa.py.
Traceback (most recent call last): File "/data2/wangfuyu/NQ/ycl/SMRCToolkit-master/sogou_mrc/model/base_model.py", line 21, in __del__ self.session.close() AttributeError: 'BertCoQA' object has no attribute 'session'

File "/data2/wangfuyu/NQ/ycl/SMRCToolkit-master/examples/run_coqa/run_bert_coqa.py", line 37, in <module> model = BertCoQA(bert_dir=bert_dir,answer_verificatioin=True) TypeError: __init__() got an unexpected keyword argument 'answer_verificatioin'

The version of my python is 3.6.8
Could anyone give me some advices?

CMRC2018数据集的支持

看到examples中有使用BiDAF在CMRC2018数据集上的测试。在cmrc_bidaf.py中使用了词向量embedding_folder，是否提供该词向量下载地址？谢谢！

Missing import sys and other issues

flake8 testing of https://github.com/sogou/SMRCToolkit on Python 3.7.1

$ flake8 . --count --select=E9,F63,F72,F82 --show-source --statistics

./sogou_mrc/dataset/squadv2.py:88:55: F632 use ==/!= to compare str, bytes, and int literals
            "answer_start": answer_token_starts[0] if len(answer_token_starts) > 0 is not None else None,
                                                      ^
./sogou_mrc/dataset/squadv2.py:89:51: F632 use ==/!= to compare str, bytes, and int literals
            "answer_end": answer_token_ends[0] if len(answer_token_ends) > 0 is not None else None,
                                                  ^
./sogou_mrc/dataset/coqa.py:363:21: F821 undefined name 'sys'
                    sys.stderr.write("Turn id should match index {}: {}\n".format(i + 1, qa))
                    ^
./sogou_mrc/dataset/coqa.py:368:25: F821 undefined name 'sys'
                        sys.stderr.write("Question turn id does match answer: {} {}\n".format(qa, answer))
                        ^
./sogou_mrc/dataset/coqa.py:372:21: F821 undefined name 'sys'
                    sys.stderr.write("Gold file has duplicate stories: {}".format(source))
                    ^
./sogou_mrc/libraries/tokenization.py:39:27: F821 undefined name 'unicode'
    elif isinstance(text, unicode):
                          ^
./sogou_mrc/libraries/tokenization.py:62:27: F821 undefined name 'unicode'
    elif isinstance(text, unicode):
                          ^
./sogou_mrc/libraries/modeling.py:364:10: F821 undefined name 'output'
  return output
         ^
2     F632 use ==/!= to compare str, bytes, and int literals
6     F821 undefined name 'sys'
8

After the model training, I want to input a paragraph and a question, and then get the output answer, how to do

ImportError: No module named 'stanfordnlp'

the issues i encountered are the module issue.
and i believe the owner should add the stanfordnlp install procedure in the tutorial

No OpKernel was registered to support Op 'CudnnRNN'

/home/purabi/anaconda3/envs/smrc/bin/python /home/purabi/SMRCToolkit-master/examples/run_bidaf/main.py
WARNING: Logging before flag parsing goes to stderr.
W0415 17:05:55.514122 139645281789760 init.py:56] Some hub symbols are not available because TensorFlow version is less than 1.14
87599it [30:24, 48.01it/s]
10570it [03:26, 51.23it/s]
100%|██████████| 98169/98169 [01:22<00:00, 1194.49it/s]

WARNING: The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:

https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
https://github.com/tensorflow/addons
If you depend on functionality not listed there, please file an issue.

2019-04-15 17:41:39.343160: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-04-15 17:41:39.370044: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2712000000 Hz
2019-04-15 17:41:39.370318: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x55756a467530 executing computations on platform Host. Devices:
2019-04-15 17:41:39.370343: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (0): ,
Traceback (most recent call last):
File "/home/purabi/anaconda3/envs/smrc/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/home/purabi/anaconda3/envs/smrc/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1317, in _run_fn
self._extend_graph()
File "/home/purabi/anaconda3/envs/smrc/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1352, in _extend_graph
tf_session.ExtendSession(self._session)
tensorflow.python.framework.errors_impl.InvalidArgumentError: No OpKernel was registered to support Op 'CudnnRNN' used by {{node cu_dnnlstm/CudnnRNN}}with these attrs: [is_training=true, seed2=0, dropout=0, seed=0, T=DT_FLOAT, input_mode="linear_input", direction="unidirectional", rnn_mode="lstm"]
Registered devices: [CPU, XLA_CPU]
Registered kernels:

 [[{{node cu_dnnlstm/CudnnRNN}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/purabi/SMRCToolkit-master/examples/run_bidaf/main.py", line 31, in
model.train_and_evaluate(train_batch_generator, eval_batch_generator, evaluator, epochs=15, eposides=2)
File "/home/purabi/SMRCToolkit-master/sogou_mrc/model/base_model.py", line 47, in train_and_evaluate
self.session.run(tf.global_variables_initializer())
File "/home/purabi/anaconda3/envs/smrc/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/home/purabi/anaconda3/envs/smrc/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/home/purabi/anaconda3/envs/smrc/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/home/purabi/anaconda3/envs/smrc/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: No OpKernel was registered to support Op 'CudnnRNN' used by node cu_dnnlstm/CudnnRNN (defined at /home/purabi/SMRCToolkit-master/sogou_mrc/nn/recurrent.py:41) with these attrs: [is_training=true, seed2=0, dropout=0, seed=0, T=DT_FLOAT, input_mode="linear_input", direction="unidirectional", rnn_mode="lstm"]
Registered devices: [CPU, XLA_CPU]
Registered kernels:

 [[node cu_dnnlstm/CudnnRNN (defined at /home/purabi/SMRCToolkit-master/sogou_mrc/nn/recurrent.py:41) ]]

Caused by op 'cu_dnnlstm/CudnnRNN', defined at:
File "/home/purabi/SMRCToolkit-master/examples/run_bidaf/main.py", line 29, in
model = BiDAF(vocab, pretrained_word_embedding=word_embedding)
File "/home/purabi/SMRCToolkit-master/sogou_mrc/model/bidaf.py", line 34, in init
self._build_graph()
File "/home/purabi/SMRCToolkit-master/sogou_mrc/model/bidaf.py", line 93, in _build_graph
context_repr, _ = phrase_lstm(dropout(context_repr, self.training), self.context_len)
File "/home/purabi/SMRCToolkit-master/sogou_mrc/nn/recurrent.py", line 41, in call
fw = self.fw_layer(seq)
File "/home/purabi/anaconda3/envs/smrc/lib/python3.6/site-packages/tensorflow/python/keras/layers/recurrent.py", line 701, in call
return super(RNN, self).call(inputs, **kwargs)
File "/home/purabi/anaconda3/envs/smrc/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 554, in call
outputs = self.call(inputs, *args, **kwargs)
File "/home/purabi/anaconda3/envs/smrc/lib/python3.6/site-packages/tensorflow/python/keras/layers/cudnn_recurrent.py", line 111, in call
output, states = self._process_batch(inputs, initial_state)
File "/home/purabi/anaconda3/envs/smrc/lib/python3.6/site-packages/tensorflow/python/keras/layers/cudnn_recurrent.py", line 501, in _process_batch
is_training=True)
File "/home/purabi/anaconda3/envs/smrc/lib/python3.6/site-packages/tensorflow/python/ops/gen_cudnn_rnn_ops.py", line 142, in cudnn_rnn
seed2=seed2, is_training=is_training, name=name)
File "/home/purabi/anaconda3/envs/smrc/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
File "/home/purabi/anaconda3/envs/smrc/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/home/purabi/anaconda3/envs/smrc/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op
op_def=op_def)
File "/home/purabi/anaconda3/envs/smrc/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1801, in init
self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): No OpKernel was registered to support Op 'CudnnRNN' used by node cu_dnnlstm/CudnnRNN (defined at /home/purabi/SMRCToolkit-master/sogou_mrc/nn/recurrent.py:41) with these attrs: [is_training=true, seed2=0, dropout=0, seed=0, T=DT_FLOAT, input_mode="linear_input", direction="unidirectional", rnn_mode="lstm"]
Registered devices: [CPU, XLA_CPU]
Registered kernels:

 [[node cu_dnnlstm/CudnnRNN (defined at /home/purabi/SMRCToolkit-master/sogou_mrc/nn/recurrent.py:41) ]]

Process finished with exit code 1

run run_bert_coqa.py OOM

I have used 3 GPUs to run this program. But it still comes out an OOM.
ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[12,12,512,512] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[node bert/encoder/layer_6/attention/self/Softmax (defined at /data2/wangfuyu/NQ/ycl/SMRCToolkit-master/sogou_mrc/libraries/modeling.py:728) = Softmax[T=DT_FLOAT, _class=["loc:@bert/encoder/layer_6/attention/self/cond/Switch_1"], _device="/job:localhost/replica:0/task:0/device:GPU:0"](bert/encoder/layer_6/attention/self/add)]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

[[{{node truediv/_771}} = _Recv[client_terminated=false,recv_device="/job:localhost/replica:0/task:0/device:CPU:0",send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1,tensor_name="edge_5803_truediv", tensor_type=DT_FLOAT,_device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Input text and Question and get Answer

I have successfully passed the data through the model, though I got an output of tensors from the inference in trainer.py. How should I decode these tensors into a string that contains the answer?
Thanks.

why this code is so wired so syntex error keep occured?

essor.py", line 38
    f"\tProcessor: {self.processor_type}\n"
    ^
SyntaxError: invalid syntax

4*V100 16G for bert SQuAD got OOM Problem

Hi, I tried to run run_bert_squad with bert base uncased model(uncased_L-12_H-768_A-12). But it give the OOM error. Is that normal?

About configuration of a system and number of epoch

Hi,
For coqa model what was the configuration of the system. Also for how many epochs did you train the model.

global_step always equals to zero

question about coqa data processing

I think 'skip' answer type is actually answerable (but no span exist).
why is skip type skipped in training time? is there any performance gain?
I'm afraid that skipped conversation ruins context of conversation history.
btw, thanks for good code.

预处理COQA数据集的问题

在预处理COQA数据集的时候，我看到到history_question_tokens 是所有历史问题和答案的拼接。
question answer。
在训练集中这样拼接操作应该是正确的。但是在eval验证集中也采用这样的方式感觉并不合适。
因为相当于引入了本来应该未知的数据，在验证集中的answer应该是未知的，不能和question拼接到一起。
希望可以解答我的疑惑，谢谢您。

CoQA in Google Colab - OOM (BertWrapper?)

Hello i'm currently trying to get the CoQA example running in google colab. Unfortunatly i get a OOM at "train_data = bert_data_helper.convert(train_data,data='coqa')". The colab machines only have 12,7 gb of RAM. When I run the toolkit on my local machine i can see that this process takes up to 14gb of RAM.
My Question is, is it possible to reduce the memory usage of the bert data helper (bert wrapper)? (and if, could you tell me where exactly?)

Thank you in advance

cudnn error

在运行bidaf_squadv2.py时，报出如下错误：
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[node conv1d_1/conv1d/Conv2D (defined at G:\SMRCToolkit-master\sogou_mrc\nn\layers.py:115) = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](gradients/conv1d_1/conv1d/Conv2D_grad/Conv2DBackpropFilter-0-TransposeNHWCToNCHW-LayoutOptimizer, conv1d_1/conv1d/ExpandDims_1)]]
[[{{node add_18/_227}} = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2587_add_18", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

Caused by op 'conv1d_1/conv1d/Conv2D', defined at:
File "F:\Users\ylwang\Anaconda3\envs\SMRCToolkit-master\lib\runpy.py", line 183, in _run_module_as_main
mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
使用tensorflow版本为tensorflow-gpu==1.12,cuda9.0,cudnn7.0;怀疑是cudnn版本过低所以重新安装cudnn7.5.0，但问题依然存在，请问一下这个问题该如何解决呢？谢谢！