allenai / bi-att-flow Goto Github PK

Bi-directional Attention Flow (BiDAF) network is a multi-stage hierarchical process that represents context at different levels of granularity and uses a bi-directional attention flow mechanism to achieve a query-aware context representation without early summarization.

Home Page: http://allenai.github.io/bi-att-flow

License: Apache License 2.0

Python 75.22% Shell 0.64% HTML 1.95% Jupyter Notebook 22.19%

squad nlp tensorflow question-answering bidaf

bi-att-flow's Introduction

Bi-directional Attention Flow for Machine Comprehension

This the original implementation of Bi-directional Attention Flow for Machine Comprehension.
The CodaLab worksheet for the SQuAD Leaderboard submission is available here.
For TensorFlow v1.2 compatible version, see the dev branch.
Please contact Minjoon Seo (@seominjoon) for questions and suggestions.

0. Requirements

General

Python (verified on 3.5.2. Issues have been reported with Python 2!)
unzip, wget (for running download.sh only)

Python Packages

tensorflow (deep learning library, only works on r0.11)
nltk (NLP tools, verified on 3.2.1)
tqdm (progress bar, verified on 4.7.4)
jinja2 (for visaulization; if you only train and test, not needed)

1. Pre-processing

First, prepare data. Donwload SQuAD data and GloVe and nltk corpus (~850 MB, this will download files to $HOME/data):

chmod +x download.sh; ./download.sh

Second, Preprocess Stanford QA dataset (along with GloVe vectors) and save them in $PWD/data/squad (~5 minutes):

python -m squad.prepro

2. Training

The model has ~2.5M parameters. The model was trained with NVidia Titan X (Pascal Architecture, 2016). The model requires at least 12GB of GPU RAM. If your GPU RAM is smaller than 12GB, you can either decrease batch size (performance might degrade), or you can use multi GPU (see below). The training converges at ~18k steps, and it took ~4s per step (i.e. ~20 hours).

Before training, it is recommended to first try the following code to verify everything is okay and memory is sufficient:

python -m basic.cli --mode train --noload --debug

Then to fully train, run:

python -m basic.cli --mode train --noload

You can speed up the training process with optimization flags:

python -m basic.cli --mode train --noload --len_opt --cluster

You can still omit them, but training will be much slower.

Note that during the training, the EM and F1 scores from the occasional evaluation are not the same with the score from official squad evaluation script. The printed scores are not official (our scoring scheme is a bit harsher). To obtain the official number, use the official evaluator (copied in squad folder, squad/evaluate-v1.1.py). For more information See 3.Test.

3. Test

To test, run:

python -m basic.cli

Similarly to training, you can give the optimization flags to speed up test (5 minutes on dev data):

python -m basic.cli --len_opt --cluster

This command loads the most recently saved model during training and begins testing on the test data. After the process ends, it prints F1 and EM scores, and also outputs a json file ($PWD/out/basic/00/answer/test-####.json, where #### is the step # that the model was saved). Note that the printed scores are not official (our scoring scheme is a bit harsher). To obtain the official number, use the official evaluator (copied in squad folder) and the output json file:

python squad/evaluate-v1.1.py $HOME/data/squad/dev-v1.1.json out/basic/00/answer/test-####.json

3.1 Loading from pre-trained weights

Instead of training the model yourself, you can choose to use pre-trained weights that were used for SQuAD Leaderboard submission. Refer to this worksheet in CodaLab to reproduce the results. If you are unfamiliar with CodaLab, follow these simple steps (given that you met all prereqs above):

Download save.zip from the worksheet and unzip it in the current directory.
Copy glove.6B.100d.txt from your glove data folder ($HOME/data/glove/) to the current directory.
To reproduce single model:

basic/run_single.sh $HOME/data/squad/dev-v1.1.json single.json

This writes the answers to single.json in the current directory. You can then use the official evaluator to obtain EM and F1 scores. If you want to run on GPU (~5 mins), change the value of batch_size flag in the shell file to a higher number (60 for 12GB GPU RAM). 4. Similarly, to reproduce ensemble method:

basic/run_ensemble.sh $HOME/data/squad/dev-v1.1.json ensemble.json

If you want to run on GPU, you should run the script sequentially by removing '&' in the forloop, or you will need to specify different GPUs for each run of the for loop.

Results

Dev Data

Note these scores are from the official evaluator (copied in squad folder, squad/evaluate-v1.1.py). For more information See 3.Test. The scores appeared during the training could be lower than the scores from the official evaluator.

	EM (%)	F1 (%)
single	67.7	77.3
ensemble	72.6	80.7

Test Data

	EM (%)	F1 (%)
single	68.0	77.3
ensemble	73.3	81.1

Refer to our paper for more details. See SQuAD Leaderboard to compare with other models.

Multi-GPU Training & Testing

Our model supports multi-GPU training. We follow the parallelization paradigm described in TensorFlow Tutorial. In short, if you want to use batch size of 60 (default) but if you have 3 GPUs with 4GB of RAM, then you initialize each GPU with batch size of 20, and combine the gradients on CPU. This can be easily done by running:

python -m basic.cli --mode train --noload --num_gpus 3 --batch_size 20

Similarly, you can speed up your testing by:

python -m basic.cli --num_gpus 3 --batch_size 20

Demo

For now, please refer to the demo branch of this repository.

bi-att-flow's People

Contributors

Stargazers

Watchers

Forkers

vyraun yangliuy hawklucky binbinbian tifoit benjamesbabala robustfengbin liuyang1123 ml-lab tonydeep weizh sandeepsingh ahn19 codeaudit kltruong shachi-paul arvindsg tianlongwang sohuren chengniu raduk gopalanj wazzy robinjia sikuma ruil ycyen pku-wuwei minya-wang dustinmayeda himmelstein hyeonwoonoh francescoalb rishabgoel sungjinlees panl2015 webeng miyoungko nelson-liu effectiveai synpon julianmichael matt-gardner levstyle sunqf kelvict zjmwqx thirupathiba-kore rubby33 robert-tien diamondspark liudong16 arkll ajaytalati allensmile sathishreddy yuanzhike nroth1 rxl194 flora-chang insikk mukund-kri jkorycki azariany jamesalexsmith sandhya-1d gtkafka collawolley little1tow chagge huitingliu alephic miptdeeplearninglab sld soroushmehr tothemoon96 shashankharinath wenwei202 jemisa pum-purum-pum-pum sighsmile siddharthsingh7 xibaoxuan yuhaozhang multipath iqbal-chowdhury tbmihailov cutecha skybirdhe rockt pcgreat neufang ericwtlin demiguo silasxue iamgroot42 jabariholder kgramm9026 wuhh qiantianchi

bi-att-flow's Issues

Any guidance on how to generate the attention matrix visualization (pg 8 in paper)?

Hi there,

in the paper (https://arxiv.org/pdf/1611.01603.pdf) - page 8 in pdf, you show an attention-matrix visualization illustrating what context words the attention mechanism focuses on per word in the question. Any guidance on how to generate this visualization, or at a minimum the top-scoring context words per question word given a certain model is used?

Thank you

Limits on size of paragraph/query?

Hello, when using the demo (with the pretrained model) to run against new paragraphs/queries, 1) what limits apply on the length of paragraph and/or query?

I do see in basic/demo_cli.py that it seems the following parameters are set:
flags.DEFINE_integer("word_count_th", 30, "word count th [100]")
flags.DEFINE_integer("char_count_th", 150, "char count th [500]")
flags.DEFINE_integer("sent_size_th", 1000, "sent size th [64]")
flags.DEFINE_integer("num_sents_th", 1000, "num sents th [8]")
flags.DEFINE_integer("ques_size_th", 100, "ques size th [32]")
flags.DEFINE_integer("word_size_th", 48, "word size th [16]")
flags.DEFINE_integer("para_size_th", 1000, "para size th [256]")

Meanwhile these values are set differently in basic/demo_cli.py.

Is it just the values in basic/demo_cli.py that we care about when running the demo (ie: basic/cli.py flags won't have an effect even if they'd been used to build the model)?
What does each of these numbers entail exactly as it's not obvious in all cases?
And can one change those values in basic/demo_cli.py) to any desired number and have them take effect as soon as the demo server is up?
Is there a practical way to let any length of paragraph/query be accepted?

Note there is overlap between the concerns raised here and what's brought up on issue 13 and so I've just raised some of the above questions there as well. Happy to have questions answered within either issue.

Thank you

Problem: sum of class instances when downgrading to python 2.7

Hi all, I'm trying to downgrade the code into python 2.7. I found in the customized evaluation class overloaded the add operation as: def add(self, other).

When sum over evaluation instances of different batches, it simply use a sum command:
def get_evaluation_from_batches(self, sess, batches):
e = sum(self.get_evaluation(sess, batch) for batch in batches)
The sum inherits the overloaded sum operation well in python 3.5 whereas it does not work in python 2.7.

Does any one meet the same problem and find some solutions? Any suggestion would be appreciated. Thanks!

Clearer description on EM and F1 result would be helpful for reproduction

Hello.
Thank you for sharing well-written source code repository. By the way, I had confusion when I was trying to reproduce your result because the EM and F1 score on dev set during training are lower than the score you provided. After few days of muddling through, I realized you mentioned it in README.

Note that the printed scores are not official (our scoring scheme is a bit harsher). To obtain the official number, use the official evaluator (copied in squad folder) and the output json file.

It is easy to miss the above comment. I suggest more clarification on README would be really helpful for others also. I made a pull request for the updated README.

Is there any reason that you used harsher scoring scheme instead of the squad's evaluation method? Does it help to increase model performance?

What does softsel mean？

softsel appears many times..
Thank you..

ValueError when running training with Multi-GPU

Dear Team,

I am running training with 2 K80 Nvidia GPUs. I tried both dev branch with tf 1.2.0, python 3.6.2 with the following line:

python3 -m basic.cli --mode train --noload --num_gpus 2 --batch_size 30

However the program quits with the errors attached. We are a bit confused on how to track what's causing the error, and we are wondering if we could get some help?

Here starts the log of the error:

Traceback (most recent call last):
File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/dawn-benchmark/tensorflow-qa-orig/bi-att-flow/basic/cli.py", line 112, in
tf.app.run()
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "/home/dawn-benchmark/tensorflow-qa-orig/bi-att-flow/basic/cli.py", line 109, in main
m(config)
File "/home/dawn-benchmark/tensorflow-qa-orig/bi-att-flow/basic/main.py", line 24, in main
_train(config)
File "/home/dawn-benchmark/tensorflow-qa-orig/bi-att-flow/basic/main.py", line 83, in _train
models = get_multi_gpu_models(config)
File "/home/dawn-benchmark/tensorflow-qa-orig/bi-att-flow/basic/model.py", line 21, in get_multi_gpu_models
model = Model(config, scope, rep=gpu_idx == 0)
File "/home/dawn-benchmark/tensorflow-qa-orig/bi-att-flow/basic/model.py", line 68, in init
self._build_ema()
File "/home/dawn-benchmark/tensorflow-qa-orig/bi-att-flow/basic/model.py", line 298, in _build_ema
ema_op = ema.apply(tensors)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/moving_averages.py", line 375, in apply
colocate_with_primary=(var.op.type in ["Variable", "VariableV2"]))
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/slot_creator.py", line 174, in create_zeros_slot
colocate_with_primary=colocate_with_primary)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/slot_creator.py", line 149, in create_slot_with_initializer
dtype)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/slot_creator.py", line 66, in _create_slot_var
validate_shape=validate_shape)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/variable_scope.py", line 1065, in get_variable
use_resource=use_resource, custom_getter=custom_getter)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/variable_scope.py", line 962, in get_variable
use_resource=use_resource, custom_getter=custom_getter)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/variable_scope.py", line 367, in get_variable
validate_shape=validate_shape, use_resource=use_resource)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/variable_scope.py", line 352, in _true_getter
use_resource=use_resource)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/variable_scope.py", line 682, in _get_single_variable
"VarScope?" % name)
ValueError: Variable model_1/loss/ExponentialMovingAverage/ does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=None in VarScope?

where is the "Character Embedding Layer" in code

hi
I am a new coder. I am very interested in BiDAF. but when I had read ./squad/prepro.py ,not found "Character Embedding Layer" code . Just see "Word Embedding Layer". Could you tell me:which $name.py includes the "Character Embedding Layer".

Make use of part of speech(pos)

Hi, @seominjoon

Do you have try to use the part of speech(pos)?
Because the same word may have different pos, it may increase the performance of bi-att-flow.

If i want to make use of pos besides word embedding and character embedding, could you give me some advise? Thank you so much.

AttributeError: 'int' object has no attribute 'summaries'

I am a new . when I run the skim script as README.md . I get a error follow:
File "basic/main.py", line 119, in _train
graph_handler.add_summaries(e_train.summaries, global_step)
AttributeError: 'int' object has no attribute 'summaries'

my tensorflow version is 1.1.0

f1 and em score is less than official result by 0.3%

I followed the exact instructions in the 'readme.md' file and started training my model with the following command:
python -m basic.cli --mode train --noload --len_opt --cluster --batch_size 50
After 18K steps I used the following command to test the model
python squad/evaluate-v1.1.py $HOME/data/squad/dev-v1.1.json out/basic/00/answer/test-####.json
and then I got f1=74.982, exact_match=64.90.
The scores for a single model in the original paper are em=68.0 and f1=77.3. And mine are 0.3 % point lower than those.
Because the codes are provided by the official group , the hyper parameters are exactly the same except the batch_size which won't affect the models' performance critically. The only reason I can think of is the different initial value.
Has anyone done the same work as I do ? Or can anyone provides other ideas?
Thanks a lot!!!!

please, update to tf 1.0

The difference between basic and basic_cnn?

https://github.com/allenai/bi-att-flow/tree/master/basic
and
https://github.com/allenai/bi-att-flow/tree/master/basic_cnn
Thank you.

AttributeError: max_num_sents

Hello,

on running basic/run_single.sh ~/data/squad/dev-v1.1.json single.json I got the following error:

Traceback (most recent call last):
  File "/usr/lib/python3.5/runpy.py", line 184, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/gfrison/ws/bi-att-flow-demo/basic/cli.py", line 109, in <module>
    tf.app.run()
  File "/home/gfrison/.p3/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 30, in run
    sys.exit(main(sys.argv[:1] + flags_passthrough))
  File "/home/gfrison/ws/bi-att-flow-demo/basic/cli.py", line 106, in main
    m(config)
  File "/home/gfrison/ws/bi-att-flow-demo/basic/main.py", line 27, in main
    _forward(config)
  File "/home/gfrison/ws/bi-att-flow-demo/basic/main.py", line 192, in _forward
    models = get_multi_gpu_models(config)
  File "/home/gfrison/ws/bi-att-flow-demo/basic/model.py", line 19, in get_multi_gpu_models
    model = Model(config, scope, rep=gpu_idx == 0)
  File "/home/gfrison/ws/bi-att-flow-demo/basic/model.py", line 34, in __init__
    config.batch_size, config.max_num_sents, config.max_sent_size, \
  File "/home/gfrison/.p3/lib/python3.5/site-packages/tensorflow/python/platform/flags.py", line 44, in __getattr__
    raise AttributeError(name)
AttributeError: max_num_sents

I got it in the demo branch. In the master branch works correctly.
What can I do for making it working?

Thank you

# 2. Training - ImportError: cannot import name '_linear'

OS 16.04
Python 3 , python 2 also has an err
branch master

✘-1 ~/github/bi-att-flow/allenai/bi-att-flow [master|…32]
23:40 $ python3 -m basic.cli --mode train --noload --debug
Traceback (most recent call last):
File "/usr/lib/python3.5/runpy.py", line 184, in _run_module_as_main
"main", mod_spec)
File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/tux/github/bi-att-flow/allenai/bi-att-flow/basic/cli.py", line 5, in
from basic.main import main as m
File "/home/tux/github/bi-att-flow/allenai/bi-att-flow/basic/main.py", line 14, in
from basic.model import get_multi_gpu_models
File "/home/tux/github/bi-att-flow/allenai/bi-att-flow/basic/model.py", line 10, in
from my.tensorflow.nn import softsel, get_logits, highway_network, multi_conv1d
File "/home/tux/github/bi-att-flow/allenai/bi-att-flow/my/tensorflow/nn.py", line 1, in
from tensorflow.python.ops.rnn_cell import _linear
ImportError: cannot import name '_linear'
✘-1 ~/github/bi-att-flow/allenai/bi-att-fl

how to handle large context by Machine Comprehension(bi-att-flow) &Resource Exhausted error

I have a test file, whose name is mytest1.json. The context is a large text, which has so many words in it.

when i run the folowing:
basic/run_single.sh $HOME/data/squad/mytest1.json single.json

some errors happen, do you guys have an idea about how to solve this... Thanks so much.

`File "/home/weijiang/bi-att-flow/inference/main.py", line 29, in main
eval_data = _forward(config, data, shared)
File "/home/weijiang/bi-att-flow/inference/main.py", line 88, in _forward
models = get_multi_gpu_models(config)
File "/home/weijiang/bi-att-flow/inference/model.py", line 19, in get_multi_gpu_models
model = Model(config, scope, rep=gpu_idx == 0)
File "/home/weijiang/bi-att-flow/inference/model.py", line 58, in init
self._build_forward()
File "/home/weijiang/bi-att-flow/inference/model.py", line 164, in _build_forward
p0 = attention_layer(config, self.is_train, h, u, h_mask=self.x_mask, u_mask=self.q_mask, scope="p0", tensor_dict=self.tensor_dict)
File "/home/weijiang/bi-att-flow/inference/model.py", line 421, in attention_layer
u_a, h_a = bi_attention(config, is_train, h, u, h_mask=h_mask, u_mask=u_mask, tensor_dict=tensor_dict)
File "/home/weijiang/bi-att-flow/inference/model.py", line 398, in bi_attention
is_train=is_train, func=config.logit_func, scope='u_logits') # [N, M, JX, JQ]
File "/home/weijiang/bi-att-flow/my/tensorflow/nn.py", line 127, in get_logits
new_arg = args[0] * args[1]
File "/home/weijiang/anaconda2/envs/tensorflow-0.11-py3.5/lib/python3.5/site-packages/tensorflow/python/ops/math_ops.py", line 751, in binary_op_wrapper
return func(x, y, name=name)
File "/home/weijiang/anaconda2/envs/tensorflow-0.11-py3.5/lib/python3.5/site-packages/tensorflow/python/ops/math_ops.py", line 910, in _mul_dispatch
return gen_math_ops.mul(x, y, name=name)
File "/home/weijiang/anaconda2/envs/tensorflow-0.11-py3.5/lib/python3.5/site-packages/tensorflow/python/ops/gen_math_ops.py", line 1519, in mul
result = _op_def_lib.apply_op("Mul", x=x, y=y, name=name)
File "/home/weijiang/anaconda2/envs/tensorflow-0.11-py3.5/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 749, in apply_op
op_def=op_def)
File "/home/weijiang/anaconda2/envs/tensorflow-0.11-py3.5/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2380, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/home/weijiang/anaconda2/envs/tensorflow-0.11-py3.5/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1298, in init
self._traceback = _extract_stack()

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[1,1,34680,7,200]
[[Node: model_0/main/p0/bi_attention/mul = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"](model_0/main/p0/bi_attention/Tile, model_0/main/p0/bi_attention/Tile_1)]]
[[Node: model_0/main/g2/BW/BW/Assert/AssertGuard/Assert/Switch/_333 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_4307_model_0/main/g2/BW/BW/Assert/AssertGuard/Assert/Switch", tensor_type=DT_BOOL, _device="/job:localhost/replica:0/task:0/cpu:0"]]`

basic_cnn too much gpu memory?

Hi,
when training squad(basic.cli) dataset, Titan x 's 12 gpu memory is enough, config.max_sent_size = min(config.max_num_sents, config.num_sents_th), and config.max_sent_size is 1 because of sent_tokenize = lambda para: [para]

while training cnn (basic_cnn.cli) dataset, config.max_sent_size = 11, config.max_num_sents = 200, which results to gpu memory error.

Any ideas to fix it( may reduct the config.max_num_sents in parser) or do like in squad? Thanks!

3.1 Loading from pre-trained weights "download zip"

point already covered by the download.sh?

Examples are not fully loaded

I run
python -m basic.cli --mode train --noload --debug and get

Loaded 85277/87599 examples from train
Loaded 10258/10570 examples from dev
...

Seems examples are not fully loaded. Why may be the issue?

Is there any document for this code?

HI, It's a little hard for me to understand the details of the code.
Is there any document for the code?
thanks!

Error when running this tensorflow-1.1.0 version

``$ python -m basic.cli --mode train --noload --debug
Traceback (most recent call last):
File "anaconda2/envs/tensorflow/lib/python3.5/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "anaconda2/envs/tensorflow/lib/python3.5/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/bi-att-flow/basic/cli.py", line 5, in
from basic.main import main as m
File "/bi-att-flow/basic/main.py", line 14, in
from basic.model import get_multi_gpu_models
File "/bi-att-flow/basic/model.py", line 6, in
from tensorflow.python.ops.rnn_cell import BasicLSTMCell
ImportError: No module named 'tensorflow.python.ops.rnn_cell'

I have installed the tensorflow-gpu 1.1.0 version and python 3.5.3 and all the required libraries.

It may seem some error with the code version and tensorflow, how can I solve this problem?

Well, it's hard to find a tensorflow-gpu 0.11 version available now so I have to run the 1.1.0 version.

Thanks a lot!

ForwardEvaluator Bug

In the class "ForwardEvaluator" in the evaluator.py, there is a function:
class ForwardEvaluator(Evaluator):

    def _get2(context, xi, span):
        if len(xi) <= span[0][0]:
            return ""
        if len(xi[span[0][0]]) <= span[1][1]: 
            return ""
        return get_phrase(context, xi, span)

I think
if len(xi[span[0][0]]) <= span[1][1]:
should be
if len(xi[span[0][0]]) <span[1][1]:

In some cases, len(xi[span[0][0]]) is equal to span[1][1], e.g., the answer is from somewhere to the end of context.

Do I need to consider number of sentences in a paragraph?

I'm really interested in your model, so I'm trying to implement the model on pytorch.
But there's a problem, during reconstructing your model, I found that the dimension for number of sentence exists, whereas in your paper, as long as I read there never mentioned about this dimension. I'm aiming to implement the model for Squad dataset, and carefully think about why this dimension was needed? So, I want to ask you 'Can this dimension be useful?' and 'Why did you add this dimension?'

idx2vec_dict is empty

https://github.com/allenai/bi-att-flow/blob/dev/basic/main.py#L75

[minor] spelling mistake in Links section

Stanford Question Asnwering Dataset and Lederboard should be Stanford Question Answering Dataset and Leaderboard

Error from graph_handler

Thanks for the great work and for sharing the codes.

I am trying to reproduce your single model results. Following the steps you have described, I tried:

python -m basic.cli --mode train --noload --len_opt --cluster

Everything worked fine till 4% of training, then I got the following error from graph_handler:

File "basic/main.py", line 117, in _train
graph_handler.add_summaries(e_train.summaries, global_step)
AttributeError: 'int' object has no attribute 'summaries'

It looks like it is from TF, not sure, any ideas how I can resolve this?

Thanks!
Hamid

[question] Confidence Values

I've been using BiDAF for some time now, and I love the system! The only one question I've had is that I'm looking for a way to display a "confidence" or "score" for the answers it gives. Is that possible?
I've been looking into the source code for quite a while, but I'm unable to find out where that code may lie.
Thanks!

process_tokens() in utils.py

Hi,
I have a question about process_tokens(temp_tokens) in utils.py.

After invoke the function of process_tokens(),
the punctuation in [-−—–/~"\'“’”‘°] will split temp_tokens again.

This may result in some items of "xi = [process_tokens(tokens) for tokens in xi] " whose length are 0.
In other words, some items in the xi may be ""(empty string).
I think this is not necessary.

From my experiment, if i remove process_tokens(), the performance will decrease.
If i reserve process_tokens() and remove the empty string in xi, the performance seems to almost the same.

Thanks.

Error using pre trained weights

@seominjoon I tried following the instructions to save the pretrained weights but when I run the model using basic/run_single.sh $HOME/data/squad/dev-v1.1.json single.json I get the following error:
The inter_single folder is missing from the repo .

Would you be able to hep me fix this issue?

KeyError: 'contextss'

I trained a model using the dev branch (tensorflow v1.12) and tried running that model in the demo branch, but got the above error in the run-demo.py file.

What does aug mean？

aug appears many times..
Thank you..

Compatibility with tf-serving

I am working on making this compatible with tf-serving ? How do I set the config variables... tf-serving only takes tensors, right but how do you update scalars ?

Extracting attention weights for a query

Where can I extract the attention weights for a query? I would like to visualise the attention for a query e.g. similar to this example:

dev loss overfits after 7k steps (SQuAD)

setting:
The dev branch code
tensorflow v1.1.0
Python 3.6.1 |Anaconda 4.4.0 (64-bit)
1 TITAN X (Pascal)
Driver Version: 375.26

Running:
Run the default scripts in the tutorial to duplicate the result of SQuAD.

Results:

python -m basic.cli --batch_size 30
test step 20000: accuracy=0.6044, f1=0.7255, loss=3.7624 

python squad/evaluate-v1.1.py $HOME/data/squad/dev-v1.1.json out/basic/00/answer/test-020000.json
{"exact_match": 63.434247871333966, "f1": 73.9342647706341}

Seems it overfits after 7K steps.
The score at step 7K:

python -m basic.cli --batch_size 30
test step 7000: accuracy=0.6129, f1=0.7289, loss=3.0944

python squad/evaluate-v1.1.py $HOME/data/squad/dev-v1.1.json out/basic/00/answer/test-007000.json
{"exact_match": 64.00189214758751, "f1": 74.19603365373315}

An valuable mention is issue #35 . And the default learning rate is 0.001 instead of 0.5 in the paper. I suppose 0.5 is for the CNN dataset.
I also notice this is a comment in the tutorial Dev Data (old) NOTE: These numbers are from v0.2.1. Why would this matter?

Any insight will be appreciated.

how to predict answers for custom question and context by reusing loaded model.

Hi:
After i reload the mrc model,
how can I implement the function like predict to get the answers for different input questions and contexts.

Meanwhile, i want to reuse the loaded mrc model and session. I do not want to reload model and initialize the session many times.
Thanks

prepro.py default not splitting sentences?

Hi, I follow the preprocessing example you provide. But in the output the sentences are not separated unless I add the "--split" flag. I wonder if you train the official model in this way as well?

ImportError: /usr/local/lib/python3.4/dist-packages/tensorflow/python/_pywrap_tensorflow.so: invalid ELF header

https://hastebin.com/elejasoqef.sql -- colored output

`OS 14.04
bi-att-flow branch: Master
script that produced the error: (master branch) https://github.com/alphaaurigae/bi-att-flow_bash

BEGIN TRAINING SECTION 2.TRAINING

2. Training

Before training, it is recommended to first try the following code to verify everything is okay and memory is sufficient:
python3 -m basic.cli --mode train --noload --debug

run python3 -m basic.cli --mode train --noload --debug OR n SKIP??? [Y/n] Y
Traceback (most recent call last):
File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 18, in swig_import_helper
return importlib.import_module(mname)
File "/usr/lib/python3.4/importlib/init.py", line 109, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 2254, in _gcd_import
File "", line 2237, in _find_and_load
File "", line 2226, in _find_and_load_unlocked
File "", line 1191, in _load_unlocked
File "", line 1161, in _load_backward_compatible
File "", line 539, in _check_name_wrapper
File "", line 1715, in load_module
File "", line 321, in _call_with_frames_removed
ImportError: /usr/local/lib/python3.4/dist-packages/tensorflow/python/_pywrap_tensorflow.so: invalid ELF header

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/lib/python3.4/runpy.py", line 170, in _run_module_as_main
"main", mod_spec)
File "/usr/lib/python3.4/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/hus/github/bi-att-flow/allenai/bi-att-flow/basic/cli.py", line 3, in
import tensorflow as tf
File "/usr/local/lib/python3.4/dist-packages/tensorflow/init.py", line 23, in
from tensorflow.python import *
File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/init.py", line 49, in
from tensorflow.python import pywrap_tensorflow
File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 21, in
_pywrap_tensorflow = swig_import_helper()
File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 20, in swig_import_helper
return importlib.import_module('_pywrap_tensorflow')
File "/usr/lib/python3.4/importlib/init.py", line 109, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
ImportError: No module named '_pywrap_tensorflow'

Then to fully train, run:

python3 -m basic.cli --mode train --noload

run python3 -m basic.cli --mode train --noload OR n SKIP??? [Y/n] ^C

error in multi-GPU training

I am able to train the model on cpu. But when I am trying to train it on multi gpu with following command:
python -m basic.cli --mode train --noload --num_gpus 3 --batch_size 20
it is throwing following error:
ValueError: Variable model_1/model_1/loss/ExponentialMovingAverage/biased does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=None in VarScope?

I tried searching it on internet, but being new to tensorflow, could not figure out where to make the fix. Any pointer will be helpful.

Assert error in squad/prepro.py

How should I deal with this error? Thanks!

If I comment out that line, it ends up just failing on another similar assert statement...

basic_cnn (vs basic, tree)

Hi Minjoon,

Nice work on the paper and code!

I have a few questions. Can you explain briefly what the differences between basic, basic_cnn, & tree are? For the results described in your README under single model, were they obtained with basic & default settings?

Lastly, in basic_cnn/read_data, the field sorted doesn't seem exist, causing an error when accessing shared['sorted']. Do you have any thoughts on how to fix it?

Thanks!
-Thang

Looking for tester - install and run routine bash script

I wrote this one https://github.com/alphaaurigae/bi-att-flow_bash
Im on a netbook so having a hard time to test it till a testserver is setup - if anybody here could have a look on it and a testrun would be very awesome.

Variable model_1/loss/ExponentialMovingAverage/ does not exist

HI, I'm running the dev branch code on Tensorflow 1.2.

And I got this error:
Variable model_1/loss/ExponentialMovingAverage/ does not exist or was not created with tf.get_variable(). Did you mean to set reuse=None in VarScope.

From the stack trace, it was from basic/model.py, in _build_ema, ema_op=ema.apply(tensors).

I tried to add "with tf.variable_scope(tf.get_variable_scope(), reuse=False):" before eam.apply but that still doesn't work.

Any ideas how can I fix this?

Thanks!
I'm using CUDA8.0 and Cudnn5.1. Tensorflow v1.2, python 3.5.

how faster about the inference time?

Use the pre-trained model to make new predictions

I'm using the pre-trained model to get predictions for new questions on SQUAD dataset.

For a paragraph of an article, I add questions that aren't necesarely answerable by the paragraph. Nonethless, the model in the pre-process asks me to insert the answer_start which I cannot provide since I don't know if the answer for the question is in the paragraph.

Is there a way to get the predictions (and not an error which let the model produce a default output) without inserting the real answer_start value?

I look forward to hearing from you. Thank you :)
Cristina

How to run the demo?

From the 'demo' branch I tried running the demo but I get the following error. S

Initially it was looking for a file out/basic/00/basic-18000 which does not exist in the repo. I changed the path to out/basic/00/basic-20000 which does exist but then get a different error:

NotFoundError (see above for traceback): Tensor name "model_0/prepro/u1/FW/BasicLSTMCell/Linear/Matrix/ExponentialMovingAverage" not found in checkpoint files out/basic/00/save/basic-20000

Understanding the meaning of TimeStep in Attention Layer

Hi Guys,
I’ve been going through your paper on BI-DIRECTIONAL ATTENTION FLOW FOR MACHINE COMPREHENSION and I’m amazed to say the least. I’ve have been trying to implement the architecture proposed by you but I have a road-block in understanding a particular line/sentence that has been re-iterated within the paper.
” The attention vector at each time step, along with the embeddings from previous layers, are allowed to flow through to the subsequent modeling layer.”

My understanding of this line is two fold :

By each time step you mean to say that the shared similarity matrix is updated as in when a new word output from LSTM comes into the attention layer.At (t = t) the shared similarity matrix S is different from S (t = t+1).If yes, how do I visualize the output at the end of t=t timestep.
OR, is it just another way of saying return_sequences=True and the rest would be taken care by the KERAS Layer.

I’m trying to implement the architecture in Python using KERAS layers.Any help is highly appreciated.
Thank you for your valuable time.

Why does the model work when turn 400 to 740

When training , we limit the paragraph length to 400 to train.
But when testing, we use 740 as the paragraph length to test.
Is it too amazing that the model still work?

How to generate the save file in the save fold?

Like in the submission folder 37/ https://worksheets.codalab.org/worksheets/0x37a9b8c44f6845c28866267ef941c89d/

Is it .data file generated by tensorflow training? Thanks.

CNN+SQuAD

Is it possible to use both the CNN and SQuAD datasets to train BiDAF? If so, how?

nltk tokenize doesn't work?

Dear Team,
The code below doesn't work and the context doesn't sententce token.
if args.tokenizer == "PTB":
import nltk
sent_tokenize = nltk.sent_tokenize
def word_tokenize(tokens):
return [token.replace("''", '"').replace("``", '"') for token in nltk.word_tokenize(tokens)]
I check the shared_dev.json, I got this
"x": [
[
[
[
"The",
"income",
"tax",
"withholding",
"rate",
"remains",
"at",
"4.25",
"%",
"for",
"tax",
"year",
"2015",
".",
"However",
",",
"the",
"personal",
"exemption",
"amount",
"for",
"tax",
"year",
"2015",
"will",
"change",
"to",
"$",
"4,000",
".",
"You",
"may",
"continue",
"to",
"use",
"2014",
"Michigan",
"Income",
"Tax",
"Withholding",
"Tables",
"."
],
But, if I change the code like follows, It works.

import nltk.tokenize as nltk
def prepro_each(args, data_type, start_ratio=0.0, stop_ratio=1.0, out_name="default", in_path=None):
if args.tokenizer == "PTB":

    # sent_tokenize = nltk.sent_tokenize
    def word_tokenize(tokens):   
        return [token.replace("''", '"').replace("``", '"') for token in nltk.word_tokenize(tokens)]
......
xi = list(map(word_tokenize, nltk.sent_tokenize(context)))

I change the code and run again, but I got a little lower EM and F1. I was very puzzled about it. Could you please help me solve the problem?

No such file or directory: 'inter_single/eval.json'

Hello,

I run the pre-trained weight with:

basic/run_single.sh ~/data/squad/dev-v1.1.json single.json

but I got:

File "/usr/lib/python3.5/gzip.py", line 163, in __init__
fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')
FileNotFoundError: [Errno 2] No such file or directory: 'inter_single/eval.json'

What did I make wrong?

thank you

allenai / bi-att-flow Goto Github PK

bi-att-flow's Introduction

Bi-directional Attention Flow for Machine Comprehension

0. Requirements

General

Python Packages

1. Pre-processing

2. Training

3. Test

3.1 Loading from pre-trained weights

Results

Dev Data

Test Data

Multi-GPU Training & Testing

Demo

bi-att-flow's People

Contributors

Stargazers

Watchers

Forkers

bi-att-flow's Issues

2. Training

Then to fully train, run:

Recommend Projects

Recommend Topics

Recommend Org