hitvoice / drqa Goto Github PK

A pytorch implementation of Reading Wikipedia to Answer Open-Domain Questions.

Shell 2.48% Python 97.52%

pytorch squad drqa

drqa's Introduction

DrQA

A pytorch implementation of the ACL 2017 paper Reading Wikipedia to Answer Open-Domain Questions (DrQA).

Reading comprehension is a task to produce an answer when given a question and one or more pieces of evidence (usually natural language paragraphs). Compared to question answering over knowledge bases, reading comprehension models are more flexible and have revealed a great potential for zero-shot learning.

SQuAD is a reading comprehension benchmark where there's only a single piece of evidence and the answer is guaranteed to be a part of the evidence. Since the publication of SQuAD dataset, there has been fast progress in the research of reading comprehension and a bunch of great models have come out. DrQA is one that is conceptually simpler than most others but still yields strong performance even as a single model.

The motivation for this project is to offer a clean version of DrQA for the machine reading comprehension task, so one can quickly do some modifications and try out new ideas. Click here to see the comparison with what's described in the original paper and with two "official" projects ParlAI and DrQA.

Requirements

python >=3.5
pytorch >=0.4. Tested on pytorch 0.4 and pytorch 1.10. Historical versions:
- DrQA with pytorch 0.3
- DrQA with pytorch 0.2
numpy
msgpack
spacy 3.x

Quick Start

Setup

download the project via git clone https://github.com/hitvoice/DrQA.git; cd DrQA
make sure python 3, pip, wget and unzip are installed.
install pytorch matched with your OS, python and cuda versions.
install the remaining requirements via pip install -r requirements.txt
download the SQuAD datafile, GloVe word vectors and Spacy English language models using bash download.sh.

Train

# prepare the data
python prepro.py
# train for 40 epochs with batchsize 32
python train.py -e 40 -bs 32

Warning: Running prepro.py takes about 9G memory when using 8 threads. If there's not enough memory on your machine, try reducing the number of threads used by the script, for example, python prepro.py --threads 2

Predict

python interact.py

Example interactions:

Evidence: Super Bowl 50 was an American football game to determine the champion of the National Football League (NFL) for the 2015 season. The American Football Conference (AFC) champion Denver Broncos defeated the National Football Conference (NFC) champion Carolina Panthers 24-10 to earn their third Super Bowl title. The game was played on February 7, 2016, at Levi's Stadium in the San Francisco Bay Area at Santa Clara, California.
Question: What day was the game played on?
Answer: February 7, 2016
Time: 0.0245s

Evidence: Super Bowl 50 was an American football game to determine the champion of the National Football League (NFL) for the 2015 season. The American Football Conference (AFC) champion Denver Broncos defeated the National Football Conference (NFC) champion Carolina Panthers 24-10 to earn their third Super Bowl title. The game was played on February 7, 2016, at Levi's Stadium in the San Francisco Bay Area at Santa Clara, California.
Question: What is the AFC short for?
Answer: The American Football Conference
Time: 0.0214s

Evidence: Beanie style with simple design. So cool to wear and make you different. It wears as peak cap and a fashion cap. It is good to match your clothes well during party and holiday, also makes you charming and fashion, leisure and fashion in public and streets. It suits all adults, for men or women. Matches well with your winter outfits so you stay warm all winter long.
Question: Is it for women?
Answer: It suits all adults, for men or women
Time: 0.0238s

The last example is a randomly picked product description from Amazon (not in SQuAD).

Results

EM & F1

	EM	F1
in the original paper	69.5	78.8
in this project	69.64	78.76
offical(Spacy)	69.71	78.94
offical(CoreNLP)	69.76	79.09

Compared with the official implementation:

Detailed Comparisons

Compared to what's described in the original paper:

The grammatical features are generated by spaCy instead of Stanford CoreNLP. It's much faster and produces similar scores.

Compared to the code in facebookresearch/DrQA:

This project is much more light-weighted and focusing solely on training and evaluating on SQuAD dataset while lacking the document retriever, the interactive inference API, and some other features.
The implementation in facebookresearch/DrQA is able to train on multiple GPUs, while (currently and for simplicity) in this implementation we only support single-GPU training.

Compared to the code in facebookresearch/ParlAI:

The DrQA model is no longer wrapped in a chatbot framework, which makes the code more readable, easier to modify and is faster to train. The preprocessing for text corpus is performed only once, while in a dialog framework raw text is transmitted each time and preprocessing for the same text must be done again and again.
This is a full implementation of the original paper, while the model in ParlAI is a partial implementation, missing all grammatical features (lemma, POS tags and named entity tags).
Some minor bug fixes. Some of them have been merged into ParlAI.

About

Maintainer: Runqi Yang.

Credits: thank Jun Yang for code review and advice.

Most of the pytorch model code is borrowed from Facebook/ParlAI under a BSD-3 license.

drqa's People

Contributors

Stargazers

Watchers

Forkers

hedgefair awokeknowing nianfudong little1tow hsinyuan-huang stratigraph stonesysu khanhptnk tk1363704 breakend demiguo vunb xpertasks poivrenoir singularscience tejamukka taolei87 sanjeeku namisan wsdm-paper-reading yangliuy 19ai nahidalam chickenbestlover lomberer jx57 siviltaram zhanglbjames chunlinx lixinsu hualichenxi klpek cceyda frankxu2004 shafiahmed augmen prasoontelang dfenglei afcarl aiedward hitman56 deepakkumar1984 ngocphuongnb meccy sohuren tomlisankie magicalwind dlworkspace liuweiping2020 peide jiyulongxu zhouhaosame sidney1994 tikyau nicemartin brettkoonce yucoian sayduke gilvandroneto maknotavailable miroblog tabshaikh esskay0000 code-krishna ramdhanoriya sepehrs07 urbankocmut kb-rahul tayyabikhlaq tomarraj008 mazzzystar homelh ztl-35 sharma-ji red8top carmanzhang sloanqin zheng19931128 saburbutt daishu7 wangcheny sourcepirate yfsunshine seahrh begugao parikshit-hooda andrey-tkachev pawardipawesh anshumanaich07 maobui2907 fd54 muhamob cindycandy lightluu nhunguet cocacolabai seongl aqui-tna anshiquanshu66 namratasaun

drqa's Issues

AssertionError: Torch not compiled with CUDA enabled

$ python3 train.py -e 40 -bs 32

02/15/2020 05:17:11 [Program starts. Loading data...]
02/15/2020 05:22:48 {'log_per_updates': 3, 'data_file': 'SQuAD/data.msgpack', 'model_dir': '/Users/balagopalbhallamudi/Desktop/DrQA/models', 'save_last_only': False, 'save_dawn_logs': False, 'seed': 1013, 'cuda': False, 'epochs': 40, 'batch_size': 32, 'resume': '', 'resume_options': False, 'reduce_lr': 0.0, 'optimizer': 'adamax', 'grad_clipping': 10, 'weight_decay': 0, 'learning_rate': 0.1, 'momentum': 0, 'tune_partial': 1000, 'fix_embeddings': False, 'rnn_padding': False, 'question_merge': 'self_attn', 'doc_layers': 3, 'question_layers': 3, 'hidden_size': 128, 'num_features': 4, 'pos': True, 'ner': True, 'use_qemb': True, 'concat_rnn_layers': True, 'dropout_emb': 0.4, 'dropout_rnn': 0.4, 'dropout_rnn_output': True, 'max_len': 15, 'rnn_type': 'lstm', 'pretrained_words': True, 'vocab_size': 91590, 'embedding_dim': 300, 'pos_size': 50, 'ner_size': 19}
02/15/2020 05:22:48 [Data loaded.]
02/15/2020 05:22:48 Epoch 1
02/15/2020 07:07:48 > epoch [ 1] updates[ 2707] train loss[4.38260] remaining[0:00:00]

02/15/2020 07:09:46 dev EM: 53.140964995269634 F1: 64.78947947738538
Traceback (most recent call last):
File "train.py", line 377, in
main()
File "train.py", line 87, in main
model.save(model_file, epoch, [em, f1, best_val_score])
File "/Users/balagopalbhallamudi/Desktop/DrQA/drqa/model.py", line 147, in save
'torch_cuda_state': torch.cuda.get_rng_state()
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/cuda/random.py", line 20, in get_rng_state
_lazy_init()
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/cuda/init.py", line 196, in _lazy_init
_check_driver()
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/cuda/init.py", line 94, in _check_driver
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled
(base) Balagopals-MacBook-Pro:DrQA balagopalbhallamudi$ python3 interact.py
Traceback (most recent call last):
File "interact.py", line 31, in
checkpoint = torch.load(args.model_file, map_location=lambda storage, loc: storage)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/serialization.py", line 525, in load
with _open_file_like(f, 'rb') as opened_file:
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/serialization.py", line 212, in _open_file_like
return _open_file(name_or_buffer, mode)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/serialization.py", line 193, in init
super(_open_file, self).init(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: 'models/best_model.pt'

Getting low F1 and EM scores

I have been working independently to implement this paper and I have referred to this repo on many occasions. I started training the model but I am not getting satisfactory results.
I am not updating the glove embeddings during training and not using the POS, ENT features. I have included f_align_feature however.
Can you tell me some reasons why this might be the case?

New dataset

how to add new dataset to the prediction engine for training , testing purpose ?how to add new dataset for prediction purpose ?

Does it requires GPU acceleration

hi Does it requires GPU acceleration? like pytorch GPU version ? can we develop it to use CPUs ? how many cores and ram is required to run it ?

Question

Hi, I'm new to Machine Learning and I've got a question regarding models predictions. What kind of data do i have to provide? Or even more general question. How do I actually make any predictions using this particular model?
Thanks

train stop

hello, i'm the new researcher on machine reading comprehension.
when use"python train.py -e 40 -bs 32", the process will stop at “Data loaded”.
could you give me the solution about this?

prepro.py , `to_id` function assigns id using tokens in BOTH train and dev?

Here's a code snippet of prepro.py:

full = train + dev
vocab, counter = build_vocab([row[5] for row in full], [row[1] for row in full])
w2id = {w: i for i, w in enumerate(vocab)}

def to_id(row, unk_id=1):
    context_tokens = row[1]
    context_features = row[2]
    context_tags = row[3]
    context_ents = row[4]
    question_tokens = row[5]
    question_ids = [w2id[w] if w in w2id else unk_id for w in question_tokens]
    context_ids = [w2id[w] if w in w2id else unk_id for w in context_tokens]
    ...

train = list(map(to_id, train))
dev = list(map(to_id, dev))

If my interpretation is right, this means that when processing the dev set, the FULL vocab set (constructed from train+dev) is used to determine if words in dev set are UNK. Shouldn't it be using vocab constructed from the train set only?
Let me know if my interpretation is right :)

msgpack.exceptions.UnpackValueError: Unpack failed: error = 0

I followed all of the instructions and then got this error. Would anyone know how to go about troubleshooting this?

msgpack.exceptions.UnpackValueError: Unpack failed: error = 0

Complete Error Message:

Traceback (most recent call last):
  File "/home/samrat/Documents/StudyBuddy/Algorithms/DrQA/interact.py", line 36, in <module>
    meta = msgpack.load(f, encoding='utf8')
  File "/home/samrat/Documents/StudyBuddy/venv/lib/python3.6/site-packages/msgpack/__init__.py", line 58, in unpack
    return unpackb(data, **kwargs)
  File "msgpack/_unpacker.pyx", line 211, in msgpack._unpacker.unpackb
msgpack.exceptions.UnpackValueError: Unpack failed: error = 0

Just FYI, I trained the model in Google Colab and downloaded it from there. Could this have caused any problems? I ran interact.py in Colab and it worked fine so I am really unsure what the problem is.

Thank You in Advance!!

Adding Evidence as Database (like wikipedia )

say once your model is trained and you export the model for prediction. you want to add all the evidence as database in "id", "txt" format.So multiple users can run the queries on the dataset for Answers . how to add such datasets ? would we require another python script like dataset / document reader.py ?

no model file

I run the command python interact.py got this error
(pt) swapnilbhadade@hitvoice:~/pt/DrQA-1$ python interact.py Traceback (most recent call last): File "interact.py", line 22, in <module> checkpoint = torch.load(args.model_file) File "/home/swapnilbhadade/pt/lib/python3.5/site-packages/torch/serialization.py", line 301, in load f = open(f, 'rb') FileNotFoundError: [Errno 2] No such file or directory: 'models/best_model.pt'

How to get the 78.6 F1 score?

Hi,

Thanks for creating this repo!

When I ran the code with default options and 30 epochs, I got 78.0~78.1 F1 score. Did I miss something? Do I need more training epochs?

Thanks,
Tao

training stopped at epoch 1

can you tell me how long does it take for the training process to complete?

i am using a google colab notebook. and it has been stuck at epoch 1 since last 20 mins

Update function doesn't work correctly.

I had already trained the model for 8 epochs and then the training stopped as my pc crashed. I saved the model folder and then I transfered it to an other pc. At the second pc, I executed the prepo.py to produce the meta.msgpack and data.msgpack again. Then I restarted the training with the command:
"python3 train.py -e 40 -bs 32 -rs checkpoint_epoch_8.pt -ro "

The strange thing is that Ι got this:
(02/08/2018 05:49:43 [Data loaded.]
02/08/2018 05:49:43 [loading previous model...]
02/08/2018 06:03:50 [dev EM: 44.11542100283822 F1: 56.78141180138346])
and at the evaluation state after the 8th epoch(at first pc) I had got
(02/05/2018 03:45:31 dev EM: 66.29139072847683 F1: 76.02391342380288)
How is it possible the accuracy at dev set to be reduced?

Also the train loss has been increased:
02/08/2018 06:03:50 Epoch 9
02/08/2018 06:07:01 epoch [ 9] updates[ 21641] train loss[3.85778] remaining[2 days, 4:23:09]
02/08/2018 06:09:00 epoch [ 9] updates[ 21644] train loss[4.76032] remaining[1 day, 11:30:17]
02/08/2018 06:12:22 epoch [ 9] updates[ 21647] train loss[4.78624] remaining[1 day, 17:57:18]
02/08/2018 06:14:58 epoch [ 9] updates[ 21650] train loss[4.76184] remaining[1 day, 17:00:43]
02/08/2018 06:19:14 epoch [ 9] updates[ 21653] train loss[4.79866] remaining[1 day, 22:13:37]
02/08/2018 06:22:23 epoch [ 9] updates[ 21656] train loss[4.88899] remaining[1 day, 22:20:00]
02/08/2018 06:24:32 epoch [ 9] updates[ 21659] train loss[4.91365] remaining[1 day, 20:01:18]
02/08/2018 06:26:16 epoch [ 9] updates[ 21662] train loss[4.88356] remaining[1 day, 17:30:11]
02/08/2018 06:27:48 epoch [ 9] updates[ 21665] train loss[4.81403] remaining[1 day, 15:14:03]
02/08/2018 06:29:32 epoch [ 9] updates[ 21668] train loss[4.76201] remaining[1 day, 13:45:44]
02/08/2018 06:31:33 epoch [ 9] updates[ 21671] train loss[4.74199] remaining[1 day, 12:58:03]
02/08/2018 06:33:21 epoch [ 9] updates[ 21674] train loss[4.73257] remaining[1 day, 12:00:50]
02/08/2018 06:35:34 epoch [ 9] updates[ 21677] train loss[4.72988] remaining[1 day, 11:43:07]
02/08/2018 06:37:44 epoch [ 9] updates[ 21680] train loss[4.69789] remaining[1 day, 11:24:20]
02/08/2018 06:39:13 epoch [ 9] updates[ 21683] train loss[4.70580] remaining[1 day, 10:26:02]
02/08/2018 06:41:08 epoch [ 9] updates[ 21686] train loss[4.71170] remaining[1 day, 10:00:15]
02/08/2018 06:42:29 epoch [ 9] updates[ 21689] train loss[4.70412] remaining[1 day, 9:05:42]
02/08/2018 06:44:02 epoch [ 9] updates[ 21692] train loss[4.70031] remaining[1 day, 8:28:25]
02/08/2018 06:45:49 epoch [ 9] updates[ 21695] train loss[4.67916] remaining[1 day, 8:06:20]
02/08/2018 06:47:25 epoch [ 9] updates[ 21698] train loss[4.65448] remaining[1 day, 7:37:25]
02/08/2018 06:49:12 epoch [ 9] updates[ 21701] train loss[4.64674] remaining[1 day, 7:19:29]
02/08/2018 06:50:58 epoch [ 9] updates[ 21704] train loss[4.61606] remaining[1 day, 7:02:11]

as at the 8th epoch(first pc) I had:
2/05/2018 03:23:24 epoch [ 8] updates[ 21624] train loss[3.56107] remaining[0:13:05]
02/05/2018 03:25:22 epoch [ 8] updates[ 21627] train loss[3.56098] remaining[0:10:37]
02/05/2018 03:27:28 epoch [ 8] updates[ 21630] train loss[3.56091] remaining[0:08:10]
02/05/2018 03:29:17 epoch [ 8] updates[ 21633] train loss[3.56084] remaining[0:05:43]
02/05/2018 03:30:57 epoch [ 8] updates[ 21636] train loss[3.56072] remaining[0:03:16]
02/05/2018 03:32:57 epoch [ 8] updates[ 21639] train loss[3.56061] remaining[0:00:49]

However it seems that after some iterations the train loss has been reduced even if some fluctuations still exists:
02/09/2018 10:32:30 epoch [ 9] updates[ 23123] train loss[4.03933] remaining[13:33:00]
02/09/2018 10:34:10 epoch [ 9] updates[ 23126] train loss[4.03983] remaining[13:30:44]
02/09/2018 10:36:24 epoch [ 9] updates[ 23129] train loss[4.03977] remaining[13:28:56]
02/09/2018 10:37:39 epoch [ 9] updates[ 23132] train loss[4.04052] remaining[13:26:20]
02/09/2018 10:39:15 epoch [ 9] updates[ 23135] train loss[4.04012] remaining[13:24:01]
02/09/2018 10:40:40 epoch [ 9] updates[ 23138] train loss[4.03946] remaining[13:21:34]
02/09/2018 10:42:24 epoch [ 9] updates[ 23141] train loss[4.04051] remaining[13:19:22]
02/09/2018 10:44:23 epoch [ 9] updates[ 23144] train loss[4.04052] remaining[13:17:22]
02/09/2018 10:46:02 epoch [ 9] updates[ 23147] train loss[4.04009] remaining[13:15:06]
02/09/2018 10:47:39 epoch [ 9] updates[ 23150] train loss[4.04014] remaining[13:12:49]
02/09/2018 10:49:17 epoch [ 9] updates[ 23153] train loss[4.03977] remaining[13:10:32]
02/09/2018 10:51:50 epoch [ 9] updates[ 23156] train loss[4.03920] remaining[13:08:59]
02/09/2018 10:53:59 epoch [ 9] updates[ 23159] train loss[4.03874] remaining[13:07:07]
02/09/2018 10:55:47 epoch [ 9] updates[ 23162] train loss[4.03874] remaining[13:04:59]

Can be an explanation for this model's behavior?

Regarding train.py

Upto how many epochs i can train the model?
40 epoches are sufficient?

Is there a way to know the score of the prediction to analyse whether it is right or wrong?

@hitvoice Consider below evidence and questions

{
"evidence":"I am on vacation from July 31st and coming back next month",
"question":{
"1":"when he is going on vacation?",
"2":"when he is returning back from vacation?",
}

Answer will be:
"when he is going on vacation?": "July 31st",
"when he is returning back from vacation?": "next month",

This is working as expected. But consider the case where I have not provided the return back details and the evidence is just

"evidence":"I am on vacation from July 31st"

And I am getting below answer
"when he is going on vacation?": "July 31st",
"when he is returning back from vacation?": "July 31st",

And we know that return back date is not July 31st, is there a way to get the score of the prediction and based on some threshold make it invalid or blank?

Cant do "bash"

When I do "bash download.sh", this happens

Error: ${REQUIRED[i]} is not installed.

I installed everything.

Only decode on a test set

Can you use the trained model to just decode on a new test set (same json format as dev) ? Instead of train+decode and at the same time? Thanks

Trying to understand the index_answer funtion

The last condition in this function, wherein you return (None, None). Does this condition arise or is it just for avoiding a crash.
I am trying to implement the same paper and when I try to get the final labels for my context-question pair, there are many answers that result in ValueError. Is this some flaw in dataset?
Thank you.

Finetune against a custom dataset

Hi Runqi Yang
Thanks for such a wonderful repo.
Quick help on how to finetune the model with a new dataset loading checkpoint from a SQUAD trained model.

How long to run the model for the default params

The code is realy helpfull, I was just curious to know as to how long it took you guys to run 1 epoch

using DrQA for Squad 2.0 and other datasets

Is it possible to use DrQA with Squad 2.0 or other QA datasets ? If so what would be the steps?

Question about POS and NER in the model

Does the model map each POS tag and NER tag category to a one-hot encoding? If not, why? It doesn't make sense to me how you can just supply the category ID directly in the embedding.

Test example

How to use an example(question) to test this project?

UNK过多

在用squad_preprocess.py预处理之后，用load_squad函数load出来的context和question里面UNK太多了，最后的vocab数量40000+，如下图。

但是我把data.msgpack里面存储的id形式的context和question用vocab转化为string形式后，发现UNK太多了，想问一下怎么处理呢？

Different function of evaluating metrics

I am facing some challenges in trying to reproduce the results. The evaluation function used in this repo is as follows: (To calculate the start and end indexes)

        max_len = self.opt['max_len'] or score_s.size(1)
        for i in range(score_s.size(0)):
            scores = torch.ger(score_s[i], score_e[i])
            scores.triu_().tril_(max_len - 1)
            scores = scores.numpy()
            s_idx, e_idx = np.unravel_index(np.argmax(scores), scores.shape)

I am using the following to calculate the start and end indexes from the predictions.

           preds = model(context, question, context_mask, question_mask)
           p1, p2 = preds
           y1, y2 = label[:,0], label[:,1]
           loss = F.nll_loss(p1, y1) + F.nll_loss(p2, y2)
           valid_loss += loss.item()
           yp1 = torch.argmax(p1, dim=1)
           yp2 = torch.argmax(p2, dim=1)
           yps = torch.stack([yp1, yp2], dim=1)

           y_min, _ = torch.min(yps,1) # corresponds to s_idx 
           y_max, _ = torch.max(yps,1) # corresponds to e_idx

I tried using both the methods and I am getting different results. Is something wrong with the latter approach?
Thank you

FileNotFoundError: [Errno 2] No such file or directory: 'SQuAD/meta.msgpack'

I ran download.sh and I saw two files in the SQuAD folder:
dev-v1.1.json train-v1.1.json

Then I got the error when running:
python train.py -e 40 -bs 32

Where can I download the meta.msgpack? Thanks.

save best model

        # save
        if not args.save_last_only or epoch == epoch_0 + args.epoches - 1:
            model_file = os.path.join(model_dir, 'checkpoint_epoch_{}.pt'.format(epoch))
            model.save(model_file, epoch)
            if f1 > best_val_score:
                best_val_score = f1
                copyfile(
                    os.path.join(model_dir, model_file),
                    os.path.join(model_dir, 'best_model.pt'))
                log.info('[new best model saved.]')

train.py中的save部分，copyfile中给出src和dst文件名，但是model_file之前已经是os.path.join后的结果，copyfile中不需要再os.path.join吧？

Gradient flow of the failing model

I am trying to reproduce this paper and I have referred to this repository. My training is not giving satisfactory results. The metrics are subpar. Upon investigation, I tried plotting the layer gradients and found this. I have used the function from this thread.

From the figure above, it seems as if the middle layers are not learning anything.
What should I try doing in order to fix this?

Using DrQA on an Chinese dataset

Is it expected that this code can be applied to a Chinese language dataset with only minor changes?

I understand that I will need to provide the following:

Chinese train/dev data files in the SQuAD format
GloVe word vectors trained on the Chinese language
Spacy Chinese language models
Changes in prepro.py to take care of things such as tokenization, add encoding="utf8" to file read/write statements, etc.

Would very much appreciate any insights if there is any known reasons why this is not supposed to work.

get_answer_index() takes 4 positional arguments but 5 were given

running prepro.py gives error:

Traceback (most recent call last):
File "prepro.py", line 165, in
train.answer_start, train.answer_end)])
File "prepro.py", line 163, in
zip(*[get_answer_index(a, b, c, d, e) for a, b, c, d, e in
TypeError: get_answer_index() takes 4 positional arguments but 5 were given

planning to implement Attend It Again paper.

Hello there!

I was planning to implement the attention-again from this paper Attend It Again on DrQA.

Basically, what Attend It again does is as follows.

This model has two LSTM layers. In the bottom layer of LSTM, we use the traditional attention mechanisms and generate the hidden state of LSTM unit from previous hidden state and current input. Next step, we integrate the hidden state of previous LSTM unit in top layer, current input feature and the current output from the bottom layer of LSTM unit.

My plan was to take the doc_hiddens and the x1_emb and feed these to an Attention similar to qemb_match along with question_hiddens then feed this to a LSTM network similar to doc_rnn. Later take this output and feed into start_attn and end_attn to get the start_scores and end_scores.

Can you please tell, if this will be any good to get the better F1 measure ?