hsinyuan-huang / flowqa Goto Github PK

View Code? Open in Web Editor NEW

198.0 198.0 57.0 35 KB

Implementation of conversational QA model: FlowQA (with slight improvement)

Python 99.51% Shell 0.49%

nlp pytorch question-answering

flowqa's People

Contributors

Stargazers

Watchers

Forkers

sparkjiao stevenlol hitxujian cswangjing haoyusoong lrh000 hulalazz mrtuo 1252187392 binbinbian ldruth28 xiaojino yuzunrui newenglandml saravananpsg flauted chenglongchen rafaelsimonmaia hengqujushi ddl-donglin pfzhu yjygo amoshua scottire yash-5 milozms bogerchen tommy870901 milllllk chavesliu colinsongf mikewlange bxclib2 donnyslin oumkale yumere dongdinglin desmonday scoyer aapp1420 pohanchi maobui2907 xhsun1997 rogervaas yueyingshuo qxl-space jessieho96 w2q3q1 aikho ady-yu zhangwen0629 huja9 00mjk haojiepan1 sahil11129

flowqa's Issues

docker image that contains allennlp and pytorch

I see that the FlowQA has already got a score on the leaderboard of the CoQA.
CoQA requires run experiments on Codalab.
So , Do you have a docker image that can run your code on.
Please give the link.

Hyperparameters of best model

What are the hyper parameter setting of your best model ?
I have a hard time to reproduce your experiments result.
I think there may be something I done wrong.

what is "dialog_ctx * 3" means in general_utils?

hi, i don't know what the means of "*3", what 3 means?

context_feature = torch.Tensor(batch_size, question_num, context_len, \ feature_len + (self.dialog_ctx * 3)).fill_(0).to(self.device)

and
if prv_ans_choice == 3: # There is an answer for k in range(prv_ans_st, prv_ans_end + 1): if k >= context_len: break context_feature[i, j, k, feature_len + prv_ctx * 3 + 1] = 1 else: context_feature[i, j, :select_len, feature_len + prv_ctx * 3 + 2] = 1

User Input

How to use this on our input

Where is hidden representation?

Can you know which part of the code the hidden representation described in the paper is?

Are there some problems with using max overlap of gold answer?

Well, I calculated the final F1 score of all the extracted answers with max F1 of gold answers and it's 89.
So do you think it's may cause bad performance by using them as supervised? I don't know how to use the span_text and input_text now. Thanks!

a question in predict for CoQA

Hello, I want to ask a question. In the training, each question is provided with previous N gold answers, and it is also operated on the development set, but at the time of testing, I don't know what the gold answer is. I don't know how to deal with this? Use the answer predicted by model on the previous questions?

EOFError: Ran out of input

When I execute python train_QuAC.py,there are errors as following:
Traceback (most recent call last):
File "train_QuAC.py", line 324, in
main()
File "train_QuAC.py", line 175, in main
model = QAModel(opt, train_embedding)
File "/home/zoulongkun/project/FlowQA-master/QA_model/model_QuAC.py", line 30, in init
self.network = FlowQA(opt, embedding)
File "/home/zoulongkun/project/FlowQA-master/QA_model/detail_model.py", line 47, in init
self.CoVe = layers.MTLSTM(opt, embedding)
File "/home/zoulongkun/project/FlowQA-master/QA_model/layers.py", line 161, in init
state_dict = torch.load(opt['MTLSTM_path'])
File "/home/zoulongkun/anaconda3/lib/python3.7/site-packages/torch/serialization.py", line 358, in load
return _load(f, map_location, pickle_module)
File "/home/zoulongkun/anaconda3/lib/python3.7/site-packages/torch/serialization.py", line 532, in _load
magic_number = pickle_module.load(f)
EOFError: Ran out of input

Your help will be highly appreciated!

RuntimeError: Expected object of type torch.FloatTensor but found type torch.cuda.FloatTensor for argument #4 'other'

i use --resume, after i evaluate the model and then train, i get this error：

Traceback (most recent call last):
File "train_CoQA.py", line 229, in
main()
File "train_CoQA.py", line 122, in main
model.update(batch)
File "/home/susht3/workspace/flow/QA_model/model_CoQA.py", line 128, in update
self.optimizer.step()
File "/home/susht3/local/anaconda3/envs/susht/lib/python3.6/site-packages/torch/optim/adamax.py", line 75, in step
exp_avg.mul_(beta1).add_(1 - beta1, grad)
RuntimeError: Expected object of type torch.FloatTensor but found type torch.cuda.FloatTensor for argument #4 'other' [INFO/MainProcess] process shutting down

the values of qa['yesno'] and qa['followup']

answer_choice = 0 if answer == 'CANNOTANSWER' else
1 if qa['yesno'] == 'y' else
2 if qa['yesno'] == 'n' else
3 # Not a yes/no question
if answer_choice != 0:
"""
0: Do not ask a follow up question!
1: Definitely ask a follow up question!
2: Not too important, but you can ask a follow up.
"""
answer_choice += 10 * (0 if qa['followup'] == "n" else
1 if qa['followup'] == "y" else
2)

Hi, this code comes from a function named proc_tran(ith,article) in preprocess_Quac.py . But I found that the qa['yesno'] doesn`t have a value 'n' while the qa['followup'] have a value 'm' in train.json .

Coreference and ellipsis in the conversation

Hello, I want to know how you solve the problem of co-reference and ellipsis in the conversation.

User

RuntimeError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

i change the code to evaluate for n batches, and get this error:

Traceback (most recent call last):
File "train_CoQA.py", line 234, in
main()
File "train_CoQA.py", line 122, in main
model.update(batch)
File "/home/susht3/workspace/flow/QA_model/model_CoQA.py", line 87, in update
score_s, score_e, score_c = self.network(*inputs)
File "/home/susht3/local/anaconda3/envs/susht/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/susht3/workspace/flow/QA_model/detail_model.py", line 163, in forward
representation_with_bos_eos, mask_with_bos_eos
File "/home/susht3/local/allennlp/allennlp/nn/util.py", line 1222, in remove_sentence_boundaries
sequence_lengths = mask.sum(dim=1).detach().cpu().numpy()
RuntimeError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

AttributeError: 'FlowQA' object has no attribute 'eval_embed'

when i eval the model, i get this error:

Traceback (most recent call last):
File "train_coqa.py", line 126, in
if name == 'main':
File "train_coqa.py", line 103, in main
if epoch % args.eval_per_epoch == 0:
File "train_coqa.py", line 53, in eval_epoch
for batch in (batches):
File "/home/workspace/flowQA/QA_model/model_CoQA.py", line 141, in predict
# Run forward
File "/home/workspace/flowQA/QA_model/model_CoQA.py", line 209, in update_eval_embed

File "/home/local/anaconda3/envs/susht/lib/python3.6/site-packages/torch/nn/modules/module.py", line 518, in getattr
type(self).name, name))
AttributeError: 'FlowQA' object has no attribute 'eval_embed'

the version number of the package in the requirement document

Hello, I want to ask a question. What is the version number of the package in the requirement document？

Did you use different predictors for CoQA and QuAC?

Hi,

Thank you for sharing the code! It really helps us better understand the FlowQA model.
It seems that in your implementation, both model_CoQA and model_QuAC use the same FlowQA network defined in detail_model.py. I was wondering if you actually used different answer predictors for CoQA and QuAC to respect the difference between the two datasets. Also I noticed that when computing loss and evaluating results, you did treat the output of the network differently for CoQA and QuAC in your code. Your help will be highly appreciated!

Slump in F1 accuracy during prediction

Hi,

During training, validation is performed (on the dev dataset) per epoch and corresponding F1 is calculated. The best F1 of 64% was found during the 15th epoch(total 30 epochs) and I saved the model at the 15th epoch. Thereafter, I used this model in predict_QuAC to predict on dev dataset. The F1 score after prediction is down to 61%. I am clueless as to why there is a huge reduction in the accuracy, even though the dev dataset is used for both validation and prediction.

Please let me know why there is this slump in accuracy.

Thanks

[Flow operation] Equivalent code using permutation

I noticed that you used many transposes rather than one permutation in the flow operation, is that for better performance?

I think the following code is equivalent to the original code,

        def flow_operation(cur_h, flow):
            n, t, c = x1_full.size()
            # flow_in = cur_h.transpose(0, 1).view(c, n, t, -1)
            # flow_in = flow_in.transpose(0, 2).contiguous().view(t, n * c, -1).transpose(0, 1)

            flow_in = cur_h.view(n, t, c, -1).permute(0, 2, 1, 3).contiguous().view(n * c, t, -1)

            # [bsz * context_length, max_qa_pair, hidden_state]
            flow_out = flow(flow_in)
            # [bsz * context_length, max_qa_pair, flow_hidden_state_dim (hidden_state/2)]
            if self.opt['no_dialog_flow']:
                flow_out = flow_out * 0

            # flow_out = flow_out.transpose(0, 1).view(t, n, c, -1).transpose(0, 2).contiguous()
            # flow_out = flow_out.view(c, n * t, -1).transpose(0, 1)
            # [bsz * max_qa_pair, context_length, flow_hidden_state_dim]

            flow_out = flow_out.view(n, c, t, -1).permute(0, 2, 1, 3).contiguous().view(n * t, c, -1)

            return flow_out

Request for the code corresponding to the SCONE experiments

Hi,

Thank you for releasing the code! Could you please also share the code corresponding to the SCONE experiments?

Where is the code for abstractive answer?

@momohuang Thank you!!

can you use previous rationale for predict？

hi，i find that you use previous rationale to get answer id, and use previous answer id for prediction.
i also use previous rationale to extract answer, but when i submit to the leaderboard, author tell me that i cannot use previous rationale

RuntimeError: CUDA error: out of memory

When I execute python train_QuAC.py,there are errors as following:

After Input LSTM, the vector_sizes [doc, query] are [ 250 250 ] * 2
Self deep-attention 250 rays in 750-dim space
Before answer span finding, hidden size are 250 250
12/20/2018 04:43:43 [dev] Total number of params: 11852394
12/20/2018 16:43:43 - INFO - main - [dev] Total number of params: 11852394
12/20/2018 04:43:45 Epoch 1
12/20/2018 16:43:45 - WARNING - main - Epoch 1
12/20/2018 04:43:46 updates[ 1] train loss[15.87973] remaining[1:26:47]
12/20/2018 16:43:46 - INFO - main - updates[ 1] train loss[15.87973] remaining[1:26:47]
12/20/2018 04:44:00 updates[ 21] train loss[10.87507] remaining[0:45:12]
12/20/2018 16:44:00 - INFO - main - updates[ 21] train loss[10.87507] remaining[0:45:12]
12/20/2018 04:44:14 updates[ 41] train loss[10.16175] remaining[0:44:17]
12/20/2018 16:44:14 - INFO - main - updates[ 41] train loss[10.16175] remaining[0:44:17]
12/20/2018 04:44:29 updates[ 61] train loss[9.96472] remaining[0:45:30]
12/20/2018 16:44:29 - INFO - main - updates[ 61] train loss[9.96472] remaining[0:45:30]
12/20/2018 04:44:48 updates[ 81] train loss[9.56536] remaining[0:49:21]
12/20/2018 16:44:48 - INFO - main - updates[ 81] train loss[9.56536] remaining[0:49:21]
12/20/2018 04:45:03 updates[ 101] train loss[9.38102] remaining[0:48:40]
12/20/2018 16:45:03 - INFO - main - updates[ 101] train loss[9.38102] remaining[0:48:40]
12/20/2018 04:45:17 updates[ 121] train loss[9.11970] remaining[0:47:07]
12/20/2018 16:45:17 - INFO - main - updates[ 121] train loss[9.11970] remaining[0:47:07]
12/20/2018 04:45:35 updates[ 141] train loss[8.99858] remaining[0:48:30]
12/20/2018 16:45:35 - INFO - main - updates[ 141] train loss[8.99858] remaining[0:48:30]
12/20/2018 04:45:49 updates[ 161] train loss[8.76992] remaining[0:47:21]
12/20/2018 16:45:49 - INFO - main - updates[ 161] train loss[8.76992] remaining[0:47:21]
12/20/2018 04:46:03 updates[ 181] train loss[8.64918] remaining[0:46:42]
12/20/2018 16:46:03 - INFO - main - updates[ 181] train loss[8.64918] remaining[0:46:42]
12/20/2018 04:46:17 updates[ 201] train loss[8.58265] remaining[0:46:12]
12/20/2018 16:46:17 - INFO - main - updates[ 201] train loss[8.58265] remaining[0:46:12]
12/20/2018 04:46:30 updates[ 221] train loss[8.47283] remaining[0:45:20]
12/20/2018 16:46:30 - INFO - main - updates[ 221] train loss[8.47283] remaining[0:45:20]
12/20/2018 04:46:43 updates[ 241] train loss[8.38734] remaining[0:44:24]
12/20/2018 16:46:43 - INFO - main - updates[ 241] train loss[8.38734] remaining[0:44:24]
12/20/2018 04:46:57 updates[ 261] train loss[8.36940] remaining[0:44:00]
12/20/2018 16:46:57 - INFO - main - updates[ 261] train loss[8.36940] remaining[0:44:00]
Traceback (most recent call last):
File "train_QuAC.py", line 324, in
main()
File "train_QuAC.py", line 209, in main
model.update(batch)
File "/home/zys/文档/FlowQA-master/QA_model/model_QuAC.py", line 83, in update
score_s, score_e, score_no_answ = self.network(*inputs)
File "/home/zys/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/zys/文档/FlowQA-master/QA_model/detail_model.py", line 306, in forward
highlvl_self_attn_hiddens = self.highlvl_self_att(x1_att, x1_att, x1_mask, x3=doc_hiddens, drop_diagonal=True)
File "/home/zys/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/zys/文档/FlowQA-master/QA_model/layers.py", line 285, in forward
alpha = F.softmax(scores, dim=2)
File "/home/zys/anaconda3/lib/python3.6/site-packages/torch/nn/functional.py", line 889, in softmax
return input.softmax(dim)
RuntimeError: CUDA error: out of memory

Effect of batch size during testing time?

Hi, during my run time I have found that batch size parameter could impact the performance of the model during testing time?, Could it be due to fact that the same input sentence with different size of padding ([0,0,1,1] in batch size of 1 and [0,0,1,1,1,1,1] when mixing with other examples) can lead to some difference in output from final layer of the model?