yuwfan / hgn Goto Github PK

View Code? Open in Web Editor NEW

84.0 3.0 23.0 396 KB

License: MIT License

Python 99.76% Shell 0.24%

hgn's Introduction

HGN: Hierarchical Graph Network for Multi-hop Question Answering

This is the official repository of HGN (EMNLP 2020).

Requirements

We provide Docker image for easier reproduction. Please use Dockerfile or pull image directly.

docker pull studyfang/hgn:latest

To run docker without sudo permission, please refer this documentation Manage Docker as a non-root user. Then, you could start docker, e.g.

docker run --gpus all -it -v /datadrive_c/yuwfan:/ssd -it studyfang/hgn:latest bash

Quick Start

NOTE: Please make sure you have set up the environment correctly.

Download raw data and our preprocessed data.

bash scripts/download_data.sh

Inference

We provide roberta-large and albert-xxlarge-v2 finetuned model for inference directly. Please run

python predict.py --config_file configs/predict.roberta.json

You may get the following results on the dev set with RoBERTa-large model and ALBERT model respectively:

em = 0.6895340985820392
f1 = 0.8220366071156804
sp_em = 0.6310600945307225
sp_f1 = 0.8859230865915771
joint_em = 0.4649561107359892
joint_f1 = 0.7436079971145017

and

em = 0.7018230925050641
f1 = 0.8344362891739213
sp_em = 0.6317353139770425
sp_f1 = 0.8919316739978699
joint_em = 0.4700877785280216
joint_f1 = 0.7573679775376975

Please refer to Preprocess and training section if you want to reproduce other steps.

Preprocess

Please set DATA_ROOT in preprocess.sh for your usage, otherwise data will be downloaded to the currect directory.

To download and preprocess the data, please run

bash run.sh download,preprocess

After you download the data, you could also optionally to preprocess the data only:

bash run.sh preprocess

Training

Please set your home data folder HOME_DATA_FOLDER in envs.py.

And then run

python train.py --config_file configs/train.roberta.json

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.

When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repositories using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Acknowledgment

Our code makes a heavy use of Huggingface's PyTorch implementation, and DFGN. We thank them for open-sourcing their projects.

Citation

If you use this code useful, please star our repo or consider citing:

@article{fang2019hierarchical,
  title={Hierarchical graph network for multi-hop question answering},
  author={Fang, Yuwei and Sun, Siqi and Gan, Zhe and Pillai, Rohit and Wang, Shuohang and Liu, Jingjing},
  journal={arXiv preprint arXiv:1911.03631},
  year={2019}
}

License

MIT

hgn's People

Contributors

Stargazers

Watchers

Forkers

canyuchen iamfinethanksu aliebrahiiimi fanfanxiaozu tswings anshiquanshu66 jobine royahe coder14122 dorothy011 gongchuanyang mubidiy arisu-lin tengxiaoliu casually-pylearner hikary haritzpuerto r-mohammadi zhanwenchen

hgn's Issues

Docker image not up-to-date?

Hey, I pulled the docker image and it seems the docker file tree does not match the get started doc, here is what I got from the DrQA folder:

Could you provide some guidance on it? thx!

TypeError: _tokenize() got an unexpected keyword argument 'add_prefix_space'

Hi
In the preprocess, when we use albert model in 5_dump_features, we get this error:
TypeError: _tokenize() got an unexpected keyword argument 'add_prefix_space'
but it works fine for roberta model

full_data Address

hi
what is the full_data file in "scripts/5_dump_features.py". you make full_data file as a required parameter but you didn't address it in 5_dump_features.py.

how can I address that?

Documentation for Multi-GPU training

Greetings! I'm working on a project that would be greatly benefited by having the ability to train this model in a variety of settings. I have a good deal of experience with the DFGN code this repository is based on, however, multi-gpu training seems to have been modified heavily. I was wondering if some guidance can be provided in this area. In particular I am running out of GPU memory when moving the BERT encoder and graph model to a RTX 2080 (~11GB VRAM).

I've tried settings local_rank = -1 which seems to be what is required for multi-gpu, as well I set data_parallel to true, and set n_gpu to 3. I've been unable to verify if this method works as it is unclear to me how to specify which gpus the application should use (the computer server I use has 4 and almost also has at least 1 in use. From what I can tell the code will try to grab devices 0, 1, 2 and I'm not seeing a way to change this).

Another concern, from what I can tell in the code, the models are only put on separate gpus after the initial loading. In the original DFGN code, the models were loaded directly onto separate GPUs if multiple GPUs were specified.

HGN:

encoder.to(args.device)  # Note that args.device will be cuda in this case and not a specific gpu (I think)
model.to(args.device)

DFGN:

encoder_gpus = [int(i) for i in args.encoder_gpu.split(',')]
model_gpu = 'cuda:{}'.format(args.model_gpu)

encoder = BertModel.from_pretrained(args.bert_model)
encoder.cuda(encoder_gpus[0])
encoder = torch.nn.DataParallel(encoder, device_ids=encoder_gpus)
encoder.eval()

# Set Model
model = GraphFusionNet(config=args)
model.cuda(model_gpu)
model.train()
`

Hi I can not download the data by bash scripts/download_data.sh

Hyper-parameters for loss

Hi,
Thank you for your paper and your code.

I see you use:

ans_lambda = 1
type_lambda = 1
para_lambda = 1
sent_lambda = 5
ent_lambda = 1

Would you mind to share your motivations/reasons for why do you choose these numbers? Or you obtain it via experiments?

Thank you for your support.

IndexError: list index out of range

hi. I used your codes using hotpot_dev_fullwiki_v1.json as dev file. In dump_features phase I get this error:

/scripts/5_dump_features.py", line 225, in read_hotpot_examples
for _l in sel_paras[0]:
IndexError: list index out of range

I debugged code and I found that in multihop phase for some questions, no paragraph is selected.
a part of multihop file that is created is:

"5ae60426554299546bf83019": [
["Celebrity Home Entertainment"],
["BraveStarr"],
["Tottoi", "COPS (animated TV series)"]
],
"5a8cfee555429941ae14df5c": [],
"5a71166d5542994082a3e576": [
["Battle of Manila (1574)"],
["Battle of Klushino"],
["Battle of Yao (Japan)", "Battle of Alhandic"]
],

for "5a8cfee555429941ae14df5c" no paragraph is selected. so we get this exception.

How can I train the model in 4GPUs without docker?

Hi
In the train, when I train the model in 4GPUs, but it still only works on a single one.
How should I change the settings?

best parameters

Hi , Thank you for the code.
Can you provide the best parameters of the model?
Look forward to your reply.

code unavailable

Hi,

I'm sorry for not being good at english in advance.

I'm studying on NLP, GNN, etc..

I read your paper 'HGN' and I was about to run your code but I couldn't find any code about HGN.

So, would you plz let me know or provide your HGN code?

Thank you.

Problem to use custom dataset

Hello,

I'm trying to use my own custom dataset to test with your model. In order to simplify the test, I transformed my dataset into hotpotQA format. While doing the preprocess step, I found that I need to build a db before starting this process. So I found a script under script, 0_build_db.py.

What is the content for the input of this script? I found two variables 'text_with_links', and '_text_ner_str' in it, how did you get them?

About Performance

Hi,
I wonder how much gain does the trick "using entity label prediction as normalization" gets in your experiments?
There isn't a detailed ablation in your paper about this part.

heollo,can you help me?

I have trained the data, but I keep getting this problem when I run train.py：Traceback (most recent call last):
File "train.py", line 164, in
start, end, q_type, paras, sents, ents, _, _ = model(batch, return_yp=True)
File "/home/luoshihang/miniconda3/envs/MRC/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/luoshihang/py_proj/HGN-master/models/HGN.py", line 74, in forward
input_state, _ = self.ctx_attention(input_state, graph_state, graph_mask.squeeze(-1))
File "/home/luoshihang/miniconda3/envs/MRC/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/luoshihang/py_proj/HGN-master/models/layers.py", line 343, in forward
gate_th = torch.tanh(self.input_linear_2(output))
File "/home/luoshihang/miniconda3/envs/MRC/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/luoshihang/miniconda3/envs/MRC/lib/python3.7/site-packages/torch/nn/modules/linear.py", line 103, in forward
return F.linear(input, self.weight, self.bias)
File "/home/luoshihang/miniconda3/envs/MRC/lib/python3.7/site-packages/apex/amp/wrap.py", line 21, in wrapper
args[i] = utils.cached_cast(cast_fn, args[i], handle.cache)
File "/home/luoshihang/miniconda3/envs/MRC/lib/python3.7/site-packages/apex/amp/utils.py", line 97, in cached_cast
if cached_x.grad_fn.next_functions[1][0].variable is not x:
IndexError: tuple index out of range

can't find 'utils.py' file

I'm very interested in your paper. In this file 'HGN/scripts/3_paragraph_ranking.py', the file 'utils.feature_selection' can't be found. Could you upload the file? Thanks!

Fine-tuning the LM encoder

Hi, I wonder why you don't fine-tune the LM as in most previous works. Have you tried to fine-tune roBERTa? Thank you

Pretrained model?

Hi,

Is the pretrained model available to download? I cannot find the link.

Thank you

Question on the MLP for supporting sentences

I can see that you are using an MLP to produce one single score to supporting sentences. The dimension of MLP is from 2*hidden_size to 1. Then the code, concat a '0' to make the last dimension become 2.

https://github.com/yuwfan/HGN/blob/master/models/layers.py#L284-L297

        gat_logit = self.sent_mlp(graph_state[:, :1+max_para_num+max_sent_num, :]) # N x max_sent x 1
        para_logit = gat_logit[:, 1:1+max_para_num, :].contiguous()
        sent_logit = gat_logit[:, 1+max_para_num:, :].contiguous()

        .....
        sent_logits_aux = Variable(sent_logit.data.new(sent_logit.size(0), sent_logit.size(1), 1).zero_())
        sent_prediction = torch.cat([sent_logits_aux, sent_logit], dim=-1).contiguous()

I was wondering why it isn't from 2*hidden_size to 2. Isn't it more natural for binary classification?