zjunlp / docunet Goto Github PK
View Code? Open in Web Editor NEW[IJCAI 2021] Document-level Relation Extraction as Semantic Segmentation
License: MIT License
[IJCAI 2021] Document-level Relation Extraction as Semantic Segmentation
License: MIT License
Hello author, we are very interested in your work, but we have not changed the code, and we cannot reproduce your results on robert-larger, and hope to get your help.
The results we have reproduced so far, the F1 on the development set is 56%.
Thanks.
您好,想问问一下,代码哪里可以下载呢?
还有以下几个问题想请教一下:
1、做文档级关系抽取,句子中会用到指代词,比如 they,it,he,this 等等,这些指代词需要做替换吗?还是将指带关系-coreference,也放入到文档级关系的识别过程中?
2、做文档级关系抽取, define a N × N matrix Y,这个矩阵Y 的 行是训练集中所有的实体吗,还是每一篇文档构建一个矩阵Y?
3、在实验设置中,提到了“We set the matrix size N = 42” 这个矩阵是指的第二条的矩阵Y吗?
这个42指的是什么呢?怎样获得的呢?
Both in CDR and GDA running bash
#! /bin/bash
export CUDA_VISIBLE_DEVICES=0
if true; then
type=context-based
bs=4
bl=3e-5
uls=(4e-4)
accum=1
for ul in ${uls[@]}
do
python -u ./train_bio.py --data_dir ./dataset/cdr \
--max_height 35 \
--channel_type $type \
--bert_lr $bl \
--transformer_type bert \
--model_name_or_path allenai/scibert_scivocab_cased \
--train_file train.data \
--dev_file dev.data \
--test_file test.data \
--train_batch_size $bs \
--test_batch_size $bs \
--gradient_accumulation_steps $accum \
--num_labels 1 \
--learning_rate $ul \
--max_grad_norm 1.0 \
--warmup_ratio 0.06 \
--num_train_epochs 30 \
--seed 111 \
--num_class 2 \
--save_path ./checkpoint/cdr/train_scibert-lr${bl}_accum${accum}_unet-lr${ul}_bs${bs}.pt \
--log_dir ./logs/cdr/train_scibert-lr${bl}_accum${accum}_unet-lr${ul}_bs${bs}.log
done
fi
Hello!
Thank you for the awesome work, I enjoyed reading your about and your approach.
Is it possible to share the trained weights for DocRED?
To be honest, it will save me a ton of time haha. I'm writing a paper and would like to use the trained DocRED model on a few examples. Replicating this work is proving difficult when I don't have access to enough GPU memory.
I tried to implement the DocuNet model in colab. while running the model, I got the following error:
The following command I ran with DocRED dataset:
!bash scripts/run_docred.sh --transformer-type roberta
!bash scripts/run_docred.sh --transformer-type bert
Both the commands, throws same error.
The error was follows:
Traceback (most recent call last):
File "./train_balanceloss.py", line 12, in
from model_balanceloss import DocREModel
File "/content/DocuNet/model_balanceloss.py", line 8, in
from element_wise import ElementWiseMatrixAttention
File "/content/DocuNet/element_wise.py", line 8, in
class ElementWiseMatrixAttention(MatrixAttention):
File "/content/DocuNet/element_wise.py", line 22, in ElementWiseMatrixAttention
def forward(self, tensor_1: torch.Tensor, tensor_2: torch.Tensor) -> torch.Tensor:
File "/usr/local/lib/python3.7/site-packages/overrides/overrides.py", line 88, in overrides
return _overrides(method, check_signature, check_at_runtime)
File "/usr/local/lib/python3.7/site-packages/overrides/overrides.py", line 114, in _overrides
_validate_method(method, super_class, check_signature)
File "/usr/local/lib/python3.7/site-packages/overrides/overrides.py", line 135, in _validate_method
ensure_signature_is_compatible(super_method, method, is_static)
File "/usr/local/lib/python3.7/site-packages/overrides/signature.py", line 95, in ensure_signature_is_compatible
super_sig, sub_sig, super_type_hints, sub_type_hints, is_static, method_name
File "/usr/local/lib/python3.7/site-packages/overrides/signature.py", line 136, in ensure_all_kwargs_defined_in_sub
raise TypeError(f"{method_name}: {name}
is not present.")
TypeError: ElementWiseMatrixAttention.forward: matrix_1
is not present.
Traceback (most recent call last):
File "./train_balanceloss.py", line 12, in
from model_balanceloss import DocREModel
File "/content/DocuNet/model_balanceloss.py", line 8, in
from element_wise import ElementWiseMatrixAttention
File "/content/DocuNet/element_wise.py", line 8, in
class ElementWiseMatrixAttention(MatrixAttention):
File "/content/DocuNet/element_wise.py", line 22, in ElementWiseMatrixAttention
def forward(self, tensor_1: torch.Tensor, tensor_2: torch.Tensor) -> torch.Tensor:
File "/usr/local/lib/python3.7/site-packages/overrides/overrides.py", line 88, in overrides
return _overrides(method, check_signature, check_at_runtime)
File "/usr/local/lib/python3.7/site-packages/overrides/overrides.py", line 114, in _overrides
_validate_method(method, super_class, check_signature)
File "/usr/local/lib/python3.7/site-packages/overrides/overrides.py", line 135, in _validate_method
ensure_signature_is_compatible(super_method, method, is_static)
File "/usr/local/lib/python3.7/site-packages/overrides/signature.py", line 95, in ensure_signature_is_compatible
super_sig, sub_sig, super_type_hints, sub_type_hints, is_static, method_name
File "/usr/local/lib/python3.7/site-packages/overrides/signature.py", line 136, in ensure_all_kwargs_defined_in_sub
raise TypeError(f"{method_name}: {name}
is not present.")
TypeError: ElementWiseMatrixAttention.forward: matrix_1
is not present.
Traceback (most recent call last):
File "./train_balanceloss.py", line 12, in
from model_balanceloss import DocREModel
File "/content/DocuNet/model_balanceloss.py", line 8, in
from element_wise import ElementWiseMatrixAttention
File "/content/DocuNet/element_wise.py", line 8, in
class ElementWiseMatrixAttention(MatrixAttention):
File "/content/DocuNet/element_wise.py", line 22, in ElementWiseMatrixAttention
def forward(self, tensor_1: torch.Tensor, tensor_2: torch.Tensor) -> torch.Tensor:
File "/usr/local/lib/python3.7/site-packages/overrides/overrides.py", line 88, in overrides
return _overrides(method, check_signature, check_at_runtime)
File "/usr/local/lib/python3.7/site-packages/overrides/overrides.py", line 114, in _overrides
_validate_method(method, super_class, check_signature)
File "/usr/local/lib/python3.7/site-packages/overrides/overrides.py", line 135, in _validate_method
ensure_signature_is_compatible(super_method, method, is_static)
File "/usr/local/lib/python3.7/site-packages/overrides/signature.py", line 95, in ensure_signature_is_compatible
super_sig, sub_sig, super_type_hints, sub_type_hints, is_static, method_name
File "/usr/local/lib/python3.7/site-packages/overrides/signature.py", line 136, in ensure_all_kwargs_defined_in_sub
raise TypeError(f"{method_name}: {name}
is not present.")
TypeError: ElementWiseMatrixAttention.forward: matrix_1
is not present.
I couldn't able to find the error. How to solve this error? can you help me out from this error?
Thank you!
When I am trying to run the script "run_docred.sh", I got the this error. I unable to solve this issue. like no file named as overrides.
I checked the other python files, no function or classes named overrides found. the error:
File "./train_balanceloss.py", line 12, in <module>
from model_balanceloss import DocREModel
File "/content/DocuNet/model_balanceloss.py", line 8, in <module>
from element_wise import ElementWiseMatrixAttention
File "/content/DocuNet/element_wise.py", line 2, in <module>
from overrides import overrides
ModuleNotFoundError: No module named 'overrides' ```
How to solve this issue?
Thank you
Are there any code files that have not been uploaded?
Hi,
I notice that in your paper it is said that "... N is the largest number of entities, counted from all the dataset samples". However, it seems like that in your DocRED experiment the size N is fixed to args.max_height 42. So I wonder what does N stand for?
hello,为什么我跑cdr数据集,出来的f1值是85左右,然后论文给出的是76.3
Here is the log:
Let's use 2 GPUs!
Total steps: 763
Warmup steps: 45
Traceback (most recent call last):
File "./train_balanceloss.py", line 325, in
main()
File "./train_balanceloss.py", line 314, in main
train(args, model, train_features, dev_features, test_features)
File "./train_balanceloss.py", line 137, in train
finetune(train_features, optimizer, args.num_train_epochs, num_steps, model)
File "./train_balanceloss.py", line 62, in finetune
outputs = model(**inputs)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 155, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 165, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/opt/conda/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply
output.reraise()
File "/opt/conda/lib/python3.8/site-packages/torch/_utils.py", line 395, in reraise
raise self.exc_type(msg)
IndexError: Caught IndexError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
output = module(*input, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/ssd3/chunxu/docunet_predict/model_balanceloss.py", line 163, in forward
hs, ts, entity_embs, entity_as = self.get_hrt(sequence_output, attention, entity_pos, hts)
File "/home/ssd3/chunxu/docunet_predict/model_balanceloss.py", line 67, in get_hrt
e_emb.append(sequence_output[i, start + offset])
IndexError: index 2 is out of bounds for dimension 0 with size 2
Hello,thank you for your outstanding work. In this paper, can you tell me the value of the parameter D‘, this is not introduced in your paper.
想问下 这个 transformers==3.0.4 到底是哪个版本 。 我用4.17和3.4.0,都尝试了 ,还是会报各种库里面的错
Could you please provide the hyperparameters for training on multiple GPUs? I still cannot reproduce your reported results using roberta, similar to
@Veronicium in #5 (comment)_
thx
请问U-shaped Segmentation Module框架图中的FFD是代表的什么啊?看代码中就是一个卷积层.
Hello!
Thank you for the awesome repository.
Is it possible to share an updated version of the trained weights for DocRED or the result.json file? The trained weights shared on issue #9 don't predict anything on the official evaluation.
I am doing a research thesis and not having to train the model would save me a lot of time. I'm performing this task on a model I developed but my model can only predict positive labels (non-Na) and the result.json file generated by this model would help me filter the Na examples. My ideia would be for my model to predict the pairs of labels your model predicted. As I don't have access to enough GPU memory I am not able to train your model from scratch.
Thank you very much.
Why was bert able to converge to 61 and Robert could only converge to about 47, using the original code and running on 3090
Thanks for your work.
Recently, I used the default hyperparameter in the script to train a roberta-large model but got a very low result:
'dev_F1': 47.51133476159634, 'dev_F1_ign': 45.94575619655935, 'dev_re_p': 61.53846153846154, 'dev_re_r': 38.6918769780086
Could you please take a look at my training log and suggest any possible reasons for that?
train_roberta-lr3e-5_accum2_unet-lr4e-4_type_context-based.log
报错:
scripts/run_docred.sh: line 3: $'\r': command not found
scripts/run_docred.sh: line 14: syntax error near unexpected token `$'do\r''
'cripts/run_docred.sh: line 14: ` do
这个run_docred.sh 脚本是不是用windows改过导致 linux 下跑不通啊
I did not know why append the same e_att so many times in the entity_atts
for _ in range(self.min_height-entity_num-1):
entity_atts.append(e_att)
Thanks for your help!
Hi! I'm quite interested in your paper but I have trouble reproducing your results. When I run the run_docred.sh file, The bert-base model works fine (by reaching 61.54/59.56 F1/Ign F1, around mean-1.5sd), but my Roberta model can only get 63.30/61.40 F1/Ign F1.
I notice some hyper-parameters listed in the script and the supplementary material is inconsistent. For example, the docred script uses bs=accum=2 but the supplementary says you use bs=5, accum=1. The supplementary says the weight decay is set to 5e-4, but I didn't see it in the code. Is this the reason why I couldn't reproduce the results?
Plus, could you upload your utils_sample file? It seems the file is missing. I plug in the one in ATLOP and that works, but I'm not sure if you write it in the same way.
Thanks!
你好,docred数据集如何获取
| epoch 29 | time: 51.71s | dev_result:{'dev_F1': 0.06700784829423147, 'dev_F1_ign': 0.05671531404503352, 'dev_re_p': 0.033853348984865014, 'dev_re_r': 3.245962833725554, 'dev_average_loss': 5.296513883590698}
How can I solve this problem?
I run their scripts, and only got lower F1-scores (dev_result:{'dev_F1': 61.39554434636402, 'dev_F1_ign': 59.42344205967282, 'dev_re_p': 63.68710211912444, 'dev_re_r': 59.2631664367443, 'dev_average_loss': 0.3790786044299603}).
我想请问一下,当文本过长的时候(超过512的时候)如何计算$A_i^S$。(Entity对文本的attention_weight)
感谢您的答疑!:)
您好,想请教下,我应该去哪下载数据集,并且对于数据集应该进行哪些操作?
我运行命令:bash scripts/run_cdr.sh
FileNotFoundError: [Errno 2] No such file or directory: './dataset/cdr/train_filter.data'
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.