hy-struggle / prgc Goto Github PK

View Code? Open in Web Editor NEW

105.0 105.0 16.0 8.37 MB

PRGC: Potential Relation and Global Correspondence Based Joint Relational Triple Extraction

Python 99.37% Shell 0.63%

prgc's People

Contributors

Stargazers

Watchers

Forkers

javantang good-panda abhinav-kumar-thakur zmtdya nautycode abel-harvey hawksilent xfzhu2003 lynne515 kzjava1998 awyys nlpersecjtu feihuamantian sy141109 bbbryant

prgc's Issues

Some issues w.r.t the dataset.

Why is there a duplicate triplet?
For example, the second data from the validator of the NYT dataset：
{
"text": "In his authoritative and tough-minded new book , '' The Assassins ' Gate : America in Iraq , '' the New Yorker writer George Packer reminds us that the decision of the Bush administration to go to war against Iraq and its increasingly embattled handling of the occupation were both predicated upon large , abstract ideas about the role of America in the post-cold war world -- most notably , a belief in pre-emptive and unilateral action , the viability of exporting democracy abroad , the urge to streamline the military and the dream of remaking the Middle East .",
"triple_list": [
[
"Middle East",
"/location/location/contains",
"Iraq"
],
[
"Middle East",
"/location/location/contains",
"Iraq"
]
]
},
There are two identical triples "Middle East /location/location/contains Iraq".
While for the NYT-star, the same item is following:
{
"text": "In his authoritative and tough-minded new book , '' The Assassins ' Gate : America in Iraq , '' the New Yorker writer George Packer reminds us that the decision of the Bush administration to go to war against Iraq and its increasingly embattled handling of the occupation were both predicated upon large , abstract ideas about the role of America in the post-cold war world -- most notably , a belief in pre-emptive and unilateral action , the viability of exporting democracy abroad , the urge to streamline the military and the dream of remaking the Middle East .",
"triple_list": [
[
"East",
"/location/location/contains",
"Iraq"
]
]
},
which in contrast doesn't have two same triples.
Is this phenomenon justified? Does this affect the final performance for the experiment?

关于训练一轮后 evaluate.py 测试结果准确率 f1值全是0 输出pre预测值都没有什么情况

help

When I use multi_gpu to run train script,It shows 'zip argument #1 must support iteration',and I have already transferred the type of function evaluate's return to tensor.Can you tell me how to fix the issue?

数据类型错误 'NoneType' *int

请教arxiv论文中的 test 集数据分析表中的 triples 数目是否没有去重？

因为看代码中使用了 defaultdict，但是使用 dict 加载数据集的时候发现 nyt 中的 triples 数目应当是 8120？

关于实验结果达不到论文所提

你好，我按照你在附录写的参数做的实验，选取验证集上表现最好的模型在NYT-star这个数据集上F1只有91.6，与论文所说的92.6还差一个百分点。另外在WebNLG数据集上实验F1值只有88.2

bert_config.json

我是小白，想运行一下代码，请问为什么找不到这个文件啊？bert_config.json

rel

If I only want to get the rel precission, I find its f1 is small

Whether need to run 100 epochs?

I have run 20 epochs, and the f1 score seems not to be higher. Also, I can not reproduce the same result in NYT as in the paper? So we must run 100 epochs?

After training one epoch, got an error

Epoch 1/100
100%|####################################################################################################| 89/89 [00:27<00:00, 3.25it/s, loss=0.288, loss_mat=0.000, loss_rel=0.000, loss_seq=0.288]
0%| | 0/2 [00:00<?, ?Batch/s]
Traceback (most recent call last):
File "train.py", line 224, in
train_and_evaluate(model, params, ex_params, args.restore_file)
File "train.py", line 153, in train_and_evaluate
val_metrics = evaluate(model, val_loader, params, ex_params, mark='Val')
File "F:\code\PRGC\evaluate.py", line 92, in evaluate
ex_params=ex_params)
ValueError: not enough values to unpack (expected 4, got 3)

I'm so confused about this error. because I confirmed the output of the evaluate function is 3. The expected 4 is non-sense.

The result of the experiment did not come up to expectations . Why ?

When batch size =30, other parameters are the same as yours, F1 score is only 90.5 in NYT-star test datasets. It does not reach the 92.6 mentioned in the paper.

Additional hidden layers in each of the three stages?

According to the model description in the paper on arxiv (equations (1), (2) and (3)), each stage of PRGC consists of only one linear layer followed by a sigmoid. When I inspected the code, it seemed the each stage has an additional non-linear hidden layer (the MultiNonLinearClassifier class). This greatly increases the model size. Were the published results achieved using the smaller or bigger model?

三元组提取

小白提问，想要看到训练集和测试集的三元组提取情况应该怎么做呢

f=0,p=0,r=0

When I use my dataset to run ,the output is 0 which f=0,p=0 and r=0. But while I use the dataset of PRGC, the output is normal. How can i solve this problem? Thanks

model 的evaluate中argument命令缺失问题

您好,打扰您了我在使用您的模型训练NYT数据集在训练完之后想使用evaluate.py评估一下模型效果，结果evaluate.sh中--mode没有在evaluate.py中定义报错
evaluate.py: error: unrecognized arguments: --mode=test
以及
Traceback (most recent call last):
File "/root/PRGC-main/evaluate.py", line 142, in
'corres_threshold': args.mat_threshold,
AttributeError: 'Namespace' object has no attribute 'mat_threshold'请问您可以给说一下解决思路吗？万分感谢

x_i

请问代码注释里的x_i是什么？

Chinese field

Does the model support Chinese triplet extraction? thanks for your reply

关于模型

您好，请问模型在进行关系预测的时候，如果一次预测出多个潜在关系，那是将多个潜在关系分别输入模型，然后再进行实体的预测么？

加载val后程序停滞

在尝试复现论文执行train.py时遇到的问题

train.py

hi，In train.py,when calculate the num_train_optimization_steps，why is len(train_loader) not divided by batch_size?

grad can be implicitly created only for scalar outputs

this bug appeared when i try to use several GPUs to run the model.
how can i fix the bug?

参数意义

--ensure_corres和--ensure_rel两个参数代表什么意思，期待您的回复

显卡选择

我怎么改可以让他使用两张显卡去训练模型？

没有开接口的文件吗？

evaluate时gather_map 收集不同GPU数据时zip报错

Epoch 1/20
0%| | 0/1266 [00:00<?, ?it/s]Epoch=20
/home/root1/anaconda3/envs/prgc/lib/python3.7/site-packages/torch/nn/parallel/_functions.py:61: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector.
warnings.warn('Was asked to gather along dimension 0, but all '
0%| | 0/1266 [01:55<?, ?it/s, loss=2.482, loss_mat=0.665, loss_rel=0.692, loss_seq=1.125]
0%| | 0/209 [01:11<?, ?Batch/s]
0%| | 0/1266 [03:06<?, ?it/s, loss=2.482, loss_mat=0.665, loss_rel=0.692, loss_seq=1.125]
Traceback (most recent call last):
File "/data00/home/hjj/code/RelationExtract/Joint_Extraction/PRGCLocal/evaluate.py", line 92, in evaluate
ex_params=ex_params)
File "/home/root1/anaconda3/envs/prgc/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/root1/anaconda3/envs/prgc/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 156, in forward
return self.gather(outputs, self.output_device)
File "/home/root1/anaconda3/envs/prgc/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 168, in gather
return gather(outputs, output_device, dim=self.dim)
File "/home/root1/anaconda3/envs/prgc/lib/python3.7/site-packages/torch/nn/parallel/scatter_gather.py", line 68, in gather
res = gather_map(outputs)
File "/home/root1/anaconda3/envs/prgc/lib/python3.7/site-packages/torch/nn/parallel/scatter_gather.py", line 63, in gather_map
return type(out)(map(gather_map, zip(*outputs)))
File "/home/root1/anaconda3/envs/prgc/lib/python3.7/site-packages/torch/nn/parallel/scatter_gather.py", line 63, in gather_map
return type(out)(map(gather_map, zip(*outputs)))
File "/home/root1/anaconda3/envs/prgc/lib/python3.7/site-packages/torch/nn/parallel/scatter_gather.py", line 63, in gather_map
return type(out)(map(gather_map, zip(*outputs)))
TypeError: zip argument #1 must support iteration

Process finished with exit code 1

A Problem of Complexity Analysis

How do you calculate number of parameters, floating point operations (FLOPs) and inference time? Could you give me the code? Thank you very much.