Giter Site home page Giter Site logo

spico197 / docee Goto Github PK

View Code? Open in Web Editor NEW
225.0 6.0 36.0 35.96 MB

🕹️ A toolkit for document-level event extraction, containing some SOTA model implementations.

Home Page: https://doc-ee.readthedocs.io/

License: MIT License

Python 94.35% Makefile 0.05% HTML 0.83% Shell 4.77%
event-extraction information-extraction natural-language-understanding pytorch

docee's Introduction

Hi there 👋

🎀 Thanks for your visiting. I'm Tong Zhu, and I'm interested in Information Extraction. You may want to checkout my homepage for more details ~

Other repositories from me:

docee's People

Contributors

shiina18 avatar spico197 avatar tosemml avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

docee's Issues

想问问大佬投了什么顶会

** Idea sharing **
While sharing what you want to do, make sure to protect your ideas.

** Problems **
If you have any questions about event extraction, make sure you have read the latest papers or searched on the Internet.

** Others **
Other things you may want to share or discuss.


训练 teacher prob 的问题

看了下模型训练最后还有0.1的概率 这样模型最终训练的时候其实应该只有 use_gold_span 为True的时候 指标才会上升 其他的时候基本都是不会增长的 那这个在dev上面的效果还有参考意义吗

伪触发词

您好,在看代码的时候发现zheng2019_trigger_graph.py中伪触发词选择的是测试集中重要性最高的,请问这是否是合理的呢

您好 关于GIT的GCN输出那一段如何理解

您好老师, 我在看这几个模型的时候,有一些疑问,请问GIT在做ner的时候,是使用的几层Transformer呢?
构成异构图的时候,GCN是输出的是什么呢?包括后续事件检测时,使用的是什么来代表文档向量进行事件分类的呢?
看论文和代码的时候有些没看懂,希望得到老师您的解答。
谢谢🙏

DueeFin结果讨论

** Idea sharing **
While sharing what you want to do, make sure to protect your ideas.

** Problems **
嗨,我跟着您的repo 训练完了'PTPCG_P1-DuEE_fin-woTgg-wOtherType'任务(即|R|=1, Tgg=×).对于Results文件夹的结果有些疑惑,特来请教。
total_results: [ { "ModelType": "TriggerAwarePrunedCompleteGraph", "Total": { "precision": "68.8", "recall": "62.4", "f1": "65.5" }]
请问total_results的f1值是按照下图(图1)的方式计算的吗
捕获

其次
"m2m": { "classification": { "precision": "94.696", "recall": "93.718", "f1": "94.204" }, "entity": { "precision": "80.362", "recall": "85.863", "f1": "83.022" }, "combination": { "precision": "22.823", "recall": "24.050", "f1": "23.421" }, "rawCombination": { "precision": "22.448", "recall": "21.756", "f1": "22.097" }, "overall": { "precision": "68.838", "recall": "62.408", "f1": "65.465" }, "instance": { "precision": "21.141", "recall": "22.977", "f1": "22.021" } }
这里的combination的分数是论元的提取分数吗,如果是,则22的f1值是否说明该模型对论元提取任务的参考欠佳。

** Others **
Other things you may want to share or discuss.

这个库里面哪些代码是ptpcg这个算法用到的

Agreement

  • Fill the space in brackets with x to check the agreement items.
  • Before submitting this issue, I've fully checked the instructions in README.md.
  • Before submitting this issue, I'd searched in the issue area and didn't find a solved issue that covers my problem.
  • This issue is about the toolkit itself, not Python, pip or other programming basics.
  • I understand if I do not check all the agreemnt items above, my issue MAY BE CLOSED OR REMOVED WITHOUT FURTHER EXPLANATIONS.

Problem

我想单独看这个算法的相关的部分内容 不看其他的 是否有历史的分支项目代码

Environment

Environment Values
System Windows/Linux
GPU Device
CUDA Version
Python Version
PyTorch Version
dee (the Toolkit) Version

Full Log

Log:

DDP问题 - IndexError: Caught IndexError in replica 0 on device 0

老师您好,在使用单机多卡的时候,会出现以下报错:

Traceback (most recent call last):
File "/data/home/qianbenchen/DocEE-main/dee/tasks/dee_task.py", line 587, in get_loss_on_batch
teacher_prob=teacher_prob,
File "/data/home/qianbenchen/envs/torch/venv/lib64/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/data/home/qianbenchen/envs/torch/venv/lib64/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 161, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/data/home/qianbenchen/envs/torch/venv/lib64/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 171, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/data/home/qianbenchen/envs/torch/venv/lib64/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply
output.reraise()
File "/data/home/qianbenchen/envs/torch/venv/lib64/python3.6/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
IndexError: Caught IndexError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/data/home/qianbenchen/envs/torch/venv/lib64/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
output = module(*input, **kwargs)
File "/data/home/qianbenchen/envs/torch/venv/lib64/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/data/home/qianbenchen/DocEE-main/dee/models/trigger_aware.py", line 172, in forward
ent_fix_mode=self.config.ent_fix_mode,
File "/data/home/qianbenchen/DocEE-main/dee/modules/doc_info.py", line 305, in get_doc_arg_rel_info_list
) = get_span_mention_info(span_dranges_list, doc_token_type_mat)
File "/data/home/qianbenchen/DocEE-main/dee/modules/doc_info.py", line 16, in get_span_mention_info
mention_type_list.append(doc_token_type_list[sent_idx][char_s])
IndexError: list index out of range

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "run_dee_task.py", line 274, in
dee_task.train(save_cpt_flag=in_argv.save_cpt_flag)
File "/data/home/qianbenchen/DocEE-main/dee/tasks/dee_task.py", line 656, in train
base_epoch_idx=resume_base_epoch,
File "/data/home/qianbenchen/DocEE-main/dee/tasks/base_task.py", line 693, in base_train
total_loss = get_loss_func(self, batch, **kwargs_dict1)
File "/data/home/qianbenchen/DocEE-main/dee/tasks/dee_task.py", line 598, in get_loss_on_batch
raise Exception("Cannot get the loss")

请问是否有得到解决呢?谢谢!

uncleaned redundancies

  • LSTMMTL2EDAGModel
  • EventTableForIndependentTypeCombination
  • DEEMultiStepTriggeringFeatureConverter
  • DEEMultiStepTriggeringFeature

关于predict_one()函数的输入问题

老师您好,感谢您的指导,又来打扰您了。今天我完成了初始化工作后,使用predict_one()函数进行测试。我使用了两组纯文本数据,第一组是字母乱码,第二组是金融领域的文本。第一组有输出,第二组直接卡死,这是什么原因呢?我是忽视了什么参数吗?(我使用了Google的bert模型https://storage.googleapis.com/bert_models/2018_11_03/chinese_L-12_H-768_A-12.zip
52%7W_L))@XM0(4KXV$YRDG

关于使用问题

你好我是使用windows的学生,拜读您的论文后十分感兴趣,看了之前您关于api问题的回答,但是还是不知道下载模型后如何在我的项目中使用,您方便的话可以赐教一下嘛

How can PTPCG be used for news event extraction

Hi,I will use PTPCG for news event extraction. The data format has been processed into PTPCG input format. And I have used trigger.py to get the importance score of psudo trigger selection. Do I need to modify other places besides? like utils.py of dee folder. And what other details should I pay attention to....
Thanks for reading! I am looking forward to your replying!

EE with triggers

do you have any materials or repo work for the doc EE with triggers ? thanks.

Readme first before opening a new issue when error occurs. 遇到报错提issue之前先看这里

For toolkit usage errors, you must strictly follow the Toolkit usage issue template to open a new issue.
对于使用时报错等工具使用类的问题,必须严格使用 Toolkit usage issue 模板进行提问。

Otherwise, your issue may be closed directly without further explanations.
否则您的 issue 可能会被无解释地直接关闭。

The template can be found when you open a new issue.
该模板可在新建 issue 时找到。

image

触发词的问题

我又来了,还是有问题想请问下:

  1. 按照论文里面的pipeline 只有单触法词的模型训练(非伪触法词) ,触法词识别是先ner 然后作为图构建的节点
    在构建子图分解的时候 这个触法词节点是作为最大子团来的吗?
    2.代码里面 如何判断那些mention是伪触法词(或者触法词) 需要在span_context_list 里面获取对应的下标

wikievents 等英文数据集实验

准备实验个英文数据集 不知道作者是否在wikievents 上面跑出结果 因为看 scripts 里面的预训练模型名称都是中文的 ~~~

关于事件类型

** Problems **
又来打扰您啦,感谢您之前给我的对于标签统计的方法,我已经完成了对__dee/event_types/zheng2019_trigger_graph.py__中标签的统计。但还有一些问题,针对您所提供的 duee-fin 事件类型及对应角色.pdf 中一共有13中事件类型的分类,而在我的统计中只有五种(如图)。请问您是否实现了涵盖所有事件分类的方法?或者说这是您留给读者自己拓展的地方吗?
(1U{5OB% 3{)(V IBCVQL1S

新数据集的训练

Agreement

  • Fill the space in brackets with x to check the agreement items.
  • Before submitting this issue, I've fully checked the instructions in README.md.
  • Before submitting this issue, I'd searched in the issue area and didn't find a solved issue that covers my problem.
  • This issue is about the toolkit itself, not Python, pip or other programming basics.
  • I understand if I do not check all the agreemnt items above, my issue MAY BE CLOSED OR REMOVED WITHOUT FURTHER EXPLANATIONS.

Problem

在自己新数据的训练 数据处理这块如何入手 有无具体的步骤指引

Environment

Environment Values
System Windows/Linux
GPU Device
CUDA Version
Python Version
PyTorch Version
dee (the Toolkit) Version

Take Model as API to Extract event in Document

您好,我是做其他的NLP任务的,但是对抽取文档里的Event很感兴趣,发现了您的工作

通读了README之后看到了很详细的复现方法,但是想问一下是否有公开已经训练的模型,以及inference的API。可以比较方便的直接作为一个数据的预处理方法,在自己的数据上,获得文档中的事件,而不需要重新训练和阅读代码呢?

非常感谢您的建议

老师您好 有一些使用的方法请教

我按照您说的哪个方法解决dee导入问题,可是还是出现了下面的错误,不知道是不是哪里有问
屏幕截图 2022-03-29 230348

然后还要请教您下那个dump-task是怎么使用的,我是npl小白,还有很多问题想不通,望您帮忙解惑

Whether empty samples can be added for training PTPCG

** Idea sharing **
While sharing what you want to do, make sure to protect your ideas.

** Problems **
If you have any questions about event extraction, make sure you have read the latest papers or searched on the Internet.

** Others **
Other things you may want to share or discuss.

Hello Spico, I'm reproducing your paper and use your online demo. I find a new event will be misclassified. So Can help model classify the event list as null if I add some empty samples. like this:
{text:'I love China',"event_list:"[]

How to generate "dueefin_PTPCG_P1R1_wTgg.json" or Where is "dueefin_PTPCG_P1R1_wTgg.json" or What's its format

Agreement

  • Fill the space in brackets with x to check the agreement items.
  • Before submitting this issue, I've fully checked the instructions in README.md.
  • [xx] Before submitting this issue, I'd searched in the issue area and didn't find a solved issue that covers my problem.
  • This issue is about the toolkit itself, not Python, pip or other programming basics.
  • I understand if I do not check all the agreemnt items above, my issue MAY BE CLOSED OR REMOVED WITHOUT FURTHER EXPLANATIONS.

Problem

When I train the code about dueefin, I can't find dueefin_PTPCG_P1R1_wTgg.json file. My question is what the title says. Thanks for reading! Looking forward to your reply!

一些问题

Agreement

  • Fill the space in brackets with x to check the agreement items.
  • Before submitting this issue, I've fully checked the instructions in README.md.
  • Before submitting this issue, I'd searched in the issue area and didn't find a solved issue that covers my problem.
  • This issue is about the toolkit itself, not Python, pip or other programming basics.
  • I understand if I do not check all the agreemnt items above, my issue MAY BE CLOSED OR REMOVED WITHOUT FURTHER EXPLANATIONS.

Problem

My problem is ...
老师你好,我是这几天才正式用github的,您说的严格按照模板我不知道是不是指我这个,如果有问题,还望您海涵。
我这里还有几个问题请教您(还要这么麻烦您我还真有点不好意思了~)
1.inteference.py里的bert-base-chinese是不是图一里的那个,这里面还有哪些文件是必须要的呀(vocab.txt当然是需要的)
2.图二里您说的过一遍是指执行一遍程序吗,因为我执行了interference.py出现了图三的结果,我不知道是哪里导致出现这种问题,所以才问您这个简单的问题的~
3.如果是运行的话您能大致说下执行流程吗,我看您readmd看的有点迷糊,我这里是想用您的模型执行一些预测

屏幕截图 2022-03-30 132719
屏幕截图 2022-03-30 114442
屏幕截图 2022-03-30 134430

You can reproduce the problem by ...

I have tried ..., but it goes to ...

I have checked the source codes, and the problem may come from ...

Environment

Environment Values
System Windows/Linux
GPU Device
CUDA Version
Python Version
PyTorch Version
dee (the Toolkit) Version

Full Log

Log:

关于“Before running any bash script, please ensure has been correctly set.bert_model”

你好老师,我按照您说的纠正了一些问题,很高兴项目现在已经可以运行了。但是还有一些小问题实在无法解决需要向您请教,下面我将陈述我的问题。

  1. 您readme中“Before running any bash script, please ensure has been correctly set.bert_model”所指的bert模型是Google官方开源的中文模型吗(https://github.com/google-research/bert),?
  2. 由于我的运行结果中分词存在问题(见图1),所有的role都只有一个字或者标点,所以我怀疑是bert没有导入的结果,因为我并没有修改您tump中的task_setting.json "bert_model": "bert-base-chinese",所以我的怀疑合理吗?
    KR{7$COF~(W()IQX8B K@A4图1
    D}7OSJT }PHU}XP5%``@36
    % 6 2C((9YH8$MHZK%O8QQQ

一些问题

老师,您好。初入事件抽取这一领域,或许是处于零基础阶段。
有些文件看不懂具体实现什么样的功能:
1652349711(1)
可以麻烦老师给予一些解答吗

模型训练问题

Agreement

  • Fill the space in brackets with x to check the agreement items.
  • Before submitting this issue, I've fully checked the instructions in README.md.
  • Before submitting this issue, I'd searched in the issue area and didn't find a solved issue that covers my problem.
  • This issue is about the toolkit itself, not Python, pip or other programming basics.
  • I understand if I do not check all the agreemnt items above, my issue MAY BE CLOSED OR REMOVED WITHOUT FURTHER EXPLANATIONS.

Problem

老师你好,我现在想重新训练ptpcg模型,运行run_ptpcg.sh发现我的电脑配置太低,所以准备申请云平台进行加速。我阅读了dee_task.py,现在我是否通过shell运行run_dee_task.py,就可以获得我想要的模型在Exps文件中?(不知道为啥,dee_task.train(save_cpt_flag=in_argv.save_cpt_flag)中的save_cpt_flag=False,意思是不保存模型吗?)

关于mention type和argument role

Agreement

  • Fill the space in brackets with x to check the agreement items.
  • Before submitting this issue, I've fully checked the instructions in README.md.
  • Before submitting this issue, I'd searched in the issue area and didn't find a solved issue that covers my problem.
  • This issue is about the toolkit itself, not Python, pip or other programming basics.
  • I understand if I do not check all the agreemnt items above, my issue MAY BE CLOSED OR REMOVED WITHOUT FURTHER EXPLANATIONS.

Problem

您好,您论文Figure 1中涉及到了两个概念:mention type和argument role,并且在Entity Representation部分也使用到了mention type这一信息。这两者的含义和区别我明白,但是在duee数据集中有mention type这个标注信息吗?我检查了一下,似乎只有argument role。

Environment

Environment Values
System Windows/Linux
GPU Device
CUDA Version
Python Version
PyTorch Version
dee (the Toolkit) Version

Full Log

Log:

关于event_table.py

作者您好,想问一下readme中提到的event_table.py文件是在那个文件夹中,好像没有找到

doc_lang=self.setting.doc_lang报错

在运行run_ptpcg_dueefin_withtgg_withptgg.sh时,报错 self.setting没有doc_lang这个值,然后我手动加了一下(俺也不知道对不对,添加的值是self.setting.doc_lang=‘zh’)

关于filter_event_type的unk参数,以及寻求一些建议

Agreement

  • Fill the space in brackets with x to check the agreement items.
  • [ x] Before submitting this issue, I've fully checked the instructions in README.md.
  • [x ] Before submitting this issue, I'd searched in the issue area and didn't find a solved issue that covers my problem.
  • [x ] This issue is about the toolkit itself, not Python, pip or other programming basics.
  • [x ] I understand if I do not check all the agreemnt items above, my issue MAY BE CLOSED OR REMOVED WITHOUT FURTHER EXPLANATIONS.

Problem

您好,我的问题1::在最新的inference.py里我发现filter_event_type里出现了个unk,这个是在什么情况使用,我记得之前只支持o2o,o2m,m2m.overall
问题2:您对于开放域的中文新闻中的事件检测模型有什么了解或者推荐么?由于新闻的表达多样性和数据分布,无法直接使用您的模型预测。仍然会预测出大量错误的事件(不含事件的、其他语义类似事件但未包括在已定义事件错分为定义的事件,所以需要在前置加入事件检测部分。
我目前使用了触发词触发事件(保证不筛入完全不含事件的情况,但可能产生大量误召回以及筛掉含部分未收录触发词的已定义事件)+后续根据您的模型抽取结果和文章向量卡阈值的方法(筛出低置信度的结果,保证已抽出结果准确性),但依旧没能解决

Environment

Environment Values
System Windows/Linux
GPU Device
CUDA Version
Python Version
PyTorch Version
dee (the Toolkit) Version

Full Log

Log:

Would it be possible to use my own entities?

I have my own entities extracted through another model. How easy would it be to apply DEE to these external entities using this work? I see that the entities and events share an lstm at the beginning of processing.

PTPCG预测时灵活性的问题

hi,看了PTPCG这个模型,对预测过程有个问题。假设邻接矩阵已经预测出来,那么对应的Combinations也就确定了,下面就要根据每一个预测出的事件类型和Combinations里的每一个Combination进行论元角色预测。这样是不是有一个潜在的假设:每一个预测出的事件类型都有同样数量的Combination,即每个event_type都有同样数量的event_object。不知道我的理解是否有误?

Reproduction of Doc2EDAG

** Idea sharing **
While sharing what you want to do, make sure to protect your ideas.

** Problems **
If you have any questions about event extraction, make sure you have read the latest papers or searched on the Internet.

** Others **
Other things you may want to share or discuss.
Hello, Spico! I'm very glad to talk with you about event extraction. Does the order of event type (o2o, o2m, m2m) in training data important for model performance? I find that the reproduction of Doc2EDAG in your paper is (P=86.2, R=70.8, F=79.0, overall scores), but my reproduction is only (P=79.7, R=73.2, F=76.3, overall scores). I just git clone code from the Github repo in Doc2EDAG paper and run the code without modified data preprocessing.

训练模型对计算资源的要求

感谢分享。请问卡数的要求和限制(如:Tip: At least 4 * NVIDIA V100 GPU (32GB) cards are required to run GIT models.)是为了训练速度考虑吗,还是对模型训练后的表现也有影响?

降低Batch_size或者增加梯度累计可以用更少的资源来训练(虽然训练时间更长了),不知道作者有没有试过这样做对最终模型效果的影响

一些问题

Agreement

  • Fill the space in brackets with x to check the agreement items.
  • Before submitting this issue, I've fully checked the instructions in README.md.
  • Before submitting this issue, I'd searched in the issue area and didn't find a solved issue that covers my problem.
  • This issue is about the toolkit itself, not Python, pip or other programming basics.
  • I understand if I do not check all the agreemnt items above, my issue MAY BE CLOSED OR REMOVED WITHOUT FURTHER EXPLANATIONS.

Problem

My problem is ...

You can reproduce the problem by ...

屏幕截图 2022-04-10 155050
屏幕截图 2022-04-10 155024
老师,我运行程序发现有些实体属性识别不出来(比如公司名识别出来了,但是法院裁定时间识别不出来,而且试了很多文本都出现这种问题),请问这可能是什么问题呢?上面我附带了输出的一些我认为有可能的问题,期待您的解答~

I have tried ..., but it goes to ...

I have checked the source codes, and the problem may come from ...

Environment

Environment Values
System Windows/Linux
GPU Device
CUDA Version
Python Version
PyTorch Version
dee (the Toolkit) Version

Full Log

Log:

关于标签的统计

** Idea sharing **
两个事件之间在剪枝后,可能之前还存在关系,是否可以挖掘出之前两个事件的关系。因为两个事件团不是互斥的关系。

** Problems **
有没有关于role和eventType的所有标签模板呢(也就是图中的name和fields里的标签)?我想看一下分类个数,以及能不能拓展。

** Others **

($749ZG8 N(FCPU{U Z@@$N

关于trigger

** Idea sharing **
While sharing what you want to do, make sure to protect your ideas.

** Problems **
If you have any questions about event extraction, make sure you have read the latest papers or searched on the Internet.
您好,很高心能有您这样的开源工作者,我在实践中遇到了些问题,想向您请教一下。
1、我在自己构造数据集时,发现一个有意思的现象。在标注了trigger的情况下,如下代码两种构建template方式,在build_data.py)中构造数据集时,两种方式下add_triggers都设置为False。得到的结论差距恨到,第二种的F1值高很多,您知道什么原因吗?
2、另外我想 问的是TRIGGERS设置那么多模式最终选哪种模式,是根据我们设置的num_triggers参数选择吗?
3、在最终评价的时候有些field对应为null,这种在评价的时候会自动过滤吗?
期待您的回复

class MarryEvent(BaseEvent):
    NAME = "Marry"
    FIELDS = [
        "Trigger",
        'Marry_loc', 'Marry_wife', 'Marry_time', 'Marry_husband'
    ]

    TRIGGERS = {
	1: ['Marry_time'],  # importance: 0.9686967372778184
	2: ['Marry_husband', 'Marry_time'],  # importance: 0.9842342342342343
	3: ['Marry_husband', 'Marry_loc', 'Marry_time'],  # importance: 0.9887387387387387
	4: ['Marry_husband', 'Marry_loc', 'Marry_time', 'Trigger'],  # importance: 0.9887387387387387
	5: ['Marry_husband', 'Marry_loc', 'Marry_time', 'Marry_wife', 'Trigger'],  # importance: 0.9887387387387387
}
    TRIGGERS['all'] = ['Marry_time', 'Marry_loc', 'Marry_husband', 'Marry_wife', 'Trigger']
class MarryEvent(BaseEvent):
    NAME = "Marry"
    FIELDS = ["Trigger",'loc', 'wife', 'time', 'husband']

    TRIGGERS = {
        1: ["Trigger"], 
        2: ["Trigger", 'loc'],  
        3: ["Trigger", 'loc', 'wife'],  
        4: ["Trigger", 'loc', 'wife', 'time'],  
        5: ["Trigger", 'loc', 'wife', 'time', 'husband'], 
    }
    TRIGGERS["all"] = ["Trigger",'loc', 'wife', 'time', 'husband']

    def __init__(self, recguid=None):
        super().__init__(self.FIELDS, event_name=self.NAME, recguid=recguid)
        self.set_key_fields(self.TRIGGERS)

** Others **
Other things you may want to share or discuss.

有关于一些使用上的问题

谢谢您之前对我的指导,现在问题解决了,我翻看了您之前对其它一些同学解答的问题,可谓是比较详细的解答的,但是其中对于刚入门的我还是有点使用上的问题想向您请教,第一个图是您回答的一个issue,我不太理解”使用task dump中配置文件初始化一个dee_task,之后导入已训练好的模型权重“这句话(使用上的流程),期待您的回复
屏幕截图 2022-03-30 114442
屏幕截图 2022-03-30 114829

where to download bert-base-chinese

Agreement

  • Fill the space in brackets with x to check the agreement items.
  • Before submitting this issue, I've fully checked the instructions in README.md.
  • Before submitting this issue, I'd searched in the issue area and didn't find a solved issue that covers my problem.
  • This issue is about the toolkit itself, not Python, pip or other programming basics.
  • I understand if I do not check all the agreemnt items above, my issue MAY BE CLOSED OR REMOVED WITHOUT FURTHER EXPLANATIONS.

Hi, I am trying to train PTPCG. I download bert-base-chinese from huggingface website but error occurs. So I want to get your bert model to reduce the probability of errors. Thanks for reading.

问1:LSTMMTL、LSTMMTL2CompleteGraph区别?问2:调整readme对应参数和batchsize为16后,运行LSTMMTL,仍报out of memory,我是1块2080ti,11G,70m自己数据,需求一份论文所说9G卡可运行参数,谢谢

Using backend: pytorch
2022-02-23 15:51:43.263 | Level 20 | dee.tasks.base_task:logging:196 - ====================Check Setting Validity====================
2022-02-23 15:51:43.264 | Level 20 | dee.tasks.base_task:logging:196 - Setting: {
"data_dir": "./Data",
"model_dir": "./Exps/jiao/Model",
"output_dir": "./Exps/jiao/Output",
"bert_model": "bert",
"train_file_name": "typed_train.json",
"dev_file_name": "typed_dev.json",
"test_file_name": "typed_test.json",
"max_seq_len": 128,
"train_batch_size": 16,
"eval_batch_size": 2,
"learning_rate": 0.0001,
"num_train_epochs": 10,
"warmup_proportion": 0.1,
"no_cuda": false,
"local_rank": -1,
"seed": 99,
"gradient_accumulation_steps": 8,
"optimize_on_cpu": false,
"fp16": false,
"loss_scale": 128,
"cpt_file_name": "Doc2EDAG",
"summary_dir_name": "./Exps/jiao/Summary/Summary",
"event_type_template": "jiao",
"max_sent_len": 128,
"max_sent_num": 64,
"use_lr_scheduler": false,
"lr_scheduler_step": 20,
"use_bert": false,
"use_biaffine_ner": false,
"use_masked_crf": false,
"only_master_logging": true,
"resume_latest_cpt": true,
"remove_last_cpt": false,
"save_best_cpt": false,
"model_type": "Doc2EDAG",
"rearrange_sent": false,
"use_crf_layer": true,
"min_teacher_prob": 0.1,
"schedule_epoch_start": 10,
"schedule_epoch_length": 10,
"loss_lambda": 0.05,
"loss_gamma": 1.0,
"add_greedy_dec": true,
"use_token_role": true,
"seq_reduce_type": "MaxPooling",
"hidden_size": 768,
"dropout": 0.1,
"ff_size": 1024,
"num_tf_layers": 4,
"use_path_mem": true,
"use_scheduled_sampling": true,
"use_doc_enc": true,
"neg_field_loss_scaling": 3.0,
"gcn_layer": 3,
"ner_num_tf_layers": 4,
"num_lstm_layers": 1,
"use_span_lstm": false,
"span_lstm_num_layer": 1,
"use_span_att": false,
"span_att_heads": 4,
"dot_att_head": 4,
"comb_samp_min_num_span": 2,
"comb_samp_num_samp": 100,
"comb_samp_max_samp_times": 1000,
"use_span_lstm_projection": false,
"biaffine_hidden_size": 256,
"triaffine_hidden_size": 150,
"vi_max_iter": 3,
"biaffine_hard_threshold": 0.5,
"event_cls_loss_weight": 1.0,
"smooth_attn_loss_weight": 1.0,
"combination_loss_weight": 1.0,
"comb_cls_loss_weight": 1.0,
"comb_sim_loss_weight": 1.0,
"span_cls_loss_weight": 1.0,
"use_comb_cls_pred": false,
"role_loss_weight": 1.0,
"event_relevant_combination": false,
"run_mode": "full",
"drop_irr_ents": false,
"at_least_one_comb": true,
"include_complementary_ents": true,
"filtered_data_types": "o2o",
"ent_context_window": 20,
"biaffine_grad_clip": false,
"global_grad_clip": false,
"ent_fix_mode": "n",
"span_mention_sum": false,
"add_adj_mat_weight_bias": false,
"optimizer": "adam",
"num_triggers": 1,
"eval_num_triggers": 1,
"with_left_trigger": true,
"with_all_one_trigger_comb": false,
"directed_trigger_graph": false,
"adj_sim_head": 1,
"adj_sim_agg": "mean",
"adj_sim_split_head": false,
"num_triggering_steps": 1,
"use_shared_dropout_proj": false,
"use_layer_norm_b4_biaffine": false,
"remove_mention_type_layer_norm": false,
"use_token_drop": false,
"guessing_decode": false,
"max_clique_decode": true,
"try_to_make_up": false,
"self_loop": false,
"incremental_min_conn": -1,
"use_span_self_att": false,
"use_smooth_span_self_att": false,
"ment_feature_type": "plus",
"ment_type_hidden_size": 32,
"num_mention_lstm_layer": 1,
"gat_alpha": 0.2,
"gat_num_heads": 4,
"gat_num_layers": 2,
"role_by_encoding": false,
"use_mention_lstm": false,
"mlp_before_adj_measure": false,
"use_field_cls_mlp": false,
"build_dense_connected_doc_graph": false,
"stop_gradient": false,
"doc_lang": "zh"
}
2022-02-23 15:51:43.264 | Level 20 | dee.tasks.base_task:logging:196 - ====================Init Device====================
2022-02-23 15:51:43.296 | Level 20 | dee.tasks.base_task:logging:196 - device cuda n_gpu 2 distributed training False
2022-02-23 15:51:43.296 | Level 20 | dee.tasks.base_task:logging:196 - ====================Reset Random Seed to 99====================
2022-02-23 15:51:43.297 | Level 20 | dee.tasks.base_task:logging:196 - Init Summary Writer
/root/anaconda3/envs/zhtorch/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/root/anaconda3/envs/zhtorch/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/root/anaconda3/envs/zhtorch/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/root/anaconda3/envs/zhtorch/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/root/anaconda3/envs/zhtorch/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/root/anaconda3/envs/zhtorch/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
2022-02-23 15:51:44.384 | Level 20 | dee.tasks.base_task:logging:196 - Writing summary into ./Exps/jiao/Summary/Summary-Feb23_15-51-43
2022-02-23 15:51:44.384 | Level 20 | dee.tasks.base_task:logging:196 - Initializing DEETask
file bert/config.json not found
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization.
The tokenizer class you load from this checkpoint is 'BertTokenizer'.
The class this function is called from is 'BertTokenizerForDocEE'.
[('Build', ['CompanyName', 'Product', 'Address', 'StartTime', 'Country'], {1: ['CompanyName'], 2: ['CompanyName', 'StartTime'], 3: ['CompanyName', 'Product', 'StartTime'], 4: ['Address', 'CompanyName', 'Product', 'StartTime'], 5: ['Address', 'CompanyName', 'Country', 'Product', 'StartTime'], 'all': ['CompanyName', 'Product', 'Address', 'StartTime', 'Country']}, 5), ('Violated', ['CompanyName', 'Law', 'StartTime', 'Address', 'Character'], {1: ['CompanyName'], 2: ['CompanyName', 'StartTime'], 3: ['Character', 'CompanyName', 'StartTime'], 4: ['Address', 'Character', 'CompanyName', 'StartTime'], 5: ['Address', 'Character', 'CompanyName', 'Law', 'StartTime'], 'all': ['CompanyName', 'Law', 'StartTime', 'Address', 'Character']}, 5)]
2022-02-23 15:51:44.651 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_embedding.token_embedding.weight torch.Size([21128, 768]) 16226304
2022-02-23 15:51:44.651 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_embedding.pos_embedding.weight torch.Size([128, 768]) 98304
2022-02-23 15:51:44.651 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_embedding.layer_norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.652 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_embedding.layer_norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.652 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.self_attn.linears.0.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.652 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.self_attn.linears.0.bias torch.Size([768]) 768
2022-02-23 15:51:44.652 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.self_attn.linears.1.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.652 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.self_attn.linears.1.bias torch.Size([768]) 768
2022-02-23 15:51:44.653 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.self_attn.linears.2.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.653 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.self_attn.linears.2.bias torch.Size([768]) 768
2022-02-23 15:51:44.653 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.self_attn.linears.3.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.653 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.self_attn.linears.3.bias torch.Size([768]) 768
2022-02-23 15:51:44.653 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
2022-02-23 15:51:44.653 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.feed_forward.w_1.bias torch.Size([1024]) 1024
2022-02-23 15:51:44.654 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
2022-02-23 15:51:44.654 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.feed_forward.w_2.bias torch.Size([768]) 768
2022-02-23 15:51:44.654 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.sublayer.0.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.654 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.sublayer.0.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.654 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.sublayer.1.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.654 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.sublayer.1.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.655 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.self_attn.linears.0.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.655 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.self_attn.linears.0.bias torch.Size([768]) 768
2022-02-23 15:51:44.655 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.self_attn.linears.1.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.655 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.self_attn.linears.1.bias torch.Size([768]) 768
2022-02-23 15:51:44.655 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.self_attn.linears.2.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.655 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.self_attn.linears.2.bias torch.Size([768]) 768
2022-02-23 15:51:44.656 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.self_attn.linears.3.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.656 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.self_attn.linears.3.bias torch.Size([768]) 768
2022-02-23 15:51:44.656 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
2022-02-23 15:51:44.656 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.feed_forward.w_1.bias torch.Size([1024]) 1024
2022-02-23 15:51:44.656 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
2022-02-23 15:51:44.656 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.feed_forward.w_2.bias torch.Size([768]) 768
2022-02-23 15:51:44.657 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.sublayer.0.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.657 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.sublayer.0.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.657 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.sublayer.1.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.657 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.sublayer.1.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.657 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.self_attn.linears.0.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.657 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.self_attn.linears.0.bias torch.Size([768]) 768
2022-02-23 15:51:44.658 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.self_attn.linears.1.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.658 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.self_attn.linears.1.bias torch.Size([768]) 768
2022-02-23 15:51:44.658 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.self_attn.linears.2.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.658 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.self_attn.linears.2.bias torch.Size([768]) 768
2022-02-23 15:51:44.658 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.self_attn.linears.3.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.658 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.self_attn.linears.3.bias torch.Size([768]) 768
2022-02-23 15:51:44.659 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
2022-02-23 15:51:44.659 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.feed_forward.w_1.bias torch.Size([1024]) 1024
2022-02-23 15:51:44.659 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
2022-02-23 15:51:44.659 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.feed_forward.w_2.bias torch.Size([768]) 768
2022-02-23 15:51:44.659 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.sublayer.0.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.660 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.sublayer.0.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.660 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.sublayer.1.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.660 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.sublayer.1.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.660 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.self_attn.linears.0.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.660 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.self_attn.linears.0.bias torch.Size([768]) 768
2022-02-23 15:51:44.660 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.self_attn.linears.1.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.661 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.self_attn.linears.1.bias torch.Size([768]) 768
2022-02-23 15:51:44.661 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.self_attn.linears.2.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.661 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.self_attn.linears.2.bias torch.Size([768]) 768
2022-02-23 15:51:44.661 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.self_attn.linears.3.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.661 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.self_attn.linears.3.bias torch.Size([768]) 768
2022-02-23 15:51:44.661 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
2022-02-23 15:51:44.662 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.feed_forward.w_1.bias torch.Size([1024]) 1024
2022-02-23 15:51:44.662 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
2022-02-23 15:51:44.662 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.feed_forward.w_2.bias torch.Size([768]) 768
2022-02-23 15:51:44.662 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.sublayer.0.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.662 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.sublayer.0.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.662 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.sublayer.1.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.663 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.sublayer.1.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.663 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.663 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.norm.betatorch.Size([768]) 768
2022-02-23 15:51:44.663 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.crf_layer.trans_mat torch.Size([17, 17]) 289
2022-02-23 15:51:44.663 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.crf_layer.hidden2tag.weight torch.Size([17, 768]) 13056
2022-02-23 15:51:44.663 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.crf_layer.hidden2tag.bias torch.Size([17]) 17
2022-02-23 15:51:44.664 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.event_query torch.Size([1, 768]) 768
2022-02-23 15:51:44.664 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.event_cls.weight torch.Size([2, 768]) 1536
2022-02-23 15:51:44.664 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.event_cls.bias torch.Size([2]) 2
2022-02-23 15:51:44.664 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_cls_list.0.weight torch.Size([2, 768]) 1536
2022-02-23 15:51:44.664 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_cls_list.0.bias torch.Size([2]) 2
2022-02-23 15:51:44.664 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_cls_list.1.weight torch.Size([2, 768]) 1536
2022-02-23 15:51:44.665 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_cls_list.1.bias torch.Size([2]) 2
2022-02-23 15:51:44.665 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_cls_list.2.weight torch.Size([2, 768]) 1536
2022-02-23 15:51:44.665 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_cls_list.2.bias torch.Size([2]) 2
2022-02-23 15:51:44.665 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_cls_list.3.weight torch.Size([2, 768]) 1536
2022-02-23 15:51:44.665 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_cls_list.3.bias torch.Size([2]) 2
2022-02-23 15:51:44.665 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_cls_list.4.weight torch.Size([2, 768]) 1536
2022-02-23 15:51:44.666 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_cls_list.4.bias torch.Size([2]) 2
2022-02-23 15:51:44.666 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_queries.0 torch.Size([1, 768]) 768
2022-02-23 15:51:44.666 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_queries.1 torch.Size([1, 768]) 768
2022-02-23 15:51:44.666 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_queries.2 torch.Size([1, 768]) 768
2022-02-23 15:51:44.666 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_queries.3 torch.Size([1, 768]) 768
2022-02-23 15:51:44.666 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_queries.4 torch.Size([1, 768]) 768
2022-02-23 15:51:44.667 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.event_query torch.Size([1, 768]) 768
2022-02-23 15:51:44.667 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.event_cls.weight torch.Size([2, 768]) 1536
2022-02-23 15:51:44.667 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.event_cls.bias torch.Size([2]) 2
2022-02-23 15:51:44.667 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_cls_list.0.weight torch.Size([2, 768]) 1536
2022-02-23 15:51:44.667 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_cls_list.0.bias torch.Size([2]) 2
2022-02-23 15:51:44.667 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_cls_list.1.weight torch.Size([2, 768]) 1536
2022-02-23 15:51:44.668 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_cls_list.1.bias torch.Size([2]) 2
2022-02-23 15:51:44.668 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_cls_list.2.weight torch.Size([2, 768]) 1536
2022-02-23 15:51:44.668 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_cls_list.2.bias torch.Size([2]) 2
2022-02-23 15:51:44.668 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_cls_list.3.weight torch.Size([2, 768]) 1536
2022-02-23 15:51:44.668 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_cls_list.3.bias torch.Size([2]) 2
2022-02-23 15:51:44.668 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_cls_list.4.weight torch.Size([2, 768]) 1536
2022-02-23 15:51:44.669 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_cls_list.4.bias torch.Size([2]) 2
2022-02-23 15:51:44.669 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_queries.0 torch.Size([1, 768]) 768
2022-02-23 15:51:44.669 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_queries.1 torch.Size([1, 768]) 768
2022-02-23 15:51:44.669 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_queries.2 torch.Size([1, 768]) 768
2022-02-23 15:51:44.669 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_queries.3 torch.Size([1, 768]) 768
2022-02-23 15:51:44.669 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_queries.4 torch.Size([1, 768]) 768
2022-02-23 15:51:44.670 | INFO | dee.tasks.dee_task:init:377 - Trainable: sent_pos_encoder.embedding.weighttorch.Size([64, 768]) 49152
2022-02-23 15:51:44.670 | INFO | dee.tasks.dee_task:init:377 - Trainable: sent_pos_encoder.layer_norm.gammatorch.Size([768]) 768
2022-02-23 15:51:44.670 | INFO | dee.tasks.dee_task:init:377 - Trainable: sent_pos_encoder.layer_norm.betatorch.Size([768]) 768
2022-02-23 15:51:44.670 | INFO | dee.tasks.dee_task:init:377 - Trainable: ment_type_encoder.embedding.weight torch.Size([15, 768]) 11520
2022-02-23 15:51:44.670 | INFO | dee.tasks.dee_task:init:377 - Trainable: ment_type_encoder.layer_norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.670 | INFO | dee.tasks.dee_task:init:377 - Trainable: ment_type_encoder.layer_norm.betatorch.Size([768]) 768
2022-02-23 15:51:44.671 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.self_attn.linears.0.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.671 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.self_attn.linears.0.bias torch.Size([768]) 768
2022-02-23 15:51:44.671 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.self_attn.linears.1.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.671 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.self_attn.linears.1.bias torch.Size([768]) 768
2022-02-23 15:51:44.671 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.self_attn.linears.2.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.671 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.self_attn.linears.2.bias torch.Size([768]) 768
2022-02-23 15:51:44.672 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.self_attn.linears.3.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.672 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.self_attn.linears.3.bias torch.Size([768]) 768
2022-02-23 15:51:44.672 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
2022-02-23 15:51:44.672 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.feed_forward.w_1.bias torch.Size([1024]) 1024
2022-02-23 15:51:44.672 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
2022-02-23 15:51:44.673 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.feed_forward.w_2.bias torch.Size([768]) 768
2022-02-23 15:51:44.673 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.sublayer.0.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.673 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.sublayer.0.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.673 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.sublayer.1.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.673 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.sublayer.1.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.673 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.self_attn.linears.0.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.674 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.self_attn.linears.0.bias torch.Size([768]) 768
2022-02-23 15:51:44.674 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.self_attn.linears.1.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.674 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.self_attn.linears.1.bias torch.Size([768]) 768
2022-02-23 15:51:44.674 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.self_attn.linears.2.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.674 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.self_attn.linears.2.bias torch.Size([768]) 768
2022-02-23 15:51:44.674 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.self_attn.linears.3.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.675 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.self_attn.linears.3.bias torch.Size([768]) 768
2022-02-23 15:51:44.675 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
2022-02-23 15:51:44.675 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.feed_forward.w_1.bias torch.Size([1024]) 1024
2022-02-23 15:51:44.675 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
2022-02-23 15:51:44.675 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.feed_forward.w_2.bias torch.Size([768]) 768
2022-02-23 15:51:44.675 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.sublayer.0.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.676 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.sublayer.0.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.676 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.sublayer.1.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.676 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.sublayer.1.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.676 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.self_attn.linears.0.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.676 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.self_attn.linears.0.bias torch.Size([768]) 768
2022-02-23 15:51:44.676 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.self_attn.linears.1.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.677 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.self_attn.linears.1.bias torch.Size([768]) 768
2022-02-23 15:51:44.677 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.self_attn.linears.2.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.677 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.self_attn.linears.2.bias torch.Size([768]) 768
2022-02-23 15:51:44.677 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.self_attn.linears.3.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.677 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.self_attn.linears.3.bias torch.Size([768]) 768
2022-02-23 15:51:44.677 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
2022-02-23 15:51:44.678 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.feed_forward.w_1.bias torch.Size([1024]) 1024
2022-02-23 15:51:44.678 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
2022-02-23 15:51:44.678 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.feed_forward.w_2.bias torch.Size([768]) 768
2022-02-23 15:51:44.678 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.sublayer.0.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.678 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.sublayer.0.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.678 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.sublayer.1.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.679 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.sublayer.1.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.679 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.self_attn.linears.0.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.679 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.self_attn.linears.0.bias torch.Size([768]) 768
2022-02-23 15:51:44.679 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.self_attn.linears.1.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.679 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.self_attn.linears.1.bias torch.Size([768]) 768
2022-02-23 15:51:44.679 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.self_attn.linears.2.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.680 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.self_attn.linears.2.bias torch.Size([768]) 768
2022-02-23 15:51:44.680 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.self_attn.linears.3.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.680 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.self_attn.linears.3.bias torch.Size([768]) 768
2022-02-23 15:51:44.680 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
2022-02-23 15:51:44.680 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.feed_forward.w_1.bias torch.Size([1024]) 1024
2022-02-23 15:51:44.680 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
2022-02-23 15:51:44.681 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.feed_forward.w_2.bias torch.Size([768]) 768
2022-02-23 15:51:44.681 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.sublayer.0.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.681 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.sublayer.0.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.681 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.sublayer.1.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.681 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.sublayer.1.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.681 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.682 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.682 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.self_attn.linears.0.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.682 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.self_attn.linears.0.bias torch.Size([768]) 768
2022-02-23 15:51:44.682 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.self_attn.linears.1.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.682 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.self_attn.linears.1.bias torch.Size([768]) 768
2022-02-23 15:51:44.682 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.self_attn.linears.2.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.683 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.self_attn.linears.2.bias torch.Size([768]) 768
2022-02-23 15:51:44.683 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.self_attn.linears.3.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.683 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.self_attn.linears.3.bias torch.Size([768]) 768
2022-02-23 15:51:44.683 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
2022-02-23 15:51:44.683 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.feed_forward.w_1.bias torch.Size([1024]) 1024
2022-02-23 15:51:44.683 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
2022-02-23 15:51:44.684 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.feed_forward.w_2.bias torch.Size([768]) 768
2022-02-23 15:51:44.684 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.sublayer.0.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.684 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.sublayer.0.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.684 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.sublayer.1.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.684 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.sublayer.1.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.684 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.self_attn.linears.0.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.685 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.self_attn.linears.0.bias torch.Size([768]) 768
2022-02-23 15:51:44.685 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.self_attn.linears.1.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.685 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.self_attn.linears.1.bias torch.Size([768]) 768
2022-02-23 15:51:44.685 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.self_attn.linears.2.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.685 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.self_attn.linears.2.bias torch.Size([768]) 768
2022-02-23 15:51:44.686 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.self_attn.linears.3.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.686 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.self_attn.linears.3.bias torch.Size([768]) 768
2022-02-23 15:51:44.686 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
2022-02-23 15:51:44.686 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.feed_forward.w_1.bias torch.Size([1024]) 1024
2022-02-23 15:51:44.686 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
2022-02-23 15:51:44.686 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.feed_forward.w_2.bias torch.Size([768]) 768
2022-02-23 15:51:44.687 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.sublayer.0.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.687 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.sublayer.0.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.687 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.sublayer.1.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.687 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.sublayer.1.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.687 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.self_attn.linears.0.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.687 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.self_attn.linears.0.bias torch.Size([768]) 768
2022-02-23 15:51:44.688 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.self_attn.linears.1.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.688 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.self_attn.linears.1.bias torch.Size([768]) 768
2022-02-23 15:51:44.688 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.self_attn.linears.2.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.688 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.self_attn.linears.2.bias torch.Size([768]) 768
2022-02-23 15:51:44.688 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.self_attn.linears.3.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.688 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.self_attn.linears.3.bias torch.Size([768]) 768
2022-02-23 15:51:44.689 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
2022-02-23 15:51:44.689 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.feed_forward.w_1.bias torch.Size([1024]) 1024
2022-02-23 15:51:44.689 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
2022-02-23 15:51:44.689 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.feed_forward.w_2.bias torch.Size([768]) 768
2022-02-23 15:51:44.689 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.sublayer.0.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.689 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.sublayer.0.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.690 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.sublayer.1.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.690 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.sublayer.1.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.690 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.self_attn.linears.0.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.690 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.self_attn.linears.0.bias torch.Size([768]) 768
2022-02-23 15:51:44.690 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.self_attn.linears.1.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.690 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.self_attn.linears.1.bias torch.Size([768]) 768
2022-02-23 15:51:44.691 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.self_attn.linears.2.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.691 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.self_attn.linears.2.bias torch.Size([768]) 768
2022-02-23 15:51:44.691 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.self_attn.linears.3.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.691 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.self_attn.linears.3.bias torch.Size([768]) 768
2022-02-23 15:51:44.691 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
2022-02-23 15:51:44.691 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.feed_forward.w_1.bias torch.Size([1024]) 1024
2022-02-23 15:51:44.692 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
2022-02-23 15:51:44.692 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.feed_forward.w_2.bias torch.Size([768]) 768
2022-02-23 15:51:44.692 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.sublayer.0.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.692 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.sublayer.0.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.692 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.sublayer.1.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.692 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.sublayer.1.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.693 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.norm.gammatorch.Size([768]) 768
2022-02-23 15:51:44.693 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.693 | INFO | dee.tasks.dee_task:init:389 - #Total Trainable Parameters: 63716682
2022-02-23 15:51:44.693 | INFO | dee.tasks.dee_task:init:390 - #Total Fixed Parameters: 0
2022-02-23 15:51:44.693 | Level 20 | dee.tasks.base_task:logging:196 - ====================Decorate Model====================
Traceback (most recent call last):
File "/home/jiaojiaxin/DocEE/run_dee_task.py", line 208, in
parallel_decorate=in_argv.parallel_decorate,
File "/home/jiaojiaxin/DocEE/dee/tasks/dee_task.py", line 392, in init
self._decorate_model(parallel_decorate=parallel_decorate)
File "/home/jiaojiaxin/DocEE/dee/tasks/base_task.py", line 474, in _decorate_model
self.model.to(self.device)
File "/root/anaconda3/envs/zhtorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 612, in to
return self._apply(convert)
File "/root/anaconda3/envs/zhtorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 359, in _apply
module._apply(fn)
File "/root/anaconda3/envs/zhtorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 359, in _apply
module._apply(fn)
File "/root/anaconda3/envs/zhtorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 359, in _apply
module._apply(fn)
File "/root/anaconda3/envs/zhtorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 381, in _apply
param_applied = fn(param)
File "/root/anaconda3/envs/zhtorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 610, in convert
return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
RuntimeError: CUDA error: out of memory

PTPCG 分布式训练的效率

** Idea sharing **
While sharing what you want to do, make sure to protect your ideas.

** Problems **
参考了其他运行的命令 执行如下命令

TASK_NAME='PTPCG_R1_reproduction'
CUDA='0,1,2,3'
NUM_GPU=4
MODEL_NAME='TriggerAwarePrunedCompleteGraph'


CUDA_VISIBLE_DEVICES=${CUDA} ./scripts/train_multi.sh ${NUM_GPU} --task_name ${TASK_NAME}\
    --use_bert=False \
    --bert_model='/data/xxl/roberta-base-chinese/' \
    --model_type=${MODEL_NAME} \
    --cpt_file_name=${MODEL_NAME} \
    --resume_latest_cpt=False \
	--save_cpt_flag=False \
    --save_best_cpt=True \
    --remove_last_cpt=True \
    --resume_latest_cpt=False \
    --optimizer='adam' \
    --learning_rate=0.0005 \
    --dropout=0.1 \
    --gradient_accumulation_steps=8 \
    --train_batch_size=64 \
    --eval_batch_size=16 \
    --max_clique_decode=True \
    --num_triggers=1 \
    --eval_num_triggers=1 \
    --with_left_trigger=True \
    --directed_trigger_graph=True \
    --use_scheduled_sampling=True \
    --schedule_epoch_start=10 \
    --schedule_epoch_length=10 \
    --num_train_epochs=100 \
    --run_mode='full' \
    --skip_train=False \
	--filtered_data_types='o2o,o2m,m2m' \
    --re_eval_flag=False \
    --add_greedy_dec=False \
    --num_lstm_layers=2 \
    --hidden_size=768 \
    --biaffine_hidden_size=512 \
    --biaffine_hard_threshold=0.5 \
    --at_least_one_comb=True \
    --include_complementary_ents=True \
    --event_type_template='zheng2019_trigger_graph' \
    --use_span_lstm=True \
    --span_lstm_num_layer=2 \
    --role_by_encoding=True \
    --use_token_role=True \
    --ment_feature_type='concat' \
    --ment_type_hidden_size=32 \
    --parallel_decorate

运行的几个卡我看都是有使用起来的

但是最终的运行速度还是没有提高(20min) ,比较单卡的时间还要长一些。这块我也不是很懂 是不是缺少啥

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.