Giter Site home page Giter Site logo

GPU usage of model, if it's possible to increase the max length of the text to 256 or 512? about entity-recognition-and-relation-extraction HOT 8 CLOSED

LiuHao-THU avatar LiuHao-THU commented on June 26, 2024
GPU usage of model, if it's possible to increase the max length of the text to 256 or 512?

from entity-recognition-and-relation-extraction.

Comments (8)

X-jun-0130 avatar X-jun-0130 commented on June 26, 2024

Yes,of course

from entity-recognition-and-relation-extraction.

LiuHao-THU avatar LiuHao-THU commented on June 26, 2024

Yes,of course

thanks for your reply!

  I tried to run the model in max_len = 256 using 2080ti, the batch size of the model can set to 6 max.  but I don't quite understand the tf.round in the following code.

def extra_sujects(self, ner_label): ner = ner_label[0] ner = tf.round(ner) ner = [tf.argmax(ner[k]) for k in range(ner.shape[0])] new_ner = list(np.array(ner)) ner = list(np.array(ner))[1:-1] ner.append(0)#防止最后一位不为0 text_list = [key for key in self.text] subject = [] for i, k in enumerate(text_list): if int(ner[i]) == 0 or int(ner[i]) == 2: continue elif int(ner[i]) == 1: ner_back = [int(j) for j in ner[i + 1:]] if 1 in ner_back and 0 in ner_back: indics_1 = ner_back.index(1) + i indics_0 = ner_back.index(0) + i subject.append((''.join(text_list[i: min(indics_0, indics_1) + 1]), i + 1)) elif 1 not in ner_back: indics = ner_back.index(0) + i subject.append((''.join(text_list[i:indics + 1]), i + 1)) return subject, new_ner

from entity-recognition-and-relation-extraction.

LiuHao-THU avatar LiuHao-THU commented on June 26, 2024

the f1 increase slowly after 30 epoch trains on the joint model. using training data 30000 and dev data 100. if it's better to change the learning rate of bert and relation extraction network?

Epoch: 1
测试集: F 0.000000: P 1.000000: R 0.000000:
Epoch: 2
测试集: F 0.017467: P 0.181818: R 0.017467:
Epoch: 3
测试集: F 0.281525: P 0.390244: R 0.281525:
Epoch: 4
测试集: F 0.529002: P 0.535211: R 0.529002:
saving_model
Epoch: 5
测试集: F 0.524731: P 0.493927: R 0.524731:
Epoch: 6
测试集: F 0.554745: P 0.590674: R 0.554745:
saving_model
Epoch: 7
测试集: F 0.668103: P 0.630081: R 0.668103:
saving_model
Epoch: 8
测试集: F 0.611973: P 0.592275: R 0.611973:
Epoch: 9
测试集: F 0.628821: P 0.600000: R 0.628821:
Epoch: 10
测试集: F 0.617169: P 0.624413: R 0.617169:
Epoch: 11
测试集: F 0.600462: P 0.604651: R 0.600462:
Epoch: 12
测试集: F 0.606593: P 0.582278: R 0.606593:
Epoch: 13
测试集: F 0.638695: P 0.649289: R 0.638695:
Epoch: 14
测试集: F 0.631090: P 0.638498: R 0.631090:
Epoch: 15
测试集: F 0.632258: P 0.595142: R 0.632258:
Epoch: 16
测试集: F 0.671024: P 0.639004: R 0.671024:
saving_model
Epoch: 17
测试集: F 0.635347: P 0.620087: R 0.635347:
Epoch: 18
测试集: F 0.657596: P 0.650224: R 0.657596:
Epoch: 19
测试集: F 0.647826: P 0.615702: R 0.647826:
Epoch: 20
测试集: F 0.653422: P 0.629787: R 0.653422:
Epoch: 21
测试集: F 0.665188: P 0.643777: R 0.665188:
Epoch: 22
测试集: F 0.654867: P 0.632479: R 0.654867:
Epoch: 23
测试集: F 0.644231: P 0.676768: R 0.644231:
Epoch: 24
测试集: F 0.632794: P 0.637209: R 0.632794:
Epoch: 25
测试集: F 0.641026: P 0.600000: R 0.641026:
Epoch: 26
测试集: F 0.684932: P 0.681818: R 0.684932:
saving_model
Epoch: 27
测试集: F 0.642032: P 0.646512: R 0.642032:
Epoch: 28
测试集: F 0.690265: P 0.666667: R 0.690265:
saving_model
Epoch: 29
测试集: F 0.708972: P 0.677824: R 0.708972:
saving_model
Epoch: 30
测试集: F 0.660422: P 0.674641: R 0.660422:
Epoch: 31
测试集: F 0.693157: P 0.668085: R 0.693157:
Epoch: 32
测试集: F 0.601695: P 0.559055: R 0.601695:

from entity-recognition-and-relation-extraction.

X-jun-0130 avatar X-jun-0130 commented on June 26, 2024
  1. but I don't quite understand the tf.round in the following code.
    tf.round(0.6) = 1; tf.round(0.4) = 0;
    网络中,softmax输出是各标签分布的概率值,并不是整数,所以ner输出值在(0,1)内;需要将其转换为整数才能进行subject提取;

2.if it's better to change the learning rate of bert and relation extraction network?
我试过不同的learning rate但结果并没有提升多少;网络提升速度慢是因为第二层网络输出是一个相对大而十分稀疏的矩阵,训练难度大。如果不适用bert,仅用lstm等常规网络,训练速度更慢。

原论文作者在训练网络时,使用了更多的epochs,当然也有更多的数据。

from entity-recognition-and-relation-extraction.

LiuHao-THU avatar LiuHao-THU commented on June 26, 2024
  1. but I don't quite understand the tf.round in the following code.
    tf.round(0.6) = 1; tf.round(0.4) = 0;
    网络中,softmax输出是各标签分布的概率值,并不是整数,所以ner输出值在(0,1)内;需要将其转换为整数才能进行subject提取;

2.if it's better to change the learning rate of bert and relation extraction network?
我试过不同的learning rate但结果并没有提升多少;网络提升速度慢是因为第二层网络输出是一个相对大而十分稀疏的矩阵,训练难度大。如果不适用bert,仅用lstm等常规网络,训练速度更慢。

原论文作者在训练网络时,使用了更多的epochs,当然也有更多的数据。

您好,多谢回复。

  1. softmax直接去argmax就可以取出标签了啊,你这样会不会如果[0.4, 0.41, 0...]会造成二选一的情况
  2. 您尝试不同的lr是整体调节的还是只在第二个关系抽取模型中改的啊
  3. 我现在使用了64000数据,使用的label embedding的网络进行的抽取,现在训练到epoch = 18 最大是 75% 左右,感觉应该继续训练还能涨,您那个描述是不是写错了?

·模型使用3000条训练数据,最后对测试集前100进行验证,最大F1值是81.8%;
模型太重,没有取过多数据训练,应该还可以继续提高的。·

from entity-recognition-and-relation-extraction.

X-jun-0130 avatar X-jun-0130 commented on June 26, 2024

1.argmax([0.2,0.3,0.4,0.5,0.3]) = 3
tf.round(0.2,0.3,0.4,0.5,0.3]) = [0,0,0,1,0]
后者的形式是我要的,ner序列;
2. 整体抽取模型 和 单独进行关系抽取的都进行过几次lr测试;但没有太细致的去研究。
3.不是3000,是30000,写错了。感谢提醒

from entity-recognition-and-relation-extraction.

LiuHao-THU avatar LiuHao-THU commented on June 26, 2024

`class Ner_model(tf.keras.Model):
def init(self, bert_model):
super(Ner_model, self).init()
self.bert = bert_model
#self.dense_fuc = tf.keras.layers.Dense(100, use_bias=False) #全连接层
self.dense = tf.keras.layers.Dense(label_class)

def call(self, inputs, mask, segment):
    output_encode, _ = self.bert([inputs, mask, segment])
    #x = self.dense_fuc(output_encode)
    x = self.dense(output_encode)
    **x = tf.nn.softmax(x)**
    return x, output_encode`

您好,多谢回复

个人感觉您还是说的有点问题

这里tf.nn.softmax(x)之后的维度应该是batch * max_len * num_classes_of_entity

按照您的做法的前面应该是sigmod才对

如果是softmax的话,直接在num_classes_of_entity做softmax是最好的。

from entity-recognition-and-relation-extraction.

X-jun-0130 avatar X-jun-0130 commented on June 26, 2024

ner = ner_label[0]
ner = tf.round(ner) #你说的对的,这步多余了,可以删掉
ner = [tf.argmax(ner[k]) for k in range(ner.shape[0])]
ner = list(np.array(ner))[1:-1]

from entity-recognition-and-relation-extraction.

Related Issues (4)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.