Giter Site home page Giter Site logo

t5-pegasus's People

Contributors

zhuiyitechnology avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

t5-pegasus's Issues

DataLossError: Checksum does not match

错误: tensorflow.python.framework.errors_impl.DataLossError: Checksum does not match: stored 592068290 vs. calculated on the restored bytes 517592881
网盘下载的权重, 使用google mt5没有问题
gpu v100 32g

查了下问题,没查到.到load_variable函数的时候报的错,不知道是不是的t5-pegasus问题.

小样本微调

几千个小样本在单卡上fine-turning很耗时,大量的cpu计算,显存却使用比较少?

请问如何在finetune中使用多GPU训练?

您好, 我照着train.py中的代码使用在finetune.py, 在训练时发生以下错误, 请问我要怎么修改, 才能正确训练?
谢谢

InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: TypeError: generator yielded an element that could not be converted to the expected type. The expected type was float32, but the yielded element was [array([[ 101, 2349, 25480, ..., 9172, 16054, 102],
[ 101, 2335, 5088, ..., 4934, 31621, 102],
[ 101, 2349, 25480, ..., 18312, 5661, 102],
[ 101, 2349, 25480, ..., 33732, 11511, 102]]), array([[ 101, 22191, 27209, 41412, 31201, 8506, 42696, 31201, 5661,

推理脚本

您好,请问下,用自己的小数据集finetune过后生成的best_model.weights,能否提供下推理的示范脚本呢?谢谢!

想改成输入多doc格式

参考Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering这篇论文,
https://arxiv.org/pdf/2007.01282.pdf
想将您的代码改成输入为多doc,分别经过encoder后,将表示concat起来,送入decoder。如下代码

input_ids = Input(shape=(max_padding_len, max_c_len), name='INPUT_contents_ids', dtype=tf.int32)
input_ans_ids = Input(shape=(max_a_len,), name='INPUT_ans_ids', dtype=tf.int32)
input_ids_reshape = K.reshape(input_ids,(-1, max_c_len))  # (bs*max_padding_len, max_seq_len)

t5 = build_transformer_model(config_path=config_path,checkpoint_path=checkpoint_path,model='t5.1.1',return_keras_model=False,name='T5')
encoder = t5.encoder
decoder = t5.decoder
resp = encoder(input_ids_reshape)  # (bs*max_padding_len, max_c_len, 512)
resp_concat = K.reshape(resp, (-1, max_padding_len * max_c_len, 512))   # (bs, max_padding_len*max_c_len, 512)
out = decoder([resp_concat, input_ans_ids])

output = CrossEntropy(1)([input_ans_ids, out])
model = Model(inputs=[input_ids, input_ans_ids], outputs=output)

会报错
AttributeError: 'NoneType' object has no attribute '_inbound_nodes'
可能是因为decoder是Model属性导致的?
请问怎么解决呢?可以重新给decoder传入tensor
多谢

KeyError: 'mt5.1.1'

跑finetune脚本,报如下错误,请问该如何解决?

1653544327(1)

base版和small版都是报这个错误

用bert4keras做ner任务在选择T5模型,在加载该chinese_t5_pegasus_base预训练模型时报错

苏神你好,首先感谢大佬分享!

我最近在使用bert4keras做实体识别任务,具体脚本用的是:
https://github.com/bojone/bert4keras/blob/master/examples/task_sequence_labeling_ner_crf.py

python3.6.9
tensorflow-gpu 1.14.0
keras 2.3.1
bert4keras 0.10.7

想尝试通过该脚本用t5模型,但是在加载预训练模型chinese_t5_pegasus_base是报如下错误:

Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Traceback (most recent call last):
File "task_sequence_labeling_ner_crf_conlleval.py", line 155, in
model = build_transformer_model(config_path, checkpoint_path, model=is_albert)
File "/big_disk/ner_2/bert4keras/models.py", line 2451, in build_transformer_model
transformer.load_weights_from_checkpoint(checkpoint_path)
File "/big_disk/ner_2/bert4keras/models.py", line 305, in load_weights_from_checkpoint
raise e
File "/big_disk/ner_2/bert4keras/models.py", line 299, in load_weights_from_checkpoint
values.append(self.load_variable(checkpoint, v))
File "/big_disk/ner_2/bert4keras/models.py", line 1763, in load_variable
variable = super(T5_Base, self).load_variable(checkpoint, name)
File "/big_disk/ner_2/bert4keras/models.py", line 270, in load_variable
return tf.train.load_variable(checkpoint, name)
File "/big_disk/venv36/lib/python3.6/site-packages/tensorflow/python/training/checkpoint_utils.py", line 84, in load_variable
return reader.get_tensor(name)
File "/big_disk/venv36/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 678, in get_tensor
return CheckpointReader_GetTensor(self, compat.as_bytes(tensor_str))
tensorflow.python.framework.errors_impl.NotFoundError: Key encoder/final_layer_norm/scale not found in checkpoint

请问可以用该脚本使用t5模型做实体识别任务吗?应该如何解决加载预训练模型chinese_t5_pegasus_base出现的错误呢?

再次感谢大佬!

BERT的结果是怎么得到的?

image
上图实验中的BERT模型的结果是怎么得到的,BERT可以直接用来做生成式模型吗?还是把LCSTS当做抽取来做?

model.load_weights('./best_model.weights')报错'str' object has no attribute 'decode'

基本信息

你使用的操作系统: windows
你使用的Python版本: python3.6.0
你使用的Tensorflow版本: 1.14.0
你使用的Keras版本: 2.3.1
你使用的bert4keras版本: 0.10.0
你使用纯keras还是tf.keras:
你加载的预训练模型: T5_pegasus

核心代码

Run finetune.py的时候正常,得到权重best_model.weights,使用的是CSL数据集进行微调,rouge比博客给出的低了5-6%。进行推理时,根据微调脚本写了预测代码,加载模型代码如下: t5 = build_transformer_model(
config_path=config_path,
checkpoint_path=None,
model='t5.1.1',
return_keras_model=False,
name='T5',
)

encoder = t5.encoder
decoder = t5.decoder
model = t5.model
model.summary()
model.load_weights(save_mode_path, by_name=False,
skip_mismatch=False, reshape=False)`

输出信息

Traceback (most recent call last): File "F:/pycharm_projrct/t5-pegasus-main/preditc_01.py", line 130, in <module> skip_mismatch=True, reshape=False) File "E:\Anaconda3\envs\tens1\lib\site-packages\keras\engine\saving.py", line 492, in load_wrapper return load_function(*args, **kwargs) File "E:\Anaconda3\envs\tens1\lib\site-packages\keras\engine\network.py", line 1227, in load_weights reshape=reshape) File "E:\Anaconda3\envs\tens1\lib\site-packages\keras\engine\saving.py", line 1262, in load_weights_from_hdf5_group_by_name original_keras_version = f.attrs['keras_version'].decode('utf8') AttributeError: 'str' object has no attribute 'decode'

自我尝试

  • 尝试过修改load_weights的参数 by_name=True, skip_mismatch=True,好像没用
  • 尝试将模型保存为ckpt,使用
    # model.save_weights('./best_model.weights') # 保存模型 t5.save_weights_as_checkpoint('./best_model.weights') # 保存模型
    可以正常加载模型了但是预测出来的是空字符串。
    请苏神帮忙看一下

tokenizer疑惑

如果你重新构建了词表,是不是代表T5预训练模型中的token embedding没办法用了?因为词表中的字词都是新的,且位置变了?这个地方一直不是很清楚

缺少self.last_token函数

你好,在finetune.py文件中提示缺少self.last_token函数

class AutoTitle(AutoRegressiveDecoder):
"""seq2seq解码器
"""
@AutoRegressiveDecoder.wraps(default_rtype='probas')
def predict(self, inputs, output_ids, states):
c_encoded = inputs[0]
return self.last_token(decoder).predict([c_encoded, output_ids])

train

预训练train.py保存的是.weights文件,finetune用的是.ckpt文件,请问这个是怎么转化过去的呢?

Tokenizer无法判断何时终止

T5原来用的sentence piece tokenizer确实不太适合中文,但是BertTokenizer它把句子之间的分隔"[SEP]"同时认为是结束符(句子结束应为"[EOS]"之类的)。这就导致在做生成任务时,要么直接把"[SEP]"指定为终止符,要么就得生成到最大长度,如果人为添加终止符,则需要大量语料支持。这显然不利于生成任务,我想请问有什么好的解决办法吗?

请问如何将compute_loss中的tensor(比如y_true)转为array呢

尝试了tensor.eval()和tensor.numpy()均不行
第一种报错大致意思说缺少一个session,给他新建一个session报错说不在原来session内
第二种方式中
tensorflow.python.framework.ops.EagerTensor 这种类型的可以转换,而代码里的类型为tensorflow.python.framework.ops.Tensor,加上tf.enable_eager_execution()也不行
求解答

bert4keras0.11版本以后需要修改加载方式

config_path = '/root/bert/chinese_t5_pegasus_base/config.json'
checkpoint_path = '/root/bert/chinese_t5_pegasus_base/model.ckpt'

build_transformer_model(
config_path=config_path,
checkpoint_path=checkpoint_path,
model='t5.1.1',
return_keras_model=False,
name='T5',
)
加载方式需要改为model='mt5.1.1'

请问生成结果中,经常有最后几个短句循环重复的情况,是为什么?

用了finetune,没用train。
自己标注了1w训练数据。

输入:首先我们点击菜单栏当中的文件右上角的,然后弹出的窗口当中,这边有一个选项,有的选项可能在在最下方,有的是在上面点哈,点击选项,然后我们选择笑对,一定是选择校队,校队里面,我们找着在word中订正拼写和语法时,这个位置,下面有个键入时检查拼写,键入时,标记语法错误,然后我们把这两个勾选给去掉,把这两个给勾选去掉,现在这两个勾选去了之后,下面这边它就会只有这个弹出的现,现在我把它勾选了,然后他就隐藏掉了,现在我们点击一下确定,

输出:点击菜单栏当中的文件右上角的,弹出的窗口当中,点击选项,选择字对,选择校队,键入时检查拼写,键入时标记语法错误,把两个勾选去掉,弹出的字勾选隐藏掉,点击确定,字对勾选有了,点确定,字对勾选没有了,点击确定,字对勾选没有了,点确定,字对勾选有了,点确定,字对勾选有了,点确定,字对勾选有了,点确定,字对勾选没有了

运行train.py的时候,最后到model.fit(dataset,steps_per_epoch=1000, ......)时候报错 AttributeError: 'DatasetV1Adapter' object has no attribute 'ndim',

您好,运行环境跟您一样,运行train.py的时候,最后到model.fit(dataset,steps_per_epoch=1000, ......)时候报错 AttributeError: 'DatasetV1Adapter' object has no attribute 'ndim',这里dataset是'DatasetV1Adapter',这个“ndim”是什么呢?是一个DatasetV1Adapter的属性吗?还是因为其他什么原因呢?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.