Comments (5)
相对位置编码 我只是转化为一个layer,不影响啊
from nezha_chinese_pytorch.
我使用该项目的代码加载其提供的nezha-large-wwm预训练权重,同样出现,说相对位置编码层的预训练权重无法使用,想请问一下这是怎么回事呢?
Some weights of the model checkpoint at ./pretrained_model/nezha-large-www/ were not used when initializing NeZhaForSequenceClassification: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.predictions.decoder.bias', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'bert.encoder.layer.0.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.1.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.2.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.3.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.4.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.5.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.6.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.7.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.8.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.9.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.10.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.11.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.12.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.13.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.14.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.15.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.16.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.17.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.18.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.19.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.20.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.21.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.22.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.23.attention.self.relative_positions_encoding.positions_encoding']
- This IS expected if you are initializing NeZhaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing NeZhaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
from nezha_chinese_pytorch.
代码已经更新了,主要是相对位置的问题
from nezha_chinese_pytorch.
代码已经更新了,主要是相对位置的问题
不过貌似还有这个警告 不知道原因
Some weights of the model checkpoint at /home/root1/DY/nezha-base-www were not used when initializing NeZhaForMaskedLM: ['cls.seq_relationship.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing NeZhaForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing NeZhaForMaskedLM from the checkpoint
from nezha_chinese_pytorch.
@bestpredicts 你都使用了MLM,哪来seq_next的权重呢?
from nezha_chinese_pytorch.
Related Issues (13)
- 大佬可以提供转换到onnx的脚本吗? HOT 2
- 两个self.relative_positions_encoding[:to_seq_length, :to_seq_length, :].to(hidden_states.device)太影响性能了
- transfomers model hub support
- NeZha 有英文版吗?
- 可以上传模型权重到huggingface吗?
- checkpoint转pytorch问题 HOT 4
- www的意思是? HOT 1
- RuntimeError: version_ <= kMaxSupportedFileFormatVersion INTERNAL ASSERT FAILED HOT 2
- 长文本 HOT 4
- 咨询下finetune的模型大小 HOT 6
- 能麻烦提供一个从头预训练的代码或者脚本吗 HOT 1
- 想问一下,可以直接使用transformer里的bert模型,直接来加载nezha权重可以吗 HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nezha_chinese_pytorch.