Giter Site home page Giter Site logo

Comments (5)

lonePatient avatar lonePatient commented on June 3, 2024

相对位置编码 我只是转化为一个layer,不影响啊

from nezha_chinese_pytorch.

suolyer avatar suolyer commented on June 3, 2024

我使用该项目的代码加载其提供的nezha-large-wwm预训练权重,同样出现,说相对位置编码层的预训练权重无法使用,想请问一下这是怎么回事呢?

Some weights of the model checkpoint at ./pretrained_model/nezha-large-www/ were not used when initializing NeZhaForSequenceClassification: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.predictions.decoder.bias', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'bert.encoder.layer.0.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.1.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.2.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.3.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.4.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.5.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.6.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.7.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.8.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.9.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.10.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.11.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.12.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.13.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.14.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.15.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.16.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.17.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.18.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.19.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.20.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.21.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.22.attention.self.relative_positions_encoding.positions_encoding', 'bert.encoder.layer.23.attention.self.relative_positions_encoding.positions_encoding']
- This IS expected if you are initializing NeZhaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing NeZhaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).

from nezha_chinese_pytorch.

lonePatient avatar lonePatient commented on June 3, 2024

代码已经更新了,主要是相对位置的问题

from nezha_chinese_pytorch.

bestpredicts avatar bestpredicts commented on June 3, 2024

代码已经更新了,主要是相对位置的问题

不过貌似还有这个警告 不知道原因

Some weights of the model checkpoint at /home/root1/DY/nezha-base-www were not used when initializing NeZhaForMaskedLM: ['cls.seq_relationship.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing NeZhaForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing NeZhaForMaskedLM from the checkpoint 

from nezha_chinese_pytorch.

lonePatient avatar lonePatient commented on June 3, 2024

@bestpredicts 你都使用了MLM,哪来seq_next的权重呢?

from nezha_chinese_pytorch.

Related Issues (13)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.