Giter Site home page Giter Site logo

nezha_chinese_pytorch's Introduction

NeZha_Chinese_PyTorch

pytorch版NEZHA,适配transformers

论文下载地址: NEZHA: Neural Contextualized Representation for Chinese Language Understanding

运行脚本依赖模块

如果需要运行该案例脚本,需要安装以下模块:

  1. trainsformers>=4.1.1
  2. TorchBlocks

模型权重下载

官方提供的Tensorflow版本权重下载地址:huawei-noah

已经转化为PyTorch版本权重下载地址如下:

说明:若加载的模型权重是从下列百度网盘下载的PyTorch模型权重,则需要保证torch版本>=1.6.0

运行

执行命令:

sh scripts/run_task_text_classification_chnsenti.sh

长文本

长文本可以通过设置config.max_position_embeddings参数实现,默认值为512,如:

config.max_position_embeddings=args.train_max_seq_length

结果

NEZHA(base-wwm) chnsenti
tensorflow 94.75
pytorch 94.92

nezha_chinese_pytorch's People

Contributors

lonepatient avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

nezha_chinese_pytorch's Issues

咨询下finetune的模型大小

你好,我运行HuggingFace的run_mlm.py脚本做finetune,nezha的base模型保存pytorch_model.bin有1G多,请问这是什么原因?
我对比运行了华为官方的modeling_nezha,发现你们的区别是在attention层,你的多了relative_positions_encoding,是因为这个吗?

www的意思是?

您好,很感谢您的这一工作,方便请问一下,您这里的模型名称,如nezha-base-www中的www的意思是?不知道是不是Whole Word Masking的意思。

transfomers model hub support

I am woudering if is possible that adding nezha chinese model into huggingface transformers repo, for both code and paramaters. nazhe is widely used in chinese nlp community, and would helps a lot.

checkpoint转pytorch问题

有几个问题请教下
1、公司网络无法下载某云盘资源,如何把huawei提供的checkpoint转成pytorch的bin文件。
2、转好后是否可以直接用transformers包加载使用

长文本

设置max_position_embedding >512 后,还是无法输入超过512的长度,请问怎么该怎么用这个处理长文本

使用华为原本的torch代码加载本项目的nezha-wwm权重有警告

  Weights from pretrained model not used in BertForPreTraining: 
['bert.encoder.layer.0.attention.self.relative_positions_encoding.positions_encoding', 
'bert.encoder.layer.1.attention.self.relative_positions_encoding.positions_encoding', 
'bert.encoder.layer.2.attention.self.relative_positions_encoding.positions_encoding',  
'bert.encoder.layer.3.attention.self.relative_positions_encoding.positions_encoding', 
'bert.encoder.layer.4.attention.self.relative_positions_encoding.positions_encoding', 
'bert.encoder.layer.5.attention.self.relative_positions_encoding.positions_encoding',   
........
 'cls.predictions.decoder.bias']

希望能提供一下这个问题的原因

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.