Giter Site home page Giter Site logo

Comments (18)

TIANRENK avatar TIANRENK commented on May 1, 2024

But I print the model.embeddings.token_type_embeddings it was Embedding(16,768) .

from transformers.

thomwolf avatar thomwolf commented on May 1, 2024

which model are you loading?

from transformers.

TIANRENK avatar TIANRENK commented on May 1, 2024

which model are you loading?

the pre-trained model chinese_L-12_H-768_A-12

from transformers.

TIANRENK avatar TIANRENK commented on May 1, 2024

mycode:
bert_config = BertConfig.from_json_file('bert_config.json')
model=BertModel(bert_config)
model.load_state_dict(torch.load('pytorch_model.bin'))

The error:
RuntimeError: Error(s) in loading state_dict for BertModel:
size mismatch for embeddings.token_type_embeddings.weight: copying a param of torch.Size([16, 768]) from checkpoint, where the shape is torch.Size([2, 768]) in current model.

from transformers.

thomwolf avatar thomwolf commented on May 1, 2024

I'm testing the chinese model.
Do you use the config.json of the chinese_L-12_H-768_A-12 ?
Can you send the content of your config_json ?

from transformers.

TIANRENK avatar TIANRENK commented on May 1, 2024

I'm testing the chinese model.
Do you use the config.json of the chinese_L-12_H-768_A-12 ?
Can you send the content of your config_json ?

In the 'config.json' of the chinese_L-12_H-768_A-12 ,the type_vocab_size=2.But I change the config.type_vocab_size=16, it still error.

from transformers.

TIANRENK avatar TIANRENK commented on May 1, 2024

I'm testing the chinese model.
Do you use the config.json of the chinese_L-12_H-768_A-12 ?
Can you send the content of your config_json ?

{
"attention_probs_dropout_prob": 0.1,
"directionality": "bidi",
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 3072,
"max_position_embeddings": 512,
"num_attention_heads": 12,
"num_hidden_layers": 12,
"pooler_fc_size": 768,
"pooler_num_attention_heads": 12,
"pooler_num_fc_layers": 3,
"pooler_size_per_head": 128,
"pooler_type": "first_token_transform",
"type_vocab_size": 2,
"vocab_size": 21128
}

I change my code:
bert_config = BertConfig.from_json_file('bert_config.json')
bert_config.type_vocab_size=16
model=BertModel(bert_config)
model.load_state_dict(torch.load('pytorch_model.bin'))

it still error.

from transformers.

TIANRENK avatar TIANRENK commented on May 1, 2024

I see you have "type_vocab_size": 2 in your config file, how is that?

Yes,but I change it in my code.

from transformers.

TIANRENK avatar TIANRENK commented on May 1, 2024

is your pytorch_model.bin the good converted model of the chinese one (and not of an English one)?

I think it's good.

from transformers.

thomwolf avatar thomwolf commented on May 1, 2024

Ok, I have the models. I think type_vocab_size should be 2 also for chinese. I am wondering why it is 16 in your pytorch_model.bin

from transformers.

TIANRENK avatar TIANRENK commented on May 1, 2024

I have no idea.Did my model make the wrong convert?

from transformers.

thomwolf avatar thomwolf commented on May 1, 2024

I am testing that right now. I haven't played with the multi-lingual models yet.

from transformers.

TIANRENK avatar TIANRENK commented on May 1, 2024

I am testing that right now. I haven't played with the multi-lingual models yet.

I also use it for the first time.I am looking forward to your test results.

from transformers.

TIANRENK avatar TIANRENK commented on May 1, 2024

I am testing that right now. I haven't played with the multi-lingual models yet.

When I was converting the model .

Traceback (most recent call last):
File "convert_tf_checkpoint_to_pytorch.py", line 95, in
convert()
File "convert_tf_checkpoint_to_pytorch.py", line 85, in convert
assert pointer.shape == array.shape
AssertionError: (torch.Size([16, 768]), (2, 768))

from transformers.

thomwolf avatar thomwolf commented on May 1, 2024

are you supplying a config file with "type_vocab_size": 2 to the conversion script?

from transformers.

TIANRENK avatar TIANRENK commented on May 1, 2024

are you supplying a config file with "type_vocab_size": 2 to the conversion script?

I used the 'bert_config.json' of the chinese_L-12_H-768_A-12 when I was converting .

from transformers.

thomwolf avatar thomwolf commented on May 1, 2024

Ok, I think I found the issue, your BertConfig is not build from the configuration file for some reason and thus use the default value of type_vocab_size in BertConfig which is 16.

This error happen on my system when I use config = BertConfig('bert_config.json') instead of config = BertConfig.from_json_file('bert_config.json').

I will make sure these two ways of initializing the configuration file (from parameters or from json file) cannot be messed up.

from transformers.

imxiaomin avatar imxiaomin commented on May 1, 2024

运行时错误:加载 BertModel state_dict时出错:embeddings.token_type_embeddings 的大小不匹配.weight:
复制火炬参数。大小([16, 768]) 从检查点开始,其中形状为火炬。当前模型中的大小([2, 768]

i have the same problem as you. did you solve the problem?

from transformers.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.