Giter Site home page Giter Site logo

Comments (26)

yanqiangmiffy avatar yanqiangmiffy commented on July 16, 2024 1

NER成功的环境

Package           Version
----------------- -------------------
certifi           2020.6.20
chardet           4.0.0
click             7.1.2
configparser      5.0.1
dataclasses       0.8
docker-pycreds    0.4.0
filelock          3.0.12
gitdb             4.0.5
GitPython         3.1.12
idna              2.10
joblib            1.0.0
numpy             1.19.5
pandas            1.1.5
Pillow            8.1.0
pip               20.2.4
promise           2.3
protobuf          3.14.0
psutil            5.8.0
pyltp             0.2.1
python-dateutil   2.8.1
pytorch-crf       0.7.2
pytz              2020.5
PyYAML            5.3.1
regex             2020.11.13
requests          2.25.1
sacremoses        0.0.43
scikit-learn      0.24.0
scipy             1.5.4
sentencepiece     0.1.94
sentry-sdk        0.19.5
setuptools        50.3.0.post20201006
shortuuid         1.0.1
six               1.15.0
smmap             3.0.4
subprocess32      3.5.4
threadpoolctl     2.1.0
tokenizers        0.7.0
torch             1.7.1
torchvision       0.8.2
tqdm              4.55.1
transformers      2.10.0
typing-extensions 3.7.4.3
urllib3           1.26.2
wandb             0.10.12
watchdog          1.0.2
wheel             0.35.1

from guwenbert.

Ethan-yt avatar Ethan-yt commented on July 16, 2024 1

禁用cuda之后,查看报错信息:

  File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/pydevd.py", line 1448, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/Users/ethan/Projects/transformers/examples/multiple-choice/run_swag.py", line 405, in <module>
    main()
  File "/Users/ethan/Projects/transformers/examples/multiple-choice/run_swag.py", line 367, in main
    train_result = trainer.train(resume_from_checkpoint=checkpoint)
  File "/Users/ethan/Projects/transformers/src/transformers/trainer.py", line 996, in train
    tr_loss += self.training_step(model, inputs)
  File "/Users/ethan/Projects/transformers/src/transformers/trainer.py", line 1399, in training_step
    loss = self.compute_loss(model, inputs)
  File "/Users/ethan/Projects/transformers/src/transformers/trainer.py", line 1429, in compute_loss
    outputs = model(**inputs)
  File "/Users/ethan/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/Users/ethan/Projects/transformers/src/transformers/models/roberta/modeling_roberta.py", line 1249, in forward
    return_dict=return_dict,
  File "/Users/ethan/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/Users/ethan/Projects/transformers/src/transformers/models/roberta/modeling_roberta.py", line 805, in forward
    past_key_values_length=past_key_values_length,
  File "/Users/ethan/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/Users/ethan/Projects/transformers/src/transformers/models/roberta/modeling_roberta.py", line 117, in forward
    token_type_embeddings = self.token_type_embeddings(token_type_ids)
  File "/Users/ethan/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/Users/ethan/miniconda3/lib/python3.7/site-packages/torch/nn/modules/sparse.py", line 126, in forward
    self.norm_type, self.scale_grad_by_freq, self.sparse)
  File "/Users/ethan/miniconda3/lib/python3.7/site-packages/torch/nn/functional.py", line 1852, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
IndexError: index out of range in self

重点关注这行:

  File "/Users/ethan/Projects/transformers/src/transformers/models/roberta/modeling_roberta.py", line 117, in forward
    token_type_embeddings = self.token_type_embeddings(token_type_ids)

发现是token_type_embeddings出错。通过debug发现,输入数据的token_type_ids有0和1两种。

Roberta由于取消了BERT的next sentence prediction任务,token_type_id 只支持0。

from guwenbert.

Ethan-yt avatar Ethan-yt commented on July 16, 2024 1

解决方法:修改读数据相关代码,将token_type_id全部设置为0。

为什么其他模型可以使用:因为他们的robert是假的,底层还是bert。

关于token_type_id可以看一下以下参考资料:
https://huggingface.co/transformers/model_doc/roberta.html#transformers.RobertaTokenizer.create_token_type_ids_from_sequences
https://huggingface.co/transformers/glossary.html#token-type-ids
huggingface/transformers#1114

from guwenbert.

jackhuntcn avatar jackhuntcn commented on July 16, 2024 1

感谢作者,楼上几位老哥都是在打海华技术组比赛吧?:)

from guwenbert.

Lirsakura avatar Lirsakura commented on July 16, 2024 1

禁用cuda之后,查看报错信息:

  File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/pydevd.py", line 1448, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/Users/ethan/Projects/transformers/examples/multiple-choice/run_swag.py", line 405, in <module>
    main()
  File "/Users/ethan/Projects/transformers/examples/multiple-choice/run_swag.py", line 367, in main
    train_result = trainer.train(resume_from_checkpoint=checkpoint)
  File "/Users/ethan/Projects/transformers/src/transformers/trainer.py", line 996, in train
    tr_loss += self.training_step(model, inputs)
  File "/Users/ethan/Projects/transformers/src/transformers/trainer.py", line 1399, in training_step
    loss = self.compute_loss(model, inputs)
  File "/Users/ethan/Projects/transformers/src/transformers/trainer.py", line 1429, in compute_loss
    outputs = model(**inputs)
  File "/Users/ethan/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/Users/ethan/Projects/transformers/src/transformers/models/roberta/modeling_roberta.py", line 1249, in forward
    return_dict=return_dict,
  File "/Users/ethan/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/Users/ethan/Projects/transformers/src/transformers/models/roberta/modeling_roberta.py", line 805, in forward
    past_key_values_length=past_key_values_length,
  File "/Users/ethan/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/Users/ethan/Projects/transformers/src/transformers/models/roberta/modeling_roberta.py", line 117, in forward
    token_type_embeddings = self.token_type_embeddings(token_type_ids)
  File "/Users/ethan/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/Users/ethan/miniconda3/lib/python3.7/site-packages/torch/nn/modules/sparse.py", line 126, in forward
    self.norm_type, self.scale_grad_by_freq, self.sparse)
  File "/Users/ethan/miniconda3/lib/python3.7/site-packages/torch/nn/functional.py", line 1852, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
IndexError: index out of range in self

重点关注这行:

  File "/Users/ethan/Projects/transformers/src/transformers/models/roberta/modeling_roberta.py", line 117, in forward
    token_type_embeddings = self.token_type_embeddings(token_type_ids)

发现是token_type_embeddings出错。通过debug发现,输入数据的token_type_ids有0和1两种。
Roberta由于取消了BERT的next sentence prediction任务,token_type_id 只支持0。

感谢,懂了

请问您解决问题了吗,我直接使用token_type_ids = torch.zeros_like(token_type_ids)似乎不行

from guwenbert.

yanqiangmiffy avatar yanqiangmiffy commented on July 16, 2024

同样的错误,应该是transformers版本或者torch版本不对应

from guwenbert.

Ethan-yt avatar Ethan-yt commented on July 16, 2024

可不可以把代码贴一下?

from guwenbert.

yanqiangmiffy avatar yanqiangmiffy commented on July 16, 2024

可不可以把代码贴一下?

https://github.com/huggingface/transformers/tree/master/examples/multiple-choice

这个问题之前遇到过,不过是NER的任务,当时通过改变transformers和torch的版本最后才解决的,不同的切换尝试罢不同版本,但是很痛苦很麻烦
z814081807/DeepNER#1 (comment)

from guwenbert.

Ethan-yt avatar Ethan-yt commented on July 16, 2024

看报错日志是索引越界,检查一下词表,训练数据。比如句子是否过长?如果数据没问题检查一下网络结构,AutoModel的输出是一个向量,如果是分类任务还需要接一层FFN。

from guwenbert.

Daemon-ser avatar Daemon-ser commented on July 16, 2024

重点是代码只改变模型名字,改成chinese-bert-wwm,或者其他中文模型都可以跑,改成ethanyt/guwenbert-base 或者large就都不行了,可能是 @yanqiangmiffy 他说的版本问题....

from guwenbert.

Ethan-yt avatar Ethan-yt commented on July 16, 2024

重点是代码只改变模型名字,改成chinese-bert-wwm,或者其他中文模型都可以跑,改成ethanyt/guwenbert-base 或者large就都不行了,可能是 @yanqiangmiffy 他说的版本问题....

通常CUDA报错可以通过CPU来debug。如果只用cpu跑同样的代码(删掉model.cuda())会更容易发现哪里越界。

from guwenbert.

Daemon-ser avatar Daemon-ser commented on July 16, 2024

重点是代码只改变模型名字,改成chinese-bert-wwm,或者其他中文模型都可以跑,改成ethanyt/guwenbert-base 或者large就都不行了,可能是 @yanqiangmiffy 他说的版本问题....

通常CUDA报错可以通过CPU来debug。如果只用cpu跑同样的代码(删掉model.cuda())会更容易发现哪里越界。

好的

from guwenbert.

yanqiangmiffy avatar yanqiangmiffy commented on July 16, 2024

看报错日志是索引越界,检查一下词表,训练数据。比如句子是否过长?如果数据没问题检查一下网络结构,AutoModel的输出是一个向量,如果是分类任务还需要接一层FFN。

对的,我一开始也是以为是数组索引越界或者GPU OOM,但是在两个任务中(NER和MRC)中,我把bs和max_len设置成较低值也会报错,在NER任务中我是用任何模型都会报错,在MRC中使用当前模型会报错。后来我通过配置transformer和torch来解决的

from guwenbert.

Daemon-ser avatar Daemon-ser commented on July 16, 2024

看报错日志是索引越界,检查一下词表,训练数据。比如句子是否过长?如果数据没问题检查一下网络结构,AutoModel的输出是一个向量,如果是分类任务还需要接一层FFN。

对的,我一开始也是以为是数组索引越界或者GPU OOM,但是在两个任务中(NER和MRC)中,我把bs和max_len设置成较低值也会报错,在NER任务中我是用任何模型都会报错,在MRC中使用当前模型会报错。后来我通过配置transformer和torch来解决的

我目前是torch 1.7 + transfomers 3.4,我试试降级transformers

from guwenbert.

yanqiangmiffy avatar yanqiangmiffy commented on July 16, 2024

请问下作者当前使用的transformers的版本是多少,我之前Google很多关于这个bug,可能是API(比如tokenizer参数)变动了,以下为可能的原因:

1.GPU OOM
2.huggingface OOM
3.[max_seq_length(RuntimeError: cuda runtime error (59) : device-side assert triggered #97]](huggingface/transformers#97)
4.API使用huggingface‘s transformers预训练自己模型时报:Assertion ‘srcIndex < srcSelectDimSize‘ failed. 的解决办法

from guwenbert.

Daemon-ser avatar Daemon-ser commented on July 16, 2024

请问下作者当前使用的transformers的版本是多少,我之前Google很多关于这个bug,可能是API(比如tokenizer参数)变动了

的确,看报错应该是索引越界,很可能是分词器的问题,导致encode后index不同,可能transformers版本更新改了分词器,我试一试降级transformers

from guwenbert.

Daemon-ser avatar Daemon-ser commented on July 16, 2024

请问下作者当前使用的transformers的版本是多少,我之前Google很多关于这个bug,可能是API(比如tokenizer参数)变动了,以下为可能的原因:

1.GPU OOM
2.huggingface OOM
3.[max_seq_length(RuntimeError: cuda runtime error (59) : device-side assert triggered #97]](huggingface/transformers#97)
4.API使用huggingface‘s transformers预训练自己模型时报:Assertion ‘srcIndex < srcSelectDimSize‘ failed. 的解决办法

多谢,降级为2.4后,出现同样的错误,我再细看下这几个解决办法

from guwenbert.

Daemon-ser avatar Daemon-ser commented on July 16, 2024

禁用cuda之后,查看报错信息:

  File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/pydevd.py", line 1448, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/Users/ethan/Projects/transformers/examples/multiple-choice/run_swag.py", line 405, in <module>
    main()
  File "/Users/ethan/Projects/transformers/examples/multiple-choice/run_swag.py", line 367, in main
    train_result = trainer.train(resume_from_checkpoint=checkpoint)
  File "/Users/ethan/Projects/transformers/src/transformers/trainer.py", line 996, in train
    tr_loss += self.training_step(model, inputs)
  File "/Users/ethan/Projects/transformers/src/transformers/trainer.py", line 1399, in training_step
    loss = self.compute_loss(model, inputs)
  File "/Users/ethan/Projects/transformers/src/transformers/trainer.py", line 1429, in compute_loss
    outputs = model(**inputs)
  File "/Users/ethan/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/Users/ethan/Projects/transformers/src/transformers/models/roberta/modeling_roberta.py", line 1249, in forward
    return_dict=return_dict,
  File "/Users/ethan/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/Users/ethan/Projects/transformers/src/transformers/models/roberta/modeling_roberta.py", line 805, in forward
    past_key_values_length=past_key_values_length,
  File "/Users/ethan/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/Users/ethan/Projects/transformers/src/transformers/models/roberta/modeling_roberta.py", line 117, in forward
    token_type_embeddings = self.token_type_embeddings(token_type_ids)
  File "/Users/ethan/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/Users/ethan/miniconda3/lib/python3.7/site-packages/torch/nn/modules/sparse.py", line 126, in forward
    self.norm_type, self.scale_grad_by_freq, self.sparse)
  File "/Users/ethan/miniconda3/lib/python3.7/site-packages/torch/nn/functional.py", line 1852, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
IndexError: index out of range in self

重点关注这行:

  File "/Users/ethan/Projects/transformers/src/transformers/models/roberta/modeling_roberta.py", line 117, in forward
    token_type_embeddings = self.token_type_embeddings(token_type_ids)

发现是token_type_embeddings出错。通过debug发现,输入数据的token_type_ids有0和1两种。

Roberta由于取消了BERT的next sentence prediction任务,token_type_id 只支持0。

感谢,懂了

from guwenbert.

Daemon-ser avatar Daemon-ser commented on July 16, 2024

解决方法:修改读数据相关代码,将token_type_id全部设置为0。

为什么其他模型可以使用:因为他们的robert是假的,底层还是bert。

关于token_type_id可以看一下以下参考资料:
https://huggingface.co/transformers/model_doc/roberta.html#transformers.RobertaTokenizer.create_token_type_ids_from_sequences
https://huggingface.co/transformers/glossary.html#token-type-ids
huggingface/transformers#1114
是huggingface里的roberta模型都是用的nsp做的训练嘛,还是说是为了API兼容选择了训练roberta还用nsp,谢谢您啦!

from guwenbert.

Daemon-ser avatar Daemon-ser commented on July 16, 2024

感谢作者,楼上几位老哥都是在打海华技术组比赛吧?:)

是的,里面古文还挺多的,想试试这个bert能不能提升效果

from guwenbert.

Ethan-yt avatar Ethan-yt commented on July 16, 2024

Roberta本身就取消了nsp任务,但是还是保留了这个embedding,虽然全都是0,对整体的embedding没有任何影响。

from guwenbert.

Daemon-ser avatar Daemon-ser commented on July 16, 2024

Roberta本身就取消了nsp任务,但是还是保留了这个embedding,虽然全都是0,对整体的embedding没有任何影响。

嗯嗯,感谢

from guwenbert.

Ethan-yt avatar Ethan-yt commented on July 16, 2024

我没有参与这个比赛,如果你们发现有提升可以把对比结果分享一下,期待你们的好消息:)

from guwenbert.

yanqiangmiffy avatar yanqiangmiffy commented on July 16, 2024

解决方法:修改读数据相关代码,将token_type_id全部设置为0。

为什么其他模型可以使用:因为他们的robert是假的,底层还是bert。

关于token_type_id可以看一下以下参考资料:
https://huggingface.co/transformers/model_doc/roberta.html#transformers.RobertaTokenizer.create_token_type_ids_from_sequences
https://huggingface.co/transformers/glossary.html#token-type-ids
huggingface/transformers#1114

感谢作者解答

from guwenbert.

Lirsakura avatar Lirsakura commented on July 16, 2024

设置全0了依然没办法解决问题,请问版本应该是什么?

from guwenbert.

Daemon-ser avatar Daemon-ser commented on July 16, 2024

禁用cuda之后,查看报错信息:

  File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/pydevd.py", line 1448, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/Users/ethan/Projects/transformers/examples/multiple-choice/run_swag.py", line 405, in <module>
    main()
  File "/Users/ethan/Projects/transformers/examples/multiple-choice/run_swag.py", line 367, in main
    train_result = trainer.train(resume_from_checkpoint=checkpoint)
  File "/Users/ethan/Projects/transformers/src/transformers/trainer.py", line 996, in train
    tr_loss += self.training_step(model, inputs)
  File "/Users/ethan/Projects/transformers/src/transformers/trainer.py", line 1399, in training_step
    loss = self.compute_loss(model, inputs)
  File "/Users/ethan/Projects/transformers/src/transformers/trainer.py", line 1429, in compute_loss
    outputs = model(**inputs)
  File "/Users/ethan/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/Users/ethan/Projects/transformers/src/transformers/models/roberta/modeling_roberta.py", line 1249, in forward
    return_dict=return_dict,
  File "/Users/ethan/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/Users/ethan/Projects/transformers/src/transformers/models/roberta/modeling_roberta.py", line 805, in forward
    past_key_values_length=past_key_values_length,
  File "/Users/ethan/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/Users/ethan/Projects/transformers/src/transformers/models/roberta/modeling_roberta.py", line 117, in forward
    token_type_embeddings = self.token_type_embeddings(token_type_ids)
  File "/Users/ethan/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/Users/ethan/miniconda3/lib/python3.7/site-packages/torch/nn/modules/sparse.py", line 126, in forward
    self.norm_type, self.scale_grad_by_freq, self.sparse)
  File "/Users/ethan/miniconda3/lib/python3.7/site-packages/torch/nn/functional.py", line 1852, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
IndexError: index out of range in self

重点关注这行:

  File "/Users/ethan/Projects/transformers/src/transformers/models/roberta/modeling_roberta.py", line 117, in forward
    token_type_embeddings = self.token_type_embeddings(token_type_ids)

发现是token_type_embeddings出错。通过debug发现,输入数据的token_type_ids有0和1两种。
Roberta由于取消了BERT的next sentence prediction任务,token_type_id 只支持0。

感谢,懂了

请问您解决问题了吗,我直接使用token_type_ids = torch.zeros_like(token_type_ids)似乎不行

我按照他说的,全部设为0后是可以的

from guwenbert.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.