jasoncao11 / nlp-notebook Goto Github PK

NLP 领域常见任务的实现，包括新词发现、以及基于pytorch的词向量、中文文本分类、实体识别、摘要文本生成、句子相似度判断、三元组抽取、预训练模型等。

License: MIT License

Python 100.00%

textcnn textrcnn bilstm-crf-model bilstm-attention fasttext-embeddings transformer-pytorch bert-chinese textrcnn-bert distill-bert seq2seq

nlp-notebook's People

Contributors

Stargazers

Watchers

Forkers

askintution tiexueyl xho-yhzhou cindycandy alansec shunsunsun chunyu226 xiangyuya xbqnl km1994 yliu9418 chenshubiao techthiyanes piao0623 jiang-yiyan zoumt1633 kazgu badao006 9414lalala zhouxiaoleilei zhoujiangbing sophieduan ashdyh1999 allensmile yisheng123 af-74413592 liuyang333 zurichrain alsers bobotjb tcmip microw answer3664 ehealthgroup yangjinfeng wqw547243068 sunpu1995 longway4ml startgis ni0317 anbiao nevinhappy dmuistlab brightchu anminhhung poisoners benjamesbabala wsj-7416 jimmycao kingking888 v0idwu algorithm-learning-community-for-python mayi140611 tuozhanjun skyrookieyu sinntalker kangaroobiubiu ipengx1029 lichendaozhu xhh315 luckydoggy yesthing kenyony hiahianet aaronzhangl skytodmoon fanlystone xjy531171158 michal-olek benkang-chen lf464347567 tantailong stephenhzj lorraine021 newsky donghailiang111 sanzanalora jcarlosneto wallaceliu moroboshidann houhailun yysverson1 ygj781129 loppol38 lvvvvvk marrinw ycqingfeng marscube zhhy1 zhoulei163 syfffff liufeiran123 danmo121 wengbenjue haisimao besteee minhtien2405 zhoufangquan tender-sun ccj211985

nlp-notebook's Issues

Bug

  File "/workspace/nlp-notebook/4-3.Transformer/model.py", line 256, in forward
    enc_src = self.encoder(src, src_mask)        
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/workspace/nlp-notebook/4-3.Transformer/model.py", line 35, in forward
    src = self.dropout((self.tok_embedding(src) * self.scale) + self.pos_embedding(pos))     
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 156, in forward
    return F.embedding(
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/functional.py", line 1916, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
IndexError: index out of range in self

生成模型Transformer有以上错误，排除了软件包版本问题，S2S+attention也有bug，具体和之前一个issue一样，tuple和tensor的问题

请问bert中文预训练做4-5的时候的运行时显存要求是多少

我目前是在校学生，只能用自己的笔记本进行实验，显卡是GTX2060 6GB的，但是显然是不够的，希望知情的人可以告知，不胜感激

3-2 'tuple' object has no attribute 'last_hidden_state'

(pytorch18) z@z:~/code/nlp-notebook-master/3-2.Bert-CRF$ python demo_train.py
Some weights of the model checkpoint at ./bert-base-chinese were not used when initializing BertForNER: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']
This IS expected if you are initializing BertForNER from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
This IS NOT expected if you are initializing BertForNER from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForNER were not initialized from the model checkpoint at ./bert-base-chinese and are newly initialized: ['transitions', 'hidden2label.weight', 'hidden2label.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
[Train Epoch 0]: 0%| | 0/1584 [00:00<?, ?it/s]
Traceback (most recent call last):
File "demo_train.py", line 66, in
run()
File "demo_train.py", line 53, in run
loss = model.neg_log_likelihood(input_ids, attention_mask, label_ids, real_lens)
File "/home/z/code/nlp-notebook-master/3-2.Bert-CRF/model.py", line 137, in neg_log_likelihood
feats = self.get_features(input_ids, attention_mask)
File "/home/z/code/nlp-notebook-master/3-2.Bert-CRF/model.py", line 53, in get_features
sequence_output, pooled_output = x.last_hidden_state, x.pooler_output
AttributeError: 'tuple' object has no attribute 'last_hidden_state'

输出如上，尝试修改model.from_pretrained（model_path，output_hidden_states = True）也不行
请问是哪里出了问题？环境配置是一样的

请问可以加入基于transformer的关键词提取和生成吗？

4.1Seq2Seq

请问在Seq2Seq模型中，把trg改为tensor类型后，出现以下报错：
Traceback (most recent call last):
File "E:\nlp-notebook-master\4-1.Seq2seq\train_eval.py", line 54, in
trg, src = trg.to(device), src.to(device)
AttributeError: 'NoneType' object has no attribute 'to'
是怎么回事呢？

RuntimeError: _th_ceil_out not supported on CUDAType for Long

您好，跑您p-tuning代码，遇见一个RuntimeError: _th_ceil_out not supported on CUDAType for Long，猜测是不是mlm_pytorch包的版本问题,想问一下您的版本和环境~谢谢！
Traceback (most recent call last):
File "G:/download/yg/nlp-notebook-master/5.PaperwithCode/3.P-tuning/train.py", line 30, in
loss = model(batch_data[0], batch_data[1])
File "C:\Users\Adam-CVTeam\Anaconda\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "G:\download\yg\nlp-notebook-master\5.PaperwithCode\3.P-tuning\model.py", line 77, in forward
inputs_embeds = self.embed_input(queries) #[batch size, spell_length + x, hidden_size]
File "G:\download\yg\nlp-notebook-master\5.PaperwithCode\3.P-tuning\model.py", line 45, in embed_input
replace_embeds = self.prompt_encoder() #[spell_length, hidden_size]
File "C:\Users\Adam-CVTeam\Anaconda\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "G:\download\yg\nlp-notebook-master\5.PaperwithCode\3.P-tuning\prompt_encoder.py", line 53, in forward
output_embeds = self.mlm_head(input_embeds)[0].squeeze() # [9(sum(template)), hidden_size]
File "C:\Users\Adam-CVTeam\Anaconda\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Adam-CVTeam\Anaconda\envs\pytorch\lib\site-packages\mlm_pytorch\mlm_pytorch.py", line 67, in forward
mask = get_mask_subset_with_prob(~no_mask, self.mask_prob)
File "C:\Users\Adam-CVTeam\Anaconda\envs\pytorch\lib\site-packages\mlm_pytorch\mlm_pytorch.py", line 23, in get_mask_subset_with_prob
mask_excess = (mask.cumsum(dim=-1) > (num_tokens * prob).ceil())
RuntimeError: _th_ceil_out not supported on CUDAType for Long

Process finished with exit code 1

jasoncao11 / nlp-notebook Goto Github PK

nlp-notebook's People

Contributors

Stargazers

Watchers

Forkers

nlp-notebook's Issues

环境问题

Bug

请问bert中文预训练做4-5的时候的运行时显存要求是多少

3-2 'tuple' object has no attribute 'last_hidden_state'

请问可以加入基于transformer的关键词提取和生成吗？

4.1Seq2Seq

RuntimeError: _th_ceil_out not supported on CUDAType for Long

AttributeError : 'tuple' has no attribute 'to'

Lattice-LSTM

https://pan.baidu.com/s/1O3GXE3E7-qn9TrdKq1-5cw 提取码：h6vt

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent