运行python generate.py --model_type chatglm --size 7 时可以正常起来,里面对应的chatglm的lora_weights已经写死。
后面输入instruction以后,运行报错:
Traceback (most recent call last):
File "/root/llm/Alpaca-CoT-main/generate.py", line 258, in
response = evaluate(instruction)
File "/root/llm/Alpaca-CoT-main/generate.py", line 212, in evaluate
output = tokenizer.decode(s)
File "/root/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b/fdb7a601d8f8279806124542e11549bdd76f62f6/tokenization_chatglm.py", line 276, in decode
if self.pad_token_id in token_ids: # remove pad
RuntimeError: Boolean value of Tensor with more than one value is ambiguous
output的前一步那个s,打印出来是有值的:
Response:
The dtype of attention mask (torch.int64) is not bool
tensor([ 32313, 20107, 20125, 26054, 20109, 23384, 20104, 21833, 20007,
31121, 20104, 20532, 20109, 32475, 49321, 20100, 21029, 20007,
20004, 145875, 57010, 20012, 20004, 20150, 88230, 29668, 90663,
83831, 85119, 99903, 20004, 145875, 31034, 20012, 150001, 150004,
20483, 22739, 20142, 20372, 88230, 29668, 90663, 20103, 20142,
21224, 20006, 20120, 20134, 20236, 20103, 21008, 20208, 22095,
20012, 20004, 20004, 20009, 20007, 150009, 22999, 20142, 20372,
88230, 29668, 20102, 90085, 84121, 90663, 83823, 20004, 20010,
20007, 150009, 86246, 20058, 85119, 84052, 20062, 90959, 84140,
20006, 83984, 20058, 99903, 85119, 145907, 20004, 20013, 20007,
150009, 86977, 84121, 85119, 84086, 20006, 84111, 85964, 83824,
83995, 84015, 83824, 86299, 84015, 83835, 83823, 20004, 20016,
20007, 150009, 86246, 20058, 99903, 20062, 90997, 20006, 85749,
137200, 119854, 83966, 88230, 83823, 20004, 20004, 24400, 20120,
20127, 99903, 84192, 20006, 20142, 20372, 88230, 29668, 90663,
20134, 20113, 21554, 20103, 20142, 21224, 20102, 20120, 20134,
20113, 20477, 20103, 21506, 20142, 21224, 20207, 20142, 20372,
88230, 29668, 20007, 150005], device='cuda:0')