Comments (13)
是这样的大佬老师 我使用合并的model作为 base model 来 finetune, 提示这个错误
关于 MAX_STEPS 设置为None的原因
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:22<00:00, 11.04s/it]
Downloading and preparing dataset json/default to /root/.cache/huggingface/datasets/json/default-5488fd0b86b9abc9/0.0.0/0f7e3662623656454fcd2b650f34e886a7db4b9104504885bd462096cc7a9f51...
Downloading data files: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 473.18it/s]
Extracting data files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 42.30it/s]
Dataset json downloaded and prepared to /root/.cache/huggingface/datasets/json/default-5488fd0b86b9abc9/0.0.0/0f7e3662623656454fcd2b650f34e886a7db4b9104504885bd462096cc7a9f51. Subsequent calls will reuse this data.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 45.51it/s]
trainable params: 4194304 || all params: 6889689088 || trainable%: 0.060877986603275876
Traceback (most recent call last):
File "/mnt/e/Chinese-Vicuna/finetune.py", line 228, in <module>
trainer = transformers.Trainer(
File "/root/anaconda3/envs/Chinese-alpaca-lora/lib/python3.9/site-packages/transformers-4.28.0.dev0-py3.9.egg/transformers/trainer.py", line 543, in __init__
if args.max_steps > 0:
TypeError: '>' not supported between instances of 'NoneType' and 'int'
(Chinese-alpaca-lora) root@DESKTOP-6KDJTBC:/mnt/e/Chinese-Vicuna#
from chinese-vicuna.
@ZenXir max_step会在代码下面改。这个问题我昨天在本地branch改了忘push上来了,你可以更新一下。
from chinese-vicuna.
好的大佬老师
from chinese-vicuna.
大佬老师 我使用合并的model 使用finetune.py 训练
试了多次 一直报错
模型合并过程和流程分两步:
1、是先按照 https://github.com/ymcui/Chinese-LLaMA-Alpaca 给出的embedding过的model 合并出 pth模型
2、把 1 合并出的pth模型,再通过 stransformer 转换成 huggingface 格式:
python src/transformers/models/llama/convert_llama_weights_to_hf.py --input_dir /mnt/e/Chinese-LLaMA-Alpaca/model --model_size 7B --output_dir /mnt/e/Chinese-LLaMA-Alpaca/model/7B_hf
finetune命令是:
python finetune.py --data_path sample/merge.json --output_path lora-Vicuna_Embedded/7B/ --model_path /mnt/e/Chinese-LLaMA-Alpaca/model/7B_hf
报错内容是这个:
CUDA SETUP: Loading binary /root/anaconda3/envs/Chinese-alpaca-lora/lib/python3.9/site-packages/bitsandbytes/libbitsandbytes_cpu.so...
/mnt/e/Chinese-LLaMA-Alpaca/model/7B_hf
Overriding torch_dtype=None with `torch_dtype=torch.float16` due to requirements of `bitsandbytes` to enable model loading in mixed int8. Either pass torch_dtype=torch.float16 or don't pass this argument at all to remove this warning.
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:21<00:00, 10.80s/it]
Found cached dataset json (/root/.cache/huggingface/datasets/json/default-5488fd0b86b9abc9/0.0.0/0f7e3662623656454fcd2b650f34e886a7db4b9104504885bd462096cc7a9f51)
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 7.06it/s]
trainable params: 4194304 || all params: 6889689088 || trainable%: 0.060877986603275876
If there's a warning about missing keys above, please disregard :)
/root/anaconda3/envs/Chinese-alpaca-lora/lib/python3.9/site-packages/transformers-4.28.0.dev0-py3.9.egg/transformers/optimization.py:391: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
warnings.warn(
0%| | 0/16260 [00:00<?, ?it/s]Traceback (most recent call last):
File "/mnt/e/Chinese-Vicuna/finetune.py", line 271, in <module>
trainer.train(resume_from_checkpoint=args.resume_from_checkpoint)
File "/root/anaconda3/envs/Chinese-alpaca-lora/lib/python3.9/site-packages/transformers-4.28.0.dev0-py3.9.egg/transformers/trainer.py", line 1636, in train
return inner_training_loop(
File "/root/anaconda3/envs/Chinese-alpaca-lora/lib/python3.9/site-packages/transformers-4.28.0.dev0-py3.9.egg/transformers/trainer.py", line 1903, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/root/anaconda3/envs/Chinese-alpaca-lora/lib/python3.9/site-packages/transformers-4.28.0.dev0-py3.9.egg/transformers/trainer.py", line 2649, in training_step
loss = self.compute_loss(model, inputs)
File "/root/anaconda3/envs/Chinese-alpaca-lora/lib/python3.9/site-packages/transformers-4.28.0.dev0-py3.9.egg/transformers/trainer.py", line 2681, in compute_loss
outputs = model(**inputs)
File "/root/anaconda3/envs/Chinese-alpaca-lora/lib/python3.9/site-packages/torch-2.0.0-py3.9-linux-x86_64.egg/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/anaconda3/envs/Chinese-alpaca-lora/lib/python3.9/site-packages/peft-0.3.0.dev0-py3.9.egg/peft/peft_model.py", line 529, in forward
File "/root/anaconda3/envs/Chinese-alpaca-lora/lib/python3.9/site-packages/torch-2.0.0-py3.9-linux-x86_64.egg/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/anaconda3/envs/Chinese-alpaca-lora/lib/python3.9/site-packages/accelerate-0.17.1-py3.9.egg/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/root/anaconda3/envs/Chinese-alpaca-lora/lib/python3.9/site-packages/transformers-4.28.0.dev0-py3.9.egg/transformers/models/llama/modeling_llama.py", line 786, in forward
loss = loss_fct(shift_logits.view(-1, self.config.vocab_size), shift_labels.view(-1))
RuntimeError: shape '[-1, 32000]' is invalid for input of size 50953080
from chinese-vicuna.
@ZenXir 我还没跑过他们的,你先自己研究一下吧。你这个情况就是没成功转过来。
RuntimeError: shape '[-1, 32000]' is invalid for input of size 50953080
,llama的词表就是32000左右,这个仓库词表好像是49954这么多吧(不知道后续有没有更新)。如果我猜的没错的话,应该是要加上这一段东西model.resize_token_embeddings(len(tokenizer))
来更新model内部的embedding维度,你可以试试
from chinese-vicuna.
在 prepare for traning 前这样 resize_token_embeddings 就可以训练了大佬
我让机器跑两天 看看训练出来的效果怎么样
vocab_size = len(tokenizer.get_vocab())
print("Tokenizer的词表数量为:", vocab_size)
model.resize_token_embeddings(vocab_size)
from chinese-vicuna.
@Facico 对了大佬老师
用合并了 embedding model的模型finetune 我使用的命令是:
python finetune.py --data_path sample/merge.json --output_path lora-Vicuna_Embedded/7B/ --model_path /mnt/e/Chinese-LLaMA-Alpaca/model/7B_hf
其他参数都是默认的,我的机器是单卡 RTX4090 24G
在影响训练效果,和速度方面 有什么建议调整的参数不?
像 bath_size , test_size, epoch 什么的
尤其效果方面的 到时候可以更直观的对比
from chinese-vicuna.
抱歉消息太多了有些消息会看漏,如果要直观的对比的话,保持batch size和epoch就可以了,如果想要跑快一点可以将mirco batch size调大
from chinese-vicuna.
双卡,RTX3090:
if not args.wandb:
37 os.environ["WANDB_MODE"] = "disable"
38 # optimized for RTX 4090. for larger GPUs, increase some of these?
39 MICRO_BATCH_SIZE = 4 # this could actually be 5 but i like powers of 2
40 BATCH_SIZE = 128
41 MAX_STEPS = None
42 GRADIENT_ACCUMULATION_STEPS = BATCH_SIZE // MICRO_BATCH_SIZE
43 EPOCHS = 3 # we don't always need 3 tbh
44 LEARNING_RATE = 3e-4 # the Karpathy constant
45 CUTOFF_LEN = 256 # 256 accounts for about 96% of the data
46 LORA_R = 8
47 LORA_ALPHA = 16
48 LORA_DROPOUT = 0.05
49 VAL_SET_SIZE = args.test_size #2000
50 TARGET_MODULES = [
51 "q_proj",
52 "v_proj",
53 ]
from chinese-vicuna.
/root/anaconda3/lib/python3.9/site-packages/transformers/optimization.py:391: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set no_deprecation_warning=True to disable this warning
warnings.warn(
0%| | 0/32481 [00:00<?, ?it/s]╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
./Chinese-Vicuna/finetune.py:271 in │
│ │
│ 268 │
│ 269 print("\n If there's a warning about missing keys above, please disregard :)") │
│ 270 │
│ ❱ 271 trainer.train(resume_from_checkpoint=args.resume_from_checkpoint) │
│ 272 │
│ 273 model.save_pretrained(OUTPUT_DIR) │
│ 274 │
│ │
│ /root/anaconda3/lib/python3.9/site-packages/transformers/trainer.py:1662 in train │
│ │
│ 1659 │ │ inner_training_loop = find_executable_batch_size( │
│ 1660 │ │ │ self._inner_training_loop, self._train_batch_size, args.auto_find_batch_size │
│ 1661 │ │ ) │
│ ❱ 1662 │ │ return inner_training_loop( │
│ 1663 │ │ │ args=args, │
│ 1664 │ │ │ resume_from_checkpoint=resume_from_checkpoint, │
│ 1665 │ │ │ trial=trial, │
│ │
│ /root/anaconda3/lib/python3.9/site-packages/transformers/trainer.py:1929 in _inner_training_loop │
│ │
│ 1926 │ │ │ │ │ with model.no_sync(): │
│ 1927 │ │ │ │ │ │ tr_loss_step = self.training_step(model, inputs) │
│ 1928 │ │ │ │ else: │
│ ❱ 1929 │ │ │ │ │ tr_loss_step = self.training_step(model, inputs) │
│ 1930 │ │ │ │ │
│ 1931 │ │ │ │ if ( │
│ 1932 │ │ │ │ │ args.logging_nan_inf_filter │
│ │
│ /root/anaconda3/lib/python3.9/site-packages/transformers/trainer.py:2699 in training_step │
│ │
│ 2696 │ │ │ return loss_mb.reduce_mean().detach().to(self.args.device) │
│ 2697 │ │ │
│ 2698 │ │ with self.compute_loss_context_manager(): │
│ ❱ 2699 │ │ │ loss = self.compute_loss(model, inputs) │
│ 2700 │ │ │
│ 2701 │ │ if self.args.n_gpu > 1: │
│ 2702 │ │ │ loss = loss.mean() # mean() to average on multi-gpu parallel training │
│ │
│ /root/anaconda3/lib/python3.9/site-packages/transformers/trainer.py:2731 in compute_loss │
│ │
│ 2728 │ │ │ labels = inputs.pop("labels") │
│ 2729 │ │ else: │
│ 2730 │ │ │ labels = None │
│ ❱ 2731 │ │ outputs = model(**inputs) │
│ 2732 │ │ # Save past state if it exists │
│ 2733 │ │ # TODO: this needs to be fixed and made cleaner later. │
│ 2734 │ │ if self.args.past_index >= 0: │
│ │
│ /root/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py:1102 in _call_impl │
│ │
│ 1099 │ │ # this function, and just call forward. │
│ 1100 │ │ if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks o │
│ 1101 │ │ │ │ or _global_forward_hooks or _global_forward_pre_hooks): │
│ ❱ 1102 │ │ │ return forward_call(*input, **kwargs) │
│ 1103 │ │ # Do not call functions when jit is used │
│ 1104 │ │ full_backward_hooks, non_full_backward_hooks = [], [] │
│ 1105 │ │ if self._backward_hooks or _global_backward_hooks: │
│ │
│ in forward:663 │
│ │
│ /root/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py:1102 in _call_impl │
│ │
│ 1099 │ │ # this function, and just call forward. │
│ 1100 │ │ if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks o │
│ 1101 │ │ │ │ or _global_forward_hooks or _global_forward_pre_hooks): │
│ ❱ 1102 │ │ │ return forward_call(*input, **kwargs) │
│ 1103 │ │ # Do not call functions when jit is used │
│ 1104 │ │ full_backward_hooks, non_full_backward_hooks = [], [] │
│ 1105 │ │ if self._backward_hooks or _global_backward_hooks: │
│ │
│ /root/anaconda3/lib/python3.9/site-packages/accelerate/hooks.py:165 in new_forward │
│ │
│ 162 │ │ │ with torch.no_grad(): │
│ 163 │ │ │ │ output = old_forward(*args, **kwargs) │
│ 164 │ │ else: │
│ ❱ 165 │ │ │ output = old_forward(*args, **kwargs) │
│ 166 │ │ return module._hf_hook.post_forward(module, output) │
│ 167 │ │
│ 168 │ module.forward = new_forward │
│ │
│ /root/anaconda3/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py:709 in │
│ forward │
│ │
│ 706 │ │ │ shift_labels = labels[..., 1:].contiguous() │
│ 707 │ │ │ # Flatten the tokens │
│ 708 │ │ │ loss_fct = CrossEntropyLoss() │
│ ❱ 709 │ │ │ shift_logits = shift_logits.view(-1, self.config.vocab_size) │
│ 710 │ │ │ shift_labels = shift_labels.view(-1) │
│ 711 │ │ │ # Enable model parallelism │
│ 712 │ │ │ shift_labels = shift_labels.to(shift_logits.device) │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: shape '[-1, 32001]' is invalid for input of size 32640000
0%| | 0/32481 [00:04<?, ?it/s]
from chinese-vicuna.
在 prepare for traning 前这样 resize_token_embeddings 就可以训练了大佬 我让机器跑两天 看看训练出来的效果怎么样
vocab_size = len(tokenizer.get_vocab()) print("Tokenizer的词表数量为:", vocab_size) model.resize_token_embeddings(vocab_size)
大佬三句代码是加在哪一步的哪个文件里面呢?我也想做同样的训练,奈何我太菜了,没明白
from chinese-vicuna.
@godzeo 放在加载完模型和tokenizer后就行
from chinese-vicuna.
好的大佬老师
老哥 这个max_step 怎么填哇
from chinese-vicuna.
Related Issues (20)
- ⁇ Below is an instruction that describes a task. Write a response
- 有办法改成分类任务么,用LlamaForSequenceClassification模型类加载
- transformers和pydantic问题 HOT 1
- 是因为梯度为0吗?
- 多卡finetune_chat时报mat1 and mat2 shapes cannot be multiplied (1024x2 and 1x11008) HOT 2
- 中文乱码 HOT 5
- 请问多个lora模型怎么合并?
- 请问llama7b_4bit_128g的input shape是多少呢 HOT 1
- 运行chat_7B.sh聊两句话out of memory
- 多卡训练 bash scripts/finetune.sh报错 HOT 1
- 这几个不同路径下的模型是否有区别?
- 运行generate脚本之后,在页面提问,很久没有产生回答,后台无报错 HOT 2
- OSError: Not enough disk space. Needed: Unknown size (download: Unknown size, generated: Unknown size, post-processed: Unknown size)
- 从belle+guanaco数据集中抽取前5000条样本训练lora,效果不好
- deepspeed跑模型相关问题
- 使用finetune.sh来指令微调llama-33b,出现ZeroDivisionError: integer division or modulo by zero错误 HOT 2
- 可以提供一下huggingface上的Chinese-Vicuna/llama7b_4bit_128g模型的config.json和tokenizer么?
- 官方colab安裝套件失效
- 如果更改數據集格式,要如何更改代碼
- 可以更新一下requirements吗? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chinese-vicuna.