Giter Site home page Giter Site logo

llama3-finetuning's Introduction

Llama3-Finetuning

对llama3进行全参微调、lora微调以及qlora微调。除此之外,也支持对qwen1.5的模型进行微调。如果要替换为其它的模型,最主要的还是在数据的预处理那一块。

安装依赖

pip install -r requirements.txt

数据准备

data下面存放数据,具体格式为:

[
  {
    "conversations": [
      {
        "from": "user",
        "value": "你是那个名字叫ChatGPT的模型吗?"
      },
      {
        "from": "assistant",
        "value": "我的名字是西西嘛呦,并且是通过家里蹲公司的大数据平台进行训练的。"
      }
	]
  }
  ...
]

多轮对话也是按照上述格式准备好数据。

模型准备

进入到model_hub文件夹下,运行python download_modelscope.py即可下载llama3-8B-Instruct模型。

微调

进入到script文件夹。

全参数微调

受机器限制,这里并未进行全参数微调,如果有条件可以试试。

lora微调

nproc_per_node和CUDA_VISIBLE_DEVICES指定的显卡数目要保持一致。

NCCL_P2P_DISABLE=1 \
NCCL_IB_DISABLE=1 \
CUDA_VISIBLE_DEVICES=0,1,2,4,5,6,7 \
torchrun \
--nproc_per_node 7 \
--nnodes 1 \
--node_rank 0 \
--master_addr localhost \
--master_port 6601 \
../finetune_llama3.py \
--model_name_or_path "../model_hub/LLM-Research/Meta-Llama-3-8B-Instruct/" \
--data_path "../data/Belle_sampled_qwen.json" \
--bf16 True \
--output_dir "../output/llama3_8B_lora" \
--num_train_epochs 100 \
--per_device_train_batch_size 1 \
--per_device_eval_batch_size 1 \
--gradient_accumulation_steps 8 \
--evaluation_strategy "no" \
--save_strategy "steps" \
--save_steps 5 \
--save_total_limit 1 \
--learning_rate 1e-5 \
--weight_decay 0.1 \
--adam_beta2 0.95 \
--warmup_ratio 0.01 \
--lr_scheduler_type "cosine" \
--logging_steps 1 \
--report_to "none" \
--model_max_length 4096 \
--gradient_checkpointing True \
--lazy_preprocess True \
--deepspeed "../config/ds_config_zero3_72B.json" \
--use_lora

qlora微调

nproc_per_node和CUDA_VISIBLE_DEVICES指定的显卡数目要保持一致。使用qlora在单张4090上即可完成训练。

NCCL_P2P_DISABLE=1 \
NCCL_IB_DISABLE=1 \
CUDA_VISIBLE_DEVICES=0,1,2,4,5,6,7 \
torchrun \
--nproc_per_node 7 \
--nnodes 1 \
--node_rank 0 \
--master_addr localhost \
--master_port 6601 \
../finetune_llama3.py \
--model_name_or_path "../model_hub/LLM-Research/Meta-Llama-3-8B-Instruct/" \
--data_path "../data/Belle_sampled_qwen.json" \
--bf16 True \
--output_dir "../output/llama3_8B_qlora" \
--num_train_epochs 100 \
--per_device_train_batch_size 1 \
--per_device_eval_batch_size 1 \
--gradient_accumulation_steps 16 \
--evaluation_strategy "no" \
--save_strategy "steps" \
--save_steps 5 \
--save_total_limit 1 \
--learning_rate 1e-5 \
--weight_decay 0.1 \
--adam_beta2 0.95 \
--warmup_ratio 0.01 \
--lr_scheduler_type "cosine" \
--logging_steps 1 \
--report_to "none" \
--model_max_length 4096 \
--gradient_checkpointing True \
--lazy_preprocess True \
--deepspeed "../config/ds_config_zero2.json" \
--use_lora \
--load_in_4bit \
--q_lora

推理

  • llama3_single_predict.py:提供的是未微调前的模型的推理。
  • llama3_lora_predict.py:提供将基于lora微调的模型合并到基础模型中,并进行推理。
  • llama3_qlora_4bit_predict.py:提供将基于qlora微调的模型合并到基础模型中,并进行推理。
  • llama3_peft_qlora_predict.py:使用peft加载qlora适配器进行推理。
  • llama3_peft_lora_predict.py:使用peft加载lora适配器进行推理。

训练完成后推理:

问题在互联网普及的时代社交媒体已经成为了现代人生活中不可缺少的一部分从微信微博Facebook到TwitterInstagram等等社交媒体不仅满足了人们交流互动的需求同时让人们更加容易地分享自己的生活瞬间获取新闻资讯找到志同道合的人群以及推广自己的业务然而与此同时社交媒体也带来了一系列问题如滥用个人信息泄露隐私等因此如何正确地使用社交媒体已经成为了一个备受关注的话题。,总结上面这段文本的几个关键词

微调前结果Arrrr, shiver me timbers! Ye be wantin' me to summarize the key words from that there text, eh? Alright then, matey! Here be the main points:

1. Social media (the scourge o' the seven seas... er, I mean, the modern world)
2. Communication (the lifeblood o' the pirate's existence... er, I mean, the need for people to interact)
3. Sharing (the treasure o' the digital age... er, I mean, the ability to share one's life moments)
4. Information (the booty o' the internet... er, I mean, the news and knowledge shared online)
5. Privacy (the anchor o' security... er, I mean, the need to protect one's personal info)
6. Correct usage (the map to avoidin' the rocky shores o' social media misuse... er, I mean, the importance of using social media responsibly)

So hoist the colors, me hearties, and remember to use social media like a proper pirate: with caution, respect, and a keen eye for treasure!

微调后结果社交媒体交流分享隐私滥用信息

参考

代码主要参考了:

模型可以去modelscope上进行下载:

llama3-finetuning's People

Contributors

taishan1994 avatar

Stargazers

nd791899 avatar UniBDP avatar alexyu avatar yukang lin avatar algorithmexplorer avatar Cloudflying avatar Chenxi Bai avatar XuanWu avatar  avatar  avatar PLUTO avatar  avatar ppc avatar Yujin Wang avatar Inosuke Hashibira avatar  avatar Lss avatar  avatar  avatar XiangyuYang avatar SamgeShao avatar  avatar nullfullme avatar  avatar  avatar  avatar  avatar vicen avatar  avatar Major Roszel avatar

Watchers

Kostas Georgiou avatar  avatar  avatar

llama3-finetuning's Issues

RuntimeError: Error building extension 'fused_adam'

一直报错fused_adam的错误,我的设备是双卡2080ti,版本和requirement一样

FAILED: multi_tensor_adam.cuda.o
/usr/bin/nvcc -DTORCH_EXTENSION_NAME=fused_adam -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -I/home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/deepspeed/ops/csrc/includes -I/home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/deepspeed/ops/csrc/adam -isystem /home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/torch/include -isystem /home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/torch/include/TH -isystem /home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/torch/include/THC -isystem /home/patrickstar/miniconda3/envs/pytorch/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -O3 -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -lineinfo --use_fast_math -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_75,code=compute_75 -std=c++17 -c /home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/deepspeed/ops/csrc/adam/multi_tensor_adam.cu -o multi_tensor_adam.cuda.o
gcc: fatal error: cannot execute 'cc1plus': execvp: No such file or directory
compilation terminated.
nvcc fatal : Failed to preprocess host compiler properties.
[2/3] c++ -MMD -MF fused_adam_frontend.o.d -DTORCH_EXTENSION_NAME=fused_adam -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -I/home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/deepspeed/ops/csrc/includes -I/home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/deepspeed/ops/csrc/adam -isystem /home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/torch/include -isystem /home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/torch/include/TH -isystem /home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/torch/include/THC -isystem /home/patrickstar/miniconda3/envs/pytorch/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -std=c++17 -g -Wno-reorder -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -c /home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/deepspeed/ops/csrc/adam/fused_adam_frontend.cpp -o fused_adam_frontend.o
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "/home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2100, in _run_ninja_build
subprocess.run(
File "/home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/subprocess.py", line 526, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/patrickstar/llm_learn/Llama3-Finetuning/script/../finetune_llama3.py", line 448, in
train()
File "/home/patrickstar/llm_learn/Llama3-Finetuning/script/../finetune_llama3.py", line 441, in train
trainer.train()
File "/home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/transformers/trainer.py", line 1780, in train
return inner_training_loop(
File "/home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/transformers/trainer.py", line 1936, in _inner_training_loop
model, self.optimizer, self.lr_scheduler = self.accelerator.prepare(
File "/home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/accelerate/accelerator.py", line 1255, in prepare
result = self._prepare_deepspeed(*args)
File "/home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/accelerate/accelerator.py", line 1640, in _prepare_deepspeed
engine, optimizer, _, lr_scheduler = deepspeed.initialize(**kwargs)
File "/home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/deepspeed/init.py", line 181, in initialize
engine = DeepSpeedEngine(args=args,
File "/home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 307, in init
self._configure_optimizer(optimizer, model_parameters)
File "/home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 1232, in _configure_optimizer
basic_optimizer = self._configure_basic_optimizer(model_parameters)
File "/home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 1309, in _configure_basic_optimizer
optimizer = FusedAdam(
File "/home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/deepspeed/ops/adam/fused_adam.py", line 94, in init
fused_adam_cuda = FusedAdamBuilder().load()
File "/home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/deepspeed/ops/op_builder/builder.py", line 480, in load
return self.jit_load(verbose)
File "/home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/deepspeed/ops/op_builder/builder.py", line 524, in jit_load
op_module = load(name=self.name,
File "/home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1308, in load
return _jit_compile(
File "/home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1710, in _jit_compile
_write_ninja_file_and_build_library(
File "/home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1823, in _write_ninja_file_and_build_library
_run_ninja_build(
File "/home/patrickstar/miniconda3/envs/pytorch/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2116, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error building extension 'fused_adam'

assert len(input_id) == len(target) AssertionError

您好,我在执行qlora微调复现时遇到这个问题,报错信息是:
Traceback (most recent call last):
File "../finetune_llama3.py", line 452, in
train()
File "../finetune_llama3.py", line 445, in train
trainer.train()
File "/home/nlpir/miniconda3/envs/cjy_llama/lib/python3.8/site-packages/transformers/trainer.py", line 1624, in train
return inner_training_loop(
File "/home/nlpir/miniconda3/envs/cjy_llama/lib/python3.8/site-packages/transformers/trainer.py", line 1928, in _inner_training_loop
for step, inputs in enumerate(epoch_iterator):
File "/home/nlpir/miniconda3/envs/cjy_llama/lib/python3.8/site-packages/accelerate/data_loader.py", line 452, in iter
current_batch = next(dataloader_iter)
File "/home/nlpir/miniconda3/envs/cjy_llama/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 633, in next
data = self._next_data()
File "/home/nlpir/miniconda3/envs/cjy_llama/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 677, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File "/home/nlpir/miniconda3/envs/cjy_llama/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/nlpir/miniconda3/envs/cjy_llama/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 51, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "../finetune_llama3.py", line 255, in getitem
ret = preprocess([self.raw_data[i]["conversations"]], self.tokenizer, self.max_len)
File "../finetune_llama3.py", line 192, in preprocess
assert len(input_id) == len(target)
AssertionError
实际情况是input_id一直比target长度大1.
我在6块1080ti运行的,shell脚本内容如下:
NCCL_P2P_DISABLE=1
NCCL_IB_DISABLE=1
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5
torchrun
--nproc_per_node 6
--nnodes 1
--node_rank 0
--master_addr localhost
--master_port 6601
../finetune_llama3.py
--model_name_or_path "../model_hub/LLM-Research/Meta-Llama-3-8B-Instruct/"
--data_path "../data/Belle_sampled_qwen.json"
--fp16 True
--output_dir "../output/llama3_8B_qlora"
--num_train_epochs 100
--per_device_train_batch_size 1
--per_device_eval_batch_size 1
--gradient_accumulation_steps 16
--evaluation_strategy "no"
--save_strategy "steps"
--save_steps 5
--save_total_limit 1
--learning_rate 1e-5
--weight_decay 0.1
--adam_beta2 0.95
--warmup_ratio 0.01
--lr_scheduler_type "cosine"
--logging_steps 1
--report_to "none"
--model_max_length 4096
--gradient_checkpointing True
--lazy_preprocess True
--deepspeed "../config/ds_config_zero2.json"
--use_lora
--load_in_4bit
--q_lora
只是改了下CUDA_VISIBLE_DEVICES和nproc_per_node ,并且把bf16改为fp16.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.