wwxu21 / cut Goto Github PK

View Code? Open in Web Editor NEW

54.0 1.0 5.0 4.51 MB

Source code of "Reasons to Reject? Aligning Language Models with Judgments"

License: Apache License 2.0

Shell 2.03% Python 97.97%

cut's Introduction

Reasons to Reject? Aligning Language Models with Judgments

This repository contains code and resources of our paper,

Reasons to Reject? Aligning Language Models with Judgments.

Weiwen Xu, Deng Cai, Zhisong Zhang, Wai Lam, Shuming Shi

1. Introduction

As humans, we consistently engage in interactions with our peers and receive feedback in the form of natural language. This language feedback allows us to reflect on our actions, maintain appropriate behavior, and rectify our errors. The question arises naturally: can we use language feedback to align large language models (LLMs)?

In contrast to previous research that aligns LLMs with reward or preference data, we present the first systematic exploration of alignment through the lens of language feedback (i.e., judgment). We commence with an in-depth investigation of potential methods that can be adapted for aligning LLMs with judgments, revealing that these methods are unable to fully capitalize on the judgments. To facilitate more effective utilization of judgments, we propose a novel framework, Contrastive Unlikelihood Training (CUT), that allows for fine-grained inappropriate content detection and correction based on judgments.

Our offline alignment results show that, with merely 1317 off-the-shelf judgment data, CUT (LLaMA2-13b) can beat the 175B DaVinci003 and surpass the best baseline by 52.34 points on AlpacaEval. The online alignment results demonstrate that CUT can align LLMs (LLaMA2-chat-13b) in an iterative fashion using model-specific judgment data, with a steady performance improvement from 81.09 to 91.36 points on AlpacaEval. Our analysis further suggests that judgments exhibit greater potential than rewards for LLM alignment and warrant future research.

2. Dataset

2.1. Offline Alignment

To reproduce the offline experiments, please use the datasets from Summarization Train, Summarization Test, and Shepherd. Please use the script scripts/convert2alpaca.py to convert the data into the Alpaca Format.

2.2. Online Alignment

To reproduce the online experiments, we provide the training instances for 5 online interations in data/iter.

2.3. Judgment v.s. Rewards

We sample 1000 * 4 instruction-response-judgment triplets from UltraFeedback and re-annotate them with only negative judgments. The new judgment data can be found in data/UltraFeedback.

3. Fine-tuning

3.1. Prepare the environment

pip install -r requirments.txt

3.2. Train LLMs with CUT

3.2.1. Online Alignment (the first online iteration as an example)

threshold=1.1
weight_unlike=1
name=cut-1plus-13b
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 torchrun --nproc_per_node=8 --master_port=1233 finetune_unlikelihood.py \
    --base_model saved_models/llama2-13b-chat-hf \
    --data-path data/iter/train-alpaca-sample-iter1.json \
    --output_dir ./saved_models/lora/${name} \
    --batch_size 8 \
    --micro_batch_size 1 \ 
    --num_epochs 1 \
    --learning_rate 0.0004 \
    --cutoff_len 2048 \
    --val_set_size 0 \
    --lora_r 16 \
    --lora_alpha 16 \
    --lora_dropout 0.05 \
    --lora_target_modules '[gate_proj, down_proj, up_proj]' \
    --train_on_inputs False \
    --add_eos_token False \
    --group_by_length False \
    --prompt_template_name alpaca \
    --lr_scheduler 'cosine' \
    --warmup_steps 100\
    --weight_unlike ${weight_unlike}\
    --threshold ${threshold}\
    --downsample 0.25\

CUDA_VISIBLE_DEVICES=0 python merge.py \
    --base_model_name_or_path saved_models/llama2-13b-chat-hf \
    --peft_model_path ./saved_models/lora/${name} \
    --output_dir ./saved_models/${name}

3.2.2. Offline alignment (Shepherd as an example)

First, get the Shepherd dataset according to Sec. 2.1. Then use the following script:

threshold=1.2
weight_unlike=0.5
name=cut-1plus-13b
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 torchrun --nproc_per_node=8 --master_port=1233 finetune_unlikelihood.py \
    --base_model saved_models/llama2-13b-chat-hf \
    --data-path data/Shepherd/train-alpaca.json \
    --output_dir ./saved_models/lora/${name} \
    --batch_size 8 \
    --micro_batch_size 1 \ 
    --num_epochs 1 \
    --learning_rate 0.0004 \
    --cutoff_len 2048 \
    --val_set_size 0 \
    --lora_r 16 \
    --lora_alpha 16 \
    --lora_dropout 0.05 \
    --lora_target_modules '[gate_proj, down_proj, up_proj]' \
    --train_on_inputs False \
    --add_eos_token False \
    --group_by_length False \
    --prompt_template_name alpaca \
    --lr_scheduler 'cosine' \
    --warmup_steps 100\
    --weight_unlike ${weight_unlike}\
    --threshold ${threshold}\
    --downsample 0.25\

CUDA_VISIBLE_DEVICES=0 python merge.py \
    --base_model_name_or_path saved_models/llama2-13b-chat-hf \
    --peft_model_path ./saved_models/lora/${name} \
    --output_dir ./saved_models/${name}

4. Inference

4.1. Checkpoint Release

We present our CUT model, which has undergone four online iterations and successfully achieved a score of 91.36 points on AlpacaEval.

4.2. Inference Template

We follow the inference template used from Stanford Alpaca:

Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
{instruction}

### Response:

4.3. CLI

Fastchat provides a simple setup for those interested in trying our aligned model. After downloading the CUT model through HuggingFace, clone the Fastchat repository:

git clone https://github.com/lm-sys/FastChat.git
cd FastChat

Download the required packages:

pip install --upgrade pip  # enable PEP 660 support
pip install -e .

Finally, run the following:

python -m fastchat.serve.cli --model-path xww033/cut-13b --conv-template alpaca

5. Testing

5.1. Generation-based Evaluation

We evaluate the model on AlpacaEval. Please first install the evaluation tool:

pip install alpaca-eval

The following script is employed to request the LLM to produce responses to the provided 805 instructions:

python scripts/generate.py --base_model_name_or_path <model checkpoint>

The generated responses would be saved in <model checkpoint>/alpaca_eval.json, which is subsequently submitted for GPT4 evaluation:

alpaca_eval --model_outputs <model checkpoint>/alpaca_eval.json

5.2. Ranking-based Evaluation

We evaluate the model's performance on ARC, HellaSwag, MMLU and TruthfulQA, utilizing the LLM Evaluation Harness.

BibTeX

@article{xu2023reasons,
  title={Reasons to Reject? Aligning Language Models with Judgments},
  author={Xu, Weiwen and Cai, Deng and Zhang, Zhisong and Lam, Wai and Shi, Shuming},
  journal={arXiv preprint arXiv:2312.14591},
  year={2023}
}

cut's People

Contributors

Stargazers

Watchers

Forkers

mivanovitch codeaudit expert68 buptygz wbcsjtu

cut's Issues

Is there a script for offline alignment and full fine-tune? And what about catastrophic forgetting？

Interesting work!
Is there a script for offline alignment and full fine-tune?
And I wonder if the online alignment would cause catastrophic forgetting？
Thanks a lot!

Why loss become zero?

During my first round online alignment, the loss:
{'loss': 1.6736, 'learning_rate': 4.000000000000001e-06, 'epoch': 0.02}
{'loss': 0.0, 'learning_rate': 8.000000000000001e-06, 'epoch': 0.03}
{'loss': 0.0, 'learning_rate': 1.2e-05, 'epoch': 0.05}
{'loss': 0.0, 'learning_rate': 1.6000000000000003e-05, 'epoch': 0.07}
{'loss': 0.0, 'learning_rate': 2e-05, 'epoch': 0.08}
{'loss': 0.0, 'learning_rate': 2.4e-05, 'epoch': 0.1}
{'loss': 0.0, 'learning_rate': 2.8000000000000003e-05, 'epoch': 0.11}
{'loss': 0.0, 'learning_rate': 3.2000000000000005e-05, 'epoch': 0.13}
{'loss': 0.0, 'learning_rate': 3.6e-05, 'epoch': 0.15}
{'loss': 0.0, 'learning_rate': 4e-05, 'epoch': 0.16}
{'loss': 0.0, 'learning_rate': 4.4000000000000006e-05, 'epoch': 0.18}
{'loss': 0.0, 'learning_rate': 4.8e-05, 'epoch': 0.2}
{'loss': 0.0, 'learning_rate': 5.2000000000000004e-05, 'epoch': 0.21}
{'loss': 0.0, 'learning_rate': 5.6000000000000006e-05, 'epoch': 0.23}
{'loss': 0.0, 'learning_rate': 6e-05, 'epoch': 0.24}
{'loss': 0.0, 'learning_rate': 6.400000000000001e-05, 'epoch': 0.26}
{'loss': 0.0, 'learning_rate': 6.800000000000001e-05, 'epoch': 0.28}
{'loss': 0.0, 'learning_rate': 7.2e-05, 'epoch': 0.29}
{'loss': 0.0, 'learning_rate': 7.6e-05, 'epoch': 0.31}
{'loss': 0.0, 'learning_rate': 8e-05, 'epoch': 0.33}
{'loss': 0.0, 'learning_rate': 8.4e-05, 'epoch': 0.34}
{'loss': 0.0, 'learning_rate': 8.800000000000001e-05, 'epoch': 0.36}
{'loss': 0.0, 'learning_rate': 9.200000000000001e-05, 'epoch': 0.37}
{'loss': 0.0, 'learning_rate': 9.6e-05, 'epoch': 0.39}
{'loss': 0.0, 'learning_rate': 0.0001, 'epoch': 0.41}
{'loss': 0.0, 'learning_rate': 0.00010400000000000001, 'epoch': 0.42}
{'loss': 0.0, 'learning_rate': 0.00010800000000000001, 'epoch': 0.44}
{'loss': 0.0, 'learning_rate': 0.00011200000000000001, 'epoch': 0.46}
{'loss': 0.0, 'learning_rate': 0.000116, 'epoch': 0.47}
{'loss': 0.0, 'learning_rate': 0.00012, 'epoch': 0.49}
{'loss': 0.0, 'learning_rate': 0.000124, 'epoch': 0.5}
{'loss': 0.0, 'learning_rate': 0.00012800000000000002, 'epoch': 0.52}
{'loss': 0.0, 'learning_rate': 0.000132, 'epoch': 0.54}
{'loss': 0.0, 'learning_rate': 0.00013600000000000003, 'epoch': 0.55}
{'loss': 0.0, 'learning_rate': 0.00014, 'epoch': 0.57}
{'loss': 0.0, 'learning_rate': 0.000144, 'epoch': 0.59}
{'loss': 0.0, 'learning_rate': 0.000148, 'epoch': 0.6}
{'loss': 0.0, 'learning_rate': 0.000152, 'epoch': 0.62}
{'loss': 0.0, 'learning_rate': 0.00015600000000000002, 'epoch': 0.63}
{'loss': 0.0, 'learning_rate': 0.00016, 'epoch': 0.65}
{'loss': 0.0, 'learning_rate': 0.000164, 'epoch': 0.67}
{'loss': 0.0, 'learning_rate': 0.000168, 'epoch': 0.68}
{'loss': 0.0, 'learning_rate': 0.000172, 'epoch': 0.7}
{'loss': 0.0, 'learning_rate': 0.00017600000000000002, 'epoch': 0.72}
{'loss': 0.0, 'learning_rate': 0.00018, 'epoch': 0.73}
{'loss': 0.0, 'learning_rate': 0.00018400000000000003, 'epoch': 0.75}
{'loss': 0.0, 'learning_rate': 0.000188, 'epoch': 0.76}
{'loss': 0.0, 'learning_rate': 0.000192, 'epoch': 0.78}
{'loss': 0.0, 'learning_rate': 0.000196, 'epoch': 0.8}
{'loss': 0.0, 'learning_rate': 0.0002, 'epoch': 0.81}
{'loss': 0.0, 'learning_rate': 0.00020400000000000003, 'epoch': 0.83}
{'loss': 0.0, 'learning_rate': 0.00020800000000000001, 'epoch': 0.85}
{'loss': 0.0, 'learning_rate': 0.00021200000000000003, 'epoch': 0.86}
{'loss': 0.0, 'learning_rate': 0.00021600000000000002, 'epoch': 0.88}
{'loss': 0.0, 'learning_rate': 0.00022000000000000003, 'epoch': 0.89}
{'loss': 0.0, 'learning_rate': 0.00022400000000000002, 'epoch': 0.91}
{'loss': 0.0, 'learning_rate': 0.00022799999999999999, 'epoch': 0.93}
{'loss': 0.0, 'learning_rate': 0.000232, 'epoch': 0.94}
{'loss': 0.0, 'learning_rate': 0.000236, 'epoch': 0.96}
{'loss': 0.0, 'learning_rate': 0.00024, 'epoch': 0.98}
{'loss': 0.0, 'learning_rate': 0.000244, 'epoch': 0.99}
I use the script in README, why the loss become 0?
Looking forward for help

训练异常

训练经过第一个样本后就会卡住
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
{'loss': 0.6894, 'learning_rate': 4.000000000000001e-06, 'epoch': 0.0}
0%|▏ | 1/1151 [00:04<1:29:15, 4.66s/it]/usr/local/lib/python3.8/dist-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3483.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
[E ProcessGroupNCCL.cpp:828] [Rank 2] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=37, OpType=ALLREDUCE, Timeout(ms)=1800000) ran for 1802339 milliseconds before timing out.
[E ProcessGroupNCCL.cpp:828] [Rank 4] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=37, OpType=ALLREDUCE, Timeout(ms)=1800000) ran for 1802220 milliseconds before timing out.
ai-node-i-ycwg2tzshsqc6il45qda:22094:23204 [4] NCCL INFO [Service thread] Connection closed by localRank 4
[E ProcessGroupNCCL.cpp:828] [Rank 5] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=37, OpType=ALLREDUCE, Timeout(ms)=1800000) ran for 1802760 milliseconds before timing out.
ai-node-i-ycwg2tzshsqc6il45qda:22092:23106 [0] NCCL INFO comm 0x3a2abaa0 rank 2 nranks 8 cudaDev 2 busId 69000 - Abort COMPLETE
ai-node-i-ycwg2tzshsqc6il45qda:22094:23109 [0] NCCL INFO comm 0xbd14350 rank 4 nranks 8 cudaDev 4 busId b3000 - Abort COMPLETE
[E ProcessGroupNCCL.cpp:455] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data.
[E ProcessGroupNCCL.cpp:460] To avoid data inconsistency, we are taking the entire process down.
terminate called after throwing an instance of 'std::runtime_error'

Traceback (most recent call last):
File "finetune_unlikelihood.py", line 558, in
fire.Fire(train)
File "/usr/local/lib/python3.8/dist-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/usr/local/lib/python3.8/dist-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/usr/local/lib/python3.8/dist-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "finetune_unlikelihood.py", line 547, in train
trainer.train(resume_from_checkpoint=resume_from_checkpoint)
File "/usr/local/lib/python3.8/dist-packages/transformers/trainer.py", line 1553, in train
return inner_training_loop(
File "/usr/local/lib/python3.8/dist-packages/transformers/trainer.py", line 1835, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/usr/local/lib/python3.8/dist-packages/transformers/trainer.py", line 2690, in training_step
self.accelerator.backward(loss)
File "/usr/local/lib/python3.8/dist-packages/accelerate/accelerator.py", line 1985, in backward
loss.backward(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/_tensor.py", line 487, in backward
torch.autograd.backward(
File "/usr/local/lib/python3.8/dist-packages/torch/autograd/init.py", line 200, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
File "/usr/local/lib/python3.8/dist-packages/torch/autograd/function.py", line 274, in apply
return user_fn(self, *args)
File "/usr/local/lib/python3.8/dist-packages/torch/utils/checkpoint.py", line 157, in backward
torch.autograd.backward(outputs_with_grad, args_with_grad)
File "/usr/local/lib/python3.8/dist-packages/torch/autograd/init.py", line 200, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: NCCL communicator was aborted on rank 7. Original reason for failure was: [Rank 7] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=36, OpType=BROADCAST, Timeout(ms)=1800000) ran for 1803931 milliseconds before timing out.
[E ProcessGroupNCCL.cpp:828] [Rank 6] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=37, OpType=ALLREDUCE, Timeout(ms)=1800000) ran for 1804929 milliseconds before timing out.
[E ProcessGroupNCCL.cpp:828] [Rank 0] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=37, OpType=ALLREDUCE, Timeout(ms)=1800000) ran for 1804265 milliseconds before timing out.
ai-node-i-ycwg2tzshsqc6il45qda:22096:23124 [0] NCCL INFO comm 0x50f0b560 rank 6 nranks 8 cudaDev 6 busId d5000 - Abort COMPLETE

What is the checkpoint initialized from in DPO training settings?

Is it LLama2 or LLama2-chat?
THX