Giter Site home page Giter Site logo

cut's Introduction

Reasons to Reject? Aligning Language Models with Judgments

This repository contains code and resources of our paper,

Reasons to Reject? Aligning Language Models with Judgments.

Weiwen Xu, Deng Cai, Zhisong Zhang, Wai Lam, Shuming Shi

Catalogue:


1. Introduction

As humans, we consistently engage in interactions with our peers and receive feedback in the form of natural language. This language feedback allows us to reflect on our actions, maintain appropriate behavior, and rectify our errors. The question arises naturally: can we use language feedback to align large language models (LLMs)?

intro

In contrast to previous research that aligns LLMs with reward or preference data, we present the first systematic exploration of alignment through the lens of language feedback (i.e., judgment). We commence with an in-depth investigation of potential methods that can be adapted for aligning LLMs with judgments, revealing that these methods are unable to fully capitalize on the judgments. To facilitate more effective utilization of judgments, we propose a novel framework, Contrastive Unlikelihood Training (CUT), that allows for fine-grained inappropriate content detection and correction based on judgments.

CUT

Our offline alignment results show that, with merely 1317 off-the-shelf judgment data, CUT (LLaMA2-13b) can beat the 175B DaVinci003 and surpass the best baseline by 52.34 points on AlpacaEval. The online alignment results demonstrate that CUT can align LLMs (LLaMA2-chat-13b) in an iterative fashion using model-specific judgment data, with a steady performance improvement from 81.09 to 91.36 points on AlpacaEval. Our analysis further suggests that judgments exhibit greater potential than rewards for LLM alignment and warrant future research.

2. Dataset

2.1. Offline Alignment

To reproduce the offline experiments, please use the datasets from Summarization Train, Summarization Test, and Shepherd. Please use the script scripts/convert2alpaca.py to convert the data into the Alpaca Format.

2.2. Online Alignment

To reproduce the online experiments, we provide the training instances for 5 online interations in data/iter.

2.3. Judgment v.s. Rewards

We sample 1000 * 4 instruction-response-judgment triplets from UltraFeedback and re-annotate them with only negative judgments. The new judgment data can be found in data/UltraFeedback.

3. Fine-tuning

3.1. Prepare the environment
pip install -r requirments.txt
3.2. Train LLMs with CUT
3.2.1. Online Alignment (the first online iteration as an example)
threshold=1.1
weight_unlike=1
name=cut-1plus-13b
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 torchrun --nproc_per_node=8 --master_port=1233 finetune_unlikelihood.py \
    --base_model saved_models/llama2-13b-chat-hf \
    --data-path data/iter/train-alpaca-sample-iter1.json \
    --output_dir ./saved_models/lora/${name} \
    --batch_size 8 \
    --micro_batch_size 1 \ 
    --num_epochs 1 \
    --learning_rate 0.0004 \
    --cutoff_len 2048 \
    --val_set_size 0 \
    --lora_r 16 \
    --lora_alpha 16 \
    --lora_dropout 0.05 \
    --lora_target_modules '[gate_proj, down_proj, up_proj]' \
    --train_on_inputs False \
    --add_eos_token False \
    --group_by_length False \
    --prompt_template_name alpaca \
    --lr_scheduler 'cosine' \
    --warmup_steps 100\
    --weight_unlike ${weight_unlike}\
    --threshold ${threshold}\
    --downsample 0.25\

CUDA_VISIBLE_DEVICES=0 python merge.py \
    --base_model_name_or_path saved_models/llama2-13b-chat-hf \
    --peft_model_path ./saved_models/lora/${name} \
    --output_dir ./saved_models/${name}
3.2.2. Offline alignment (Shepherd as an example)

First, get the Shepherd dataset according to Sec. 2.1. Then use the following script:

threshold=1.2
weight_unlike=0.5
name=cut-1plus-13b
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 torchrun --nproc_per_node=8 --master_port=1233 finetune_unlikelihood.py \
    --base_model saved_models/llama2-13b-chat-hf \
    --data-path data/Shepherd/train-alpaca.json \
    --output_dir ./saved_models/lora/${name} \
    --batch_size 8 \
    --micro_batch_size 1 \ 
    --num_epochs 1 \
    --learning_rate 0.0004 \
    --cutoff_len 2048 \
    --val_set_size 0 \
    --lora_r 16 \
    --lora_alpha 16 \
    --lora_dropout 0.05 \
    --lora_target_modules '[gate_proj, down_proj, up_proj]' \
    --train_on_inputs False \
    --add_eos_token False \
    --group_by_length False \
    --prompt_template_name alpaca \
    --lr_scheduler 'cosine' \
    --warmup_steps 100\
    --weight_unlike ${weight_unlike}\
    --threshold ${threshold}\
    --downsample 0.25\

CUDA_VISIBLE_DEVICES=0 python merge.py \
    --base_model_name_or_path saved_models/llama2-13b-chat-hf \
    --peft_model_path ./saved_models/lora/${name} \
    --output_dir ./saved_models/${name}

4. Inference

4.1. Checkpoint Release

We present our CUT model, which has undergone four online iterations and successfully achieved a score of 91.36 points on AlpacaEval.

4.2. Inference Template

We follow the inference template used from Stanford Alpaca:

Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
{instruction}

### Response:
4.3. CLI

Fastchat provides a simple setup for those interested in trying our aligned model. After downloading the CUT model through HuggingFace, clone the Fastchat repository:

git clone https://github.com/lm-sys/FastChat.git
cd FastChat

Download the required packages:

pip install --upgrade pip  # enable PEP 660 support
pip install -e .

Finally, run the following:

python -m fastchat.serve.cli --model-path xww033/cut-13b --conv-template alpaca

5. Testing

5.1. Generation-based Evaluation

We evaluate the model on AlpacaEval. Please first install the evaluation tool:

pip install alpaca-eval

The following script is employed to request the LLM to produce responses to the provided 805 instructions:

python scripts/generate.py --base_model_name_or_path <model checkpoint>

The generated responses would be saved in <model checkpoint>/alpaca_eval.json, which is subsequently submitted for GPT4 evaluation:

alpaca_eval --model_outputs <model checkpoint>/alpaca_eval.json
5.2. Ranking-based Evaluation

We evaluate the model's performance on ARC, HellaSwag, MMLU and TruthfulQA, utilizing the LLM Evaluation Harness.

BibTeX

@article{xu2023reasons,
  title={Reasons to Reject? Aligning Language Models with Judgments},
  author={Xu, Weiwen and Cai, Deng and Zhang, Zhisong and Lam, Wai and Shi, Shuming},
  journal={arXiv preprint arXiv:2312.14591},
  year={2023}
}

cut's People

Contributors

wwxu21 avatar

Stargazers

🤡🐀 avatar bubble avatar Philip avatar  avatar skykiseki avatar Shijue Huang avatar hackaday avatar Jingxin Xu avatar buptygz avatar  avatar Harryis Wang avatar  avatar Li Yuan avatar Sheng Guan avatar  avatar Yukun Zhang avatar Daxiong avatar  avatar 南栖 avatar tomato avatar Shuming Shi avatar tansuozhe02 avatar  avatar Xi Chen avatar  avatar Xiaoye Qu avatar JIMMY ZHAO avatar Bruno Pio avatar Renat Zayashnikov avatar Deng Cai avatar  avatar Hayden Rear avatar Jose Cohenca avatar Jeff Carpenter avatar Wei Liu avatar Kristoffer Rolf Deinoff avatar 唐国梁Tommy avatar  avatar Mohammed OE Abdallah avatar huchi avatar 爱可可-爱生活 avatar Jacques Thibodeau avatar Jialong Wu avatar Yue Zhang avatar 姬忠鹏 avatar  avatar Sen avatar LI XIN avatar Zhiwei He avatar  avatar Seungyun Baek avatar  avatar Zeyu Qin avatar yangchao avatar

Watchers

 avatar

cut's Issues

Why loss become zero?

During my first round online alignment, the loss:
{'loss': 1.6736, 'learning_rate': 4.000000000000001e-06, 'epoch': 0.02}
{'loss': 0.0, 'learning_rate': 8.000000000000001e-06, 'epoch': 0.03}
{'loss': 0.0, 'learning_rate': 1.2e-05, 'epoch': 0.05}
{'loss': 0.0, 'learning_rate': 1.6000000000000003e-05, 'epoch': 0.07}
{'loss': 0.0, 'learning_rate': 2e-05, 'epoch': 0.08}
{'loss': 0.0, 'learning_rate': 2.4e-05, 'epoch': 0.1}
{'loss': 0.0, 'learning_rate': 2.8000000000000003e-05, 'epoch': 0.11}
{'loss': 0.0, 'learning_rate': 3.2000000000000005e-05, 'epoch': 0.13}
{'loss': 0.0, 'learning_rate': 3.6e-05, 'epoch': 0.15}
{'loss': 0.0, 'learning_rate': 4e-05, 'epoch': 0.16}
{'loss': 0.0, 'learning_rate': 4.4000000000000006e-05, 'epoch': 0.18}
{'loss': 0.0, 'learning_rate': 4.8e-05, 'epoch': 0.2}
{'loss': 0.0, 'learning_rate': 5.2000000000000004e-05, 'epoch': 0.21}
{'loss': 0.0, 'learning_rate': 5.6000000000000006e-05, 'epoch': 0.23}
{'loss': 0.0, 'learning_rate': 6e-05, 'epoch': 0.24}
{'loss': 0.0, 'learning_rate': 6.400000000000001e-05, 'epoch': 0.26}
{'loss': 0.0, 'learning_rate': 6.800000000000001e-05, 'epoch': 0.28}
{'loss': 0.0, 'learning_rate': 7.2e-05, 'epoch': 0.29}
{'loss': 0.0, 'learning_rate': 7.6e-05, 'epoch': 0.31}
{'loss': 0.0, 'learning_rate': 8e-05, 'epoch': 0.33}
{'loss': 0.0, 'learning_rate': 8.4e-05, 'epoch': 0.34}
{'loss': 0.0, 'learning_rate': 8.800000000000001e-05, 'epoch': 0.36}
{'loss': 0.0, 'learning_rate': 9.200000000000001e-05, 'epoch': 0.37}
{'loss': 0.0, 'learning_rate': 9.6e-05, 'epoch': 0.39}
{'loss': 0.0, 'learning_rate': 0.0001, 'epoch': 0.41}
{'loss': 0.0, 'learning_rate': 0.00010400000000000001, 'epoch': 0.42}
{'loss': 0.0, 'learning_rate': 0.00010800000000000001, 'epoch': 0.44}
{'loss': 0.0, 'learning_rate': 0.00011200000000000001, 'epoch': 0.46}
{'loss': 0.0, 'learning_rate': 0.000116, 'epoch': 0.47}
{'loss': 0.0, 'learning_rate': 0.00012, 'epoch': 0.49}
{'loss': 0.0, 'learning_rate': 0.000124, 'epoch': 0.5}
{'loss': 0.0, 'learning_rate': 0.00012800000000000002, 'epoch': 0.52}
{'loss': 0.0, 'learning_rate': 0.000132, 'epoch': 0.54}
{'loss': 0.0, 'learning_rate': 0.00013600000000000003, 'epoch': 0.55}
{'loss': 0.0, 'learning_rate': 0.00014, 'epoch': 0.57}
{'loss': 0.0, 'learning_rate': 0.000144, 'epoch': 0.59}
{'loss': 0.0, 'learning_rate': 0.000148, 'epoch': 0.6}
{'loss': 0.0, 'learning_rate': 0.000152, 'epoch': 0.62}
{'loss': 0.0, 'learning_rate': 0.00015600000000000002, 'epoch': 0.63}
{'loss': 0.0, 'learning_rate': 0.00016, 'epoch': 0.65}
{'loss': 0.0, 'learning_rate': 0.000164, 'epoch': 0.67}
{'loss': 0.0, 'learning_rate': 0.000168, 'epoch': 0.68}
{'loss': 0.0, 'learning_rate': 0.000172, 'epoch': 0.7}
{'loss': 0.0, 'learning_rate': 0.00017600000000000002, 'epoch': 0.72}
{'loss': 0.0, 'learning_rate': 0.00018, 'epoch': 0.73}
{'loss': 0.0, 'learning_rate': 0.00018400000000000003, 'epoch': 0.75}
{'loss': 0.0, 'learning_rate': 0.000188, 'epoch': 0.76}
{'loss': 0.0, 'learning_rate': 0.000192, 'epoch': 0.78}
{'loss': 0.0, 'learning_rate': 0.000196, 'epoch': 0.8}
{'loss': 0.0, 'learning_rate': 0.0002, 'epoch': 0.81}
{'loss': 0.0, 'learning_rate': 0.00020400000000000003, 'epoch': 0.83}
{'loss': 0.0, 'learning_rate': 0.00020800000000000001, 'epoch': 0.85}
{'loss': 0.0, 'learning_rate': 0.00021200000000000003, 'epoch': 0.86}
{'loss': 0.0, 'learning_rate': 0.00021600000000000002, 'epoch': 0.88}
{'loss': 0.0, 'learning_rate': 0.00022000000000000003, 'epoch': 0.89}
{'loss': 0.0, 'learning_rate': 0.00022400000000000002, 'epoch': 0.91}
{'loss': 0.0, 'learning_rate': 0.00022799999999999999, 'epoch': 0.93}
{'loss': 0.0, 'learning_rate': 0.000232, 'epoch': 0.94}
{'loss': 0.0, 'learning_rate': 0.000236, 'epoch': 0.96}
{'loss': 0.0, 'learning_rate': 0.00024, 'epoch': 0.98}
{'loss': 0.0, 'learning_rate': 0.000244, 'epoch': 0.99}
I use the script in README, why the loss become 0?
Looking forward for help

训练异常

训练经过第一个样本后就会卡住
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
{'loss': 0.6894, 'learning_rate': 4.000000000000001e-06, 'epoch': 0.0}
0%|▏ | 1/1151 [00:04<1:29:15, 4.66s/it]/usr/local/lib/python3.8/dist-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3483.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
[E ProcessGroupNCCL.cpp:828] [Rank 2] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=37, OpType=ALLREDUCE, Timeout(ms)=1800000) ran for 1802339 milliseconds before timing out.
[E ProcessGroupNCCL.cpp:828] [Rank 4] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=37, OpType=ALLREDUCE, Timeout(ms)=1800000) ran for 1802220 milliseconds before timing out.
ai-node-i-ycwg2tzshsqc6il45qda:22094:23204 [4] NCCL INFO [Service thread] Connection closed by localRank 4
[E ProcessGroupNCCL.cpp:828] [Rank 5] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=37, OpType=ALLREDUCE, Timeout(ms)=1800000) ran for 1802760 milliseconds before timing out.
ai-node-i-ycwg2tzshsqc6il45qda:22092:23106 [0] NCCL INFO comm 0x3a2abaa0 rank 2 nranks 8 cudaDev 2 busId 69000 - Abort COMPLETE
ai-node-i-ycwg2tzshsqc6il45qda:22094:23109 [0] NCCL INFO comm 0xbd14350 rank 4 nranks 8 cudaDev 4 busId b3000 - Abort COMPLETE
[E ProcessGroupNCCL.cpp:455] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data.
[E ProcessGroupNCCL.cpp:460] To avoid data inconsistency, we are taking the entire process down.
terminate called after throwing an instance of 'std::runtime_error'

Traceback (most recent call last):
File "finetune_unlikelihood.py", line 558, in
fire.Fire(train)
File "/usr/local/lib/python3.8/dist-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/usr/local/lib/python3.8/dist-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/usr/local/lib/python3.8/dist-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "finetune_unlikelihood.py", line 547, in train
trainer.train(resume_from_checkpoint=resume_from_checkpoint)
File "/usr/local/lib/python3.8/dist-packages/transformers/trainer.py", line 1553, in train
return inner_training_loop(
File "/usr/local/lib/python3.8/dist-packages/transformers/trainer.py", line 1835, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/usr/local/lib/python3.8/dist-packages/transformers/trainer.py", line 2690, in training_step
self.accelerator.backward(loss)
File "/usr/local/lib/python3.8/dist-packages/accelerate/accelerator.py", line 1985, in backward
loss.backward(**kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/_tensor.py", line 487, in backward
torch.autograd.backward(
File "/usr/local/lib/python3.8/dist-packages/torch/autograd/init.py", line 200, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
File "/usr/local/lib/python3.8/dist-packages/torch/autograd/function.py", line 274, in apply
return user_fn(self, *args)
File "/usr/local/lib/python3.8/dist-packages/torch/utils/checkpoint.py", line 157, in backward
torch.autograd.backward(outputs_with_grad, args_with_grad)
File "/usr/local/lib/python3.8/dist-packages/torch/autograd/init.py", line 200, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: NCCL communicator was aborted on rank 7. Original reason for failure was: [Rank 7] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=36, OpType=BROADCAST, Timeout(ms)=1800000) ran for 1803931 milliseconds before timing out.
[E ProcessGroupNCCL.cpp:828] [Rank 6] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=37, OpType=ALLREDUCE, Timeout(ms)=1800000) ran for 1804929 milliseconds before timing out.
[E ProcessGroupNCCL.cpp:828] [Rank 0] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=37, OpType=ALLREDUCE, Timeout(ms)=1800000) ran for 1804265 milliseconds before timing out.
ai-node-i-ycwg2tzshsqc6il45qda:22096:23124 [0] NCCL INFO comm 0x50f0b560 rank 6 nranks 8 cudaDev 6 busId d5000 - Abort COMPLETE

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.