cardinalere / batch-icl Goto Github PK

Code for paper 'Batch-ICL: Effective, Efficient, and Order-Agnostic In-Context Learning'

Python 99.84% Shell 0.16%

batch-icl's Introduction

Batch-ICL

This is the code for paper 'Batch-ICL: Effective, Efficient, and Order-Agnostic In-Context Learning'

Enviornment

pip install -r requirements.txt

Install the transformers from source. Copy the contents from the 'transformers' folder we provide into the 'transformers' library folder in your environment to replace 'src/transformers/models/llama/modeling_llama.py', 'src/transformers/models/llama/__init__.py', 'src/transformers/__init__.py' and 'src/transformers/modeling_outputs.py'.

Run

Run BatchICL:

bash Batch-ICL.sh

Run BatchICL without evaluating k (enumerate each layer to search for the optimal k):

bash best.sh

Run BatchICL for generation:

bash generation.sh

Run Multiple “Epochs”:

bash multi_epoch.sh

Modify the model size/shots and other parameters in the corresponding .sh file.

batch-icl's People

Contributors

Stargazers

Watchers

Forkers

amutong

batch-icl's Issues

Ablations

Hi, thanks for sharing your very interesting work!

Did you try using more than one layer, i.e. getting the hidden states from all layers and having a per-layer aggregate that you replaced in all layers of the network? Why did you choose to only work with 1 layer at a time?

Why do you substitute the zero-shot outputs with the aggregated 1-shot hidden states? Why not adding instead of substituting?

Mismatch between paper and code

Hi, thanks for sharing your very interesting work!

In Section 3.2 of the paper you mention that "At a selected layer k, we collect the fi(xq) at the last position, i.e., the result of Eq. 5.", but in the code (see e.g. here) you use the last 2 tokens.

According to the paper, shouldn't it be just the last token?

Thanks in advance.

About machine translation task.

Hi, thank you very much for your contributions. I am really interested in your work. While reproducing the machine translation task, I encountered some issues. According to the description in the paper, for the machine translation task, you found the best insertion layer on the validation set, but I did not find the code using the validation set in the BatchICL_generation.py script, nor did I find a function like get_layer. It seems that it is directly finding the best layer on the test set. Could you please provide the relevant code by using the validation set or explain this?

Unexpected keyword argument in init() of BaseModelOutputWithPast

I got the following error when running bash Batch-ICL.sh after following the instruction and substitute the source code of transformers. It seems like the class BaseModelOutputWithPast in transformers/modeling_outputs.py should be correspondingly changed since it originally does not adopt 'hidden_before_mlp' as its initialization parameter.

Traceback (most recent call last):
  File "Batch-ICL.py", line 426, in <module>
    main(args.param_size, args.model_type, args.task)
  File "Batch-ICL.py", line 354, in main
    lay = get_layer(tokenizer,model,dev_df,val_df,tot_l,task)
  File "Batch-ICL.py", line 320, in get_layer
    hiddens,_ = batch_get_hidden(model,tokenizer,[record['prompt'] for record in records_1shots],k,val_df.shape[0],pos=-2)
  File "Batch-ICL.py", line 186, in batch_get_hidden
    outp = model(**encode_inputs, output_hidden_states = True, return_dict = True,out_attn=True,idea=0)#["hidden_states"]
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 1678, in forward
    outputs = self.model(
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 1580, in forward
    return BaseModelOutputWithPast(
TypeError: __init__() got an unexpected keyword argument 'hidden_before_mlp'

cardinalere / batch-icl Goto Github PK

batch-icl's Introduction

Batch-ICL

Enviornment

Run

batch-icl's People

Contributors

Stargazers

Watchers

Forkers

batch-icl's Issues

Ablations

Mismatch between paper and code

About machine translation task.

Unexpected keyword argument in init() of BaseModelOutputWithPast

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent