Giter Site home page Giter Site logo

batch-icl's Introduction

Batch-ICL

This is the code for paper 'Batch-ICL: Effective, Efficient, and Order-Agnostic In-Context Learning'

Enviornment

pip install -r requirements.txt

Install the transformers from source. Copy the contents from the 'transformers' folder we provide into the 'transformers' library folder in your environment to replace 'src/transformers/models/llama/modeling_llama.py', 'src/transformers/models/llama/__init__.py', 'src/transformers/__init__.py' and 'src/transformers/modeling_outputs.py'.

Run

Run BatchICL:

bash Batch-ICL.sh

Run BatchICL without evaluating k (enumerate each layer to search for the optimal k):

bash best.sh

Run BatchICL for generation:

bash generation.sh

Run Multiple “Epochs”:

bash multi_epoch.sh

Modify the model size/shots and other parameters in the corresponding .sh file.

batch-icl's People

Contributors

cardinalere avatar

Stargazers

Zeming Wei avatar Hansel avatar Jiwoo Kim avatar ljghy avatar Xiang Pan (潘翔) avatar Birdium avatar zijianwang avatar Suhong Moon avatar Isabel Rio-Torto avatar Jeff Carpenter avatar Ang Lv avatar skykiseki avatar LX avatar

Watchers

 avatar

Forkers

amutong

batch-icl's Issues

Ablations

Hi, thanks for sharing your very interesting work!

Did you try using more than one layer, i.e. getting the hidden states from all layers and having a per-layer aggregate that you replaced in all layers of the network? Why did you choose to only work with 1 layer at a time?

Why do you substitute the zero-shot outputs with the aggregated 1-shot hidden states? Why not adding instead of substituting?

Mismatch between paper and code

Hi, thanks for sharing your very interesting work!

In Section 3.2 of the paper you mention that "At a selected layer k, we collect the fi(xq) at the last position, i.e., the result of Eq. 5.", but in the code (see e.g. here) you use the last 2 tokens.

According to the paper, shouldn't it be just the last token?

Thanks in advance.

About machine translation task.

Hi, thank you very much for your contributions. I am really interested in your work. While reproducing the machine translation task, I encountered some issues. According to the description in the paper, for the machine translation task, you found the best insertion layer on the validation set, but I did not find the code using the validation set in the BatchICL_generation.py script, nor did I find a function like get_layer. It seems that it is directly finding the best layer on the test set. Could you please provide the relevant code by using the validation set or explain this?

Unexpected keyword argument in __init__() of BaseModelOutputWithPast

I got the following error when running bash Batch-ICL.sh after following the instruction and substitute the source code of transformers. It seems like the class BaseModelOutputWithPast in transformers/modeling_outputs.py should be correspondingly changed since it originally does not adopt 'hidden_before_mlp' as its initialization parameter.

Traceback (most recent call last):
  File "Batch-ICL.py", line 426, in <module>
    main(args.param_size, args.model_type, args.task)
  File "Batch-ICL.py", line 354, in main
    lay = get_layer(tokenizer,model,dev_df,val_df,tot_l,task)
  File "Batch-ICL.py", line 320, in get_layer
    hiddens,_ = batch_get_hidden(model,tokenizer,[record['prompt'] for record in records_1shots],k,val_df.shape[0],pos=-2)
  File "Batch-ICL.py", line 186, in batch_get_hidden
    outp = model(**encode_inputs, output_hidden_states = True, return_dict = True,out_attn=True,idea=0)#["hidden_states"]
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 1678, in forward
    outputs = self.model(
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 1580, in forward
    return BaseModelOutputWithPast(
TypeError: __init__() got an unexpected keyword argument 'hidden_before_mlp'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.