Repository for the paper Outliers Dimensions that Disrupt Transformers Are Driven by Frequency

Create and activate a conda environment with

conda env create -f environment.yml
conda activate outliersvsfreq

To replicate the fine tuning of multiberts checkpoints move to the multiberts folder and run the download script this will take some disk space (~15gb or more):

bash donwload_and_convert_to_pt_multiberts_seed_1.sh

then, go back to the main directory and run:

bash run_multiberts.sh

To replicate the results on VIT run:

bash run_vit.sh

To replicate the remaining results in the paper first run:

bash run_glue.sh bert-base-uncased

NOTE: This runs all the fine-tuning for bert-base-uncased and all the experiment zeroing outliers out, it takes a long time to run.

Then run all experiments in the paper except the bert-medium pretraining use

bash run_experiments.sh

inspect the file to see which experiments are run.

All the plots in the paper should be replicable using the paper_plots.ipynb notebook

Finally to run bert_medium pretraining the following should run the SPLIT case

NOTE: adjust the checkpointing using huggingface settings as this is too variable to be fixed for all machines.

cd pre_training_bert_medium
python -m outliersvsfreq.experiments.pretrain_bert_medium \
    --output_dir mlm_run \
    --preprocessing_num_workers 8 \
    --do_train true \
    --per_device_train_batch_size 128 \
    --per_device_eval_batch_size 128 \
    --warmup_step 30000 \
    --weight_decay 0.01 \
    --num_train_epochs 4 \
    --do_eval true \
    --validation_split_percentage 1 \
    --learning_rate 1.e-4 \
    --max_seq_length 256 \
    --do_split_in_sentences true \
    --is_test false \
    --randomize_tokens false \
    --few_special_tokens false \

changing the last two flags runs the different data preparation.

For RANDOMIZE_TOKENS they should be set to

    --randomize_tokens true
    --few_special_tokens false

and for ONE_SEP to

    --randomize_tokens false
    --few_special_tokens true

NOTE: You may need to install a model for Spacy to work

python -m spacy download "en_core_web_sm"

gpucce / outliersvsfreq Goto Github PK

outliersvsfreq's Introduction

Repository for the paper Outliers Dimensions that Disrupt Transformers Are Driven by Frequency

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent