Giter Site home page Giter Site logo

ebert's Introduction

EBERT

This repository serves as the official code release of the paper EBERT: Efficient BERT Inference with Dynamic Structured Pruning (pubilished at Findings of ACL 2021).

EBERT is a dynamic structured pruning algorithm for efficient BERT inference. Unlike previous methods that randomly prune the model weights for static inference, EBERT dynamically determines and prunes the unimportant heads in multi-head self-attention layers and the unimportant structured computations in feed-forward network for each input sample at run-time.

Prerequisites

The code has the following dependencies:

  • python >= 3.8.5
  • pytorch >= 1.4.0
  • transformers = 3.3.1 As transformers v3.3.1 has a bug when the evaluation strategy is epoch, you need to make the following changes in the transformers library:
--- a/src/transformers/training_args.py
+++ b/src/transformers/training_args.py
@@ -323,7 +323,7 @@ class TrainingArguments:
     def __post_init__(self):
         if self.disable_tqdm is None:
             self.disable_tqdm = logger.getEffectiveLevel() > logging.WARN
-        if self.evaluate_during_training is not None:
+        if self.evaluate_during_training:
             self.evaluation_strategy = (
                 EvaluationStrategy.STEPS if self.evaluate_during_training else EvaluationStrategy.NO
             )

Usages

We provide script files for training and validation in the scripts folder, and users can run these script from the repo root, e.g. bash scripts/eval_glue.sh. In each scripts, there are several arguments to modify before running:

  • --data_dir: path to dataset:GLUE, SQuAD.
  • MODEL_PATH or --model_name_or_path: path to trained model folder
  • TASK_NAME: task name in GLUE (SST-2, MNLI, ...)
  • RUN_NAME: name of the current experiment, which influence the save path and log name for wandb.
  • other hyper-parameters, e.g., head_mask_mode

You can download the original pretrained model of BERT and RoBERTa from HuggingFace.

Citation

If you found the library useful for your work, please kindly cite our work:

@inproceedings{liu-etal-2021-ebert,
    title = "{EBERT}: Efficient {BERT} Inference with Dynamic Structured Pruning",
    author = "Liu, Zejian  and
              Li, Fanrong  and
              Li, Gang  and
              Cheng, Jian",
    booktitle = "Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021",
    month = aug,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.findings-acl.425",
    doi = "10.18653/v1/2021.findings-acl.425",
    pages = "4814--4823",
}

ebert's People

Contributors

zejiangp avatar

Stargazers

HAPPYPMN avatar  avatar  avatar Huiqiang Jiang avatar guanfuchen avatar  avatar

Watchers

 avatar

ebert's Issues

A question about inference

Hi thank you for your impressive work :)
As is mentioned in your paper, "For MHA, heads with mask ’0’ will not be executed. For FFN, as matrix-matrix multiplication can be transformed to multiple matrix vector multiplications, we only need to complete part of computations where vector’s mask is not zero."
however, it seems that in modeling_ebert.py you may just simply multiply the mask with the hidden states or attention probs and computations aren't reduced. Then the inference flops is computed theoretically. Is what I said true?
But if you actually prune the channels and heads, the feature dimension (e.g.768) of the hidden states would be diminished, causing a mismatch of all those linear layers(e.g.in FFN 768->3072->768,the weight matrix is (3072,768) so if the inter dim<3072, the multiplication is invalid) How did you deal with this mismatch?

code

Hello, when will you release all the code?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.