Giter Site home page Giter Site logo

imliuruiqi / explainable-qe-shared-task Goto Github PK

View Code? Open in Web Editor NEW

This project forked from deep-spin/explainable-qe-shared-task

0.0 1.0 0.0 274 KB

IST-Unbabel 2021 Submission for the Quality Estimation Shared Task

Home Page: https://aclanthology.org/2021.eval4nlp-1.14

License: MIT License

Python 45.39% Shell 0.07% Jupyter Notebook 54.54%

explainable-qe-shared-task's Introduction

Minimalist explainable XLM-R QE system

This repo contains the code for the IST-Unbabel 2021 Submission for the Quality Estimation Shared Task.

Data from the shared task:

Preprocess the entire MLQE-PE dataset into the shared task format and download/preprocess the test set:

mkdir data
git clone https://github.com/sheffieldnlp/mlqe-pe
bash preprocess_mlqepe.sh
python3 preprocess_mlqepe.py --input-dir mlqe-pe/data/ --output-dir data/
bash download_and_preprocess_test_data.sh
rm -rf mlqe-pe

Installation:

pip install -r requirements.txt
pip install -e .

Training:

Inform a config file via -f:

python3 cli.py train -f configs/xlmr-adapters-shared-task-mlqepe-all-all.yaml

See more config files in the config/ folder. PyTorch models can be found in the model/ folder.

Evaluating:

python3 scripts/evaluate_sentence_level.py train --testset data/ro-en/dev --checkpoint path/to/model.ckpt

For word-level models:

python3 scripts/evaluate_word_level.py train --testset data/ro-en/dev --checkpoint path/to/model.ckpt

Extracting explanations:

For baseline explainers (gradient, leave-one-out, etc.), use explain.py. For example:

python3 scripts/explain.py
  --testset data/ro-en/dev
  --checkpoint path/to/model.ckpt
  --explainer ig
  --save experiments/explanations/roen_ig/
  --batch-size 1

For extracting attention, use explain_attn.py. For example:

python3 scripts/explain_attn.py
  --testset data/ro-en/dev
  --checkpoint path/to/model.ckpt
  --save experiments/explanations/roen_attn
  --batch-size 1

Several folders will be created with their name prefixed by the path informed via the flag --save for:

  • the entire model (average layers via scalar mix)
  • for each layer (average heads)
  • for each head (average the "rows" in the attention map)

Moreover, if you want to get explanations in terms of attention * norm(values), you can inform these flags:

  --norm-attention
  --norm-strategy weighted_norm

We also provide scripts for extracting explanations with other methods, e.g., DiffMask and Attention Flow/Rollout.

Evaluating explanations

Use the script evaluate_explanations.py. For example:

python3 scripts/evaluate_explanations.py
  --gold_sentence_scores_fname data/et-en/dev.da
  --gold_explanations_fname_mt data/et-en/dev.tgt-tags
  --gold_explanations_fname_src data/et-en/dev.src-tags
  --model_sentence_scores_fname experiments/explanations/eten_attn_head_18_3/sentence_scores.txt
  --model_explanations_fname_mt experiments/explanations/eten_attn_head_18_3/mt_scores.txt
  --model_explanations_fname_src experiments/explanations/eten_attn_head_18_3/source_scores.txt
  --model_fp_mask_mt experiments/explanations/eten_attn_head_18_3/mt_fp_mask.txt
  --model_fp_mask_src experiments/explanations/eten_attn_head_18_3/source_fp_mask.txt
  --reduction sum
  --transform none

The --reduction flag informs how to aggregate word pieces scores: none, first, sum, mean, max. The flag --transform can be:

  • pre: apply sigmoid(abs(.)) element-wise for each score BEFORE aggregating word pieces
  • pos: apply sigmoid(abs(.)) element-wise for each score AFTER aggregating word pieces
  • none: do not apply any transformation

This transformation might be useful for explainers that can return negative values. The computation of sigmoid is in fact irrelevant, since the metrics are based on ranking. But it is useful to have scores between 0 and 1 if we want to do some kind of thresholding to calculate accuracy or something else.

Prepare Submission:

Aggreagte subword units:

python3 scripts/aggregate_explanations.py \
  --model_explanations_dname experiments/explanations/roen_ig/ \
  --reduction sum \
  --transform none

Create a metadata.txt file, and zip all files. Here is the script that does all of this:

python3 scripts/prepare_submission.py
  --explainer experiments/explanations/roen_ig/
  --save submission.zip
  --team "Team Name"
  --track "constrained"
  --desc "Simple description of the model + explainer."

A file called submissions.zip will be created in the working directory with the explanations of ig for ro-en.

Bibtex entry

@inproceedings{treviso-etal-2021-ist,
    title = "{IST}-Unbabel 2021 Submission for the Explainable Quality Estimation Shared Task",
    author = "Treviso, Marcos  and
      Guerreiro, Nuno M.  and
      Rei, Ricardo  and
      Martins, Andr{\'e} F. T.",
    booktitle = "Proceedings of the 2nd Workshop on Evaluation and Comparison of NLP Systems",
    month = nov,
    year = "2021",
    address = "Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.eval4nlp-1.14",
    pages = "133--145",
}

License

MIT.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.