Giter Site home page Giter Site logo

rationalest's Introduction

Self-training with Few-shot Rationalization

Self-training with Few-shot Rationalization is a multi-task learning (MTL) based framework based on self-training language models with few labels for task and rationales.

Our paper was published at EMNLP 2021. The link to the paper can be found here: Self-training with Few-shot Rationalization: Teacher Explanations Aid Student in Few-shot NLU

Overview

screenshot

Using the Code

Installing the dependencies

To run the code, please install the packages from requirements.txt using the command: pip install -r requirements.txt

Downloading the datasets

Datasets used in this paper are from ERASER benchmark. The datasets can be downloaded from this link

Training our model

To run the code:

python run_st_rationale.py --task <dataset_dir> --model_dir <output_dir>  --seq_len 512 --sample_scheme uniform --sup_labels 200 --valid_split 0.2 --pt_teacher TFBertModel --pt_teacher_checkpoint bert-base-uncased --N_base 3 --sup_batch_size 8 --sup_epochs 70 --unsup_epochs 20 --model_type joint_neg_rwt_l_r_fine_tune_teacher 

Classification tasks: Set --do_pairwise for evidence and boolq dataset

Model type: Option model_type: joint_neg (MTL framework without reweighting), joint_neg_rwt_l_r (MTL framework with reweighting), joint_neg_rwt_l_r_fine_tune_teacher (MTL framework with reweighting and fine-tuning teacher)

Training and validation: sup_labels denote the total number of available labeled examples for each class, where valid_split uses a fraction of those labels as validation set for early stopping. Set sup_labels to -1 to use all training labels. Set valid_split to -1 to use the available test data as the validation set.

Fine-tuning batch size: Set sup_batch_size to a small number for few-shot fine-tuning of the teacher model. In case of many training labels, set sup_batch_size to a higher value for faster training.

Task name: Option task Provide the inout dataset directory for training.

HuggingFace Transformers: To use different pre-trained language models from HuggingFace, set pt_teacher and pt_teacher_checkpoint to corresponding model versions available from here. A default set of pre-trained language models is available at huggingface_utils.py.

Citation

If you find our paper useful, please cite the following:

@InProceedings{bhat2021self-training,
author = {Bhat, Meghana Moorthy and Sordoni, Alessandro and Mukherjee, Subhabrata (Subho)},
title = {Self-training with Few-shot Rationalization: Teacher Explanations Aid Student in Few-shot NLU},
booktitle = {EMNLP 2021},
year = {2021},
month = {November},
abstract = {While pre-trained language models have obtained state-of-the-art performance for several natural language understanding tasks, they are quite opaque in terms of their decision-making process. While some recent works focus on rationalizing neural predictions by highlighting salient concepts in text as justifications or rationales, they rely on thousands of labeled training examples for both task labels as well as annotated rationales for every instance. Such extensive large-scale annotations are infeasible to obtain for many tasks. To this end, we develop a multi-task teacher-student framework based on self-training pre-trained language models with limited task-specific labels and rationales and judicious sample selection to learn from informative pseudo-labeled examples. We study several characteristics of what constitutes a good rationale and demonstrate that the neural model performance can be significantly improved by making it aware of its rationalized predictions particularly in low-resource settings. Extensive experiments in several benchmark datasets demonstrate the effectiveness of our approach.},
url = {https://www.microsoft.com/en-us/research/publication/self-training-with-few-shot-rationalization/},
}

Contact

For questions, please contact Meghana Bhat

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

rationalest's People

Contributors

meghu2791 avatar microsoftopensource avatar subhomj avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.