Giter Site home page Giter Site logo

sigadavid96 / dart Goto Github PK

View Code? Open in Web Editor NEW

This project forked from zjunlp/dart

0.0 1.0 0.0 1.39 MB

Code for the ICLR2022 paper "Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners"

License: MIT License

Shell 9.77% Python 90.23%

dart's Introduction

Improvement of DART [Assignment 4 11711] by dsiga, rdharani, skandi

Codes are forked and inspired from DART GitHub

Re - Implementation for ICLR2022 paper Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners.

The following Environment is required to run the codes

  • [email protected], would recommend making a conda environement
  • Use pip install -r requirements.txt to install dependencies.

Data source

  • 16-shot GLUE dataset from LM-BFF.
  • Generated data consists of 5 random splits (13/21/42/87/100) for a task, each has 16 samples.

Data source creation in our environment, please use the following codes in order, inside the DART Directory post cloning our reporsitory

  mkdir data
  cd data
  wget https://nlp.cs.princeton.edu/projects/lm-bff/datasets.tar
  tar xvf datasets.tar
  cd ..
  python tools/generate_k_shot_data.py

How to run (remains same as the author's codes, hence below)

  • To run across each 5 splits in a task, use run.py:
    • In the arguments, encoder="inner" is the method proposed in the paper where verbalizers are other trainable tokens; encoder="manual" means verbalizers are selected fixed tokens; encoder="lstm" refers to the P-Tuning method.
$ python run.py -h
usage: run.py [-h] [--encoder {manual,lstm,inner,inner2}] [--task TASK]
              [--num_splits NUM_SPLITS] [--repeat REPEAT] [--load_manual]
              [--extra_mask_rate EXTRA_MASK_RATE]
              [--output_dir_suffix OUTPUT_DIR_SUFFIX]

optional arguments:
  -h, --help            show this help message and exit
  --encoder {manual,lstm,inner,inner2}
  --task TASK
  --num_splits NUM_SPLITS
  --repeat REPEAT
  --load_manual
  --extra_mask_rate EXTRA_MASK_RATE
  --output_dir_suffix OUTPUT_DIR_SUFFIX, -o OUTPUT_DIR_SUFFIX
  • To train and evaluate on a single split with details recorded, use inference.py.
    • Before running, [task_name, label_list, prompt_type] should be configured in the code.
    • prompt_type="none" refers to fixed verbalizer training, while "inner" refers to the method proposed in the paper. ("inner2" is deprecated 2-stage training)
  • To find optimal hyper-parameters for each task-split and reproduce our result, please use sweep.py:
    • Please refer to documentation for WandB for more details.
    • ❗NOTE: we follow LM-BFF to use the corresponding automatic search results with different data split seeds.

For example, To run the results for SST-2 on all splits of 16 shot data, you will have to run the below : (in DART Directory)

python3  run.py --task SST-2

dart's People

Contributors

sigadavid96 avatar zxlzr avatar riroaki avatar dependabot[bot] avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.