Giter Site home page Giter Site logo

trellixvulnteam / multit-c-dialog_9iig Goto Github PK

View Code? Open in Web Editor NEW

This project forked from zengyan-97/multit-c-dialog

0.0 0.0 0.0 837 KB

A multi-task learning approach for conditioned response generation (NAACL 2021)

Shell 1.49% Python 98.51%

multit-c-dialog_9iig's Introduction

MultiT-C-Dialog

This code is the official pytorch implementation of A Simple and Efficient Multi-Task Learning Approach for Conditioned Dialogue Generation.

In our experiments, we fine-tuned BERT for conditioned dialogue generation.
Recently, researchers have proposed some large pre-trained dialogue models by utilizing Reddit/Twitter data.
These models all utilize auto-regressive training objective.
It is easy to apply this objective to conditioned language/dialogue generation task.
However, the conditioned language encoding task in our approach applies bi-directional attention, and mask language modeling objective is thus needed.

Requirements

python3.7
torch==1.1.0

Then, run this:

pip install .

Notice that when you modify the code in ./biunilm/ or ./pytorch_pretrained_bert/,
you need to re-run this command:

pip install .

Or, directly update the corresponding code in:

xxx/anaconda3/envs/xxx/lib/python3.7/site-packages/biunilm
xxx/anaconda3/envs/xxx/lib/python3.7/site-packages/pytorch_pretrained_bert

Download Data

Download Persona Reddit and Topic-related Dialogue. We leave the data cleaning / filtering process to users. Process the data into labeled dialogue corpus:

dial.train
dial.valid
dial.test
### each file consists of lines in the form of: 
# dialogue-context \t condition-label \t response
### for multi-turn dialogue, concatenate the turns in context using [SEP]

and labeled text corpus:

text.train
text.valid
text.test
# each file consists of lines in the form of: 
# condition-label \t text

Preprocessing

Please, tokenize the dataset in advance:

python ./pre_tokenize.py $rdir $wdir

Then, calculate TF-IDF scores in advance:

python ./get_tfidf.py $datadir $rfilen

# $rpath can be the combination of text.train and dial.train (after tokenization)

Model Training

Further pre-train on a dialogue corpus (optional):

sh -x ./pretrain.sh
# use <nan> as the condition label when preprocessing the dataset

Use our approach to fine-tune on a labeled dialogue corpus and a labeled text corpus:

sh -x ./train.sh

where DATA_DIR should contain the two corpora. Some options are:

--n_text: set the number of text samples
--n_dial: set the number of dialogue samples
--FGfree: eliminating finetune-generation discrepancy
--model_recover_path: load pre-trained model 

Or, apply sequential fine-tuning:

sh -x ./run_2step_pre.sh
sh -x ./run_2step_ft.sh

Tips: If labeled text corpus is limited, use our approach to avoid catastrophic forgetting (training on small text corpus will largely erase the pre-training result).
If labeled text corpus is sufficient, use sequential fine-tuning. In this case, the final training goal is optimizing dialogue generation, and it will be better.

Model Evaluation

Calculate perplexity on the dialogue data:

sh -x ./run_ppl.sh

This command will automatically load the latest checkpoint in ${OUTPUT_DIR}.

Generate responses:

sh -x ./run_eval.sh

We provide a evaluation scrip:

python eval.py $rdir $model

Acknowledgments

Our code is based on UniLM. Thanks!

Citation

@misc{zeng2021simple,
      title={A Simple and Efficient Multi-Task Learning Approach for Conditioned Dialogue Generation}, 
      author={Yan Zeng and Jian-Yun Nie},
      year={2021},
      eprint={2010.11140},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

If activating --FGfree option, please cite:

@misc{zeng2020opendomain,
      title={Open-Domain Dialogue Generation Based on Pre-trained Language Models}, 
      author={Yan Zeng and Jian-Yun Nie},
      year={2020},
      eprint={2010.12780},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Contact

For help using this code, please submit a GitHub issue. For serious problems, please contact Yan Zeng ([email protected]).

multit-c-dialog_9iig's People

Contributors

trellixvulnteam avatar zengyan-97 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.