Giter Site home page Giter Site logo

dams's Introduction

DAMS

Pytorch implementation of the EMNLP-2021 paper: Low-Resource Dialogue Summarization with Domain-Agnostic Multi-Source Pretraining.

Requirements

  • Python 3.7.10

  • pytorch 1.7.0+cu11.0

  • py-rouge 1.1

  • transformers 4.0.0

  • multiprocess 0.70.11.1

  • tensorboardX 2.1

  • torchtext 0.4.0

  • nltk 3.6.2

Environment

  • RTX 3090 GPU

  • CUDA 11.1

Data

All the datastes used in our work are available at Google Drive or Baidu Pan (extract code: wwsd), including the multi-source pretraining data and the dialogue summary data.

Usage

  • Download BERT checkpoints here and put BERT checkpoints into the directory bert like this:

     --- bert
       |
       |--- bert_base_uncased
          |
          |--- config.json
          |
          |--- pytorch_model.bin
          |
          |--- vocab.txt
    
  • Download json files from the above data links and put them into the directory json_data like this:

     --- json_data
       |
       |--- samsum
       |
       |--- adsc
       |
       ...
    
  • Pre-process dialogue summary datasets (e.g., the SAMSum training data).

    PYTHONPATH=. python ./src/preprocess.py -type train -raw_path json_data/samsum -save_path torch_data/samsum -log_file logs/json_to_data_samsum.log -truncated -n_cpus 4
    
  • Pre-process multi-source pretraining datasets and mix them up.

    PYTHONPATH=. python ./src/preprocess.py -raw_path json_data -save_path torch_data/all -log_file logs/json_to_data.log -truncated -n_cpus 40 -mix_up
    
  • Pretrain DAMS on the multi-source datasets.

    PYTHONPATH=. python ./src/main.py -mode train -data_path torch_data/all/data -model_path models/pretrain -log_file logs/pretrain.log -sep_optim -pretrain -visible_gpus 0,1 -pretrain_steps 250000 -port 10000
    
  • Fine-tune DAMS on the SAMSum training set.

    PYTHONPATH=. python ./src/main.py -mode train -data_path torch_data/samsum/samsum -model_path models/samsum -log_file logs/samsum.train.log -visible_gpus 0 -warmup_steps 1000 -lr 0.001 -train_from models/pretrain/model_step_250000.pt -train_from_ignore_optim -train_steps 50000
    
  • Validate DAMS on the SAMSum validation set.

    PYTHONPATH=. python ./src/main.py -mode validate -data_path torch_data/samsum/samsum -log_file logs/samsum.val.log -val_all -alpha 0.95 -model_path models/samsum -result_path results/samsum/samsum -visible_gpus 0 -min_length 15 -beam_size 3 -test_batch_ex_size 50
    
  • Test DAMS.

    Zero-shot test on the SAMSum test set using the pretrained model.

    PYTHONPATH=. python ./src/main.py -mode test -data_path torch_data/samsum/samsum -log_file logs/samsum.test.log -alpha 0.95 -test_from models/pretrain/model_step_250000.pt -result_path results/samsum/samsum -visible_gpus 0 -min_length 15 -beam_size 3 -test_batch_ex_size 50
    

    Regular test on the SAMSum test set using the best validated model.

    PYTHONPATH=. python ./src/main.py -mode test -data_path torch_data/samsum/samsum -log_file logs/samsum.test.log -alpha 0.95 -test_from models/samsum/model_step_xxx.pt -result_path results/samsum/samsum -visible_gpus 0 -min_length 15 -beam_size 3 -test_batch_ex_size 50
    

    Transfer to the ADSC test set.

    PYTHONPATH=. python ./src/main.py -mode test -data_path torch_data/adsc/adsc -log_file logs/adsc.test.log -alpha 0.95 -test_from models/samsum/model_step_xxx.pt -result_path results/adsc/adsc -visible_gpus 0 -min_length 100 -beam_size 3 -test_batch_ex_size 50
    

Citation

@inproceedings{
	zou-etal-2021-low,
	title = "Low-Resource Dialogue Summarization with Domain-Agnostic Multi-Source Pretraining",
	author = "Zou, Yicheng  and Zhu, Bolin  and Hu, Xingwu  and Gui, Tao  and Zhang, Qi",
	booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
	month = nov,
	year = "2021",
	address = "Online and Punta Cana, Dominican Republic",
	publisher = "Association for Computational Linguistics",
	url = "https://aclanthology.org/2021.emnlp-main.7",
	pages = "80--91"
}

dams's People

Contributors

rowitzou avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

dams's Issues

Hyper-parameters under low resource data setting

Thanks for your remarkable work and detailed repo. I'm now focusing on dialogue summarization under the low resource data settings. Could you please help me reproduce the results in Figure 2 of your paper by providing hyper parameters (such as lr, training steps, warm up steps on fine tuning stage)?

博主好,推理结果长度较短,且内容大量重复怎么解决?

下面是我训练出来的模型,在测试集上的推理结果,可以看到生成摘要较短,而且存在大量重复内容:
1 stanislaw is not going to the ball . she already did . he will be back alreads . taylor will be there in the fireworks last last night . she
2 stanislaw is not going to the ball . he will be home arrives . she already will be back alreads . taylor is be back arrive soon . he
3 stanislaw is not going to the ball . she already did . he will be back alreads . taylor will be there in 15 minute lesson . taylor
4 stanislaw is not going to the ball . she already dinner . he isabababed for the fireworks last last night . taylor will be back alreads to the pool in the ball is broked alreadd . she
5 stanislaw is not going to the ball . she already did . he is beavers in the fireworks last last night . taylor is be back almost stanid .

数据格式已经转换跟您网盘提供的一致,请问这个是什么原因,可以怎么解决呢?敬谢~

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.