Giter Site home page Giter Site logo

cmaml's Introduction

Learning to Customize Model Structures for Few-shot Dialogue Generation Tasks

This is the implementation of our ACL 2020 paper:

Learning to Customize Model Structures for Few-shot Dialogue Generation Tasks.

Yiping Song, Zequn Liu, Wei Bi, Rui Yan, Ming Zhang

https://arxiv.org/abs/1910.14326

Please cite our paper when you use this code in your work.

Dependency

❱❱❱ pip install -r requirements.txt

Put the Pre-trained glove embedding: glove.6B.300d.txt in /vectors/.

Trained NLI model pytorch_model.bin in /data/nli_model/.

Experiment

The code is for the experiment of our model CMAML-Seq2SPG on Persona-chat. The scripts for training and evaluation are "train.sh" and "test.sh".

After training, please set the "--save_model" as the model with the lowest PPL in validation set to evaluate the model.

Acknowledgement

We use the framework of PAML and the Seq2seq implementation in https://github.com/MaximumEntropy/Seq2Seq-PyTorch

cmaml's People

Contributors

zequnl avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

cmaml's Issues

Question about some variables in the code

In seq2spg you update both loss and 'ppl' in the MAML process the function 'train_one_batch'. Usually ppl refers to the perplexity of the model but in your code I found that ppl = math.exp(min(loss.item(), 100)). Could you tell me what's that for? thx:)

About the hyperparameter setting.

Hi @zequnl,
I followed the default hyper-parameter setting during training. But I tried many times and the best validation loss I could get is around 96 at the 36-th meta iteration during the training process. But I found that the note in train.sh says that the loss should be around 45 at 55-th meta-iteration. So I wonder whther there is something wrong about the hyper-parameter setting.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.