deeplearnxmu / vnmt Goto Github PK

View Code? Open in Web Editor NEW

56.0 6.0 14.0 430 KB

Code for "Variational Neural Machine Translation" (EMNLP2016)

Python 84.92% Shell 0.07% Prolog 15.01%

variational-nmt nmt theano groundhog variational-encoder-decoder

vnmt's Introduction

VNMT

Current implementation for VNMT only support 1 layer NMT! Deep layers are meaningless.

Source code for variational neural machine translation, we will make it available soon!

If you use this code, please cite our paper:

@InProceedings{zhang-EtAl:2016:EMNLP20162,
  author    = {Zhang, Biao  and  Xiong, Deyi  and  su, jinsong  and  Duan, Hong  and  Zhang, Min},
  title     = {Variational Neural Machine Translation},
  booktitle = {Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing},
  month     = {November},
  year      = {2016},
  address   = {Austin, Texas},
  publisher = {Association for Computational Linguistics},
  pages     = {521--530},
  url       = {https://aclweb.org/anthology/D16-1050}
}

Basic Requirement

Our source is based on the GroundHog. Before use our code, please install it.

How to Run?

To train a good VNMT model, you need follow two steps.

Step 1. Pretraining

Pretrain a base model using the GroundHog.

Step 2. Retraining

Go to the work directory, and put the pretrained model to this directory, i.e. use the pretrained model to initialize the parameters of VNMT.

Simply Run (Clearly, before that you need re-config the chinese.py file to your own dataset :))

run.sh

That's it!

Notice that our test and deveopment set is the NIST dataset, which follow the sgm format! Please see the work/data/dev for example.

For any comments or questions, please email Biao Zhang.

vnmt's People

Contributors

Stargazers

Watchers

Forkers

benjamesbabala zzmjohn wangtingc cyzhangathit luomuqinghan pemywei ivanbrave alphadl mastercaojie ferdian26 xl2248 hoangcuong2011 wer2sdf

vnmt's Issues

problems with GroundHog

Hi,

When I use the bleeding edge version of GroundHog, I met this problem . Have you ever encountered this issue?

My Theano version is 0.10.0beta3+129.ga44a3f8. I don't know if it is due to the special Theano version.
Or, could you please share your pretrained and/or finetuned model?
Many thanks~

Have you tried to train VNMT from scratch?

I found training VNMT from scratch leads to a low performance. Then I monitored the training loss, and figured it out: the algorithm just converges too slow and haven't fully converged when I stop it.
Does this suggest a buggy implementation? Or this training dynamics is normal?

Missing 0.5 in your KL Divergence?

Hi,

I noticed that for computing KL divergence, the full formula should be:
0.5*(log sigma_prior - log sigma_post + (sigma_post^2 + (mu_post - mu_prior)^2) / (2*sigma_prior^2)) - 0.5

instead of

log sigma_prior - log sigma_post + (sigma_post^2 + (mu_post - mu_prior)^2) / (2*sigma_prior^2) - 0.5 as in your code as follows: https://github.com/DeepLearnXMU/VNMT/blob/master/src/encdec.py

`# log sigma_prior - log sigma_post + (sigma_post^2 + (mu_post - mu_prior)^2) / (2*sigma_prior^2) - 1/2

kl_der_q_p = (variation_log_sigma_layers[level] - variation_log_sigma_post_layers[level]) +
(UnaryOp('lambda x: x2')(variation_sigma_post_layers[level]) +
UnaryOp('lambda x: x2')(variation_mu_post_layers[level] - variation_mu_layers[level])) / (UnaryOp('lambda x: 2.*x**2')(variation_sigma_layers[level])) - 0.5`

Am I correct, or am I missing something?

Many thanks!

Best

How do you compute q(z| x, y) at test time?

Hello,

I am reading your paper Variational Neural Machine Translation.

The paper is great! I was wondering one thing: How do you compute q(z| x, y) at test time?
I cannot find an obvious approach to this as at test time we don't have y to compute their representation and then q(z| x, y).

Thanks!

Best,

Is Equation 5 the proper ELBO of the likelihood P(Y|X)

Hello,

My last question from your paper: http://www.aclweb.org/anthology/D16-1050. I tried by myself to derive the ELBO of P(Y|X), but I couldn't come-up with any formula. Meanwhile, I found it indeed pretty easy/straightforward to derive the ELBO of P(Y, X), but not P(Y|X) indeed.

So I was wondering whether it is possible to derive ELBO of P(Y|X), or your formula (equation 5) is a kind of an approximation of ELBO instead?

Many thanks!

deeplearnxmu / vnmt Goto Github PK

vnmt's Introduction

VNMT

Basic Requirement

How to Run?

Step 1. Pretraining

Step 2. Retraining

vnmt's People

Contributors

Stargazers

Watchers

Forkers

vnmt's Issues

problems with GroundHog

Have you tried to train VNMT from scratch?

Missing 0.5 in your KL Divergence?

How do you compute q(z| x, y) at test time?

Is Equation 5 the proper ELBO of the likelihood P(Y|X)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent