Comments (6)
To calculate BLEU score, you need to provide references (demonstrations) since BLEU is a kind of metric measuring the "similarity" between the evaluated sample and the reference(s). Thus, in our case, we use the human demonstrations (i.e. the real poems) as the references for calculating BLEU score. Instead of putting effort on selecting which poems as the references, we use all the real poems in the test set as the references since the goal is to generate samples which seems like coming from the real underlying distribution.
from seqgan.
@LantaoYu Thanks for the quick response. But my biggest concern is in what way to generate the test samples. Specifically, using all the real poems in the test set as the references, what is the input to the trained generator during the test procedure?
from seqgan.
In this work, the generation procedure is in an unconditional fashion, that is we first start from an initial state, arbitrarily pick the first token, then follow the learned policy to sample the rest of the sequence.
from seqgan.
@LantaoYu Could you please explain the details in how to set the initial state, and how to "arbitrarily" pick the first token? From my view, the start first token is really important during testing, and will definitely influence the evaluation results a lot. Is the BLEU score you reported in your paper the average results? What's the number you used to sample the results? Btw, have you used word embedding during training for poem/text generation? Thanks!
from seqgan.
It may need some clarification here. Actually, "arbitrarily" is not accurate. Note that when training a language model, the first input token is a predefined "start_token" and the label for the "start_token" is the first token in the real sequence. Thus in the test stage, the first input token is also the "start_token". As for the initial hidden state, for example, we can set it to be zeros, as in the synthetic data exp. So after training a language model p(a|s), the learned distribution of the first token is p(a_1 | a_0 = start_token, s_0) and we sample from this distribution. For the BLEU question, of course the result is average of a large number of samples, say 100,000. For the word embedding question, we don't use pre-trained word embedding. They are trainable parameters.
from seqgan.
@LantaoYu I just see your paper, and it is really exciting. I am also curious about your experiment setting. The ambiguous part is the train-validation-test split. In the paper, you use "a collection of 11092 paragraphs from Obama's political speeches". The curious thing is how to split the dataset.
Did you use the whole Obama's speech as a training dataset and introduce an additional corpus as test dataset? Or did you split the Obama's speech dataset into three parts like 8000 for train dataset, 1000 for validation dataset, and the remain as test dataset?
Thanks in advance :)
from seqgan.
Related Issues (20)
- About the sequences to process
- 模型问题
- the input data?
- Training on custom dataset HOT 1
- 关于采样的问题
- What is the difference between SeqGAN and LM for text generation? HOT 1
- Nothing. Ignore it.
- About dataset. HOT 1
- About oracle model
- If the positive sample and negative sample of each training are corresponding, will it affect the training result
- gradient decent implementation
- About generator in adversarial training HOT 2
- How should I understand the RL loss function
- data format
- About the accuracy of discriminator during training?
- NLG
- dataset HOT 1
- How can i use my own training data ? HOT 2
- How to resume training in Colab?
- Questions about the recurring results of the code HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from seqgan.