Giter Site home page Giter Site logo

evaluation issues about seqgan HOT 6 CLOSED

lantaoyu avatar lantaoyu commented on July 30, 2024 2
evaluation issues

from seqgan.

Comments (6)

LantaoYu avatar LantaoYu commented on July 30, 2024

To calculate BLEU score, you need to provide references (demonstrations) since BLEU is a kind of metric measuring the "similarity" between the evaluated sample and the reference(s). Thus, in our case, we use the human demonstrations (i.e. the real poems) as the references for calculating BLEU score. Instead of putting effort on selecting which poems as the references, we use all the real poems in the test set as the references since the goal is to generate samples which seems like coming from the real underlying distribution.

from seqgan.

xiaopyyy avatar xiaopyyy commented on July 30, 2024

@LantaoYu Thanks for the quick response. But my biggest concern is in what way to generate the test samples. Specifically, using all the real poems in the test set as the references, what is the input to the trained generator during the test procedure?

from seqgan.

LantaoYu avatar LantaoYu commented on July 30, 2024

In this work, the generation procedure is in an unconditional fashion, that is we first start from an initial state, arbitrarily pick the first token, then follow the learned policy to sample the rest of the sequence.

from seqgan.

xiaopyyy avatar xiaopyyy commented on July 30, 2024

@LantaoYu Could you please explain the details in how to set the initial state, and how to "arbitrarily" pick the first token? From my view, the start first token is really important during testing, and will definitely influence the evaluation results a lot. Is the BLEU score you reported in your paper the average results? What's the number you used to sample the results? Btw, have you used word embedding during training for poem/text generation? Thanks!

from seqgan.

LantaoYu avatar LantaoYu commented on July 30, 2024

It may need some clarification here. Actually, "arbitrarily" is not accurate. Note that when training a language model, the first input token is a predefined "start_token" and the label for the "start_token" is the first token in the real sequence. Thus in the test stage, the first input token is also the "start_token". As for the initial hidden state, for example, we can set it to be zeros, as in the synthetic data exp. So after training a language model p(a|s), the learned distribution of the first token is p(a_1 | a_0 = start_token, s_0) and we sample from this distribution. For the BLEU question, of course the result is average of a large number of samples, say 100,000. For the word embedding question, we don't use pre-trained word embedding. They are trainable parameters.

from seqgan.

sh0416 avatar sh0416 commented on July 30, 2024

@LantaoYu I just see your paper, and it is really exciting. I am also curious about your experiment setting. The ambiguous part is the train-validation-test split. In the paper, you use "a collection of 11092 paragraphs from Obama's political speeches". The curious thing is how to split the dataset.
Did you use the whole Obama's speech as a training dataset and introduce an additional corpus as test dataset? Or did you split the Obama's speech dataset into three parts like 8000 for train dataset, 1000 for validation dataset, and the remain as test dataset?
Thanks in advance :)

from seqgan.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.