Giter Site home page Giter Site logo

Comments (4)

abisee avatar abisee commented on June 27, 2024 1

@Henry-E re: distinguishing between unique OOV words, also check out the discussion in the comments started by RobinChen here.

from pointer-generator.

abisee avatar abisee commented on June 27, 2024

Hi @Henry-E,

Have you read the paper? This is a question that would be easier to answer by looking at the paper than the code.

For the pointer-generator model, OOV words in the source text are represented by the word vector for the UNK token (because as they're OOV, there is no word vector to use). In the paper and the code, we refer to e.g. "50k + number of unknown words in the article" as the extended vocabulary because that's the set of words that can be produced by the decoder. This doesn't mean that all the words in the extended vocabulary have word vectors, though. Only the words in the original vocabulary have word vectors.

Hope this answers your question.

from pointer-generator.

Henry-E avatar Henry-E commented on June 27, 2024

Yep that sounds good. I was just trying to figure out how the probability distribution for the extended vocabulary was obtained in practice. For some reason I thought in order to attend properly the attention mechanism needed to be able to distinguish between unique OOV words.

from pointer-generator.

Henry-E avatar Henry-E commented on June 27, 2024

Ah ok, and the final bit that I also was confused about, for anyone who comes across this later, was how the loss is calculated. I subsequently thought that maybe the copy mechanism was only applied at test time but it appears that the loss is calculated based on the extended vocabulary.

The loss function is as described in equations (6) and (7), but with respect to our modified probability distribution P(w) given in equation (9).

from pointer-generator.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.