Comments (4)
@Henry-E re: distinguishing between unique OOV words, also check out the discussion in the comments started by RobinChen here.
from pointer-generator.
Hi @Henry-E,
Have you read the paper? This is a question that would be easier to answer by looking at the paper than the code.
For the pointer-generator model, OOV words in the source text are represented by the word vector for the UNK token (because as they're OOV, there is no word vector to use). In the paper and the code, we refer to e.g. "50k + number of unknown words in the article" as the extended vocabulary because that's the set of words that can be produced by the decoder. This doesn't mean that all the words in the extended vocabulary have word vectors, though. Only the words in the original vocabulary have word vectors.
Hope this answers your question.
from pointer-generator.
Yep that sounds good. I was just trying to figure out how the probability distribution for the extended vocabulary was obtained in practice. For some reason I thought in order to attend properly the attention mechanism needed to be able to distinguish between unique OOV words.
from pointer-generator.
Ah ok, and the final bit that I also was confused about, for anyone who comes across this later, was how the loss is calculated. I subsequently thought that maybe the copy mechanism was only applied at test time but it appears that the loss is calculated based on the extended vocabulary.
The loss function is as described in equations (6) and (7), but with respect to our modified probability distribution P(w) given in equation (9).
from pointer-generator.
Related Issues (20)
- when I run the "beam search decoding", an exception happened, please help! HOT 3
- Can i apply this pretrained model to summarize news
- model 第282行 HOT 2
- the UNK problem HOT 4
- is there something wrong in beam_search.py? HOT 3
- train 和eval时间 HOT 6
- Have anyone transfer it to Python3 version? HOT 1
- 有用transformer做中文摘要的吗,出现了重复输出一个字的问题,考虑怎么在transformer中加上Coverage Mechanism,有做过的吗,欢迎交流q975669552 HOT 3
- NAN source? HOT 1
- Getting a batch error due to sentences being longer than expected HOT 1
- Question about coverage mechanism implementation
- Problem with flags
- why decoder produce same generated summary ? HOT 4
- the time of decode seems too long HOT 1
- DuplicateFlagError: The flag 'data_path' is defined twice. First from run_summarization.py, Second from run_summarization.py. Description from first occurrence: Path expression to tf.Example datafiles. Can include wildcards to access multiple datafiles.
- How to fine-tune pre-trained model on a smaller dataset?
- Same Generated Summary in Decode mode
- implementing n-gram repeat blocking
- Module Queue not found
- TypeError: unsupported operand type(s) for *: 'int' and 'Flag'
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pointer-generator.