dsksd / deepnlp-models-pytorch Goto Github PK
View Code? Open in Web Editor NEWPytorch implementations of various Deep NLP models in cs-224n(Stanford Univ)
License: MIT License
Pytorch implementations of various Deep NLP models in cs-224n(Stanford Univ)
License: MIT License
Step [54] "data = [[d.split(':')[1][:-1], d.split(':')[0]] for d in data]" seems to include the sub-category output into the input sequence.
For example, for data line "DESC:def What is ethology ?", data
will be ["def What is ethology ?", "DESC"]
so the "def" sub-category is included into the input sequence.
I suggest a fix:
# Remove the sub-category (first word) and the '?' at the end.
data = [[d.split(':')[1].split(' ', 1)[1][:-2], d.split(':')[0]] for d in data]
I want to save model for Neural Machine Translation (https://nbviewer.jupyter.org/github/DSKSD/DeepNLP-models-Pytorch/blob/master/notebooks/07.Neural-Machine-Translation-with-Attention.ipynb). Can you help me ?
I was checking out Question and Answering using Dynamic Memory Network (DMN) for BabI dataset from this source: 10.Dynamic-Memory-Network-for-Question-Answering.ipynb
I modified it above a bit so that I can save the model and later run the prediction/inference separately. src: dmn_babi.ipynb and I saved my model as 'dmn_qa'. Moreover, the inference is showing correctly when I am running wholly from the Ipython Notebook.
Later I added 2 separate files which are
However, when I run the prediction part now the output is not coming correctly. Please help, where I am doing wrong here.
Best Regards
Hi,
In file 08.CNN-for-Text-Classification.ipynb, where do you pad the input? Is it in [110], line 7:
x_p.append(torch.cat([x[i], Variable(LongTensor([word2index['']] * (max_x - x[i].size(1)))).view(1, -1)], 1))?
Thanks!
Hi SungDong. Thanks for the great posts. I am reading the first two models on skip-gram. Why do you use two embedding instead of one? The second embedding_u has all the same weights for each row after I train it. Based on the formula on this model, I think it should have only one embedding for all word vectors. Am I missing some details ?
Is the second matrix used for efficiency ? I guess the second matrix can be replace by a linear transformation with the transpose size. But the prediction is a one-hot vector, so it is a waste to compute bunches of zeros. A matrix look up is far more efficient.
class Skipgram(nn.Module):
def __init__(self, vocab_size, projection_dim):
super(Skipgram,self).__init__()
self.embedding_v = nn.Embedding(vocab_size, projection_dim)
self.embedding_u = nn.Embedding(vocab_size, projection_dim)
self.embedding_v.weight.data.uniform_(-1, 1) # init
self.embedding_u.weight.data.uniform_(0, 0) # init
#self.out = nn.Linear(projection_dim,vocab_size)
In the 4th jupyter file named Word Window Classification and Neural Networks
, I found something wrong. Specifically, in the class WindowClassifier
, you has used self.softmax = nn.LogSoftmax(dim=1)
in the output layer, then you don't need use CrossEntropyLoss()
loss, you must use torch.nn.NLLLoss
loss
I have learned a lot from this elegant project. Thanks a lots!
Based on the equation in the Skip-gram-Negative-Sampling algorithm below,
I think the negative example loss calculated by
negative_score = torch.sum(neg_embeds.bmm(center_embeds.transpose(1, 2)).squeeze(2), 1).view(negs.size(0), -1) # BxK -> Bx1
loss = self.logsigmoid(positive_score) + self.logsigmoid(negative_score)
maybe change to
negative_score = neg_embeds.bmm(center_embeds.transpose(1, 2))
loss = self.logsigmoid(positive_score) + torch.sum(self.logsigmoid(negative_score), 1)
since based on the equation, the negative_socre first goes through a logsigmoid operation, and then sums up.
Hey,
I am trying to reproduce this notebook, but the loss do not go down as advertised.
[0/5] mean_loss : 1.80
[1/5] mean_loss : 1.64
[2/5] mean_loss : 1.64
[3/5] mean_loss : 1.62
[4/5] mean_loss : 1.76
I am using PyTorch v0.4 on CUDA. My hypothesis is the newer version broke something.
>torch.__version__
'0.4.0a0+a3e9151'
Thanks!
Hi,
I have a little question about file 08.CNN-for-Text-Classification.ipynb, [96], line 4: pretrained.append(model[word2index[key]]).
word2index[key] means to find key's index, then you should find its pretrained embedding in GoogleNews-vectors-negative300.bin. But the index in this bin file should be different from the index generated from TREC dataset, i.e. model[key's index] may not be this key's (word's) embedding.
Thanks!
torch.nn.CrossEntropyLoss() combines LogSoftMax and NLLLoss in one single class.
Thus, there should be only "return out" instead of return "F.log_softmax(out)" in the forward() function of some classification models.
안녕하세요
좋은 자료 공유해주셔서 정말 감사합니다. 관련 내용을 공부하면서 정말 많은 도움을 받고 있습니다.
Issue에 글을 쓰게된 이유는 다름이 아니라
08.CNN 예제에서 logsoftmax와 cross-entropy의 중복과 관련된 내용을 문의드리기 위함입니다.
CNNClassifier의 output은 모델의 출력값에 log_softmax를 취한 결과를 리턴한다고 되어 있는데요.
후에 모델의 출력 값을 pred라는 변수로 받아서, loss_function(Cross-Entropy)에 input으로 넣어주게 되는데,
Pytorch의 Cross-Entropy 함수는 softmax 함수를 통과하기전 raw score의 결과를 input으로 받는다고 알고 있습니다.
따라서, 예제의 코드는 혹시 softmax가 2번 중첩되어 적용되는 것이 아닌지 궁금하여, 문의를 드리게 되었습니다.
감사합니다.
`
download dependency parser dataset... (clone from https://github.com/rguthrie3/DeepDependencyParsingProblemSet
mkdir: created directory '../dataset/dparser'
--2018-03-02 15:08:08-- https://raw.githubusercontent.com/rguthrie3/DeepDependencyParsingProblemSet/master/data/train.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.12.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.12.133|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2018-03-02 15:08:09 ERROR 404: Not Found.
--2018-03-02 15:08:09-- https://raw.githubusercontent.com/rguthrie3/DeepDependencyParsingProblemSet/master/data/vocab.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.12.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.12.133|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2018-03-02 15:08:09 ERROR 404: Not Found.
--2018-03-02 15:08:09-- https://raw.githubusercontent.com/rguthrie3/DeepDependencyParsingProblemSet/master/data/dev.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.12.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.12.133|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2018-03-02 15:08:09 ERROR 404: Not Found.
`
It seems like @rguthrie3 has deleted the repo... Could you please update a new address for us? Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.