Giter Site home page Giter Site logo

dsksd / deepnlp-models-pytorch Goto Github PK

View Code? Open in Web Editor NEW
2.9K 112.0 663.0 1.28 MB

Pytorch implementations of various Deep NLP models in cs-224n(Stanford Univ)

License: MIT License

Jupyter Notebook 98.50% Python 0.65% Shell 0.85%
pytorch nlp deep-nlp-models stanford-univ rnn neural-network deep-learning cs-224n natural-language-processing

deepnlp-models-pytorch's People

Contributors

dsksd avatar onetaken avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deepnlp-models-pytorch's Issues

Input sequences contain part of the outputs in CNN-for-Text-Classification

Step [54] "data = [[d.split(':')[1][:-1], d.split(':')[0]] for d in data]" seems to include the sub-category output into the input sequence.
For example, for data line "DESC:def What is ethology ?", data will be ["def What is ethology ?", "DESC"] so the "def" sub-category is included into the input sequence.

I suggest a fix:

# Remove the sub-category (first word) and the '?' at the end.
data = [[d.split(':')[1].split(' ', 1)[1][:-2], d.split(':')[0]] for d in data]

Saved model is showing wrong prediction when running separately in Question Answering task

I was checking out Question and Answering using Dynamic Memory Network (DMN) for BabI dataset from this source: 10.Dynamic-Memory-Network-for-Question-Answering.ipynb

I modified it above a bit so that I can save the model and later run the prediction/inference separately. src: dmn_babi.ipynb and I saved my model as 'dmn_qa'. Moreover, the inference is showing correctly when I am running wholly from the Ipython Notebook.

Later I added 2 separate files which are

  1. model.py - contains the DMN model, src: model.py
  2. prediction.py - contains the loading of the model and inference part of the code. src: predict.py

However, when I run the prediction part now the output is not coming correctly. Please help, where I am doing wrong here.

Best Regards

about padding sequence

Hi,
In file 08.CNN-for-Text-Classification.ipynb, where do you pad the input? Is it in [110], line 7:
x_p.append(torch.cat([x[i], Variable(LongTensor([word2index['']] * (max_x - x[i].size(1)))).view(1, -1)], 1))?
Thanks!

Why Skip-gram models need 2 embedding layer ?

Hi SungDong. Thanks for the great posts. I am reading the first two models on skip-gram. Why do you use two embedding instead of one? The second embedding_u has all the same weights for each row after I train it. Based on the formula on this model, I think it should have only one embedding for all word vectors. Am I missing some details ?


Is the second matrix used for efficiency ? I guess the second matrix can be replace by a linear transformation with the transpose size. But the prediction is a one-hot vector, so it is a waste to compute bunches of zeros. A matrix look up is far more efficient.

class Skipgram(nn.Module):
    
    def __init__(self, vocab_size, projection_dim):
        super(Skipgram,self).__init__()
        self.embedding_v = nn.Embedding(vocab_size, projection_dim)
        self.embedding_u = nn.Embedding(vocab_size, projection_dim)

        self.embedding_v.weight.data.uniform_(-1, 1) # init
        self.embedding_u.weight.data.uniform_(0, 0) # init
        #self.out = nn.Linear(projection_dim,vocab_size)

error in Window-Classifier-for-NER model

In the 4th jupyter file named Word Window Classification and Neural Networks, I found something wrong. Specifically, in the class WindowClassifier, you has used self.softmax = nn.LogSoftmax(dim=1)in the output layer, then you don't need use CrossEntropyLoss() loss, you must use torch.nn.NLLLoss loss

about the negative example loss in the Skip-gram-Negative-Sampling algorithm

I have learned a lot from this elegant project. Thanks a lots!
Based on the equation in the Skip-gram-Negative-Sampling algorithm below,
微信图片_20210425223243

I think the negative example loss calculated by

negative_score = torch.sum(neg_embeds.bmm(center_embeds.transpose(1, 2)).squeeze(2), 1).view(negs.size(0), -1) # BxK -> Bx1
loss = self.logsigmoid(positive_score) + self.logsigmoid(negative_score)

maybe change to
negative_score = neg_embeds.bmm(center_embeds.transpose(1, 2))
loss = self.logsigmoid(positive_score) + torch.sum(self.logsigmoid(negative_score), 1)
since based on the equation, the negative_socre first goes through a logsigmoid operation, and then sums up.

Not able to reproduce results for CNN-for-Text-Classification

Hey,

I am trying to reproduce this notebook, but the loss do not go down as advertised.

[0/5] mean_loss : 1.80
[1/5] mean_loss : 1.64
[2/5] mean_loss : 1.64
[3/5] mean_loss : 1.62
[4/5] mean_loss : 1.76

I am using PyTorch v0.4 on CUDA. My hypothesis is the newer version broke something.

>torch.__version__
'0.4.0a0+a3e9151'

Thanks!

about pretrained embeddings

Hi,
I have a little question about file 08.CNN-for-Text-Classification.ipynb, [96], line 4: pretrained.append(model[word2index[key]]).
word2index[key] means to find key's index, then you should find its pretrained embedding in GoogleNews-vectors-negative300.bin. But the index in this bin file should be different from the index generated from TREC dataset, i.e. model[key's index] may not be this key's (word's) embedding.
Thanks!

08. CNN-for-Text-Classification LogSoftmax와 Cross-entropy

안녕하세요

좋은 자료 공유해주셔서 정말 감사합니다. 관련 내용을 공부하면서 정말 많은 도움을 받고 있습니다.

Issue에 글을 쓰게된 이유는 다름이 아니라
08.CNN 예제에서 logsoftmax와 cross-entropy의 중복과 관련된 내용을 문의드리기 위함입니다.

CNNClassifier의 output은 모델의 출력값에 log_softmax를 취한 결과를 리턴한다고 되어 있는데요.

후에 모델의 출력 값을 pred라는 변수로 받아서, loss_function(Cross-Entropy)에 input으로 넣어주게 되는데,
Pytorch의 Cross-Entropy 함수는 softmax 함수를 통과하기전 raw score의 결과를 input으로 받는다고 알고 있습니다.

따라서, 예제의 코드는 혹시 softmax가 2번 중첩되어 적용되는 것이 아닌지 궁금하여, 문의를 드리게 되었습니다.

감사합니다.

some data is not available now

`
download dependency parser dataset... (clone from https://github.com/rguthrie3/DeepDependencyParsingProblemSet
mkdir: created directory '../dataset/dparser'
--2018-03-02 15:08:08-- https://raw.githubusercontent.com/rguthrie3/DeepDependencyParsingProblemSet/master/data/train.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.12.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.12.133|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2018-03-02 15:08:09 ERROR 404: Not Found.

--2018-03-02 15:08:09-- https://raw.githubusercontent.com/rguthrie3/DeepDependencyParsingProblemSet/master/data/vocab.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.12.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.12.133|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2018-03-02 15:08:09 ERROR 404: Not Found.

--2018-03-02 15:08:09-- https://raw.githubusercontent.com/rguthrie3/DeepDependencyParsingProblemSet/master/data/dev.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.12.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.12.133|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2018-03-02 15:08:09 ERROR 404: Not Found.
`

It seems like @rguthrie3 has deleted the repo... Could you please update a new address for us? Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.