Comments (8)
I put a first version of the batched model at https://github.com/spro/practical-pytorch/blob/master/seq2seq-translation/seq2seq-translation-batched.ipynb via 31fdb61
The biggest changes are using pack_padded_sequence
before the encoder RNN and pad_packed_sequence
after it, and the masked cross entropy loss from @jihunchoi after decoding. For the decoder itself changes are minor because it only runs one time step at a time.
from practical-pytorch.
Hi guys, I implemented more features based on this tutorial (e.g. batched computation for attention) and added some notes.
Check out my repo here: https://github.com/howardyclo/pytorch-seq2seq-example/blob/master/seq2seq.ipynb
from practical-pytorch.
Hi,
Thank you for the great work! Would you please add batching to the tutorial as well?
from practical-pytorch.
Hello, @spro I have been working through extending with batch. My code is here https://github.com/vijendra-rana/Random/blob/master/translation_with_batch.py . I have created some fake data for this. But the problem is I am getting error in loss saying
RuntimeError: Trying to backward through the graph second time, but the buffers have already been freed. Please specify retain_variables=True when calling backward for the first time.
I understand we cannot have loss being backwarded twice but I don't see anywhere that I am doing it twice.
Also I have question about masking how would you mask the loss at the encoder. Not sure how to implement it the encoder o/p being size (seq_len,batch,hidden_size) and mask being (batch_size,seq_len)
Thanks in advance for help :)
from practical-pytorch.
Thanks, @spro for your effort in putting these together. Your tutorials are really nice.
from practical-pytorch.
I noticed some implementations of batch seq2seq with attention allow for an embedded size that is different then the hidden size. Is there a reason to match the two sizes?
from practical-pytorch.
@spro Thanks for the nice code sample.
Had some trouble, looking to get some help: I tried to run it out of the box, hit an error in this block:
max_target_length = max(target_lengths)
decoder_input = Variable(torch.LongTensor([SOS_token] * small_batch_size))
decoder_hidden = encoder_hidden[:decoder_test.n_layers] # Use last (forward) hidden state from encoder
all_decoder_outputs = Variable(torch.zeros(max_target_length, small_batch_size, decoder_test.output_size))
if USE_CUDA:
all_decoder_outputs = all_decoder_outputs.cuda()
decoder_input = decoder_input.cuda()
# Run through decoder one time step at a time
for t in range(max_target_length):
decoder_output, decoder_hidden, decoder_attn = decoder_test(
decoder_input, decoder_hidden, encoder_outputs
)
all_decoder_outputs[t] = decoder_output # Store this step's outputs
decoder_input = target_batches[t] # Next input is current target
# Test masked cross entropy loss
loss = masked_cross_entropy(
all_decoder_outputs.transpose(0, 1).contiguous(),
target_batches.transpose(0, 1).contiguous(),
target_lengths
)
print('loss', loss.data[0])
The error reads as follows:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-28-babf231e41ef> in <module>()
13 for t in range(max_target_length):
14 decoder_output, decoder_hidden, decoder_attn = decoder_test(
---> 15 decoder_input, decoder_hidden, encoder_outputs
16 )
17 all_decoder_outputs[t] = decoder_output # Store this step's outputs
/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.pyc in __call__(self, *input, **kwargs)
489 result = self._slow_forward(*input, **kwargs)
490 else:
--> 491 result = self.forward(*input, **kwargs)
492 for hook in self._forward_hooks.values():
493 hook_result = hook(self, input, result)
<ipython-input-24-43d7954b3ba4> in forward(self, input_seq, last_hidden, encoder_outputs)
35 # Calculate attention from current RNN state and all encoder outputs;
36 # apply to encoder outputs to get weighted average
---> 37 attn_weights = self.attn(rnn_output, encoder_outputs)
38 context = attn_weights.bmm(encoder_outputs.transpose(0, 1)) # B x S=1 x N
39
/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.pyc in __call__(self, *input, **kwargs)
489 result = self._slow_forward(*input, **kwargs)
490 else:
--> 491 result = self.forward(*input, **kwargs)
492 for hook in self._forward_hooks.values():
493 hook_result = hook(self, input, result)
<ipython-input-22-61485b548d0f> in forward(self, hidden, encoder_outputs)
27 # Calculate energy for each encoder output
28 for i in range(max_len):
---> 29 attn_energies[b, i] = self.score(hidden[:, b], encoder_outputs[i, b].unsqueeze(0))
30
31 # Normalize energies to weights in range 0 to 1, resize to 1 x B x S
<ipython-input-22-61485b548d0f> in score(self, hidden, encoder_output)
40 elif self.method == 'general':
41 energy = self.attn(encoder_output)
---> 42 energy = hidden.dot(energy)
43 return energy
44
RuntimeError: Expected argument self to have 1 dimension, but has 2
from practical-pytorch.
@suwangcompling hidden = hidden.squeeze(), encoder_output = encoder_output.squeeze()
you can try it!
from practical-pytorch.
Related Issues (20)
- Issue on Windows
- Seq-seq not working for creating chatbot
- How to save and load train model and use it for evaluation HOT 2
- Link for Series 2 - RNNs for time-series data
- Question about Luong Attention Implementation HOT 7
- can't import torch HOT 2
- The link for Teacher Forcing in "Translation with a Sequence to Sequence Network and Attention" is broken
- Error in BahdanauAttnDecoderRNN HOT 1
- Issues in your tutorial on Classifying Names with a Character-Level RNN
- I can't calculate the score of attention in Seq2Seq Translation. HOT 2
- Error in practical-pytorch/seq2seq-translation/seq2seq-translation-batched.ipynb
- Question from character level RNN classifier, why not use the hidden state across epochs? HOT 1
- RuntimeError: 1D tensors expected, got 2D, 2D tensors HOT 1
- May I know how to support a new sentence translation?
- seq2seq: Replace the embeddings with pre-trained word embeddings such as word2vec
- About seq2seq-translation-batched.py RuntimeError HOT 1
- Wrong implementation of attention mechanism in pytorch tutorials
- FileNotFoundError: [Errno 2] No such file or directory: 'char-rnn-classification.pt'
- small format issue
- Implicit dimension choice for log_softmax has been deprecated while running 'python train.py'
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from practical-pytorch.