Batch support in seq2seq tutorial about practical-pytorch HOT 8 OPEN

spro commented on May 24, 2024

Batch support in seq2seq tutorial

from practical-pytorch.

Comments (8)

spro commented on May 24, 2024 9

I put a first version of the batched model at https://github.com/spro/practical-pytorch/blob/master/seq2seq-translation/seq2seq-translation-batched.ipynb via 31fdb61

The biggest changes are using pack_padded_sequence before the encoder RNN and pad_packed_sequence after it, and the masked cross entropy loss from @jihunchoi after decoding. For the decoder itself changes are minor because it only runs one time step at a time.

from practical-pytorch.

howardyclo commented on May 24, 2024 4

Hi guys, I implemented more features based on this tutorial (e.g. batched computation for attention) and added some notes.
Check out my repo here: https://github.com/howardyclo/pytorch-seq2seq-example/blob/master/seq2seq.ipynb

from practical-pytorch.

ehsanasgari commented on May 24, 2024

Hi,
Thank you for the great work! Would you please add batching to the tutorial as well?

from practical-pytorch.

vijendra-rana commented on May 24, 2024

Hello, @spro I have been working through extending with batch. My code is here https://github.com/vijendra-rana/Random/blob/master/translation_with_batch.py . I have created some fake data for this. But the problem is I am getting error in loss saying



RuntimeError: Trying to backward through the graph second time, but the buffers have already been freed. Please specify retain_variables=True when calling backward for the first time.

I understand we cannot have loss being backwarded twice but I don't see anywhere that I am doing it twice.
Also I have question about masking how would you mask the loss at the encoder. Not sure how to implement it the encoder o/p being size (seq_len,batch,hidden_size) and mask being (batch_size,seq_len)

Thanks in advance for help :)

from practical-pytorch.

vijendra-rana commented on May 24, 2024

Thanks, @spro for your effort in putting these together. Your tutorials are really nice.

from practical-pytorch.

physicsman commented on May 24, 2024

I noticed some implementations of batch seq2seq with attention allow for an embedded size that is different then the hidden size. Is there a reason to match the two sizes?

from practical-pytorch.

suwangcompling commented on May 24, 2024

@spro Thanks for the nice code sample.
Had some trouble, looking to get some help: I tried to run it out of the box, hit an error in this block:

max_target_length = max(target_lengths)


decoder_input = Variable(torch.LongTensor([SOS_token] * small_batch_size))
decoder_hidden = encoder_hidden[:decoder_test.n_layers] # Use last (forward) hidden state from encoder
all_decoder_outputs = Variable(torch.zeros(max_target_length, small_batch_size, decoder_test.output_size))

if USE_CUDA:
    all_decoder_outputs = all_decoder_outputs.cuda()
    decoder_input = decoder_input.cuda()

# Run through decoder one time step at a time
for t in range(max_target_length):
    decoder_output, decoder_hidden, decoder_attn = decoder_test(
        decoder_input, decoder_hidden, encoder_outputs
    )
    all_decoder_outputs[t] = decoder_output # Store this step's outputs
    decoder_input = target_batches[t] # Next input is current target

# Test masked cross entropy loss
loss = masked_cross_entropy(
    all_decoder_outputs.transpose(0, 1).contiguous(),
    target_batches.transpose(0, 1).contiguous(),
    target_lengths
)
print('loss', loss.data[0])

The error reads as follows:


---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-28-babf231e41ef> in <module>()
     13 for t in range(max_target_length):
     14     decoder_output, decoder_hidden, decoder_attn = decoder_test(
---> 15         decoder_input, decoder_hidden, encoder_outputs
     16     )
     17     all_decoder_outputs[t] = decoder_output # Store this step's outputs

/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.pyc in __call__(self, *input, **kwargs)
    489             result = self._slow_forward(*input, **kwargs)
    490         else:
--> 491             result = self.forward(*input, **kwargs)
    492         for hook in self._forward_hooks.values():
    493             hook_result = hook(self, input, result)

<ipython-input-24-43d7954b3ba4> in forward(self, input_seq, last_hidden, encoder_outputs)
     35         # Calculate attention from current RNN state and all encoder outputs;
     36         # apply to encoder outputs to get weighted average
---> 37         attn_weights = self.attn(rnn_output, encoder_outputs)
     38         context = attn_weights.bmm(encoder_outputs.transpose(0, 1)) # B x S=1 x N
     39 

/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.pyc in __call__(self, *input, **kwargs)
    489             result = self._slow_forward(*input, **kwargs)
    490         else:
--> 491             result = self.forward(*input, **kwargs)
    492         for hook in self._forward_hooks.values():
    493             hook_result = hook(self, input, result)

<ipython-input-22-61485b548d0f> in forward(self, hidden, encoder_outputs)
     27             # Calculate energy for each encoder output
     28             for i in range(max_len):
---> 29                 attn_energies[b, i] = self.score(hidden[:, b], encoder_outputs[i, b].unsqueeze(0))
     30 
     31         # Normalize energies to weights in range 0 to 1, resize to 1 x B x S

<ipython-input-22-61485b548d0f> in score(self, hidden, encoder_output)
     40         elif self.method == 'general':
     41             energy = self.attn(encoder_output)
---> 42             energy = hidden.dot(energy)
     43             return energy
     44 

RuntimeError: Expected argument self to have 1 dimension, but has 2

from practical-pytorch.

NLPScott commented on May 24, 2024

@suwangcompling hidden = hidden.squeeze(), encoder_output = encoder_output.squeeze()
you can try it!

from practical-pytorch.

Batch support in seq2seq tutorial about practical-pytorch HOT 8 OPEN

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent