Giter Site home page Giter Site logo

Comments (1)

aa1607 avatar aa1607 commented on June 8, 2024

Hi yes I'm a little confused about a few of things.

def train(input, target):
...
for c in range(chunk_len):
output, hidden = decoder(inp[c], hidden)
loss += criterion(output, target[c])

like you're defining each training loop of the rnn to cycle individually through of charcters in the sequence.

  1. Wouldnt that mean that a training loop is only a small part of the whole dataset and definitely not a whole epoch?

  2. I noticed that on this page you changed the forward method to also have a non-unit batch dimension.
    https://github.com/spro/char-rnn.pytorch/blob/master/train.py . Is there any reason you went with batch_size = 1 in this tutorial ?

  3. also i thought you didnt need to break up your sequence inputs to the rnn? Eg if i take out the for loop and just feed in the input:

def train(input, target):
output, h = charnn(input, hidden)

the model doesnt return an error? would cycling a sequence at a time rather than a sequential unit at a time not work instead and be simpler?

If you cycle through each character individually as you've done, then does that mean that the model is any different to one that goes sequence by sequence ?

I was thinking that in an attention model the for loop might help you 'collect up' the hidden state at each timestep, since only the last hidden state is returned by default, but you're not applying attention here. So I cant think of a reason...

Thanks for any help, and I think your tutorials are fantastic.

from practical-pytorch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.