Giter Site home page Giter Site logo

Comments (5)

jcjohnson avatar jcjohnson commented on August 24, 2024

This is not currently implemented, but it wouldn't be hard to do. During training the RNN learns a language model, which is able to assign a probability to an arbitrary sequence of tokens; by sampling from this language model we can generate new text, but if you have an existing piece of text that you want to score then you can just compute its probability under the language model.

All you need to do is run a forward pass of the trained model on your new piece of text; the probability of that text is (equivalent to) the training loss. To implement this, you'd need to load the model and set it to evaluate mode as in sample.lua; you'll then need to construct a CrossEntropyCriterion and use it to compute training loss as in train.lua.

from torch-rnn.

ChrisCummins avatar ChrisCummins commented on August 24, 2024

Thanks for the prompt and complete answer! I will definitely have a go at implementing this. If successful, would you be open to a pull request with a score.lua script (or some more appropriate name)?

from torch-rnn.

aliabbasjp avatar aliabbasjp commented on August 24, 2024

@ChrisCummins It is already implemented here:
billzorn@69d91a3

@jcjohnson might want to accept a pull request

from torch-rnn.

ChrisCummins avatar ChrisCummins commented on August 24, 2024

Thanks for the pointer @aliabbasjp! Reading through the diff it seems like this fork has departed quite a bit from upstream. I'm assuming the usage would be something like this?

th train.lua -input_h5 <corpus-to-compare-likeness-against> -init_from <checkpoint-trained-on-dataset> -unk 1

Perhaps @billzorn could weigh-in.

from torch-rnn.

ianni67 avatar ianni67 commented on August 24, 2024

@aliabbasjp sorry for being dumb, but I cannot find, in the code you pointed out, the function or the options to do what @ChrisCummins asked for. Would you please, give us some more information?
Thank you very much in advance.

[edit]: I use to blame students when they don't do their homeworks before asking for help. So I tried and read more carefully each file "*.lua" in the directory. I found that eval.lua seems to do something quite similar to what Chris was asking for, so the solution should be to modify eval.lua in order to make it evaluate the text from an input file instead of evaluating a split of the learned dataset.
Am I right?

from torch-rnn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.