hunkim / word-rnn-tensorflow Goto Github PK

View Code? Open in Web Editor NEW

1.3K 83.0 500.0 564 KB

Multi-layer Recurrent Neural Networks (LSTM, RNN) for word-level language models in Python using TensorFlow.

License: MIT License

Python 100.00%

rnn tensorflow rnn-tensorflow lstm python

word-rnn-tensorflow's Introduction

word-rnn-tensorflow

Multi-layer Recurrent Neural Networks (LSTM, RNN) for word-level language models in Python using TensorFlow.

Mostly reused code from https://github.com/sherjilozair/char-rnn-tensorflow which was inspired from Andrej Karpathy's char-rnn.

Requirements

Tensorflow 1.1.0rc0

Basic Usage

To train with default parameters on the tinyshakespeare corpus, run:

python train.py

To sample from a trained model

python sample.py

To pick using beam search, use the --pick parameter. Beam search can be further customized using the --width parameter, which sets the number of beams to search with. For example:

python sample.py --pick 2 --width 4

Sample output

Word-RNN

LEONTES:
Why, my Irish time?
And argue in the lord; the man mad, must be deserved a spirit as drown the warlike Pray him, how seven in.

KING would be made that, methoughts I may married a Lord dishonour
Than thou that be mine kites and sinew for his honour
In reason prettily the sudden night upon all shalt bid him thus again. times than one from mine unaccustom'd sir.

LARTIUS:
O,'tis aediles, fight!
Farewell, it himself have saw.

SLY:
Now gods have their VINCENTIO:
Whipt fearing but first I know you you, hinder truths.

ANGELO:
This are entitle up my dearest state but deliver'd.

DUKE look dissolved: seemeth brands
That He being and
full of toad, they knew me to joy.

Char-RNN

ESCALUS:
What is our honours, such a Richard story
Which you mark with bloody been Thilld we'll adverses:
That thou, Aurtructs a greques' great
Jmander may to save it not shif theseen my news
Clisters it take us?
Say the dulterout apy showd. They hance!

AnBESS OF GUCESTER:
Now, glarding far it prick me with this queen.
And if thou met were with revil, sir?

KATHW:
I must not my naturation disery,
And six nor's mighty wind, I fairs, if?

Messenger:
My lank, nobles arms;

Beam search

Beam search differs from the other --pick options in that it does not greedily pick single words; rather, it expands the most promising nodes and keeps a running score for each beam.

Word-RNN (with beam search)

# python sample.py --prime "KING RICHARD III:" -n 100 --pick 2 --width 4

KING RICHARD III:
you, and and and and have been to be hanged, I am not to be touched?

Provost:
A Bohemian born, for tying his own train,
Forthwith by all that converses more with a crow-keeper;
I have drunk, Broach'd with the acorn cradled. Follow.

FERDINAND:
Who would not be conducted.

BISHOP OF ELY:
If you have been a-bed an acre of barren ground, hath holy;
I warrant, my lord restored of noon.

ISABELLA:
'Save my master and his shortness whisper me to the pedlar;
Money's a medler.
That I will pamper it to complain.

VOLUMNIA:
Indeed, I am

Word-RNN (without beam search)

# python sample.py --prime "KING RICHARD III:" -n 100

KING RICHARD III:
marry, so and unto the wind have yours;
And thou Juliet, sir?

JULIET:
Well, wherefore speak your disposition cousin;
May thee flatter.
My hand will answer him;
e not to your Mariana Below these those and take this life,
That stir not light of reason.
The time Lucentio keeps a root from you.
Cursed be his potency,
It was my neighbour till the birth and I drank stay.

MENENIUS:
Here's the matter,
I know take this sour place,
they know allegiance Had made you guilty.
You do her bear comfort him between him or our noble bosom he did Bolingbroke's

Projects

If you have any project using this word-rnn, please let us know. I'll list up your project here.

http://bot.wpoem.com/ (Simple poem generator in Korean)

Contribution

Your comments (issues) and PRs are always welcome.

word-rnn-tensorflow's People

Contributors

Stargazers

Watchers

Forkers

californiacoffee ouya-bytes tegg89 sarnthil skycache tylerehc kjunu nmrksic rjbashar nirajanrk bahmanh brotherphil codeaudit prsahu netinmax chictypes zhousichong little1tow wanjinchang martin-prillard mothepro andreasloukakis abhi3p penultimatum2 kartikv11 pranjaldaga leconteur metalzoa tobby2002 vyraun caomw darren1231 schwittlick canbuoy bzz wichtounet manglakaran abao1999 mjk276 ilyeong-ai nanopony johndpope j-min devashishshankar emadhelmi vhf keon ml-lab kenstars chagge binbinbian jdc08161063 allensmile benjamesbabala dylanler arbdigital likebullet86 fatdopa polaris79 winnerineast wangpeng138375 johnsonc moses1994 rishabh254 lishengfever pku-wuwei techscientist jinz1 nanuri vode normanheckscher theoncomingstorm pursueorigin ineot jaciyu elainechan ecekt fengqian1989 hanialshater g0josh ishaanprasad dejori savvai pcgreat wujiahongpku calebfoss rlei sumantk cosastro yaoconglei pranav0897 harock2014 ssamkj matrix8128 njuxiaoqian xuxue1 piandpower anueshenxing eedanny anujgupta82

word-rnn-tensorflow's Issues

How to calculate perplexity using this library

Hi,
Would you please tell me how to calculate the perplexity using this RNN ? may be some fields or location you can help and fields from where i can get

TensorBoard - Embedding Projector

Edit: nevermind, I got the embeddings projector to work, will submit a pull request to add this to the project, it's a really nice feature for exploring data .

Support Cloud ML?

We need to change native python IO to support Google Storage, gs://.

Some hits are:

Output of sample.py

Hi,

I tried to train a model on the minishakespear dataset. Then, when I run sample.py, it keeps giving me output like this:

Fame,
Fame,
Fame, full complete; How many it trust no darest to lour? Dive, A merriment fled on a month. Nurse: As he, brother, at the tale his exile, thou shalt thereto with them swarm on Richmond.

There are always three "Fame" at the beginning. Even when I change the num to 1, it gives me output like:

Fame,
Fame,
Fame, Lancaster,

And in the sample function of model.py, I see this,

ret = prime

#other code

    pred = words[sample]
    ret += ' ' + pred
    word = pred
return ret

So, I think the prime string will also be included in the output, right?

Can you help me in understanding the output? Thank you.

First of all, it is a very good repo, thanks to all who contributed. Actually, this is not an issue, just a question: Did you use any word embedding (like GloVe, doc2vec or fasttext) for the input words? If so, how can we change that from one to another? If not, where should we add it and how can we organize the dependent parameters such as vocab?

rnn_decoder -> attention_decoder ?

Hello everyone, I'm new to tensorflow...
The code works perfectly on my machine, but I wonder how to modify
legacy_seq2seq.rnn_decoder() -> legacy_seq2seq.attention_decoder() ?

I have tried a lot, but it still doesn't work...
So if anyone can tell me how to define attention_states.
Thanks!

Score a sentence ?

Hello,
Is there a way to give a score for a gine sentence ??? say a perplexity score

Vocab index ordering?

Why do we calculate word counts if we ultimately include all words and sort the vocab alphabetically? Mostly I'm wondering if you ordered it with the most common words first, and later reverted to alphabetical ordering for some reason. Would it be bad to have the most common words first? Or do we expect the model would learn better with a random ordering?

In utils.py:

        vocabulary_inv = [x[0] for x in word_counts.most_common()]
        vocabulary_inv = list(sorted(vocabulary_inv))

beam search pick - error

Hi:

Training and standard sampling working perfectly in tensorflow 12.1 on 64 bit ubuntu with gpu.

Problem occurs when choosing "beam search pick" for sampling. After 15 minutes of trying to generate default 200 words the following warning appears:

/home/aaron/Desktop/word-rnn-tensorflow/beam.py:26: RuntimeWarning: divide by zero encountered in log
cand_scores = np.array(live_scores)[:, None] - np.log(self.probs)

25 minutes still no terminal output or redirect to text file.

Any ideas how I can fix this?

Thanks

Save directory checking?

Should we put the save directory under the data directory? Or at least, we need to check if the directory exists.

Writing the output into a file instead of printing in console

Since the Windows cmd has a limit of around 5000 words, it is problematic if you want to do a sample greater than that. I modified the sample.py file in my fork so that it append the strings to a file called "output.txt" instead. Feel free to look at it and implement it in your code. I never did python before this, so it may have some issues that I am not personally seeing.

UnicodeEncodeError When Training With Different Encoding

I first trained the seq to seq model using a UTF-8 encoding (instead of the default ASCII). When sampling from this model, I get the following error:

UnicodeEncodeError: 'ascii' codec can't encode character '\u2018' in position 58: ordinal not in range(128)

How can I fix this?

error while running train.py

The following screenshots are errors:
Please suggest me solution

Changes in Model.py for Python 3.52

Thank you for this wonderful program. I use it with a few writings of the german philosopher
Friedrich Wilhelm Nietzsche.
To use the program 'Model.py' with Python 3.52, the following changes must be made (#changed Python 3.5):

import tensorflow as tf
# from tensorflow.models.rnn import tf.nn.rnn_cell
# from tensorflow.models.rnn import tf.nn.seq2seq

import numpy as np

class Model():
    def __init__(self, args, infer=False):
        self.args = args
        if infer:
            args.batch_size = 1
            args.seq_length = 1

        if args.model == 'rnn':
            cell_fn = tf.nn.rnn_cell.BasicRNNCell #changed
        elif args.model == 'gru':
            cell_fn = tf.nn.rnn_cell.GRUCell #changed
        elif args.model == 'lstm':
            cell_fn = tf.nn.rnn_cell.BasicLSTMCell #changed
        else:
            raise Exception("model type not supported: {}".format(args.model))

        cell = cell_fn(args.rnn_size)

        self.cell = cell = tf.nn.rnn_cell.MultiRNNCell([cell] * args.num_layers) #changed

        self.input_data = tf.placeholder(tf.int32, [args.batch_size, args.seq_length])
        self.targets = tf.placeholder(tf.int32, [args.batch_size, args.seq_length])
        self.initial_state = cell.zero_state(args.batch_size, tf.float32)

        with tf.variable_scope('rnnlm'):
            softmax_w = tf.get_variable("softmax_w", [args.rnn_size, args.vocab_size])
            softmax_b = tf.get_variable("softmax_b", [args.vocab_size])
            with tf.device("/cpu:0"):
                embedding = tf.get_variable("embedding", [args.vocab_size, args.rnn_size])
                inputs = tf.split(1, args.seq_length, tf.nn.embedding_lookup(embedding, self.input_data))
                inputs = [tf.squeeze(input_, [1]) for input_ in inputs]

        def loop(prev, _):
            prev = tf.matmul(prev, softmax_w) + softmax_b
            prev_symbol = tf.stop_gradient(tf.argmax(prev, 1))
            return tf.nn.embedding_lookup(embedding, prev_symbol)

        outputs, last_state = tf.nn.seq2seq.rnn_decoder(inputs, self.initial_state, cell, loop_function=loop if infer else None, scope='rnnlm') #changed
        output = tf.reshape(tf.concat(1, outputs), [-1, args.rnn_size])
        self.logits = tf.matmul(output, softmax_w) + softmax_b
        self.probs = tf.nn.softmax(self.logits)
        loss = tf.nn.seq2seq.sequence_loss_by_example([self.logits], #changed
                [tf.reshape(self.targets, [-1])],
                [tf.ones([args.batch_size * args.seq_length])],
                args.vocab_size)
        self.cost = tf.reduce_sum(loss) / args.batch_size / args.seq_length
        self.final_state = last_state
        self.lr = tf.Variable(0.0, trainable=False)
        tvars = tf.trainable_variables()
        grads, _ = tf.clip_by_global_norm(tf.gradients(self.cost, tvars),
                args.grad_clip)
        optimizer = tf.train.AdamOptimizer(self.lr)
        self.train_op = optimizer.apply_gradients(list(zip(grads, tvars))) #changed Python 3.5

    def sample(self, sess, words, vocab, num=200, prime='first all', sampling_type=1):
        state = self.cell.zero_state(1, tf.float32).eval()
        prime = list(vocab.keys())[2] #changed Python 3.5
        print(prime) #changed Python 3.5
        for word in [prime]:
            print (word)
            x = np.zeros((1, 1))
            x[0, 0] = vocab[word]
            feed = {self.input_data: x, self.initial_state:state}
            [state] = sess.run([self.final_state], feed)

        def weighted_pick(weights):
            t = np.cumsum(weights)
            s = np.sum(weights)
            return(int(np.searchsorted(t, np.random.rand(1)*s)))

        ret = prime
        word = prime
        for n in range(num):
            x = np.zeros((1, 1))
            x[0, 0] = vocab[word]
            feed = {self.input_data: x, self.initial_state:state}
            [probs, state] = sess.run([self.probs, self.final_state], feed)
            p = probs[0]

            if sampling_type == 0:
                sample = np.argmax(p)
            elif sampling_type == 2:
                if word == '\n':
                    sample = weighted_pick(p)
                else:
                    sample = np.argmax(p)
            else: # sampling_type == 1 default:
                sample = weighted_pick(p)

            pred = words[sample]
            ret += ' ' + pred
            word = pred
        return ret

seq2seq can not import name

Hey, Thank you for this model. I'm a noob, trying to utilize this to push my thoughts on writing. I'd like to use this model to see what sort of word couplings it could produce to incorporate into my novel. Thanks for the help.

I am getting this error. Using python 2.7 with Tensorflow 1.2 on mac - Sierra.

Stevens-MacBook-Pro:char-rnn-tensorflow-master StevenOchs$ python train.py
Traceback (most recent call last):
File "train.py", line 10, in
from model import Model
File "/Users/StevenOchs/Desktop/char-rnn-tensorflow-master/model.py", line 3, in
from tensorflow.contrib import legacy_seq2seq
ImportError: cannot import name legacy_seq2seq

Any thoughts?

IndexError: list index out of range at pred = words[sample]

Cannot reproduce it anymore, but for the record:

hunkim:~/word-rnn-tensorflow$ python sample.py --prime="kmalloc"
kmalloc
Traceback (most recent call last):
  File "sample.py", line 42, in <module>
    main()
  File "sample.py", line 25, in main
    sample(args)
  File "sample.py", line 39, in sample
    print(model.sample(sess, words, vocab, args.n, args.prime, args.sample))
  File "/home/hunkim/word-rnn-tensorflow/model.py", line 97, in sample
    pred = words[sample]
IndexError: list index out of range

time/batch continues to increase

I'm assuming the amount of time that each batch takes to run shouldn't continuously increase? This is great work by the way!

slowing down over time

With the new changes training now starts slowing down. First it's normal speed but after about 1000 or so batches it more than doubles the time and eventually it takes almost 50 times as much time to complete a batch and seems to go up geometrically at some point.

My batch times went from 0.7 seconds each to 30 seconds each in under 10,000 batches.

I have read that this usually means the graph has to be finalized or the like but I'm still trying to figure out how that applies here so maybe someone else will beat me to it.

If I stop the training and resume, the time will go back to around 0.7 seconds per batch.

Edit: just removed the graph code and tested and something is still doing it. I reverted to the original code and it doesn't appear to be having the same problems so it's definitely not the graph but there's something causing it for sure.

Beam search?

Currently, we are using weighted_pick to select outputs:

def weighted_pick(weights):
            t = np.cumsum(weights)
            s = np.sum(weights)
            return(int(np.searchsorted(t, np.random.rand(1)*s)))

Should we also add beam search as an option?

See beam search implementations in Tensorflow tensorflow/tensorflow#654!

ImportError: cannot import name 'rnn_cell'

Does the import statement change for Windows?
I'm sorry, I'm not pretty versed in Python.

Isn't weighted_pick() giving out random words?

Hi, does weighted_pick() return random characters?

with sampling_type=1, return(int(np.searchsorted(t, np.random.rand(1)*s))) looks like giving a random choice out of the array t. And that's what I'm having right now. Is it intentional? Or am I doing something wrong?

AttributeError: 'module' object has no attribute 'merge_all'

#python3 train.py

reading text file
Traceback (most recent call last):
File "train.py", line 134, in
main()
File "train.py", line 52, in main
train(args)
File "train.py", line 88, in train
merged = tf.summary.merge_all()
AttributeError: 'module' object has no attribute 'merge_all'

Running it I'm getting this error.
I've install tensor flow with :
export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.11.0-cp34-cp34m-linux_x86_64.whl

Using Word vs char level models

How can I use word vs char level models? I don't see params for it in the config/train file. Thanks!

PS: Can the word level models support pretrained word embeddings?

Using Word RNN for word prediction

I am a undergraduate research assistant at Purdue University. I am attempting to use the Word RNN you have hosted on github to improve automated monkey testing (by supplying context relevant input instead of random input that the monkey uses). In a few words, I would like to give the RNN a word or phrase (the context) and would like the output of the RNN (for the monkey to use as input) to be the next predicted word. However, right now, the Word RNN implementation looks like it does only a sampling and cannot take input (a phrase or word) and output the next predicted word. Any suggestions on how to go about changing the implementation for my needs?

Example:
In a movie app, we record the following sequence:
⟨Movie,Search,Star Trek\n⟩.
And we train the RNN on similar datasets with these kind of sequences.

Later,
If I give the RNN "Movie, Search" as input, I am looking for the predicted next word, which should be relevant input for the app.

Variable rnnlm/softmax_w already exists, disallowed. Did you mean to set reuse=True in VarScope?

Hello. Please help my with little problem. I want to write simple Rest API with Flask:


from flask import Flask
import numpy as np
import tensorflow as tf

import argparse
import time
import os
from six.moves import cPickle

from utils import TextLoader
from model import Model

app = Flask(__name__)

def sample(n, pick, width, prime):
    save_dir = './models/tinyshakespeare'
    sample = 1
    with open(os.path.join(save_dir, 'config.pkl'), 'rb') as f:
        saved_args = cPickle.load(f)
    with open(os.path.join(save_dir, 'words_vocab.pkl'), 'rb') as f:
        words, vocab = cPickle.load(f)
    model = Model(saved_args, True)
    with tf.Session() as sess:
        tf.global_variables_initializer().run()
        saver = tf.train.Saver(tf.global_variables())
        ckpt = tf.train.get_checkpoint_state(save_dir)
        if ckpt and ckpt.model_checkpoint_path:
            saver.restore(sess, ckpt.model_checkpoint_path)
            return model.sample(sess, words, vocab, n, prime, sample, pick, width) 

@app.route("/<int:n>/<int:pick>/<int:width>/<prime>")
def hello(n, pick, width, prime):
    return sample(n, pick, width, prime)


if __name__ == "__main__":

    app.run(debug=True, use_reloader=True)

First time in browser(Good):

http://127.0.0.1:5000/200/2/4/shakespeare
text:
&C: BIANCA: Believe me, boy; and me, sir, of my troth, I am not so. LADY ANNE: I will not be a man to shrink from him, Or else the duke is great; but to be back'd with the Tower. KING EDWARD IV: Why, here's a man of fourscore three, And make Clowder with the Tower. QUEEN ELIZABETH: Why, then I have not wrong'd. KING RICHARD III: Now, sir, thou liest; and I am loath to see it. Wherein, my lord. KING RICHARD III: Now, sir, thou art not fish; if thou couldst, thou hast not wrong'd. KING RICHARD III: Now, sir, I know you not? KING RICHARD III: Why, let me go: I am in estimation; it is not so far officious; for I am undone! I am too senseless--obstinate, my lord, And give me all. KING RICHARD III: Ay, if you faint, as chaste, by that. BAPTISTA: Where is the matter? DUKE OF YORK: I will not come to me that I am able to see the writing. DUKE OF AUMERLE: My lord, I am a man that mutinies in his own blood, Or in the midst of this bright-shining day, I will not spell. But come, I...

but, the second time :
http://127.0.0.1:5000/200/2/4/shakespeare
I get an error:
ValueError: Variable rnnlm/softmax_w already exists, disallowed. Did you mean to set reuse=True in VarScope? Originally defined at:

  File "/home/firestarter/word-rnn-tensorflow-master/model.py", line 54, in __init__
    softmax_w = tf.get_variable("softmax_w", [args.rnn_size, args.vocab_size])
  File "api.py", line 22, in sample
    model = Model(saved_args, True)
  File "api.py", line 33, in hello
    return sample(n, pick, width, prime)

Blue Score

How can i compute blue score on model prediction?

TensorFlow 1.0.0-alpha

Exciting news!

https://github.com/tensorflow/tensorflow/releases/tag/v1.0.0-alpha

However, it includes some API breaking changes. Perhaps, we need to refactor our code.

Please feel free to test and send us PR for TF 1.0!

Why is args.vocab_size passed in sequence_loss_by_example?

I looked up the documentation for sequence_loss_by_example and it doesn't seem to be taking vocab_size as argument. I'd really appreciate it if you could help me understand what this argument is doing. Thanks a lot!
loss = seq2seq.sequence_loss_by_example([self.logits], [tf.reshape(self.targets, [-1])], [tf.ones([args.batch_size * args.seq_length])], args.vocab_size)

How I can continue train.py after reboot?

Hello! My server was rebooted and i need to continue train, but have an error:

`[root@adm word-rnn-tensorflow]# python3.6 train.py --data_dir="./data/adm_seotexts" --save_dir="./save/adm_seotexts" --init-from=./save/adm_seotexts

usage: train.py [-h] [--data_dir DATA_DIR] [--input_encoding INPUT_ENCODING]
[--log_dir LOG_DIR] [--save_dir SAVE_DIR]
[--rnn_size RNN_SIZE] [--num_layers NUM_LAYERS]
[--model MODEL] [--batch_size BATCH_SIZE]
[--seq_length SEQ_LENGTH] [--num_epochs NUM_EPOCHS]
[--save_every SAVE_EVERY] [--grad_clip GRAD_CLIP]
[--learning_rate LEARNING_RATE] [--decay_rate DECAY_RATE]
[--gpu_mem GPU_MEM] [--init_from INIT_FROM]
train.py: error: unrecognized arguments: --init-from=./save/adm_seotext`

But I have needed data in save path:

[root@adm word-rnn-tensorflow]# ls save/adm_seotexts/ checkpoint model.ckpt-82000.meta model.ckpt-84000.data-00000-of-00001 model.ckpt-85000.index model.ckpt-86000.meta config.pkl model.ckpt-83000.data-00000-of-00001 model.ckpt-84000.index model.ckpt-85000.meta words_vocab.pkl model.ckpt-82000.data-00000-of-00001 model.ckpt-83000.index model.ckpt-84000.meta model.ckpt-86000.data-00000-of-00001 model.ckpt-82000.index model.ckpt-83000.meta model.ckpt-85000.data-00000-of-00001 model.ckpt-86000.index
Other variants also not working (only --init-from for example):

`[root@tubeadmin word-rnn-tensorflow]# python3.6 train.py --init-from="./save/tubeadmin_seotexts"

How can I run train.py for continue training? Please help) Training ended after 2 weeks, do not want to start from the beginning :)))

Refactor with state_is_tuple=True

Need to refactor with state_is_tuple=True for TF 0.11?

cell = cell_fn(args.rnn_size, state_is_tuple=True)
self.cell = cell = rnn_cell.MultiRNNCell([cell] * args.num_layers, state_is_tuple=True)

                feed = {model.input_data: x, model.targets: y}
                for i, (c, h) in enumerate(model.initial_state):
                    feed[c] = state[I].c
                    feed[h] = state[I].h

sherjilozair/char-rnn-tensorflow@991704e

Weighted Sampling

Hey,

I am unable to understand the basis for weighted sampling to pick the next word.

def weighted_pick(weights): t = np.cumsum(weights) s = np.sum(weights) return(int(np.searchsorted(t, np.random.rand(1)*s)))

I also tried with np.argmax(p) buts its only giving same words predicted all the time.

How to tell if checkpoints are resuming?

When I use "--init_from", the count on my training starts back at 0 instead of whatever it happened to be. Is this expected?

Pause training

I want to be able to pause training.

Retrain with new input files?

Hi
Isn't it possible to retrain the model with new data, to improve output?
When I train the model with my own txt file, then retrain with a new txt file, using the init_from arg, the first issue is the new data and loaded model will obviously disagree on word set and dictionary mappings, but even if these assertions are removed, I still get:

`2018-03-27 10:41:11.721945: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
Traceback (most recent call last):
  File "/Users/ronanquinn/tensorflow/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1323, in _do_call
    return fn(*args)
  File "/Users/ronanquinn/tensorflow/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1302, in _run_fn
    status, run_metadata)
  File "/Users/ronanquinn/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 473, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [256,725] rhs shape= [256,726]
	 [[Node: save/Assign_26 = Assign[T=DT_FLOAT, _class=["loc:@rnnlm/softmax_w"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](rnnlm/softmax_w/Adam_1, save/RestoreV2_26)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "retrain.py", line 128, in <module>
    main()
  File "retrain.py", line 56, in main
    train(args)
  File "retrain.py", line 96, in train
    saver.restore(sess, ckpt.model_checkpoint_path)
  File "/Users/ronanquinn/tensorflow/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 1666, in restore
    {self.saver_def.filename_tensor_name: save_path})
  File "/Users/ronanquinn/tensorflow/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 889, in run
    run_metadata_ptr)
  File "/Users/ronanquinn/tensorflow/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1120, in _run
    feed_dict_tensor, options, run_metadata)
  File "/Users/ronanquinn/tensorflow/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1317, in _do_run
    options, run_metadata)
  File "/Users/ronanquinn/tensorflow/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1336, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [256,725] rhs shape= [256,726]
	 [[Node: save/Assign_26 = Assign[T=DT_FLOAT, _class=["loc:@rnnlm/softmax_w"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](rnnlm/softmax_w/Adam_1, save/RestoreV2_26)]]

Caused by op 'save/Assign_26', defined at:
  File "retrain.py", line 128, in <module>
    main()
  File "retrain.py", line 56, in main
    train(args)
  File "retrain.py", line 93, in train
    saver = tf.train.Saver(tf.global_variables())
  File "/Users/ronanquinn/tensorflow/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 1218, in __init__
    self.build()
  File "/Users/ronanquinn/tensorflow/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 1227, in build
    self._build(self._filename, build_save=True, build_restore=True)
  File "/Users/ronanquinn/tensorflow/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 1263, in _build
    build_save=build_save, build_restore=build_restore)
  File "/Users/ronanquinn/tensorflow/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 751, in _build_internal
    restore_sequentially, reshape)
  File "/Users/ronanquinn/tensorflow/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 439, in _AddRestoreOps
    assign_ops.append(saveable.restore(tensors, shapes))
  File "/Users/ronanquinn/tensorflow/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 160, in restore
    self.op.get_shape().is_fully_defined())
  File "/Users/ronanquinn/tensorflow/lib/python3.5/site-packages/tensorflow/python/ops/state_ops.py", line 276, in assign
    validate_shape=validate_shape)
  File "/Users/ronanquinn/tensorflow/lib/python3.5/site-packages/tensorflow/python/ops/gen_state_ops.py", line 57, in assign
    use_locking=use_locking, name=name)
  File "/Users/ronanquinn/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/Users/ronanquinn/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2956, in create_op
    op_def=op_def)
  File "/Users/ronanquinn/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1470, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [256,725] rhs shape= [256,726]
	 [[Node: save/Assign_26 = Assign[T=DT_FLOAT, _class=["loc:@rnnlm/softmax_w"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](rnnlm/softmax_w/Adam_1, save/RestoreV2_26)]]`

Incompatible shapes

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [256,725] rhs shape= [256,726]

How is the init_from arg supposed to be used if not for retraining with new data?

example not working

 python train.py
Traceback (most recent call last):
  File "train.py", line 11, in <module>
    from model import Model
  File "/home/user/word-rnn-tensorflow/model.py", line 3, in <module>
    from tensorflow.contrib import legacy_seq2seq
ImportError: cannot import name legacy_seq2seq

i have tensorflow ok

WARNING:tensorflow:From train.py:85 in train.: initialize_all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02. Instructions for updating: Use `tf.global_variables_initializer` instead.

The subject says it all. I'm still learning how to work with TF so I'm unable to correct it myself at this point..

Training Error

Getting the following when running on OS X, Python 2.7.10. Any ideas?

Traceback (most recent call last):
File "train.py", line 11, in
from model import Model
File "/word-rnn-tensorflow-master/model.py", line 2, in
from tensorflow.models.rnn import rnn_cell
File "/tensorflow/lib/python2.7/site-packages/tensorflow/models/rnn/rnn_cell.py", line 21, in
raise ImportError("This module is deprecated. Use tf.nn.rnn_cell instead.")
ImportError: This module is deprecated. Use tf.nn.rnn_cell instead.

how to encode Korean text input file?

When I use Korean test as the input.txt, error occurs like the following.

UnicodeDecodeError: 'cp949' codec can't decode byte 0xff in position 0: illegal multibyte sequence

how can I make computer learn Korean text?

Plz, help me.

Generating words from trained RNN model: “Variable already exists, disallowed. Did you mean to set reuse=True in VarScope? ”

So I am trying to implemented this RNN word generator model in jupytor notebook. When I was trying to use the trained model to generate some words:

with open(os.path.join(cfgs['save_dir'], 'config.pkl'), 'rb') as f:
   saved_args = cPickle.load(f)

with open(os.path.join(cfgs['save_dir'], 'words_vocab.pkl'), 'rb') as f:
   words, vocab = cPickle.load(f)

with tf.Session() as sess:
   model = Model(saved_args, True)
   tf.global_variables_initializer().run()
   saver = tf.train.Saver(tf.global_variables())
   ckpt = tf.train.get_checkpoint_state(cfgs['save_dir'])
   if ckpt and ckpt.model_checkpoint_path:
       saver.restore(sess, ckpt.model_checkpoint_path)
       print(model.sample(sess, words, vocab, cfgs['n'], cfgs['prime'], cfgs['sample'], cfgs['pick'], cfgs['width']))

It works for the first time, but if I run the code again there is an error:

ValueError: Variable rnnlm/softmax_w already exists, disallowed. Did you mean to set reuse=True in VarScope?

Right now I have to shut down the ipynb file then run the code to get a new sample. Any idea to change the code to avoid this situation?

Seperate graphes when not restoring old dataset

The dataset generated from each run is added up to the old one instead of generating a new one for indepedent runs.

How can I add accuracy during training and validation?

I would like to see if the model improves during training or is overfitting. To this end, I would like on the one hand to add accuracy as a metric besides loss, and a validation set to see the validation loss and accuracy. I am new to Tensorflow, is there any easy way to do it? Thanks!

Support for Unicode

Does the script support Unicode? I've been trying to set it up with Malayalam but not able to

Cannot train: 'tuple' object has no attribute 'eval'

Hi,

I tried this implementation of word-rnn and I have an issue with the train script:

Traceback (most recent call last):
File "train.py", line 111, in
main()
File "train.py", line 48, in main
train(args)
File "train.py", line 93, in train
state = model.initial_state.eval()
AttributeError: 'tuple' object has no attribute 'eval'

I'm using the last version of tensorflow and Python 3.4.

I've already had this issue with one other tensorflow project so I guesss there must be something wrong with my installation.

Thanks

about the prepared files?

Hi, we can see that you prepare 3 files from the data directory.
“input.txt” represents the training data
"vocab.pkl" represents the vocabulary list
And i don't understand what does the "data.npy" stands for?

Can I stop train.py and resume it later?

Pretty much what it says in the title. I've noticed that it occasionally saves the model, so I was wondering if it's safe for me to stop it and resume it later.

Perplexity & Performance

Looking for a way to determine performance.
Perhaps it's possible to use perplexity?

Something similar to the following pull request:
sherjilozair/char-rnn-tensorflow#56

Output format.

So from your sample output, the format looks very good, I mean did the model format this for you or you manually separated the paragraphs?

LEONTES:
Why, my Irish time?
And argue in the lord; the man mad, must be deserved a spirit as drown the warlike Pray him, how seven in.

KING would be made that, methoughts I may married a Lord dishonour
Than thou that be mine kites and sinew for his honour
In reason prettily the sudden night upon all shalt bid him thus again. times than one from mine unaccustom'd sir.

LARTIUS:
O,'tis aediles, fight!
Farewell, it himself have saw.

Becuase mine was like this(without any structure)

LEONTES:Why, my Irish time?And argue in the lord; the man mad, must be deserved a spirit as drown the warlike Pray him, how seven in.KING would be made that, methoughts I may married a Lord dishonourThan thou that be mine kites and sinew for his honour

Multiple GPUs

Do you have any idea what would be the best way to leverage multiple GPUs?

After training in the output it is displaying the input we are giving too. How to avoid printing the input again?

rnn_decoder initial_state

Thanks so much for your code hunkim! It is very helpful!

Can I ask a quick question please? Am I right to think: within one batch, every time you feed model.initial_state: state to model, it overrides
self.initial_state = cell.zero_state(args.batch_size, tf.float32) (because this zero_state initialisation is also in model def__init__() thus I am not too sure whether it can be overriden )?

Thanks!