oswaldoludwig / seq2seq-chatbot-for-keras Goto Github PK

This repository contains a new generative model of chatbot based on seq2seq modeling.

License: Apache License 2.0

Python 100.00%

chatbot conversational-agents deep-learning dialogue dialogue-generation gan generative-adversarial-network glove keras nlp seq2seq

seq2seq-chatbot-for-keras's People

Contributors

Stargazers

Watchers

Forkers

phongnhhn92 galameida hareeshbahuleyan rz0718 jld23 lantuzi phpmind kaeflint bekyilma sammy4321 raghavendranpm koshinryuu iqbal-chowdhury nanfengpo sergeyenin nhatnguyen12 xc35 arnaudmkonan shubhampachori12110095 finance-ai vertgo prabhjotsl caoxu915683474 tsingtong mahathivavilala lk251 pijju789 ngoduyvu playchimp hafizurcse zaharponimash cyzhangathit wenyu332 xiongshufeng neuron888 thientu sweetcard artemzi johndpope haonanli manishgotame sushantjha8 john-r-stevenson-iii learnaidrist renatotn7 sunyancn patrickcnkm svakulenk0 hanyinong barkha-patel yanasr mario4272 isanjayyadav asanchezlorente jasonhargrove zw76859420 prismdata samurainote khan007 aosman96 tshepomk aymar73 bitisony yjygo codeahmed mma1979 yudanta mcmxlix yufan-l ekhrapachev yjyjames emsnfi mght nazaninsal m4gr4th34 kprybol blue0221 bob13241 anqitu 595972434 yaroschak anigi98932 ruhul-amin95 annacurly17 chaoyushi warren195 rizwanbinyasin freshjesh5 kucukagan turing-dz lxngoddess5321 ktfth bijoyboban7 karkiadit aicodehunt muhammadsajid97 henrikcozza

seq2seq-chatbot-for-keras's Issues

Can't figure out the format for the input data

This is a great example! I'm trying to train my own set of conversations based off or your code, but I can't find the format for movie_data.txt that is referenced in split_qa.py (line 8).

Can you put a sample of that data or describe what my input data needs to look like to work with your infrastructure?

Thanks!

Weights loading fails

Hi! Trying to launch your example, but facing an error with loading weights:

Traceback (most recent call last):
  File "conversation.py", line 160, in <module>
    model.load_weights(weights_file)
  File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 2367, 
in load_weights  ' elements.')
Exception: Layer #3 (named "Encode answer up to the current token" in the current model) 
was found to correspond to layer lstm_1 in the save file. 
However the new layer Encode answer up to the current token expects 12 weights, 
but the saved weights have 3 elements.

The file seq2seq_model.h5 isnt exist

when i tried to programm a chatbot for the fisrt time it gave me this error

Traceback (most recent call last):
File "D:\Boody\Boody\VSCode\BeChat Chatbot\BeChat.py", line 5, in
model = tf.keras.models.load_model('seq2seq_model.h5')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Boody\AppData\Local\Programs\Python\Python311\Lib\site-packages\keras\saving\saving_api.py", line 212, in load_model
return legacy_sm_saving_lib.load_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Boody\AppData\Local\Programs\Python\Python311\Lib\site-packages\keras\utils\traceback_utils.py", line 70, in error_handler
raise e.with_traceback(filtered_tb) from None
File "C:\Users\Boody\AppData\Local\Programs\Python\Python311\Lib\site-packages\keras\saving\legacy\save.py", line 230, in load_model
raise IOError(
OSError: No file or directory found at seq2seq_model.h5

Here is a screenshot for the code

vocabulary_file missing file

hi there
thanks for the code, could you please point me out where is the vocabulary_file, as in the link I couldn't find it.
thanks a lot

TypeError: Value passed to parameter 'shape' has DataType float32 not in list of allowed values: int32, int64

I am using python 3.
error in this line
--> 155 out = Dense(dictionary_size/2, activation="relu", name='relu-activation')(merge_layer)

@oswaldoludwig I've been adapting your code for my conversation file and I can't figure out the meaning of this line
I got a variety of errors but if I set l = np.where(sent==0) the code runs. I don't know if that works or not. The same code is also at line 171.

Can explain what the EOS is doing?

Thanks!

Any suggestion for large datasets?

Hi, I'm trying to train the chatbot with a large dataset, and my RAM it's getting out of memory. In addition, my dataset is in Spanish, so my word-vectors are of size 300. I've lowered the vocabulary size to 4000, but there's no way. (I have 64GB of RAM).

Thanks

KeyError:Something

can you please explain why this error is occurring?
Traceback (most recent call last):
File "conversation.py", line 215, in
Q = tokenize(query)
File "conversation.py", line 117, in tokenize
X = np.asarray([word_to_index[w] for w in tokenized_sentences])
File "conversation.py", line 117, in
X = np.asarray([word_to_index[w] for w in tokenized_sentences])
KeyError: 'something'

Would you please add the attention or pointer mechanism based on your current model?

Thanks for your git, which gives me a lot of inspiration. To my best knowledge, the attention or pointer mechanism is popular in sequence to sequence tasks such as chatbot. I have read the attention mechanism of Luong et al. 2015 and Bahdanau et al. 2015, pointer networks of some summarization tasks, but I feel confused on those formulas. Would you please add some attention or pointer mechanism examples based on your current model?

limit = l[0][0] - IndexError: index 0 is out of bounds for axis 0 with size 0

What can be the root cause for this?

for m in range(Epochs):

    # Loop over training batches due to memory constraints:
    for n in range(0,round_exem,step):

        q2 = q[n:n+step]
        print(q2)
        s = q2.shape
        print(s)
        count = 0
        for i, sent in enumerate(a[n:n+step]):
            print("Sentence")
            print(sent)
            l = np.where(sent==3)  #  the position of the symbol EOS
            limit = l[0][0]
            count += limit + 1

  File "train_bot.py", line 188, in <module>
    limit = l[0][0]
IndexError: index 0 is out of bounds for axis 0 with size 0

I don't see any 3 for some reason

Sentence
[   1   31    5  640    8 2475    9    8  339    4    2    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0]

Any documentation on how to train it on our own data for the new GAN based algorithm?

training custom data

Could you please explain how i can give my custom data with different conversation flow.
How each conversation can be given in a single text.

eroor while running conversation.py

Traceback (most recent call last):
File "D:\Seq2seq-Chatbot-for-Keras-master\conversation.py", line 151, in
out = Dense(dictionary_size/2, activation="relu", name='relu-activation')(merge_layer)
File "C:\Users\sheen\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\engine\base_layer.py", line 431, in call
self.build(unpack_singleton(input_shapes))
File "C:\Users\sheen\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\layers\core.py", line 866, in build
constraint=self.kernel_constraint)
File "C:\Users\sheen\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\legacy\interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "C:\Users\sheen\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\engine\base_layer.py", line 249, in add_weight
weight = K.variable(initializer(shape),
File "C:\Users\sheen\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\initializers.py", line 218, in call
dtype=dtype, seed=self.seed)
File "C:\Users\sheen\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\backend\tensorflow_backend.py", line 4139, in random_uniform
dtype=dtype, seed=seed)
File "C:\Users\sheen\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\ops\random_ops.py", line 244, in random_uniform
shape, dtype, seed=seed1, seed2=seed2)
File "C:\Users\sheen\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\ops\gen_random_ops.py", line 473, in _random_uniform
name=name)
File "C:\Users\sheen\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 609, in _apply_op_helper
param_name=input_name)
File "C:\Users\sheen\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 60, in _SatisfiesTypeConstraint
", ".join(dtypes.as_dtype(x).name for x in allowed_list)))
TypeError: Value passed to parameter 'shape' has DataType float32 not in list of allowed values: int32, int64

where did you proposes a persona embedding?

Hello, the paper said "This work proposes a persona embedding that permits the incorporation of background facts for user profiles, person-specific language behavior, and interaction style.". So, I'm curious that have you ever proposed a persona embedding?

enhancement - investigate supplying conda environment.yml / requirements.txt with exported python 2.7.6 dependencies

see here for sample environment
https://github.com/johndpope/dev_env/tree/master/conda

this is a sample conda file / rather like requirements.txt / it also encapsulates pip dependencies
https://github.com/johndpope/dev_env/blob/master/conda/tensorflow-gpu.yml

with this file / environment on any install linux / windows etc

python 2 will be forced along with all (legacy) dependencies.

N.B. - to upgrade codebase to python3 is not too bad - just use
2to3 *.py -w

Error while running conversation.py

Hello.

I am attempting to chat with the pre-trained model. I have downloaded the files from the dropbox, then I go to the directory for this cloned git and then I run conversation.py

I continue to get this error:

File "/Users/harrislevine/Downloads/Seq2seq-Chatbot-for-Keras-master-2/conversation.py", line 96
if raw_word[-1] <> '!' and raw_word[-1] <> '?' and raw_word[-1] <> '.' and raw_word[-2:] <> '! ' and raw_word[-2:] <> '? ' and raw_word[-2:] <> '. ':
^
SyntaxError: invalid syntax
Harriss-MacBook-Pro:~ harrislevine$ cd /Users/harrislevine/Downloads/Seq2seq-Chatbot-for-Keras-master-2
Harriss-MacBook-Pro:Seq2seq-Chatbot-for-Keras-master-2 harrislevine$ python /Users/harrislevine/Downloads/Seq2seq-Chatbot-for-Keras-master-2/conversation.py
File "/Users/harrislevine/Downloads/Seq2seq-Chatbot-for-Keras-master-2/conversation.py", line 96
if raw_word[-1] <> '!' and raw_word[-1] <> '?' and raw_word[-1] <> '.' and raw_word[-2:] <> '! ' and raw_word[-2:] <> '? ' and raw_word[-2:] <> '. ':
^
SyntaxError: invalid syntax
Harriss-MacBook-Pro:Seq2seq-Chatbot-for-Keras-master-2 harrislevine$

If anyone can point out where I am going wrong it would be much appreciated...

SyntaxError in conversation.py

What is the Python version for this project? Because I have systax error in conversation.py as follows.

File "conversation.py", line 95
if raw_word[-1] <> '!' and raw_word[-1] <> '?' and raw_word[-1] <> '.' and raw_word[-2:] <> '! ' and raw_word[-2:] <> '? ' and raw_word[-2:] <> '. ':

Python 3.7.x Theano AssertionError when starting the model in conversation.py

Starting the model...
Traceback (most recent call last):
  File "c:\...\conversation.py", line 156, in <module>
    out = Dense(dictionary_size/2, activation="relu", name='relu activation')(me
rge_layer)
  File "c:\...\venv\lib\site-packages\keras\engine\base_layer.py",
line 463, in __call__
    self.build(unpack_singleton(input_shapes))
  File "c:\...\venv\lib\site-packages\keras\layers\core.py", line 8
95, in build
    constraint=self.kernel_constraint)
  File "c:\...\venv\lib\site-packages\keras\engine\base_layer.py",
line 279, in add_weight
    weight = K.variable(initializer(shape, dtype=dtype),
  File "c:\...\venv\lib\site-packages\keras\initializers.py", line
227, in __call__
    dtype=dtype, seed=self.seed)
  File "c:\...\venv\lib\site-packages\keras\backend\theano_backend.
py", line 2706, in random_uniform
    return rng.uniform(shape, low=minval, high=maxval, dtype=dtype)
  File "c:\...\venv\lib\site-packages\theano\sandbox\rng_mrg.py", l
ine 857, in uniform
    for i in size]), msg
AssertionError: size must be a tuple of int or a Theano variable

Is that talking about the dictionary size not being a Theano variable? I'm just trying to run conversation.py and I'm getting this error (among others I think I've fixed)

Any idea on what to do to fix?

File vocabulary_movie

Hello, how to generate a new file vocabulary_movie for my data?

Not a big deal, but Ubuntu does not like the spaces in the names

Not a big deal, but Ubuntu 16.04 Python 2.7 0 does not like the spaces in the names.
Just replace the spaces with a - and all works well.
Example:
name='the-context-text
name='the-answer-text-up-to-the-current-token
Do all the names that way.

Question: Does the bot auto-learn from conversation or must it be retrained with each session ?

how many data

Hi, do you know how many data is needed to train from scratch a model who "talks" with some sense?

thanks

Getting AttributeError:int cannot find replace.

@oswaldoludwig ,

I'm trying to execute split_qa.py.When i run the file,it is throwing me the above error.
I'm not knowing where this raw_word is converting into int.Tried many combinations,but its not working.
Could you please help me out in this scenario?

Please find the attached.

Which version of Tensorflow does this repo require?

Thank you!

weight's file updated.

I had uploaded by mistake an outdated file with the network weights in Dropbox (the chatbot doesn't work properly with this file). Now I uploaded the updated file. Please replace the old version by the new file, which can be found here: https://www.dropbox.com/sh/o0rze9dulwmon8b/AAA6g6QoKM8hBEHGst6W4JGDa?dl=0

ValueError: 'the context text' is not a valid scope name, running conversation_discriminator.py

Error when running conversation_discriminator.py
Using Tensorflow 1.4.1, keras 2.0.4

$ python conversation_discriminator.py
Using TensorFlow backend.
Starting the model...
Traceback (most recent call last):
File "conversation_discriminator.py", line 223, in
input_context = Input(shape=(maxlen_input,), dtype='int32', name='the context text')
File "/home/javier/repos/dialogue/Seq2seq-Chatbot-for-Keras/venv/local/lib/python2.7/sit
e-packages/keras/engine/topology.py", line 1414, in Input
input_tensor=tensor)
File "/home/javier/repos/dialogue/Seq2seq-Chatbot-for-Keras/venv/local/lib/python2.7/sit
e-packages/keras/legacy/interfaces.py", line 88, in wrapper
return func(*args, **kwargs)
File "/home/javier/repos/dialogue/Seq2seq-Chatbot-for-Keras/venv/local/lib/python2.7/sit
e-packages/keras/engine/topology.py", line 1325, in init
name=self.name)
File "/home/javier/repos/dialogue/Seq2seq-Chatbot-for-Keras/venv/local/lib/python2.7/sit
e-packages/keras/backend/tensorflow_backend.py", line 391, in placeholder
x = tf.placeholder(dtype, shape=shape, name=name)
File "/home/javier/repos/dialogue/Seq2seq-Chatbot-for-Keras/venv/local/lib/python2.7/sit
e-packages/tensorflow/python/ops/array_ops.py", line 1599, in placeholder
return gen_array_ops._placeholder(dtype=dtype, shape=shape, name=name)
File "/home/javier/repos/dialogue/Seq2seq-Chatbot-for-Keras/venv/local/lib/python2.7/sit
e-packages/tensorflow/python/ops/gen_array_ops.py", line 3091, in _placeholder
"Placeholder", dtype=dtype, shape=shape, name=name)
File "/home/javier/repos/dialogue/Seq2seq-Chatbot-for-Keras/venv/local/lib/python2.7/sit
e-packages/tensorflow/python/framework/op_def_library.py", line 394, in _apply_op_helper
with g.as_default(), ops.name_scope(name) as scope:
File "/home/javier/repos/dialogue/Seq2seq-Chatbot-for-Keras/venv/local/lib/python2.7/sit
e-packages/tensorflow/python/framework/ops.py", line 4932, in enter
return self._name_scope.enter()
File "/usr/lib/python2.7/contextlib.py", line 17, in enter
return self.gen.next()
File "/home/javier/repos/dialogue/Seq2seq-Chatbot-for-Keras/venv/local/lib/python2.7/sit
e-packages/tensorflow/python/framework/ops.py", line 3519, in name_scope
raise ValueError("'%s' is not a valid scope name" % name)
ValueError: 'the context text' is not a valid scope name

creating vocabulary file for different context

@oswaldoludwig, can you share how you created your vocabulary file? I see it is a pickled object but my use case is quite different and I wondered how it was created.

Thanks!

Parameters of the model ?

Hi , i just had a question regarding the parameters of the model, did you put them imperially or did you use an optimization technique to choose the parameters of the model ?

Thanks !!

TypeError: 'module' object is not callable when executing train_bot.py

@oswaldoludwig
Thanks for this repository. Its really very useful and also helping me a lot!!

When i run this pre-trained model on my data set,i'm getting the below error in train_bot.py.

Using Theano backend.
WARNING (theano.configdefaults): install mkl with conda install mkl-service: No module named 'mkl'
WARNING (theano.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
Found 400000 word vectors.

Warning (from warnings module):
File "C:\Users\Anoop\Documents\Seq2seq-Chatbot-for-Keras-master\train_bot.py", line 101
LSTM_encoder = LSTM(sentence_embedding_size, init= 'lecun_uniform')
UserWarning: Update your LSTM call to the Keras 2 API: LSTM(300, kernel_initializer="lecun_uniform")

Warning (from warnings module):
File "C:\Users\Anoop\Documents\Seq2seq-Chatbot-for-Keras-master\train_bot.py", line 102
LSTM_decoder = LSTM(sentence_embedding_size, init= 'lecun_uniform')
UserWarning: Update your LSTM call to the Keras 2 API: LSTM(300, kernel_initializer="lecun_uniform")
Traceback (most recent call last):
File "C:\Users\Anoop\Documents\Seq2seq-Chatbot-for-Keras-master\train_bot.py", line 113, in
merge_layer = merge([context_embedding, answer_embedding], mode='concat', concat_axis=1)
TypeError: 'module' object is not callable

I have tried doing R&D to find out what's going wrong.But i could not able to find out.
Please help me out.

Thanks in Advance!!

"w" in get_train_data.py

Hi, in lines 88 & 91 of get_train_data.py, it should be "wb" not "w" when writing to Pickle as it will otherwise cause write error.

how a new vocabulary file can be generated?

can you please explain how you generated your vocabulary file?

NameError: name 'embedding_matrix' is not defined

In conversation.py
Shared_Embedding = Embedding(output_dim=word_embedding_size, input_dim=dictionary_size, weights=[embedding_matrix], input_length=maxlen_input, name='Shared')
however it produces the error above, should we reimport embeddings?

Valid Scope name

When trying to run conversation.py this error pops up

File "/Users/Mohannad/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2908, in name_scope
raise ValueError("'%s' is not a valid scope name" % name)
ValueError: 'the context text' is not a valid scope name

Any idea what is this or how t solve it ?
Thanks

TYPE Error

Hi! When trying to run conversation.py, this error pops up

TypeError: Value passed to parameter 'shape' has DataType float32 not in list of allowed values: int32, int64

so i checked the values of the shape parameter and i found that all the values passed was INT32 not float32. However i tried to modify the list of allowed values and append it to the list in the ops_def_library.py

def _SatisfiesTypeConstraint(dtype, attr_def, param_name):
if attr_def.HasField("allowed_values"):
allowed_list = attr_def.allowed_values.list.type
allowed_list.append(DT_FLOAT32)
if dtype not in allowed_list:
raise TypeError(
"Value passed to parameter '%s' has DataType %s not in list of "
"allowed values: %s" %
(param_name, dtypes.as_dtype(dtype).name,
", ".join(dtypes.as_dtype(x).name for x in allowed_list)))

after the modification when i try to run conversation.py
this error pops up

File "/Users/Mohannad/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 56, in _SatisfiesTypeConstraint
allowed_list.append(DT_FLOAT32)
NameError: name 'DT_FLOAT32' is not defined

so any idea from where the value of shape float32 is passed instead of int 32 ? or how can i solve this problem ?

Finally thanks a lot !!

oswaldoludwig / seq2seq-chatbot-for-keras Goto Github PK

seq2seq-chatbot-for-keras's People

Contributors

Stargazers

Watchers

Forkers

seq2seq-chatbot-for-keras's Issues

when i tried to programm a chatbot for the fisrt time it gave me this error

Not a big deal, but Ubuntu 16.04 Python 2.7 0 does not like the spaces in the names. Just replace the spaces with a - and all works well. Example: name='the-context-text name='the-answer-text-up-to-the-current-token Do all the names that way.

Recommend Projects

Recommend Topics

Recommend Org

Not a big deal, but Ubuntu 16.04 Python 2.7 0 does not like the spaces in the names.
Just replace the spaces with a - and all works well.
Example:
name='the-context-text
name='the-answer-text-up-to-the-current-token
Do all the names that way.