lukalabs / cakechat Goto Github PK

CakeChat: Emotional Generative Dialog System

License: Apache License 2.0

Python 100.00%

conversational-ai conversational-agents conversational-bots dialogue-agents dialogue-systems dialog-systems nlp deep-learning seq2seq seq2seq-chatbot

cakechat's Introduction

Note on the top: the project is unmaintained.

Transformer-based dialog models work better and we recommend using them instead of RNN-based CakeChat. See, for example https://github.com/microsoft/DialoGPT

CakeChat: Emotional Generative Dialog System

CakeChat is a backend for chatbots that are able to express emotions via conversations.

CakeChat is built on Keras and Tensorflow.

The code is flexible and allows to condition model's responses by an arbitrary categorical variable. For example, you can train your own persona-based neural conversational model^[1] or create an emotional chatting machine^[2].

Main requirements

python 3.5.2
tensorflow 1.12.2
keras 2.2.4

Network architecture and features
Quick start
Setup for training and testing
1. Docker
  1. CPU-only setup
  2. GPU-enabled setup
2. Manual setup
Getting the pre-trained model
Training data
Training the model
Running CakeChat server
Repository overview
1. Important tools
2. Important configuration settings
Example use cases
References
Credits & Support
License

Network architecture and features

Model:

Hierarchical Recurrent Encoder-Decoder (HRED) architecture for handling deep dialog context^[3].
Multilayer RNN with GRU cells. The first layer of the utterance-level encoder is always bidirectional. By default, CuDNNGRU implementation is used for ~25% acceleration during inference.
Thought vector is fed into decoder on each decoding step.
Decoder can be conditioned on any categorical label, for example, emotion label or persona id.

Word embedding layer:

May be initialized using w2v model trained on your corpus.
Embedding layer may be either fixed or fine-tuned along with other weights of the network.

Decoding

4 different response generation algorithms: "sampling", "beamsearch", "sampling-reranking" and "beamsearch-reranking". Reranking of the generated candidates is performed according to the log-likelihood or MMI-criteria^[4]. See configuration settings description for details.

Metrics:

Perplexity
n-gram distinct metrics adjusted to the samples size^[4].
Lexical similarity between samples of the model and some fixed dataset. Lexical similarity is a cosine distance between TF-IDF vector of responses generated by the model and tokens in the dataset.
Ranking metrics: mean average precision and mean recall@k.^[5]

Quick start

In case you are familiar with Docker here is the easiest way to run a pre-trained CakeChat model as a server. You may need to run the following commands with sudo.

CPU version:

docker pull lukalabs/cakechat:latest && \

docker run --name cakechat-server -p 127.0.0.1:8080:8080 -it lukalabs/cakechat:latest bash -c "python bin/cakechat_server.py"

GPU version:

docker pull lukalabs/cakechat-gpu:latest && \

nvidia-docker run --name cakechat-gpu-server -p 127.0.0.1:8080:8080 -it lukalabs/cakechat-gpu:latest bash -c "CUDA_VISIBLE_DEVICES=0 python bin/cakechat_server.py"

That's it! Now test your CakeChat server by running the following command on your host machine:

python tools/test_api.py -f localhost -p 8080 -c "hi!" -c "hi, how are you?" -c "good!" -e "joy"

The response dict may look like this:

{'response': "I'm fine!"}

Setup for training and testing

Docker

Docker is the easiest way to set up the environment and install all the dependencies for training and testing.

CPU-only setup

Note: We strongly recommend using GPU-enabled environment for training CakeChat model. Inference can be made both on GPUs and CPUs.

Install Docker.
Pull a CPU-only docker image from dockerhub:

docker pull lukalabs/cakechat:latest

Run a docker container in the CPU-only environment:

docker run --name <YOUR_CONTAINER_NAME> -it lukalabs/cakechat:latest

GPU-enabled setup

Install nvidia-docker for the GPU support.
Pull GPU-enabled docker image from dockerhub:

docker pull lukalabs/cakechat-gpu:latest

Run a docker container in the GPU-enabled environment:

nvidia-docker run --name <YOUR_CONTAINER_NAME> -it cakechat-gpu:latest

That's it! Now you can train your model and chat with it. See the corresponding section below for further instructions.

Manual setup

If you don't want to deal with docker, you can install all the requirements manually:

pip install -r requirements.txt -r requirements-local.txt

NB:

We recommend installing the requirements inside a virtualenv to prevent messing with your system packages.

Getting the pre-trained model

You can download our pre-trained model weights by running python tools/fetch.py.

The params of the pre-trained model are the following:

context size 3 (<speaker_1_utterance>, <speaker_2_utterance>, <speaker_1_utterance>)
each encoded utterance contains up to 30 tokens
the decoded utterance contains up to 32 tokens
both encoder and decoder have 2 GRU layers with 768 hidden units each
first layer of the encoder is bidirectional

Training data

The model was trained on a preprocessed Twitter corpus with ~50 million dialogs (11Gb of text data). To clean up the corpus, we removed

URLs, retweets and citations;
mentions and hashtags that are not preceded by regular words or punctuation marks;
messages that contain more than 30 tokens.

We used our emotions classifier to label each utterance with one of the following 5 emotions: "neutral", "joy", "anger", "sadness", "fear", and used these labels during training. To mark-up your own corpus with emotions you can use, for example, DeepMoji tool.

Unfortunately, due to Twitter's privacy policy, we are not allowed to provide our dataset. You can train a dialog model on any text conversational dataset available to you, a great overview of existing conversational datasets can be found here: https://breakend.github.io/DialogDatasets/

The training data should be a txt file, where each line is a valid json object, representing a list of dialog utterances. Refer to our dummy train dataset to see the necessary file structure. Replace this dummy corpus with your data before training.

Training the model

There are two options:

training from scratch
fine-tuning the provided trained model

The first approach is less restrictive: you can use any training data you want and set any config params of the model. However, you should be aware that you'll need enough train data (~50Mb at least), one or more GPUs and enough patience (days) to get good model's responses.

The second approach is limited by the choice of config params of the pre-trained model – see cakechat/config.py for the complete list. If the default params are suitable for your task, fine-tuning should be a good option.

Fine-tuning the pre-trained model on your data

Fetch the pre-trained model from Amazon S3 by running python tools/fetch.py.
Put your training text corpus to data/corpora_processed/train_processed_dialogs.txt. Make sure that your dataset is large enough, otherwise your model risks to overfit the data and the results will be poor.
Run python tools/train.py.
1. The script will look for the pre-trained model weights in results/nn_models, the full path is inferred from the set of config params.
2. If you want to initialize the model weights from a custom file, you can specify the path to the file via -i argument, for example, python tools/train.py -i results/nn_models/my_saved_weights/model.current.
3. Don't forget to set CUDA_VISIBLE_DEVICES=<GPU_ID> environment variable (with <GPU_ID> as in output of nvidia-smi command) if you want to use GPU. For example, CUDA_VISIBLE_DEVICES=0 python tools/train.py will run the train process on the 0-th GPU.
4. Use parameter -s to train the model on a subset of the first N samples of your training data to speed up preprocessing for debugging. For example, run python tools/train.py -s 1000 to train on the first 1000 samples.

Weights of the trained model are saved to results/nn_models/.

Training the model from scratch

Put your training text corpus to data/corpora_processed/train_processed_dialogs.txt.
Set up training parameters in cakechat/config.py. See configuration settings description for more details.
Consider running PYTHONHASHSEED=42 python tools/prepare_index_files.py to build the index files with tokens and conditions from the training corpus. Make sure to set PYTHONHASHSEED environment variable, otherwise you may get different index files for different launches of the script. Warning: this script overwrites the original tokens index files data/tokens_index/t_idx_processed_dialogs.json and data/conditions_index/c_idx_processed_dialogs.json. You should only run this script in case your corpus is large enough to contain all the words that you want your model to understand. Otherwise, consider fine-tuning the pre-trained model as described above. If you messed up with index files and want to get the default versions, delete your copies and run python tools/fetch.py anew.
Consider running python tools/train_w2v.py to build w2v embedding from the training corpus. Warning: this script overwrites the original w2v weights that are stored in data/w2v_models. You should only run this script in case your corpus is large enough to contain all the words that you want your model to understand. Otherwise, consider fine-tuning the pre-trained model as described above. If you messed up with w2v files and want to get the default version, delete your file copy and run python tools/fetch.py anew.
Run python tools/train.py.
1. Don't forget to set CUDA_VISIBLE_DEVICES=<GPU_ID> environment variable (with <GPU_ID> as in output of nvidia-smi command) if you want to use GPU. For example CUDA_VISIBLE_DEVICES=0 python tools/train.py will run the train process on the 0-th GPU.
2. Use parameter -s to train the model on a subset of the first N samples of your training data to speed up preprocessing for debugging. For example, run python tools/train.py -s 1000 to train on the first 1000 samples.
You can also set IS_DEV=1 to enable the "development mode". It uses a reduced number of model parameters (decreased hidden layer dimensions, input and output sizes of token sequences, etc.) and performs verbose logging. Refer to the bottom lines of cakechat/config.py for the complete list of dev params.

Weights of the trained model are saved to results/nn_models/.

Distributed train

GPU-enabled docker container supports distributed train on multiple GPUs using horovod.

For example, run python tools/distributed_train.py -g 0 1 to start training on 0 and 1 GPUs.

Validation metrics calculation

During training the following datasets are used for validations metrics calculation:

data/corpora_processed/val_processed_dialogs.txt(dummy example, replace with your data) – for the context-sensitive dataset
data/quality/context_free_validation_set.txt – for the context-free validation dataset
data/quality/context_free_questions.txt – is used for generating responses for logging and computing distinct-metrics
data/quality/context_free_test_set.txt – is used for computing metrics of the trained model, e.g. ranking metrics

The metrics are stored to cakechat/results/tensorboard and can be visualized using Tensorboard. If you run a docker container from the provided CPU or GPU-enabled docker image, tensorboard server should start automatically and serve on http://localhost:6006. Open this link in your browser to see the training graphs.

If you installed the requirements manually, start tensorboard server first by running the following command from your cakechat root directory:

mkdir -p results/tensorboard && tensorboard --logdir=results/tensorboard 2>results/tensorboard/err.log &

After that proceed to http://localhost:6006.

Testing the trained model

You can run the following tools to evaluate your trained model on test data(dummy example, replace with your data):

tools/quality/ranking_quality.py – computes ranking metrics of a dialog model
tools/quality/prediction_distinctness.py – computes distinct-metrics of a dialog model
tools/quality/condition_quality.py – computes metrics on different subsets of data according to the condition value
tools/generate_predictions.py – evaluates the model. Generates predictions of a dialog model on the set of given dialog contexts and then computes metrics. Note that you should have a reverse-model in the data/nn_models directory if you want to use "*-reranking" prediction modes
tools/generate_predictions_for_condition.py – generates predictions for a given condition value

Running CakeChat server

Local HTTP-server

Run a server that processes HTTP-requests with given input messages and returns response messages from the model:

python bin/cakechat_server.py

Specify CUDA_VISIBLE_DEVICES=<GPU_ID> environment variable to run the server on a certain GPU.

Don't forget to run python tools/fetch.py prior to starting the server if you want to use our pre-trained model.

To make sure everything works fine, test the model on the following conversation

– Hi, Eddie, what's up?
– Not much, what about you?
– Fine, thanks. Are you going to the movies tomorrow?

by running the command:

python tools/test_api.py -f 127.0.0.1 -p 8080 \
    -c "Hi, Eddie, what's up?" \
    -c "Not much, what about you?" \
    -c "Fine, thanks. Are you going to the movies tomorrow?"

You should get a meaningful answer, for example:

{'response': "Of course!"}

HTTP-server API description

/cakechat_api/v1/actions/get_response

JSON parameters are:

Parameter	Type	Description
context	list of strings	List of previous messages from the dialogue history (max. 3 is used)
emotion	string, one of enum	One of {'neutral', 'anger', 'joy', 'fear', 'sadness'}. An emotion to condition the response on. Optional param, if not specified, 'neutral' is used

Request

POST /cakechat_api/v1/actions/get_response
data: {
 'context': ['Hello', 'Hi!', 'How are you?'],
 'emotion': 'joy'
}

Response OK

200 OK
{
  'response': 'I\'m fine!'
}

Gunicorn HTTP-server

We recommend using Gunicorn for serving the API of your model at production scale.

Install gunicorn: pip install gunicorn
Run a server that processes HTTP-queries with input messages and returns response messages of the model:

cd bin && gunicorn cakechat_server:app -w 1 -b 127.0.0.1:8080 --timeout 2000

Telegram bot

You can run your CakeChat model as a Telegram bot:

Create a telegram bot to get bot's token.
Run python tools/telegram_bot.py --token <YOUR_BOT_TOKEN> and chat with it on Telegram.

Repository overview

cakechat/dialog_model/ – contains computational graph, training procedure and other model utilities
cakechat/dialog_model/inference/ – algorithms for response generation
cakechat/dialog_model/quality/ – code for metrics calculation and logging
cakechat/utils/ – utilities for text processing, w2v training, etc.
cakechat/api/ – functions to run http server: API configuration, error handling
tools/ – scripts for training, testing and evaluating your model

Important tools

bin/cakechat_server.py – Runs an HTTP-server that returns response messages of the model given dialog contexts and an emotion. See run section for details.
tools/train.py – Trains the model on your data. You can use the --reverse option to train a reverse-model used in "*-reranking" response generation algorithms for more accurate predictions.
tools/prepare_index_files.py – Prepares index for the most commonly used tokens and conditions. Use this script before training the model from scratch on your own data.
tools/quality/ranking_quality.py – Computes ranking metrics of a dialog model.
tools/quality/prediction_distinctness.py – Computes distinct-metrics of a dialog model.
tools/quality/condition_quality.py – Computes metrics on different subsets of data according to the condition value.
tools/generate_predictions.py – Evaluates the model. Generates predictions of a dialog model on the set of given dialog contexts and then computes metrics. Note that you should have a reverse-model in the results/nn_models directory if you want to use "*-reranking" prediction modes.
tools/generate_predictions_for_condition.py – Generates predictions for a given condition value.
tools/test_api.py – Example code to send requests to a running HTTP-server.
tools/fetch.py – Downloads the pre-trained model and index files associated with it.
tools/telegram_bot.py – Runs Telegram bot on top of trained model.

Important configuration settings

All the configuration parameters for the network architecture, training, predicting and logging steps are defined in cakechat/config.py. Some inference parameters used in an HTTP-server are defined in cakechat/api/config.py.

Network architecture and size
- HIDDEN_LAYER_DIMENSION is the main parameter that defines the number of hidden units in recurrent layers.
- WORD_EMBEDDING_DIMENSION and CONDITION_EMBEDDING_DIMENSION define the number of hidden units that each token/condition are mapped into.
- Number of units of the output layer of the decoder is defined by the number of tokens in the dictionary in the tokens_index directory.
Decoding algorithm:
- PREDICTION_MODE_FOR_TESTS defines how the responses of the model are generated. The options are the following:
  - sampling – response is sampled from output distribution token-by-token. For every token the temperature transform is performed prior to sampling. You can control the temperature value by tuning DEFAULT_TEMPERATURE parameter.
  - sampling-reranking – multiple candidate-responses are generated using sampling procedure described above. After that the candidates are ranked according to their MMI-score^[4] You can tune this mode by picking SAMPLES_NUM_FOR_RERANKING and MMI_REVERSE_MODEL_SCORE_WEIGHT parameters.
  - beamsearch – candidates are sampled using beam search algorithm. The candidates are ordered according to their log-likelihood score computed by the beam search procedure.
  - beamsearch-reranking – same as above, but the candidates are re-ordered after the generation in the same way as in sampling-reranking mode.
Note that there are other parameters that affect the response generation process. See REPETITION_PENALIZE_COEFFICIENT, NON_PENALIZABLE_TOKENS, MAX_PREDICTIONS_LENGTH.

Example use cases

By providing additional condition labels within dataset entries, you can build the following models:

A Persona-Based Neural Conversation Model — a model that allows to condition responses on a persona ID to make them lexically similar to the given persona's linguistic style.
Emotional Chatting Machine-like model — a model that allows conditioning responses on different emotions to provide emotional styles (anger, sadness, joy, etc).
Topic Aware Neural Response Generation-like model — a model that allows to condition responses on a certain topic to keep the topic-aware conversation.

To make use of these extra conditions, please refer to the section Training the model. Just set the "condition" field in the training set to one of the following: persona ID, emotion or topic label, update the index files and start the training.

References

Credits & Support

CakeChat is developed and maintained by the Replika team:

Nicolas Ivanov, Michael Khalman, Nikita Smetanin, Artem Rodichev and Denis Fedorenko.

Demo by Oleg Akbarov, Alexander Kuznetsov and Vladimir Chernosvitov.

All issues and feature requests can be tracked here – GitHub Issues.

License

cakechat's People

Contributors

Stargazers

Watchers

Forkers

iamsile sth4k xuanhan863 oyeyipo nilportugues stevenlol bigdatasciencegroup fence yolandamiao lilomarry limin2021 neo4reo little1tow alphadl allensmile chenmoshushi wuyijian aakarkun nikitos9000 sun0f3 datumbox hooram raghparihar mysqlsc kkaushikvarma nicolas-ivanov m0sth8 gridl dsparling cyrke tcxdgit timmoti kalininskiy playchimp cromvell hellozjj rwreynolds lucas-chu oxy emkamal luciany jjzhx1211 zhengjunzhao1991 gulhati tristanpfost marvelousgirl pandinosaurus alikhalilli 7990satyam200 sofineismine melidisc jacobdanovitch hxyshare hydercps terminalkitten meelement tree-ind nic42 lillycorp benjamism abinj winjia ljohansenjunk njncalub strandline ai-jie01 dantodor hatleon g-wang vidhushinisrinivasan16 xwixcn hdulbj hailiang-wang shihuaxing cyzhangathit chinashijiashuai mintdawn martline1 jacswork samsgates ivan2005 teeso guidachengong amironoff apheliongroup dea6cat dieuthu lunayach gusrblanco gauravg8 shubhampachori12110095 jmew tpzjj612 thanatchon36 binojohnthomas khalman-m gossamr benrunciman sungjinlees dfraser74

cakechat's Issues

AMD ROCm support

Since TensorFlow just introduced ROCm support on their ML framework, I think CakeChat should also follow the trend since it will allow our AMD folks to run CakeChat without any dependency to NVIDIA and CUDA

Problem running the chat using Docker on Windows 10

Hi there,

I have attempted to run CakeChat on my Windows 10 machine.

I've just followed the Quick Start guide and executed the following command:
docker run --name cakechat-dev -p 127.0.0.1:8080:8080 -it lukalabs/cakechat:latest
bash -c "python bin/cakechat_server.py"

The image is dowloaded just fine, but at the end Docker throws this error:
docker: Error response from daemon: OCI runtime create failed: container_linux.go:348: starting container process caused "exec: "\\": executable file not found in $PATH": unknown.

What could be the issue?

Thank you in advance.

DialogFlow

If you were to do this app today, would you have used DialogFlow?

Adding more than 5 emotions as conditions.

Hey guys, Im using cakechat to learn emotions as the original model, But i'm adding more than 5 emotions as my condition in the training data. I also changed config.py to look for all my emotions and save them as the proper variable. However, on training my data, I get a cmd response as 'killed', and my models are not made. Any idea why?
Here's my changes of the code :

Example of data :
[{"text" : "Hi there!", "condition" : "Anticipation"}, {"text" : "Hey you. Long time no see!", "condition" : "Surprise"}, {"text" : "Sorry, Ive been busy", "condition" : "Expectation"}, {"text" : "No problem. Being busy is a part of life", "condition" : "Acceptance"}]

Changes in config.py :
EMOTIONS_TYPES = create_namedtuple_instance(
'EMOTIONS_TYPES', neutral='Neutral', anger='Anger', joy='Joy', sadness='Sadness', sadnessLonely='SadnessLonely', anticipation='Anticipation', trust="Trust", acceptance='Acceptance', surprise="Surprise").

iOS Support

How would I be able to use this model in iOS. Is there an API to call to or can I convert the model to CoreML?

TypeError: unsupported format string passed to tuple.format

I'm trying to train the model using my own data. I used Dockerfile3.cpu as a step-by-step installation guide so I'm running latest master version with python3. Then I replaced

data/corpora_processed/train_processed_dialogs.txt.
data/corpora_processed/val_processed_dialogs.txt
data/quality/context_free_validation_set.txt
data/quality/context_free_questions.txt
data/quality/context_free_test_set.txt
with my own data.
All my messages in all chats have neutral condition.
After running python tools/prepare_index_files.py I got a c_idx_processed_dialogs.json file with the following content:

{"0": "neutral"}

Then I run python tools/train.py and it fails with the following error:

Log output:

[23.01.2019 17:40:37.133][INFO][11346][cakechat.utils.files_utils][91] Creating /home/unnamed/Projects/ml_chatbot/cakechat/data/tensorboard/steps
[23.01.2019 17:40:37.134][INFO][11346][cakechat.tools/train.py][102] THEANO_FLAGS: floatX=float32,device=cpu
[23.01.2019 17:40:37.141][INFO][11346][cakechat.tools/train.py][42] Getting train iterator for w2v...
[23.01.2019 17:40:37.142][INFO][11346][cakechat.tools/train.py][48] Getting text-filtered train iterator...
[23.01.2019 17:40:37.142][INFO][11346][cakechat.tools/train.py][51] Getting tokenized train iterator...
[23.01.2019 17:40:37.142][INFO][11346][cakechat.utils.w2v.model][64] Getting w2v model
[23.01.2019 17:40:37.182][INFO][11346][cakechat.utils.s3.bucket][19] Getting file w2v_models/train_processed_dialogs_window10_voc12477_vec128_sgTrue.bin from AWS S3 and saving it as /home/unnamed/Projects/ml_chatbot/cakechat/data/w2v_models/train_processed_dialogs_window10_voc12477_vec128_sgTrue.bin
[23.01.2019 17:40:38.899][WARNING][11346][cakechat.utils.s3.resolver.S3FileResolver][42] File can not be downloaded from AWS S3 because: An error occurred (404) when calling the HeadObject operation: Not Found
[23.01.2019 17:40:38.899][INFO][11346][cakechat.utils.w2v.model][18] Word2Vec model will be trained now. It can take long, so relax and have fun.
[23.01.2019 17:40:38.899][INFO][11346][cakechat.utils.w2v.model][21] Parameters for training: window10_voc12477_vec128_sgTrue
[23.01.2019 17:40:39.382][INFO][11346][cakechat.utils.w2v.model][44] Saving model to /home/unnamed/Projects/ml_chatbot/cakechat/data/w2v_models/train_processed_dialogs_window10_voc12477_vec128_sgTrue.bin
[23.01.2019 17:40:39.495][INFO][11346][cakechat.utils.w2v.model][47] Model has been saved
[23.01.2019 17:40:39.495][INFO][11346][cakechat.utils.w2v.model][80] Successfully got w2v model

[23.01.2019 17:40:39.495][INFO][11346][cakechat.dialog_model.model_utils][202] Preparing embedding matrix based on w2v_model and index_to_token dict
[23.01.2019 17:40:39.497][WARNING][11346][cakechat.dialog_model.model_utils][192] Can't find token [_unk_] in w2v dict
[23.01.2019 17:40:40.214][INFO][11346][cakechat.dialog_model.model][466] Compiling predict function (log_prob=False)...
[23.01.2019 17:40:44.393][INFO][11346][cakechat.dialog_model.model][493] Compiling one-step predict function (log_prob=False)...
[23.01.2019 17:40:47.393][INFO][11346][cakechat.dialog_model.model][466] Compiling predict function (log_prob=True)...
[23.01.2019 17:40:51.105][INFO][11346][cakechat.dialog_model.model][493] Compiling one-step predict function (log_prob=True)...
[23.01.2019 17:40:54.295][INFO][11346][cakechat.dialog_model.model][542] Compiling sequence scoring function...
[23.01.2019 17:40:57.781][INFO][11346][cakechat.dialog_model.model][565] Compiling sequence scoring function (with thought vectors as arguments)...
Net shapes:
	input_y              	(None, None)
	emb_y                	(None, None, 128)
	thought_vector       	(None, 512)
	input_x              	(None, None, None)
	None                 	(None, None)
	emb_x                	(None, None, 128)
	mask_x               	(None, None)
	encoder_forward      	(None, None, 512)
	encoder_backward     	(None, None, 512)
	encoder_bidirectional_concat 	(None, None, 1024)
	encoder_1            	(None, 512)
	None                 	(None, None, 512)
	context_encoder      	(None, 512)
	None                 	(None, 512)
	repeat_layer         	(None, None, 512)
	input_condition_id   	(None,)
	embedding_condition_id 	(None, 128)
	embedding_condition_id_repeated 	(None, None, 128)
	decoder_concated_input 	(None, None, 768)
	mask_y               	(None, None)
	hid_states_decoder   	(None, 2, None)
	None                 	(None, None)
	decoder_1            	(None, None, 512)
	None                 	(None, None)
	decoder_2            	(None, None, 512)
	None                 	(None, 512)
	decoder_dropout_layer 	(None, 512)
	dense_output_probs   	(None, 12477)
[23.01.2019 17:41:03.248][INFO][11346][cakechat.utils.s3.bucket][19] Getting file nn_models/cakechat_v1.3_processed_dialogs_gru_hd512_cdim128_drop0.2_encd2_decd2_il30_cs3_ansl32_lr1.0_gc5.0_learnemb from AWS S3 and saving it as /home/unnamed/Projects/ml_chatbot/cakechat/data/nn_models/cakechat_v1.3_processed_dialogs_gru_hd512_cdim128_drop0.2_encd2_decd2_il30_cs3_ansl32_lr1.0_gc5.0_learnemb
[23.01.2019 17:41:32.900][INFO][11346][cakechat.utils.s3.bucket][21] Got file nn_models/cakechat_v1.3_processed_dialogs_gru_hd512_cdim128_drop0.2_encd2_decd2_il30_cs3_ansl32_lr1.0_gc5.0_learnemb from S3
[23.01.2019 17:41:32.904][INFO][11346][cakechat.dialog_model.model][626] 
Loading saved weights from file:
/home/unnamed/Projects/ml_chatbot/cakechat/data/nn_models/cakechat_v1.3_processed_dialogs_gru_hd512_cdim128_drop0.2_encd2_decd2_il30_cs3_ansl32_lr1.0_gc5.0_learnemb


Restored saved params:
	encoder_forward.W_in_to_updategate
	encoder_forward.W_hid_to_updategate
	encoder_forward.b_updategate
	encoder_forward.W_in_to_resetgate
	encoder_forward.W_hid_to_resetgate
	encoder_forward.b_resetgate
	encoder_forward.W_in_to_hidden_update
	encoder_forward.W_hid_to_hidden_update
	encoder_forward.b_hidden_update
	encoder_forward.hid_init
	encoder_backward.W_in_to_updategate
	encoder_backward.W_hid_to_updategate
	encoder_backward.b_updategate
	encoder_backward.W_in_to_resetgate
	encoder_backward.W_hid_to_resetgate
	encoder_backward.b_resetgate
	encoder_backward.W_in_to_hidden_update
	encoder_backward.W_hid_to_hidden_update
	encoder_backward.b_hidden_update
	encoder_backward.hid_init
	encoder_1.W_in_to_updategate
	encoder_1.W_hid_to_updategate
	encoder_1.b_updategate
	encoder_1.W_in_to_resetgate
	encoder_1.W_hid_to_resetgate
	encoder_1.b_resetgate
	encoder_1.W_in_to_hidden_update
	encoder_1.W_hid_to_hidden_update
	encoder_1.b_hidden_update
	encoder_1.hid_init
	context_encoder.W_in_to_updategate
	context_encoder.W_hid_to_updategate
	context_encoder.b_updategate
	context_encoder.W_in_to_resetgate
	context_encoder.W_hid_to_resetgate
	context_encoder.b_resetgate
	context_encoder.W_in_to_hidden_update
	context_encoder.W_hid_to_hidden_update
	context_encoder.b_hidden_update
	context_encoder.hid_init
	decoder_1.W_in_to_updategate
	decoder_1.W_hid_to_updategate
	decoder_1.b_updategate
	decoder_1.W_in_to_resetgate
	decoder_1.W_hid_to_resetgate
	decoder_1.b_resetgate
	decoder_1.W_in_to_hidden_update
	decoder_1.W_hid_to_hidden_update
	decoder_1.b_hidden_update
	decoder_2.W_in_to_updategate
	decoder_2.W_hid_to_updategate
	decoder_2.b_updategate
	decoder_2.W_in_to_resetgate
	decoder_2.W_hid_to_resetgate
	decoder_2.b_resetgate
	decoder_2.W_in_to_hidden_update
	decoder_2.W_hid_to_hidden_update
	decoder_2.b_hidden_update

Missing saved params:

Shapes-mismatched params (saved -> current):
Traceback (most recent call last):
  File "tools/train.py", line 107, in <module>
    train(init_path=args.init_weights, is_reverse_model=args.reverse)
  File "tools/train.py", line 79, in train
    resolver_factory=nn_model_resolver_factory, is_reverse_model=is_reverse_model)
  File "/home/unnamed/Projects/ml_chatbot/cakechat/cakechat/dialog_model/model.py", line 723, in get_nn_model
    model.load_weights()
  File "/home/unnamed/Projects/ml_chatbot/cakechat/cakechat/dialog_model/model.py", line 659, in load_weights
    laconic_logger.warning('\t{0:<40} {1:<12} -> {2:<12}'.format(var_name, saved_shape, default_shape))
TypeError: unsupported format string passed to tuple.__format__

What am I doing wrong?

Cakechat "can't find token"

Hello guys, I've started recently working with cakechat and I'm facing some issues. I've run prepare_index_files.py (with a french dataset of ~10 000 dialogs) with no issues.

Afterwards, when I run python tools/train.py, it continually threw this kind of error:

[25.04.2018 15:18:33.495][INFO][15][cakechat.utils.s3.bucket][21] Got file w2v_models/train_processed_dialogs_window10_voc50000_vec128_sgTrue.bin from S3
[25.04.2018 15:18:33.510][INFO][15][cakechat.utils.w2v.model][51] Loading model from /root/cakechat/data/w2v_models/train_processed_dialogs_window10_voc50000_vec128_sgTrue.bin
[25.04.2018 15:18:33.794][INFO][15][cakechat.utils.w2v.model][53] Model "train_processed_dialogs_window10_voc50000_vec128_sgTrue.bin" has been loaded.
[25.04.2018 15:18:33.794][INFO][15][cakechat.utils.w2v.model][80] Successfully got w2v model
[25.04.2018 15:18:33.794][INFO][15][cakechat.dialog_model.model_utils][205] Preparing embedding matrix based on w2v_model and index_to_token dict

[25.04.2018 15:18:33.806][WARNING][15][cakechat.dialog_model.model_utils][195] Can't find token [ça] in w2v dict
[25.04.2018 15:18:33.806][WARNING][15][cakechat.dialog_model.model_utils][195] Can't find token [avec] in w2v dict
[25.04.2018 15:18:33.806][WARNING][15][cakechat.dialog_model.model_utils][195] Can't find token [même] in w2v dict
[25.04.2018 15:18:33.806][WARNING][15][cakechat.dialog_model.model_utils][195] Can't find token [ils] in w2v dict
[25.04.2018 15:18:33.806][WARNING][15][cakechat.dialog_model.model_utils][195] Can't find token [être] in w2v dict
[25.04.2018 15:18:33.807][WARNING][15][cakechat.dialog_model.model_utils][195] Can't find token [suis] in w2v dict
[25.04.2018 15:18:33.807][WARNING][15][cakechat.dialog_model.model_utils][195] Can't find token [quand] in w2v dict
...

I have about 38 521 warning like that.
I've checked, all these tokens are in token_index/t_idx_processed_dialogs.json (it's weird because there's 50 000 words inside, and some of them are found)

Here is what my data/ folder looks like:
data/condition_index/c_idx_processed_dialogs.json

corpora_processed/train_processed_dialogs.txt
corpora_processed/train_processed_dialogs.txt

quality/context_free_questions.txt
quality/context_free_test_set.txt
quality/context_free_validation_set.txt

tensorboard/steps

token_index/t_idx_processed_dialogs.json

w2v_models/train_processed_dialogs_window10_voc50000_vec128_sgTrue.bin

Finally, when the step bellow comes, the train.py processed is killed:

...
[25.04.2018 15:20:26.123][INFO][15][cakechat.dialog_model.model][348] Computing train updates...
[25.04.2018 15:22:27.128][INFO][15][cakechat.dialog_model.model][351] Compiling train function...
Killed

(IS_DEV flag has been set to 0)

Thanks a lot for you help and for all your work !

Is continuous training supported?

I was just wondering if cakechat works off of precompiled static models, or if it can "learn" from people talking to it. It's not really a big deal either way as I could just periodically retrain it using an updated dataset.

Proper Dataset Formatting

I'm working on some some tools to automate the creation of my own training datasets. I've noticed that there are discrepancies between the supplied dummy data and the downloaded sets from AWS.

The dummy datasets include punctuation which will get tokenized as separate tokens when generating indices. But the AWS dataset does not include punctuation in the token index. Is the right plan of action to strip punctuation from the dataset?

Also there are placeholders like "_unk_" and "_pad_" that get added in as well, these are also not present in the AWS token index, but will be added to a generated index presumably due to the default's set in config.py

How to prepare Corpus?

I have a corpus file, but I don't know how to easily make it have "Each line of the corpus file should be a JSON object containing a list of dialog messages sorted in chronological order."

Is there a tool where I can take a downloaded corpus and translate it to cakechat format

How to start the model in continuous Q&A flow?

Once we have tested the model using

python tools/test_api.py -f 127.0.0.1 -p 8080
-c "Hi, Eddie, what's up?"
-c "Not much, what about you?"
-c "Fine, thanks. Are you going to the movies tomorrow?"

how do we start up a session that works like an actual chatbot? Like, immediate question and answers..?

Adding "memory"

Is there a way to add a sort of memory to the AI? Remembering semantics like information about them or information about them. Similar to how Replika.ai remembers details.

If there is no built feature for this how could I go about creating it myself?

How to improve the responses of the model?

Syntax Error:Invalid Syntax

Compiled on MacOSX 10.14.5

executed "pip install -r requirements.txt -r requirements-local.txt"
then "open cakechat/cakechat/api/config.py" to remove emoji character because it was complaining about non-ascii character encoding
then "python cakechat_server.py -b 127.0.0.1:8080" from the bin directory

Should I update python? 2.7.15

Using TensorFlow backend.
[2019-06-01 11:08:31 -0700] [52444] [ERROR] Exception in worker process
Traceback (most recent call last):
File "/usr/local/lib/python2.7/site-packages/gunicorn/arbiter.py", line 583, in spawn_worker
worker.init_process()
File "/usr/local/lib/python2.7/site-packages/gunicorn/workers/base.py", line 129, in init_process
self.load_wsgi()
File "/usr/local/lib/python2.7/site-packages/gunicorn/workers/base.py", line 138, in load_wsgi
self.wsgi = self.app.wsgi()
File "/usr/local/lib/python2.7/site-packages/gunicorn/app/base.py", line 67, in wsgi
self.callable = self.load()
File "/usr/local/lib/python2.7/site-packages/gunicorn/app/wsgiapp.py", line 52, in load
return self.load_wsgiapp()
File "/usr/local/lib/python2.7/site-packages/gunicorn/app/wsgiapp.py", line 41, in load_wsgiapp
return util.import_app(self.app_uri)
File "/usr/local/lib/python2.7/site-packages/gunicorn/util.py", line 350, in import_app
import(module)
File "/Users/rob/Desktop/cakechat_v2./bin/cakechat_server.py", line 11, in
from cakechat.api.v1.server import app
File "/Users/rob/Desktop/cakechat_v2./cakechat/api/v1/server.py", line 3, in
from cakechat.api.response import get_response
File "/Users/rob/Desktop/cakechat_v2./cakechat/api/response.py", line 6, in
from cakechat.dialog_model.factory import get_trained_model, get_reverse_model
File "/Users/rob/Desktop/cakechat_v2./cakechat/dialog_model/factory.py", line 8, in
from cakechat.dialog_model.inference_model import InferenceCakeChatModel
File "/Users/rob/Desktop/cakechat_v2./cakechat/dialog_model/inference_model.py", line 1, in
from cakechat.dialog_model.keras_model import KerasTFModelIsolator
File "/Users/rob/Desktop/cakechat_v2./cakechat/dialog_model/keras_model.py", line 114
class AbstractKerasModel(AbstractModel, metaclass=abc.ABCMeta):

SyntaxError: invalid syntax
[2019-06-01 11:08:31 -0700] [52444] [INFO] Worker exiting (pid: 52444)
[2019-06-01 11:08:31 -0700] [52441] [INFO] Shutting down: Master
[2019-06-01 11:08:31 -0700] [52441] [INFO] Reason: Worker failed to boot.

Recommended server specs?

I've got cakechat running now on a Debian box that I set up on Google Cloud Platform and am using it on a test app receives messages and sends back response to a Facebook messenger app. It's nice!

I'm wondering if you have any suggestions regarding server requirements: recommended RAM, disk size, number of CPUs, etc.?

Docker containers does not start on docker run

Now in the context of using your Docker container in container platforms like OpenShift or in a orchestration platform like Kubernetes, this is a bit of a no-no.

I'm proposing to add another CMD layer that would be used by container platforms to tell the container to start instead of dropping to the shell, which would cause a back off in OpenShift/Kubernetes,

This is doable in CPU but GPU would need some special configuration, which makes it a bit unachievable at the moment.

Training on group chat instead of one on one conversation

Would it be possible to train the existing architecture on group data instead of one on one conversation?
if so:
whats the best way to do this?
if not:
what changes could be made to make this possible?

APK?

I'd rather not install this through the Google play store, but I don't trust these various apk pages either. I guess technically the issue I'm reporting is the lack of an APK on the releases page.

Thanks for your patience.

tools/fetch.py fails. File can not be downloaded.

Hi there.

When I run fetch.py It fails with the following errors:

[13.06.2019` 03:43:43.186][INFO][16608][cakechat.dialog_model.inference_model.InferenceCakeChatModel][130] Looking for the previously trained model
[13.06.2019 03:43:43.186][INFO][16608 [cakechat.dialog_model.inference_model.InferenceCakeChatModel][131] Model params str: {"corpus_name": "processed_dialogs", "dense_dropout_ratio": 0.2, "epochs_num": 2, "hidden_layer_dim": 768, "input_context_size": 3, "input_seq_len": 30, "is_reverse_model": true, "optimizer": {"clipvalue": 5.0, "decay": 0.0, "epsilon": 1e-07, "lr": 6.0, "rho": 0.95}, "output_seq_len": 32, "token_embedding_dim": 128, "train_batch_size": 196, "training_callbacks": {"CakeChatEvaluatorCallback": {"eval_state_per_batches": 500}}, "training_data": "train_processed_dialogs", "validation_data": "context_free_validation_set,val_processed_dialogs", "voc_size": 101, "w2v_model": "train_processed_dialogs_window10_voc50000_vec128_sgTrue"}
[13.06.2019 03:43:43.260][INFO][16608][cakechat.utils.s3.bucket][19] Getting file nn_models/reverse_cakechat_v2.0_keras_tf_617dfa4a1691.tar.gz from AWS S3 and saving it as /mnt/amadeus/chatbot/results/nn_models/reverse_cakechat_v2.0_keras_tf_617dfa4a1691.tar.gz
[13.06.2019 03:43:44.084][WARNING][16608][cakechat.utils.s3.resolver.S3FileResolver][45] File can not be downloaded from AWS S3 because: An error occurred (404) when calling the HeadObject operation: Not Found
[13.06.2019 03:43:44.085][ERROR][16608][cakechat.dialog_model.inference_model.InferenceCakeChatModel][136] Can't find previously trained model in /mnt/amadeus/chatbot/results/nn_models/reverse_cakechat_v2.0_keras_tf_617dfa4a1691

Thanks for all your work on this project!

Setup documentation

I've downloaded and trained the model, but when I launch the server and try to call it in my browser I get a This site can't be reached error. I've tried http://localhost:8080 and http://127.0.0.1:8080/.

I looked around and saw that I could use docker inspect <CONTAINER-NAME> which yielded an ip of 172.17.0.2, which could also not be reached.

Also, when I try the test conversation I get

requests.exceptions.ConnectionError: HTTPConnectionPool(host='127.0.0.1', port=8080): Max retries exceeded with url: /cakechat_api/v1/actions/get_response (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x103607eb8>: Failed to establish a new connection: [Errno 61] Connection refused',))

I found this resource helpful, although I still don't know how I can access the Docker port mapping.

I'm super psyched to get involved and help out with this project, even as a total bot beginner, so perhaps I can help build out a fuller "Absolute Beginner" doc to support such simple client-server debugging?

Can't adjust input size

Trying to increase input size to anything more than 3 results in an error saying I need to download a new model

AttributeError: 'module' object has no attribute '_get_ndarray_c_version'

Outputs this right here... Not quite sure how to fix...

Tried deleting the stuff in w2v_models to no avail - help!

[17.05.2019 14:44:06.637][INFO][11844][cakechat.utils.files_utils][87] Loading /root/cakechat/data/tensorboard/steps
[17.05.2019 14:44:06.637][INFO][11844][cakechat.tools/train.py][102] THEANO_FLAGS: floatX=float32,device=cpu
[17.05.2019 14:44:06.639][INFO][11844][cakechat.tools/train.py][42] Getting train iterator for w2v...
[17.05.2019 14:44:06.639][INFO][11844][cakechat.tools/train.py][48] Getting text-filtered train iterator...
[17.05.2019 14:44:06.639][INFO][11844][cakechat.tools/train.py][51] Getting tokenized train iterator...
[17.05.2019 14:44:06.640][INFO][11844][cakechat.utils.w2v.model][64] Getting w2v model
[17.05.2019 14:44:06.789][INFO][11844][cakechat.utils.s3.bucket][19] Getting file w2v_models/train_LUCI_window10_voc891_vec128_sgTrue.bin from AWS S3 and saving it as /root/cakechat/data/w2v_models/train_LUCI_window10_voc891_vec128_sgTrue.bin
[17.05.2019 14:44:07.206][WARNING][11844][cakechat.utils.s3.resolver.S3FileResolver][42] File can not be downloaded from AWS S3 because: An error occurred (404) when calling the HeadObject operation: Not Found
[17.05.2019 14:44:07.206][INFO][11844][cakechat.utils.w2v.model][18] Word2Vec model will be trained now. It can take long, so relax and have fun.
[17.05.2019 14:44:07.206][INFO][11844][cakechat.utils.w2v.model][21] Parameters for training: window10_voc891_vec128_sgTrue
[17.05.2019 14:44:07.255][INFO][11844][cakechat.utils.w2v.model][44] Saving model to /root/cakechat/data/w2v_models/train_LUCI_window10_voc891_vec128_sgTrue.bin
[17.05.2019 14:44:07.263][INFO][11844][cakechat.utils.w2v.model][47] Model has been saved
[17.05.2019 14:44:07.263][INFO][11844][cakechat.utils.w2v.model][80] Successfully got w2v model

[17.05.2019 14:44:07.263][INFO][11844][cakechat.dialog_model.model_utils][202] Preparing embedding matrix based on w2v_model and index_to_token dict
[17.05.2019 14:44:07.263][WARNING][11844][cakechat.dialog_model.model_utils][192] Can't find token [_unk_] in w2v dict
Traceback (most recent call last):
  File "tools/train.py", line 107, in <module>
    train(init_path=args.init_weights, is_reverse_model=args.reverse)
  File "tools/train.py", line 79, in train
    resolver_factory=nn_model_resolver_factory, is_reverse_model=is_reverse_model)
  File "/root/cakechat/cakechat/dialog_model/model.py", line 714, in get_nn_model
    is_reverse_model=is_reverse_model)
  File "/root/cakechat/cakechat/dialog_model/model.py", line 88, in __init__
    self._compile_theano_functions_for_prediction()
  File "/root/cakechat/cakechat/dialog_model/model.py", line 144, in _compile_theano_functions_for_prediction
    self.predict_prob = self._get_predict_fn(logarithm_output_probs=False)
  File "/root/cakechat/cakechat/dialog_model/model.py", line 464, in _get_predict_fn
    output_probs = self._get_nn_output()
  File "/root/cakechat/cakechat/dialog_model/model.py", line 443, in _get_nn_output
    output_probs = get_output(self._net['dist'], deterministic=True)
  File "/usr/local/lib/python2.7/dist-packages/lasagne/layers/helper.py", line 197, in get_output
    all_outputs[layer] = layer.get_output_for(layer_inputs, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/lasagne/layers/recurrent.py", line 1489, in get_output_for
    strict=True)[0]
  File "/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan.py", line 1048, in scan
    local_op = scan_op.Scan(inner_inputs, new_outs, info)
  File "/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan_op.py", line 216, in __init__
    [])
  File "/usr/local/lib/python2.7/dist-packages/theano/gof/cc.py", line 1300, in cmodule_key_variables
    c_compiler)
  File "/usr/local/lib/python2.7/dist-packages/theano/gof/cc.py", line 1350, in cmodule_key_
    np.core.multiarray._get_ndarray_c_version())
AttributeError: 'module' object has no attribute '_get_ndarray_c_version'

Using the trained model

I have retrained the default model and trained it on a different data using the downloaded model weights as initial model weights(in Reverse Mode).
So now the question is how to use the model that i trained and not the one that was downloaded through the provided script download_model.py
Also, i wish to use colab environment and i am not sure how to use gpu, as i succesfully configured the libgpuarray but dont know where to put the gpu id or how to find it in colab. The most i could figure was, the gpu device name is "device:GPU:0" with similar steps given in(https://www.kdnuggets.com/2018/02/google-colab-free-gpu-tutorial-tensorflow-keras-pytorch.html/2)
i reached the utils/env.py file in cakechat and there a 'GPU-ID' was discovered but what value to supply is not clear.
It will be great if you could help me with that.

Thanks

Responses are not context-oriented

Hello, I came across your repository and it's a great project! Thank you for sharing!
I tried training a "chit-chat" model on it and it generates sentences that look "correct", but unfortunately quite "irrelevant" to the user's input.
Do you have any suggestion on how to improve the "relevanceness" of the responses to the user's input? (e.g., which decoding algorithm to choose, tuning parameters, or how to affect the sampling process?)
Thanks!

Newbie Training our responses

Hello, I would like to ask on how are we going to train the cakechat model. We used this as a group project on school and we have a very limited knowledge on programming. We asked helped on one of our professors and was able to run it but we have a big problem training it since he won't help us anymore. We already read your steps on training our own model and also some of the issues but we couldn't understand or maybe we were just overthinking the process.
Our questions are :

Where are we going to edit our responses?
How do we make the cakechat respond lwhat we want him to respond?
We've read that the corpus file is a JSON file can we edit this just on notepad? And does this file contains all our data responses or is it one response only in 5 different emotions?

This embarrassing but we know you have posted the answers but we can't understand. We ask specific questions because this so new to us. We hope you can help us :(

When the training model is successfully restarted, the start-up service fails to find the model.

The custom training model was successful, and an error occurred when running the python bin/cakechat_server.py command to start the service.

The operating system is Centos7.

Grammatically-correct response

How to get answer with grammatically-correct words?
I have trained new model on ru data and response generated by neural network consist of bunch of words in different forms (morphology, case, plur, etc.) - not normalized like https://pymorphy2.readthedocs.io/en/latest/user/guide.html does. So it is hard to interpret is it correct or not.

I cannot find any implementation related.
Also, word2vec model is generated for train_processed_dialogs.txt ? if so i think it's bad, why dont you use general w2v like piskvorky/gensim-data#3

Tensorflow GPU issue

CLIENT:

root@c6bf55d8c25f:~/cakechat# python tools/test_api.py -f localhost -p 8080 -c "hi!" -c "hi, how are you?" -c "good!" -e "joy"

Output:

Using TensorFlow backend.
{'message': 'Can\'t process request: OOM when allocating tensor with shape[10,39,50000] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc\n\t [[{{node decoder_model/softmax_with_temperature/Softmax}} = Softmax[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](decoder_model/softmax_with_temperature/sub)]]\nHint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.\n\n\t [[{{node decoder_model/softmax_with_temperature/Softmax/_203}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_715_decoder_model/softmax_with_temperature/Softmax", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]\nHint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.\n'}

SERVER output:

2019-08-16 12:03:08.905231: W tensorflow/core/common_runtime/bfc_allocator.cc:271] *************************************************************************************************xxx
2019-08-16 12:03:08.905317: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at softmax_op_gpu.cu.cc:158 : Resource exhausted: OOM when allocating tensor with shape[10,39,50000] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[16.08.2019 12:03:08.906][ERROR][1][cakechat.api.v1.server][5] Can't process request: OOM when allocating tensor with shape[10,39,50000] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[{{node decoder_model/softmax_with_temperature/Softmax}} = Softmax[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](decoder_model/softmax_with_temperature/sub)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

	 [[{{node decoder_model/softmax_with_temperature/Softmax/_203}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_715_decoder_model/softmax_with_temperature/Softmax", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

127.0.0.1 - - [16/Aug/2019 12:03:08] "POST /cakechat_api/v1/actions/get_response HTTP/1.1" 500 -

Amount of training data and training time

Hi, first I would like to say thank you for this amazing repository.

I have 2 questions related with training the model:

What is the amount of data that you used to train your model?

What are the specifications of the hardware that you used to train your model and how much time did it take to completely train?

Thank you very much and I apologize for my bad english.

Unable to download or generate new model

I'm experiencing difficulties with downloading the ready model via tools/download_model.py It would seem that the content from amazon web services cannot be found, see the following from the stack trace:

[22.03.2018 20:40:47.562][WARNING][1108][cakechat.utils.s3.resolver.S3FileResolver][43] File can not be downloaded from AWS S3 because: An error occurred (404) when calling the HeadObject operation: Not Found

It then attempts to create a model which seems to complete successfully.

[22.03.2018 20:40:47.565][INFO][1108][cakechat.dialog_model.model][619] Can't find previously calculated model, so will use a fresh one [22.03.2018 20:40:47.565][INFO][1108][cakechat.dialog_model.model][621] Model is built

However no model is placed in the file path specified by the stack trace.
Here is the full result of executing pythons tools/download_model.py.

Any insight into how to resolve this issue would be much appreciated, thank you!

(base) path: >python tools/download_model.py

PATH\Anaconda2\lib\site-packages\gensim\utils.py:855: UserWarning: detected Windows; aliasing chunkize to chunkize_serial
  warnings.warn("detected Windows; aliasing chunkize to chunkize_serial")
[22.03.2018 20:37:32.302][INFO][1108][cakechat.tools/download_model.py][21] Fetching and pre-compiling pre-trained model...
[22.03.2018 20:37:32.331][INFO][1108][cakechat.dialog_model.model][598] Initializing NN model with the following params:
[22.03.2018 20:37:32.332][INFO][1108][cakechat.dialog_model.model][600] NN input dimension: 256 (token vector size)
[22.03.2018 20:37:32.332][INFO][1108][cakechat.dialog_model.model][601] NN hidden dimension: 512
[22.03.2018 20:37:32.332][INFO][1108][cakechat.dialog_model.model][602] NN output dimension: 39 (dict size)
[22.03.2018 20:37:35.569][INFO][1108][cakechat.dialog_model.model][407] Compiling predict function (log_prob=False)...
[22.03.2018 20:38:18.410][INFO][1108][cakechat.dialog_model.model][434] Compiling one-step predict function (log_prob=False)...
[22.03.2018 20:38:43.671][INFO][1108][cakechat.dialog_model.model][407] Compiling predict function (log_prob=True)...
[22.03.2018 20:39:12.292][INFO][1108][cakechat.dialog_model.model][434] Compiling one-step predict function (log_prob=True)...
[22.03.2018 20:39:36.957][INFO][1108][cakechat.dialog_model.model][483] Compiling sequence scoring function...
[22.03.2018 20:40:05.703][INFO][1108][cakechat.dialog_model.model][506] Compiling sequence scoring function (with thought vectors as arguments)...
[22.03.2018 20:40:45.812][INFO][1108][cakechat.utils.s3.bucket][21] Getting file
 nn_models\processed_dialogs_gru_hd512_drop0.2_encd2_decd2_il30_cs3_ansl32_lr1.0
_gc_5.0_learnemb_cdim128_window10_voc39_vec128_sgTrue from AWS S3 and saving it
as PATH\cakechat-master\data/nn_models\processed_dialogs_g
ru_hd512_drop0.2_encd2_decd2_il30_cs3_ansl32_lr1.0_gc_5.0_learnemb_cdim128_windo
w10_voc39_vec128_sgTrue
[22.03.2018 20:40:47.562][WARNING][1108][cakechat.utils.s3.resolver.S3FileResolv
er][43] File can not be downloaded from AWS S3 because: An error occurred (404)
when calling the HeadObject operation: Not Found
[22.03.2018 20:40:47.565][INFO][1108][cakechat.dialog_model.model][619] Can't fi
nd previously calculated model, so will use a fresh one
[22.03.2018 20:40:47.565][INFO][1108][cakechat.dialog_model.model][621] Model is
 built

[22.03.2018 20:40:47.625][INFO][1108][cakechat.dialog_model.model][625] Model pa
th is PATH\cakechat-master\data/nn_models\processed_dialog
s_gru_hd512_drop0.2_encd2_decd2_il30_cs3_ansl32_lr1.0_gc_5.0_learnemb_cdim128_window10_voc39_vec128_sgTrue

Traceback (most recent call last):
  File "tools/download_model.py", line 22, in <module> get_trained_model(fetch_from_s3=True)
  File "PATH\cakechat-master\cakechat\dialog_model\factory.py", line 53, in get_trained_model
    raise Exception('Can\'t get the model. '
Exception: Can't get the model. Run tools/download_model.py first to get all required files or train it by yourself.

Propose Logo

Hi. I'm graphic disigner. I would like to know if you are interested that I make a logo for your project? If you allowed me, i"ll make logo for your project and it's free.

Installing on Google Cloud Platform

I was able to successfully install cakechat on my laptop (Mac OSX), but I ran into error messages when I tried installing it on Google Cloud Platform. Could you help with this or update your README to include instructions for running it there?

Configuring to work with google colab

how to add support to train using Google Colab to take advantage of cloud computing power and train better models?

Server Not Found

Hello!
Just getting started with cakechat, installing it without docker on macos.
worked perfectly up until I ran "python bin/cakechat_server.py"
It loaded the server, and said it was running, but when I visited the site it said
"Not Found
The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again."

Honestly, have no idea how to go about this -- would really appreciate your help :)

Can't wait to get started messing around and building awesome stuff with CakeChat! I'll be sure to keep you updated with my project, hopefully, if I can get cakechat running properly.

Setup Issues

I first tried the docker setup but it errored out and I just assumed that it was my fault since I haven't used docker before. Then I moved to the manual setup and it threw the same error.

While installing, the bdust_wheel for scipy errors which gives me:

Failed building wheel for scipy
Failed cleaning build dir for scipy
Failed building wheel for scikit-learn

which then breaks the install with:
Command "/usr/bin/python -u -c "import setuptools, tokenize;file='/tmp/pip-build-Je8Y7m/scipy/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-VPk4Mt-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-Je8Y7m/scipy/

Quick start for GPU version:

I got an error after using the command:

docker pull lukalabs/cakechat-gpu:latest && \
nvidia-docker run --name cakechat-gpu-server -p 127.0.0.1:8080:8080 -it lukalabs/cakechat-gpu:latest bash -c "CUDA_VISIBLE_DEVICES=0 python bin/cakechat_server.py"

Output:

2019-08-18 14:09:11.496399: W tensorflow/core/common_runtime/bfc_allocator.cc:271] ***************************************************************************************xxxxxxxxxxxxx
2019-08-18 14:09:11.496449: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at random_op.cc:202 : Resource exhausted: OOM when allocating tensor with shape[50000,128] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1334, in _do_call
    return fn(*args)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[768,2304] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[{{node decoder_scope/decoder_1/random_uniform/RandomUniform}} = RandomUniform[T=DT_INT32, dtype=DT_FLOAT, seed=87654321, seed2=5561963, _device="/job:localhost/replica:0/task:0/device:GPU:0"](decoder_scope/decoder_1/random_uniform/shape)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "bin/cakechat_server.py", line 11, in <module>
    from cakechat.api.v1.server import app
  File "/root/cakechat/cakechat/api/v1/server.py", line 3, in <module>
    from cakechat.api.response import get_response
  File "/root/cakechat/cakechat/api/response.py", line 14, in <module>
    _cakechat_model = get_trained_model(reverse_model=get_reverse_model(PREDICTION_MODE))
  File "/usr/local/lib/python3.5/dist-packages/cachetools/__init__.py", line 46, in wrapper
    v = func(*args, **kwargs)
  File "/root/cakechat/cakechat/dialog_model/factory.py", line 76, in get_trained_model
    model.init_model()
  File "/root/cakechat/cakechat/dialog_model/keras_model.py", line 30, in wrapper
    return func(*args, **kwargs)
  File "/root/cakechat/cakechat/dialog_model/keras_model.py", line 279, in init_model
    self.print_weights_summary()
  File "/root/cakechat/cakechat/dialog_model/keras_model.py", line 30, in wrapper
    return func(*args, **kwargs)
  File "/root/cakechat/cakechat/dialog_model/keras_model.py", line 263, in print_weights_summary
    weights = self._model.get_weights()
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/network.py", line 492, in get_weights
    return K.batch_get_value(weights)
  File "/usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py", line 2420, in batch_get_value
    return get_session().run(ops)
  File "/usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py", line 206, in get_session
    session.run(tf.variables_initializer(uninitialized_vars))
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 929, in run
    run_metadata_ptr)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1328, in _do_run
    run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[768,2304] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[node decoder_scope/decoder_1/random_uniform/RandomUniform (defined at /usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py:4139)  = RandomUniform[T=DT_INT32, dtype=DT_FLOAT, seed=87654321, seed2=5561963, _device="/job:localhost/replica:0/task:0/device:GPU:0"](decoder_scope/decoder_1/random_uniform/shape)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.


Caused by op 'decoder_scope/decoder_1/random_uniform/RandomUniform', defined at:
  File "bin/cakechat_server.py", line 11, in <module>
    from cakechat.api.v1.server import app
  File "<frozen importlib._bootstrap>", line 969, in _find_and_load
  File "<frozen importlib._bootstrap>", line 958, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 673, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 665, in exec_module
  File "<frozen importlib._bootstrap>", line 222, in _call_with_frames_removed
  File "/root/cakechat/cakechat/api/v1/server.py", line 3, in <module>
    from cakechat.api.response import get_response
  File "<frozen importlib._bootstrap>", line 969, in _find_and_load
  File "<frozen importlib._bootstrap>", line 958, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 673, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 665, in exec_module
  File "<frozen importlib._bootstrap>", line 222, in _call_with_frames_removed
  File "/root/cakechat/cakechat/api/response.py", line 14, in <module>
    _cakechat_model = get_trained_model(reverse_model=get_reverse_model(PREDICTION_MODE))
  File "/usr/local/lib/python3.5/dist-packages/cachetools/__init__.py", line 46, in wrapper
    v = func(*args, **kwargs)
  File "/root/cakechat/cakechat/dialog_model/factory.py", line 76, in get_trained_model
    model.init_model()
  File "/root/cakechat/cakechat/dialog_model/keras_model.py", line 30, in wrapper
    return func(*args, **kwargs)
  File "/root/cakechat/cakechat/dialog_model/keras_model.py", line 277, in init_model
    self._model = self._build_model()
  File "/root/cakechat/cakechat/dialog_model/model.py", line 253, in _build_model
    decoder_training_model, decoder_model = self._decoder(y_tokens_emb_model, condition_emb_model)
  File "/root/cakechat/cakechat/dialog_model/model.py", line 412, in _decoder
    (outputs_seq_0, initial_state=dec_hs_1)
  File "/usr/local/lib/python3.5/dist-packages/keras/layers/recurrent.py", line 570, in __call__
    output = super(RNN, self).__call__(full_input, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/base_layer.py", line 431, in __call__
    self.build(unpack_singleton(input_shapes))
  File "/usr/local/lib/python3.5/dist-packages/keras/layers/cudnn_recurrent.py", line 237, in build
    constraint=self.kernel_constraint)
  File "/usr/local/lib/python3.5/dist-packages/keras/legacy/interfaces.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/base_layer.py", line 249, in add_weight
    weight = K.variable(initializer(shape),
  File "/usr/local/lib/python3.5/dist-packages/keras/initializers.py", line 218, in __call__
    dtype=dtype, seed=self.seed)
  File "/usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py", line 4139, in random_uniform
    dtype=dtype, seed=seed)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/random_ops.py", line 243, in random_uniform
    rnd = gen_random_ops.random_uniform(shape, dtype, seed=seed1, seed2=seed2)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_random_ops.py", line 733, in random_uniform
    name=name)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
    op_def=op_def)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1770, in __init__
    self._traceback = tf_stack.extract_stack()

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[768,2304] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[node decoder_scope/decoder_1/random_uniform/RandomUniform (defined at /usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py:4139)  = RandomUniform[T=DT_INT32, dtype=DT_FLOAT, seed=87654321, seed2=5561963, _device="/job:localhost/replica:0/task:0/device:GPU:0"](decoder_scope/decoder_1/random_uniform/shape)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

But cpu version is ok.

tools/fetch.py ImportError: DLL load failed: The specified module could not be found. error

I am running python 3.6.7 on windows 10 pro and when I try and run the command to download the pre-trained model I get this error related to scipy

Traceback (most recent call last):
  File "tools/fetch.py", line 15, in <module>
    from cakechat.dialog_model.factory import get_trained_model
  File "D:\Python\salvation\cakechat\cakechat\dialog_model\factory.py", line 8, in <module>
    from cakechat.dialog_model.inference_model import InferenceCakeChatModel
  File "D:\Python\salvation\cakechat\cakechat\dialog_model\inference_model.py", line 1, in <module>
    from cakechat.dialog_model.keras_model import KerasTFModelIsolator
  File "D:\Python\salvation\cakechat\cakechat\dialog_model\keras_model.py", line 11, in <module>
    from cakechat.dialog_model.abstract_model import AbstractModel
  File "D:\Python\salvation\cakechat\cakechat\dialog_model\abstract_model.py", line 6, in <module>
    from cakechat.dialog_model.quality.metrics.utils import MetricsSerializer
  File "D:\Python\salvation\cakechat\cakechat\dialog_model\quality\__init__.py", line 2, in <module>
    from cakechat.dialog_model.quality.metrics.lexical_simlarity import calculate_lexical_similarity, get_tfidf_vectorizer
  File "D:\Python\salvation\cakechat\cakechat\dialog_model\quality\metrics\lexical_simlarity.py", line 3, in <module>
    from sklearn.feature_extraction.text import TfidfVectorizer
  File "C:\Python36\lib\site-packages\sklearn\__init__.py", line 76, in <module>
    from .base import clone
  File "C:\Python36\lib\site-packages\sklearn\base.py", line 16, in <module>
    from .utils import _IS_32BIT
  File "C:\Python36\lib\site-packages\sklearn\utils\__init__.py", line 20, in <module>
    from .validation import (as_float_array,
  File "C:\Python36\lib\site-packages\sklearn\utils\validation.py", line 21, in <module>
    from .fixes import _object_dtype_isnan
  File "C:\Python36\lib\site-packages\sklearn\utils\fixes.py", line 18, in <module>
    from scipy.sparse.linalg import lsqr as sparse_lsqr  # noqa
  File "C:\Python36\lib\site-packages\scipy\sparse\linalg\__init__.py", line 113, in <module>
    from .isolve import *
  File "C:\Python36\lib\site-packages\scipy\sparse\linalg\isolve\__init__.py", line 6, in <module>
    from .iterative import *
  File "C:\Python36\lib\site-packages\scipy\sparse\linalg\isolve\iterative.py", line 10, in <module>
    from . import _iterative
ImportError: DLL load failed: The specified module could not be found.

no such option: --process-dependency-links

--process-dependency-links got deprecated and is now removed from pip so docker build fails. See here.

Training process gets killed.

Hi!
When running train.py the process always gets killed for some reason.
Could you please help with pointing out if I am doing anything wrong?
I have attached my terminal output.
trainkilled.txt

cannot get trained model

please close this.

Emotion Condition vs. Emotion detection

From what I can make out in the code (get_response in cakechat.api.response), you are using the input emotion category (that the user can set - {joy, anger, sadness etc.}) to condition the response. So, do I understand correctly that you are actually not detecting any emotion from the user text input but rather hardwire the emotion in the response to the emotion input category, no matter what the user's emotion in the input text is?

Looks like you are multiplying the emotion condition (from the input) with the condition ids that you gathered from the tokenized user text. What do these condition ids actually relate to?

condition_ids = transform_conditions_to_ids([emotion] * condition_ids_num, _cakechat_model.condition_to_index,
                                                condition_ids_num)

Thanks in advance for the clarifications 👍

How to get only 1 sentence of answer?

When I input a question (single sentence), I have received a respond (with multi sentences) from my custom model. How to get an answer with only 1 sentence?
Thank you for helping!

Quickstart

Hi there!

I've followed the Quickstart instructions, yet am hitting an error I'm wondering if you have advice on:

simplejson.errors.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

And on the server side
127.0.0.1 - - [18/Jan/2019 20:38:11] code 501, message Unsupported method ('POST')
127.0.0.1 - - [18/Jan/2019 20:38:11] "POST /cakechat_api/v1/actions/get_response HTTP/1.1" 501 -

I'm running Docker on Mac, which is working fine (help, ps, etc.). I'll keep exploring, but wanted to post this in case there's something obvious I can try.

Thanks!

Sio

Python test.api cannot establish new connectioon.

Training own model

Hi,

I've loaded my own training and validation corpus, ran prepare_index_files.py, and trained it with no issue. Afterwards, when I ran python bin/cakechat_server.py, it continually threw this error:

Traceback (most recent call last):
  File "bin/cakechat_server.py", line 10, in <module>
    from cakechat.api.v1.server import app
  File "C:\...\cakechat\cakechat\api\v1\server.py", line 3, in <module>
    from cakechat.api.response import get_response
  File "C:\...\cakechat\cakechat\api\response.py", line 14, in <module>
    _cakechat_model = get_trained_model(fetch_from_s3=False)
  File "C:\...\cakechat\cakechat\dialog_model\factory.py", line 53, in get_trained_model
    raise Exception('Can\'t get the model. '
Exception: Can't get the model. Run tools/download_model.py first to get all required files or train it by yourself.

I messed around with get_nn_model() in dialog_model/model.py a bit and realized it was looking for a file named:
processed_dialogs_gru_hd512_drop0.2_encd2_decd2_il30_cs3_ansl32_lr1.0_gc_5.0_learnemb_cdim128_window10_voc11786_vec128_sgTrue

My file in data/nn_models was called:
processed_dialogs_gru_hd7_drop0.2_encd2_decd2_il7_cs3_ansl9_lr1.0_gc_5.0_learnemb_cdim128_window10_voc11786_vec15_sgTrue_pp_free2926.32_sensitive3066.80

I made a copy and renamed it. I then tried to both run the server and train it again, and as soon as it tried to load the model, in both instances I got:
ValueError: mismatch: parameter has shape (11786L, 128L) but value to set has shape (11786L, 15L)

Not really sure where to go from here; thanks in advance. Running Windows / Anaconda with py2.7. All dependencies installed and everything else is running fine thus far. I had it working with the pre-trained model. Tried running the server through both Git Bash and cmd, if that makes a difference. Trained through Bash.

How to deploy model in production?

Hello there! Thanks for the valuable and well-rounded project.

I see that you're using Flask app to serve model predictions using simple REST API. Can you please do a guide on how to deploy a trained model in production that could handle multiple requests at once? And maybe even scale (number of machines) with the number of requests? Including setting up the VM's, environment, etc...?

The community is really lacking guides like that so this can be very helpful.

Dataset Format

When preparing for training, I was looking through the sample dataset;

[{"text": "Hello", "condition": "neutral"}, {"text": "Oh, hi! :) How are you, my friend?", "condition": "joy"}, {"text": "Doing good", "condition": "neutral"}]

Which phrase is being said by the model and which is manually typed by the user?

To me it looks of the form [{USER STATEMENT}, {MODEL RESPONSE}, {USER AGAIN}]

But this doesn’t make sense to me. I would think the data should be formatted more like [{MODEL STATEMENT}, {USER RESPONSE}, {MODEL AGAIN}]?

Could someone help me clarify which party is intended to be saying which statement in the example and why it is in that order? Thanks.

Would I need to make a lot of changes to the algorithms to introduce two conditions in each dataset sentence ?

Hey guys, I am trying to put two conditions on each line so that the bot can reply on bit more specific topics than just the single user condition behind them. Would this require a massive change in the files or can I just feed more conditions on each dataset and change the condition values in config.py etc?
I changed EMOTIONS_TYPES = create_namedtuple_instance() from config.py; and MAX_CONDITIONS_NUM = $ from prepare_index_files. What else would I have to change to have more than one condition?
This is more of a technical discussion rather than an issue. Thanks!

ValueError: numpy.ufunc has the wrong size when launching Docker container

When starting the Docker container generated with
sudo docker build -t cakechat:latest -f dockerfiles/Dockerfile.cpu dockerfiles/

Then running "python tools/download_model.py" inside the container I get the error ValueError: numpy.ufunc has the wrong size.

Complete error below:

Traceback (most recent call last):
  File "bin/cakechat_server.py", line 10, in <module>
    from cakechat.api.v1.server import app
  File "/root/cakechat/cakechat/api/v1/server.py", line 3, in <module>
    from cakechat.api.response import get_response
  File "/root/cakechat/cakechat/api/response.py", line 9, in <module>
    from cakechat.dialog_model.inference import get_nn_responses, warmup_predictor
  File "/root/cakechat/cakechat/dialog_model/inference/__init__.py", line 1, in <module>
    from cakechat.dialog_model.inference.utils import get_sequence_log_probs, get_sequence_score_by_thought_vector, \
  File "/root/cakechat/cakechat/dialog_model/inference/utils.py", line 5, in <module>
    from cakechat.dialog_model.model_utils import get_training_batch
  File "/root/cakechat/cakechat/dialog_model/model_utils.py", line 15, in <module>
    from cakechat.utils.w2v import get_w2v_model
  File "/root/cakechat/cakechat/utils/w2v/__init__.py", line 1, in <module>
    from cakechat.utils.w2v.model import get_w2v_model
  File "/root/cakechat/cakechat/utils/w2v/model.py", line 4, in <module>
    from gensim.models import Word2Vec
  File "/usr/local/lib/python2.7/dist-packages/gensim/__init__.py", line 6, in <module>
    from gensim import parsing, matutils, interfaces, corpora, models, similarities, summarization
  File "/usr/local/lib/python2.7/dist-packages/gensim/models/__init__.py", line 7, in <module>
    from .coherencemodel import CoherenceModel
  File "/usr/local/lib/python2.7/dist-packages/gensim/models/coherencemodel.py", line 30, in <module>
    from gensim.models.wrappers import LdaVowpalWabbit, LdaMallet
  File "/usr/local/lib/python2.7/dist-packages/gensim/models/wrappers/__init__.py", line 8, in <module>
    from .fasttext import FastText
  File "/usr/local/lib/python2.7/dist-packages/gensim/models/wrappers/fasttext.py", line 38, in <module>
    from gensim.models.word2vec import Word2Vec
  File "/usr/local/lib/python2.7/dist-packages/gensim/models/word2vec.py", line 135, in <module>
    from gensim.models.word2vec_inner import train_batch_sg, train_batch_cbow
  File "__init__.pxd", line 861, in init gensim.models.word2vec_inner (./gensim/models/word2vec_inner.c:10917)
ValueError: numpy.ufunc has the wrong size, try recompiling. Expected 192, got 216

parameter mismatch error

When I was trainning the model, I had this parameter mismatch error. I use Windows and Anaconda with Python 2.7. The trainning corpus is the dummy corpus provided. I did not use Docker since Docer-gpu is not supported on Windows. Thanks a lot!

lukalabs / cakechat Goto Github PK

cakechat's Introduction

CakeChat: Emotional Generative Dialog System

Main requirements

Table of contents

Network architecture and features

Quick start

Setup for training and testing

Docker

CPU-only setup

GPU-enabled setup

Manual setup

Getting the pre-trained model

Training data

Training the model

Fine-tuning the pre-trained model on your data

Training the model from scratch

Distributed train

Validation metrics calculation

Testing the trained model

Running CakeChat server

Local HTTP-server

HTTP-server API description

/cakechat_api/v1/actions/get_response

Request

Response OK

Gunicorn HTTP-server

Telegram bot

Repository overview

Important tools

Important configuration settings

Example use cases

References

Credits & Support

License

cakechat's People

Contributors

Stargazers

Watchers

Forkers

cakechat's Issues

Recommend Projects

Recommend Topics

Recommend Org