shenweichen / deepmatch Goto Github PK

A deep matching model library for recommendations & advertising. It's easy to train models and to export representation vectors which can be used for ANN search.

Home Page: https://deepmatch.readthedocs.io/en/latest/

License: Apache License 2.0

Python 100.00%

dssm youtubednn mind collaborative-filtering factorization-machines matching recommendation comirec

deepmatch's Introduction

deepmatch's People

Contributors

Stargazers

Watchers

Forkers

xidianfushuai albertleers jiangquan8 iwtbs leocai alex44jzy keyman9848 playplaydata knowledgehacker goy0695 loyalzc laomagic leepwang zhaochenyang awoziji zero1666 hevensun 24flyman bensonku sanerzheng chenglongcui andy012 xzyin btbujiangjun tomzhang mysticaltech zldeng qianrenjian srk92 zhaosnw faynburd lovehoroscoper ilyaskerbal miss1997yuan 2012fang liuleigit xrosliang askintution yhgui sandy4321 zwcdp xueshang-liulp soar200 yangyangl allensmile liaozhongru007 chongminggao liangzuan1983 hyperloco djofouc hpec a626677909 donaldxu harsha-sharechat-account 1permutation yilunchen27 karlyang2013 wilsonsky18 wqw123 shujian2015 jiaxiangbu naivelamb cbhxdyx demonbibi lu1352 jakisou lijiankou fagan2888 yaosheng42 ranalytica wxkpythonwork eehiter yantaoqiao gavin90s xuxinzhang aichenbaby mengxiaozhibo maleilei zhuxinyizhizun skyscraperv xiaoqingwang aiwtb jangocheng yyxt11 lmyjq zizhuxishui fighting41love cxq80803716 ymh1308458539 arasc yangqiu 1098693818 dailyncepu machinelp lvchakele evan-li173 zhoushaojun arryboom sobyl sangyongjia

deepmatch's Issues

关于movielens划分测试集的label

请问，preprocess.py的gen_data_set方法，对movielens的测试集label只设置了1，没有0吗？

mind example error

Describe the question(问题描述)
您好，我在youtubeDNN的example里面看到了可以使用MIND模型的代码

DeepMatch/examples/run_youtubednn_sampledsoftmax.py

Line 79 in 6d82098

# user_embs = user_embs[:, i, :] i in [0,k_max) if MIND

当我把这一行去掉注释，并且删除后面的if MIND，还是会python语法错误
我想请问这一步的目的是什么，这样可以以便于我在自己的代码中将其修正，谢谢。

Operating environment(运行环境):

python version [3.7.6]
tensorflow version [ 1.14.0,]
deepmatch version [ 0.1.2,]

deepmatch 中的sdm中的process中是否写反了～

example中的 prefer_sess_length和short_sess_length是否写反了呢～
train_model_input = {"user_id": train_uid, "movie_id": train_iid, "short_movie_id": train_short_item_pad,
"prefer_movie_id": train_prefer_item_pad, "prefer_sess_length": train_short_len, "short_sess_length":
train_prefer_len, 'short_genres': train_short_genres_pad, 'prefer_genres': train_prefer_genres_pad}

SDM加载之前存储的模型后报错AttributeError: 'Functional' object has no attribute 'user_input'

Please refer to the FAQ in doc and search for the related issues before you ask the question.

Describe the question(问题描述)
对deepmatch中的sdm模型进行模型存储和加载的尝试，出现
AttributeError: 'Functional' object has no attribute 'user_input'的错误，求助各路大神.谢谢~

Additional context

样例代码：
import pandas as pd
from deepctr.feature_column import SparseFeat, VarLenSparseFeat
from preprocess import gen_data_set_sdm, gen_model_input_sdm
from sklearn.preprocessing import LabelEncoder
from tensorflow.python.keras import backend as K
from tensorflow.python.keras import optimizers
from tensorflow.python.keras.models import Model

from deepmatch.models import SDM
from deepmatch.utils import sampledsoftmaxloss

if name == "main":
data = pd.read_csvdata = pd.read_csv("./movielens_sample.txt")

sparse_features = ["movie_id", "user_id",
                   "gender", "age", "occupation", "zip", "genres"]

SEQ_LEN_short = 5
SEQ_LEN_prefer = 50

# 1.Label Encoding for sparse features,and process sequence features with `gen_date_set` and `gen_model_input`

features = ['user_id', 'movie_id', 'gender', 'age', 'occupation', 'zip', 'genres']
feature_max_idx = {}
for feature in features:
    lbe = LabelEncoder()
    data[feature] = lbe.fit_transform(data[feature]) + 1
    feature_max_idx[feature] = data[feature].max() + 1

user_profile = data[["user_id", "gender", "age", "occupation", "zip", "genres"]].drop_duplicates('user_id')
item_profile = data[["movie_id"]].drop_duplicates('movie_id')
user_profile.set_index("user_id", inplace=True)

# user_item_list = data.groupby("user_id")['movie_id'].apply(list)

train_set, test_set = gen_data_set_sdm(data, seq_short_len=SEQ_LEN_short, seq_prefer_len=SEQ_LEN_prefer)

train_model_input, train_label = gen_model_input_sdm(train_set, user_profile, SEQ_LEN_short, SEQ_LEN_prefer)
test_model_input, test_label = gen_model_input_sdm(test_set, user_profile, SEQ_LEN_short, SEQ_LEN_prefer)

# 2.count #unique features for each sparse field and generate feature config for sequence feature

embedding_dim = 32
# for sdm,we must provide `VarLenSparseFeat` with name "prefer_xxx" and "short_xxx" and their length
user_feature_columns = [SparseFeat('user_id', feature_max_idx['user_id'], 16),
                        SparseFeat("gender", feature_max_idx['gender'], 16),
                        SparseFeat("age", feature_max_idx['age'], 16),
                        SparseFeat("occupation", feature_max_idx['occupation'], 16),
                        SparseFeat("zip", feature_max_idx['zip'], 16),
                        VarLenSparseFeat(SparseFeat('short_movie_id', feature_max_idx['movie_id'], embedding_dim,
                                                    embedding_name="movie_id"), SEQ_LEN_short, 'mean',
                                         'short_sess_length'),
                        VarLenSparseFeat(SparseFeat('prefer_movie_id', feature_max_idx['movie_id'], embedding_dim,
                                                    embedding_name="movie_id"), SEQ_LEN_prefer, 'mean',
                                         'prefer_sess_length'),
                        VarLenSparseFeat(SparseFeat('short_genres', feature_max_idx['genres'], embedding_dim,
                                                    embedding_name="genres"), SEQ_LEN_short, 'mean',
                                         'short_sess_length'),
                        VarLenSparseFeat(SparseFeat('prefer_genres', feature_max_idx['genres'], embedding_dim,
                                                    embedding_name="genres"), SEQ_LEN_prefer, 'mean',
                                         'prefer_sess_length'),
                        ]

item_feature_columns = [SparseFeat('movie_id', feature_max_idx['movie_id'], embedding_dim)]

K.set_learning_phase(True)

import tensorflow as tf

if tf.__version__ >= '2.0.0':
    tf.compat.v1.disable_eager_execution()

# units must be equal to item embedding dim!
model = SDM(user_feature_columns, item_feature_columns, history_feature_list=['movie_id', 'genres'],
            units=embedding_dim, num_sampled=100, )

model.compile(optimizer='adam', loss=sampledsoftmaxloss)  # "binary_crossentropy")

history = model.fit(train_model_input, train_label,  # train_label,
                    batch_size=512, epochs=1, verbose=1, validation_split=0.0, )
model_name = './sdm_model.h5'
model.save(filepath=model_name)

K.set_learning_phase(False)
# from keras_bert import get_custom_objects

from deepmatch.layers import *
from deepctr.layers.utils import *

loaded_model = tf.keras.models.load_model(model_name,
                                          custom_objects={'EmbeddingIndex': EmbeddingIndex,
                                                          'AttentionSequencePoolingLayer': AttentionSequencePoolingLayer,
                                                          'DynamicMultiRNN': DynamicMultiRNN,
                                                          'SelfMultiHeadAttention': SelfMultiHeadAttention,
                                                          'UserAttention': UserAttention,
                                                          'PoolingLayer': PoolingLayer,
                                                          'SampledSoftmaxLayer': SampledSoftmaxLayer,
                                                          'NoMask': NoMask,
                                                          'sampledsoftmaxloss': sampledsoftmaxloss
                                                          })

# # 3.Define Model,train,predict and evaluate
test_user_model_input = test_model_input
all_item_model_input = {"movie_id": item_profile['movie_id'].values, }

user_embedding_model = Model(inputs=loaded_model.user_input, outputs=loaded_model.user_embedding)
item_embedding_model = Model(inputs=loaded_model.item_input, outputs=loaded_model.item_embedding)

user_embs = user_embedding_model.predict(test_user_model_input, batch_size=2 ** 12)
# user_embs = user_embs[:, i, :]  # i in [0,k_max) if MIND
item_embs = item_embedding_model.predict(all_item_model_input, batch_size=2 ** 12)

print(user_embs)
print(item_embs.shape)

报错信息：
Traceback (most recent call last):
File "C:/Users/HP/Desktop/DeepMatch-master/examples/run_sdm_test.py", line 107, in
user_embedding_model = Model(inputs=loaded_model.user_input, outputs=loaded_model.user_embedding)
AttributeError: 'Functional' object has no attribute 'user_input'

Operating environment(运行环境):

python version [3.7.5]
tensorflow version [2.4.0,]
deepmatch version [0.2.0,]

from tensorflow.python.keras._impl.keras.layers import Lambda ImportError: No module named _impl.keras.layers

python run_dssm_negsampling.py
Traceback (most recent call last):
File "run_dssm_negsampling.py", line 7, in
from deepmatch.models import *
File "/miniconda2/lib/python2.7/site-packages/deepmatch/init.py", line 1, in
from .utils import check_version
File "/miniconda2/lib/python2.7/site-packages/deepmatch/utils.py", line 23, in
from tensorflow.python.keras._impl.keras.layers import Lambda
ImportError: No module named _impl.keras.layers

SDM预处理代码有误

Describe the bug(问题描述)
SDM预处理代码有误

To Reproduce(复现步骤)
DeepMatch/examples/preprocess.py 107~108行
"prefer_sess_length": train_short_len, "short_sess_length": train_prefer_len
请问这两个字段的值是写反了？short及prefer的序列长度

Operating environment(运行环境):

python version [e.g. 3.4, 3.6]
tensorflow version [e.g. 1.4.0, 1.12.0]
deepmatch version [e.g. 0.1.1,]

Additional context
Add any other context about the problem here.

colab 运行movielens_youtobeDNN demo安装环境出错

Please refer to the FAQ in doc and search for the related issues before you ask the question.

Describe the question(问题描述)
我这边直接运行官方网站的demo，第一步就出错了

后面deepctr也出现了package加载的问题

Operating environment(运行环境):
colab原生环境

安装的时候出现如下问题

Features question

user_feature_columns = [SparseFeat('user_id', feature_max_idx['user_id'], embedding_dim),
SparseFeat("gender", feature_max_idx['gender'], embedding_dim),
SparseFeat("age", feature_max_idx['age'], embedding_dim),
SparseFeat("occupation", feature_max_idx['occupation'], embedding_dim),
SparseFeat("zip", feature_max_idx['zip'], embedding_dim),
VarLenSparseFeat(SparseFeat('hist_movie_id', feature_max_idx['movie_id'], embedding_dim,
embedding_name="movie_id"), SEQ_LEN, 'mean', 'hist_len'),
]

item_feature_columns = [SparseFeat('movie_id', feature_max_idx['movie_id'], embedding_dim)]

VarLenSparseFeat是对历史电影序列做嵌入向量，然后求平均吧。其中，它对每个电影的嵌入向量和item_feature_columns里的电影嵌入向量是一样的吗？

item_embedding_model计算预测物品向量的时候报错误

_SymbolicException: Inputs to eager execution function cannot be Keras symbolic tensors, but found [<tf.Tensor 'pooling_layer_3/Identity:0' shape=(209, 16) dtype=float32>]

DSSM最后一步没有看懂

你好，关于DSSM最后一步我有一些疑问

DSSM最后我看是求了user output和item output的余弦距离，然后又进行了sigmoid最后求交叉熵，余弦距离会把得分限制在-1至1之间，这样再进行sigmoid并求交叉熵非常奇怪啊。

我觉得应该求内积后直接使用sigmoid_cross_entropy来得到loss function

我们自己的实践证明上述方法也是有效的，烦请作者说一些这里的思路。

run_youtubednn模型保存后重新读取报错 Unknown layer: EmbeddingIndex

Describe the bug(问题描述)
run_youtubednn模型保存后重新读取报错 Unknown layer: EmbeddingIndex

To Reproduce(复现步骤)
history = model.fit(train_model_input, train_label, # train_label,
batch_size=256, epochs=20, verbose=1, validation_split=0.0, )

model_path = './my_model'
model.save(model_path)
new_model = tf.keras.models.load_model(model_path)

python version 3.7
tensorflow version 1.15.0
deepmatch version 0.2.0

Additional context
Add any other context about the problem here.

在MIND里怎样处理多个 item features

我想在MIND里添加多个item features。比如movielens里面的movie_id和genres。我是新手，所以尽量基于你的代码改。我现在遇到的问题是，需要给SampledSoftmaxLayer传递所有的item features的embedding。不同的tem feature的embedding的vocabulary_size是不一样的。怎样才能把多个item features的embedding给链接起来呢。

具体是mind.py里面的这段代码。如果item_features是多个的话，怎么办。

item_inputs_list = list(item_features.values())
item_embedding_matrix = embedding_matrix_dict[item_feature_name]
item_index = EmbeddingIndex(list(range(item_vocabulary_size)))(item_features[item_feature_name])
item_embedding_weight = NoMask()(item_embedding_matrix(item_index))
pooling_item_embedding_weight = PoolingLayer()([item_embedding_weight])

output = SampledSoftmaxLayer(num_sampled=num_sampled)(
    inputs=(pooling_item_embedding_weight,user_embedding_final, item_features[item_feature_name]))

万分感谢

使用LabelAwareAttention计算user embedding的一个疑问

调用mind的时候翻看代码，有个关于LabelAwareAttention有个小疑问，如果要使用这个attention机制，user embedding的返回应该是user_embedding_final吧？（deepmatch.models.mind中150行的user embedding）即： model.__setattr__("user_embedding", user_embeddings) 改为：model.__setattr__("user_embedding", user_embedding_final)

What is the item_embedding/user_embedding in ncf?

I wonder how ncf is being used in retrieving and which layer is its item_embedding/user_embedding? I didn't find it in ncf code.

a problem about EmbeddingIndex layer

TF2.1-GPU 运行YoutubeDNN的电影数据集DEMO始出现了bug

运行YoutubeDNN的电影数据集DEMO始出现了bug

Operating environment(运行环境):

python version [3.6]
tensorflow version [2.1 - GPU]
deepmatch version [0.1.2]

运行到这句话时
item_embs = item_embedding_model.predict(all_item_model_input, batch_size=2 ** 12)
出现如下bug (查了一下，可能是TF2.0 + 的eager模式导致的 ):

TypeError Traceback (most recent call last)
D:\Users\wangpeng.BEVOL\Anaconda3\envs\py36\lib\site-packages\tensorflow_core\python\eager\execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
60 op_name, inputs, attrs,
---> 61 num_outputs)
62 except core._NotOkStatusException as e:

TypeError: An op outside of the function building code is being passed
a "Graph" tensor. It is possible to have Graph tensors
leak out of the function building context by including a
tf.init_scope in your function building code.
For example, the following function will fail:
@tf.function
def has_init_scope():
my_constant = tf.constant(1.)
with tf.init_scope():
added = my_constant * 2
The graph tensor has name: pooling_layer/Identity:0

During handling of the above exception, another exception occurred:

_SymbolicException Traceback (most recent call last)
in
----> 1 item_embs = item_embedding_model.predict(all_item_model_input, batch_size=2)

D:\Users\wangpeng.BEVOL\Anaconda3\envs\py36\lib\site-packages\tensorflow_core\python\keras\engine\training.py in predict(self, x, batch_size, verbose, steps, callbacks, max_queue_size, workers, use_multiprocessing)
1011 max_queue_size=max_queue_size,
1012 workers=workers,
-> 1013 use_multiprocessing=use_multiprocessing)
1014
1015 def reset_metrics(self):

D:\Users\wangpeng.BEVOL\Anaconda3\envs\py36\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py in predict(self, model, x, batch_size, verbose, steps, callbacks, max_queue_size, workers, use_multiprocessing, **kwargs)
496 model, ModeKeys.PREDICT, x=x, batch_size=batch_size, verbose=verbose,
497 steps=steps, callbacks=callbacks, max_queue_size=max_queue_size,
--> 498 workers=workers, use_multiprocessing=use_multiprocessing, **kwargs)
499
500

D:\Users\wangpeng.BEVOL\Anaconda3\envs\py36\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py in _model_iteration(self, model, mode, x, y, batch_size, verbose, sample_weight, steps, callbacks, max_queue_size, workers, use_multiprocessing, **kwargs)
473 mode=mode,
474 training_context=training_context,
--> 475 total_epochs=1)
476 cbks.make_logs(model, epoch_logs, result, mode)
477

D:\Users\wangpeng.BEVOL\Anaconda3\envs\py36\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py in run_one_epoch(model, iterator, execution_function, dataset_size, batch_size, strategy, steps_per_epoch, num_samples, mode, training_context, total_epochs)
126 step=step, mode=mode, size=current_batch_size) as batch_logs:
127 try:
--> 128 batch_outs = execution_function(iterator)
129 except (StopIteration, errors.OutOfRangeError):
130 # TODO(kaftan): File bug about tf function and errors.OutOfRangeError?

D:\Users\wangpeng.BEVOL\Anaconda3\envs\py36\lib\site-packages\tensorflow_core\python\keras\engine\training_v2_utils.py in execution_function(input_fn)
96 # numpy translates Tensors to values in Eager mode.
97 return nest.map_structure(_non_none_constant_value,
---> 98 distributed_function(input_fn))
99
100 return execution_function

D:\Users\wangpeng.BEVOL\Anaconda3\envs\py36\lib\site-packages\tensorflow_core\python\eager\def_function.py in call(self, *args, **kwds)
566 xla_context.Exit()
567 else:
--> 568 result = self._call(*args, **kwds)
569
570 if tracing_count == self._get_tracing_count():

D:\Users\wangpeng.BEVOL\Anaconda3\envs\py36\lib\site-packages\tensorflow_core\python\eager\def_function.py in _call(self, *args, **kwds)
604 # In this case we have not created variables on the first call. So we can
605 # run the first trace but we should fail if variables are created.
--> 606 results = self._stateful_fn(*args, **kwds)
607 if self._created_variables:
608 raise ValueError("Creating variables on a non-first call to a function"

D:\Users\wangpeng.BEVOL\Anaconda3\envs\py36\lib\site-packages\tensorflow_core\python\eager\function.py in call(self, *args, **kwargs)
2361 with self._lock:
2362 graph_function, args, kwargs = self._maybe_define_function(args, kwargs)
-> 2363 return graph_function._filtered_call(args, kwargs) # pylint: disable=protected-access
2364
2365 @Property

D:\Users\wangpeng.BEVOL\Anaconda3\envs\py36\lib\site-packages\tensorflow_core\python\eager\function.py in _filtered_call(self, args, kwargs)
1609 if isinstance(t, (ops.Tensor,
1610 resource_variable_ops.BaseResourceVariable))),
-> 1611 self.captured_inputs)
1612
1613 def _call_flat(self, args, captured_inputs, cancellation_manager=None):

D:\Users\wangpeng.BEVOL\Anaconda3\envs\py36\lib\site-packages\tensorflow_core\python\eager\function.py in _call_flat(self, args, captured_inputs, cancellation_manager)
1690 # No tape is watching; skip to running the function.
1691 return self._build_call_outputs(self._inference_function.call(
-> 1692 ctx, args, cancellation_manager=cancellation_manager))
1693 forward_backward = self._select_forward_and_backward_functions(
1694 args,

D:\Users\wangpeng.BEVOL\Anaconda3\envs\py36\lib\site-packages\tensorflow_core\python\eager\function.py in call(self, ctx, args, cancellation_manager)
543 inputs=args,
544 attrs=("executor_type", executor_type, "config_proto", config),
--> 545 ctx=ctx)
546 else:
547 outputs = execute.execute_with_cancellation(

D:\Users\wangpeng.BEVOL\Anaconda3\envs\py36\lib\site-packages\tensorflow_core\python\eager\execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
73 raise core._SymbolicException(
74 "Inputs to eager execution function cannot be Keras symbolic "
---> 75 "tensors, but found {}".format(keras_symbolic_tensors))
76 raise e
77 # pylint: enable=protected-access

_SymbolicException: Inputs to eager execution function cannot be Keras symbolic tensors, but found [<tf.Tensor 'pooling_layer/Identity:0' shape=(3707, 32) dtype=float32>]

model.fit中传入的label应该是moive_id而不应该是0或者1label

Describe the bug(问题描述)
tf.nn.sampled_softmax_loss传如入moive_id, 所以 model.fit中传入的label应该是moive_id而不应该是0或者1label?

运行dssm的sample时出现AttributeError: 'Model' object has no attribute 'user_input'

运行dssm的sample时出现AttributeError: 'Model' object has no attribute 'user_input'，请问怎么解决？

SDM模型能否加入dense feature

目前在SDM模型代码60行打印不支持dense feature，raise ValueError("Now SDM don't support dense feature")，能否加上？

Mind test failed

Describe the question(问题描述)

运行MIND模型，第5步（寻找ANN以及预估结果）无法运行。

运行到这一步时D, I = index.search(np.ascontiguousarray(user_embs), 50)，出现 too many values to unpack(expected 2)

Operating environment(运行环境):

python version 3.6.2
tensorflow version tensorflow-gpu==1.14.0
deepmatch version 0.2.0,

run_youtubednn.py 出错

Please refer to the FAQ in doc and search for the related issues before you ask the question.

Describe the question(问题描述)
when executing run_youtubednn.py, I get error :

Traceback (most recent call last):
File "/data/gitlab/test/DeepMatch/examples/run_youtubednn.py", line 62, in
model = YoutubeDNN(user_feature_columns, item_feature_columns, num_sampled=5, user_dnn_hidden_units=(64, embedding_dim))
File "/data/gitlab/tes/DeepMatch/deepmatch/models/youtubednn.py", line 57, in YoutubeDNN
dnn_use_bn, output_activation=output_activation, seed=seed)(user_dnn_input)
File "/data/tools/anacoda/anacoda-install/envs/tf-py3.7/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer_v1.py", line 784, in call
outputs = call_fn(cast_inputs, *args, **kwargs)
File "/data/tools/anacoda/anacoda-install/envs/tf-py3.7/lib/python3.7/site-packages/tensorflow/python/autograph/impl/api.py", line 699, in wrapper
raise e.ag_error_metadata.to_exception(e)
tensorflow.python.framework.errors_impl.FailedPreconditionError: in user code:

File "/data/tools/anacoda/anacoda-install/envs/tf-py3.7/lib/python3.7/site-packages/deepctr/layers/core.py", line 193, in call  *
    fc = self.activation_layers[i](fc)
File "/data/tools/anacoda/anacoda-install/envs/tf-py3.7/lib/python3.7/site-packages/keras/engine/base_layer_v1.py", line 732, in __call__  **
    base_layer_utils.create_keras_history(inputs)
File "/data/tools/anacoda/anacoda-install/envs/tf-py3.7/lib/python3.7/site-packages/keras/engine/base_layer_utils.py", line 175, in create_keras_history
    _, created_layers = _create_keras_history_helper(tensors, set(), [])
File "/data/tools/anacoda/anacoda-install/envs/tf-py3.7/lib/python3.7/site-packages/keras/engine/base_layer_utils.py", line 251, in _create_keras_history_helper
    constants[i] = backend.function([], op_input)([])
File "/data/tools/anacoda/anacoda-install/envs/tf-py3.7/lib/python3.7/site-packages/keras/backend.py", line 4187, in __call__
    run_metadata=self.run_metadata)

FailedPreconditionError: Could not find variable dnn/bias0. This could mean that the variable has been deleted. In TF1, it can also mean the variable is uninitialized. Debug info: container=localhost, status error message=Container localhost does not exist. (Could not find resource: localhost/dnn/bias0)
	 [[{{node dnn/BiasAdd/ReadVariableOp}}]]

Additional context
no

Operating environment(运行环境):

python version 3.7
tensorflow version 2.7.0
deepmatch version 0.2.0

YoutubeDNN是不是少了个隐含层？

原论文，整个模型架构是包含三个隐层的DNN结构。但是我看这里user_dnn_hidden_units只有两层？
def YoutubeDNN(user_feature_columns, item_feature_columns, num_sampled=5,
user_dnn_hidden_units=(64, 32),
dnn_activation='relu', dnn_use_bn=False,
l2_reg_dnn=0, l2_reg_embedding=1e-6, dnn_dropout=0, output_activation='linear', seed=1024, ):
有小伙伴知道原因否？谢谢！

Is the output of YouTubeDNN the sample-softmax-loss?

Describe the question(问题描述)
Would any one please help to clarify my understanding the YouTubeDNN and its usage with negative sampling?

In the example "YoutubeDNN/MIND with sampled softmax", the model architecture is created using the following line,

model = YoutubeDNN(user_feature_columns, item_feature_columns, num_sampled=5, user_dnn_hidden_units=(64, embedding_dim))
model.compile(optimizer="adam", loss=sampledsoftmaxloss)

By checking the source code,

output = SampledSoftmaxLayer(num_sampled=num_sampled)( [pooling_item_embedding_weight, user_dnn_out, item_features[item_feature_name]])

In SampledSoftmanxLayer, during the calculation of sampled softmax loss, the above three arguments corresponds to

weights=pooling_item_embedding_weight, i.e., the movie id embedding matrix
labels=item_features[item_feature_name], i.e., the movie ids
inputs=user_dnn_out, i.e., the user embedding

Question 1. Am I right to say that the final output from the YouTubeDNN model is the loss value (i.e., 1-dimension)?

If it is yes for the above question, I am confused with the following line
history = model.fit(train_model_input, train_label, batch_size=256, epochs=1, verbose=1, validation_split=0.0, )

Because in the example, train_label is defined as a binary variable containing 0 or 1, indicating whether the instance is a positive or negative sample. In addition, since negsample = 0, the values in train_label are all 1s (i.e., no negative samples). Thus,

Question 2. How is train_label being used during the training when calculating the loss?

Question 3. If I set negsample = 1 or any non-zero values, how am I supposed to use YouTubeDNN method?

Thanks. Any advice is appreciated.

Operating environment(运行环境):

python version [e.g. 3.7]
tensorflow version [e.g. 2.3.0,]
deepmatch version [e.g. 0.1.3,]

千万级数据量，preprocess.py列表 MemoryError

问题现象：当训练数据量超过5000W时候，train_set创建过程中内存急剧增加，64G内存跑一半就挂了，也就是常见的MemoryError。仅仅是dataframe全量读到内存还好，预计是SEQ_LEN = 50给每个记录加上50个序列的时候内存消耗增长太快。

问题出现位置：
examples中的数据处理部分：preprocess.py => def gen_data_set(data, negsample=0): => train_set.append((reviewerID, hist[::-1], pos_list[i], 1, len(hist[::-1]),rating_list[i]))

问题请教：
此类内存问题大家如何解决？放到集群上面进行预处理数据？当前我Windows10本地运行preprocess.py很慢，初步打算改到spark集群中进行预处理，然后模型训练前直接读取处理好的数据。

谢谢！

数据分批读取训练问题

大数据场景下全量加载到显卡训练显然不现实，鄙人在TF2.4.0中试了下将数据处理好后分批给到DeepMatch模型，发现有以下问题：
1. 通过tf.data.Dataset.from_tensor_slices将数据分批后训练，却发现在tf.compat.v1.disable_eager_execution()后无法使用，报错如下：
D:\Anaconda3\envs\TF2GPU\lib\site-packages\tensorflow\python\keras\backend.py:434: UserWarning: tf.keras.backend.set_learning_phase is deprecated and will be removed after 2020-10-11. To update it, simply pass a True/False value to the training argument of the __call__ method of your layer or model.
warnings.warn('tf.keras.backend.set_learning_phase is deprecated and '
Traceback (most recent call last):
File "F:/python/DeepMatch-master/examples/bp_mind_emr_distr_index_batch.py", line 77, in
for batch_train_model_input,batch_train_label in tf.data.Dataset.from_tensor_slices((test_model_input)).batch(2): #.shuffle(1000):
File "D:\Anaconda3\envs\TF2GPU\lib\site-packages\tensorflow\python\data\ops\dataset_ops.py", line 424, in iter
raise RuntimeError("iter() is only supported inside of tf.function "
RuntimeError: iter() is only supported inside of tf.function or when eager execution is enabled.

2. 通过继承class DataGenerator(keras.utils.Sequence)方式分批读取数据，目前跑单机模式，CPU和GPU利用率上不去

3. 然后尝试了generate_arrays_from_file(data,batch_size=256)分批读取训练，Linux服务器单机正常运行，但CPU和GPU利用率上不去，但win10 GPU运行不行（WIN10报错，后来了解到win10上不能workers=4,use_multiprocessing=True这样多进程）

然后我又看了下DeepCtr那边的code写法，几乎也都是全量数据加载的，没看到分批给到GPU训练，不知路过的各位如何解决此类问题的？

谢谢！

大佬有像DeepCTR项目一样做一个PyTorch版本的计划吗

run_sdm.py执行时deepctr依赖缺少SparseFeat

Describe the bug(问题描述)
run_sdm.py执行时deepctr依赖缺少SparseFeat

To Reproduce(复现步骤)
我没有安装deepctr的包，而是git clone了DeepCTR的库
DeepMatch/example/run_sdm.py中，from deepctr.inputs import SparseFeat, VarLenSparseFeat这一行，出现ImportError: cannot import name SparseFeat的问题。
我在翻阅DeepCTR/deepctr/inputs.py中，在其中也没有找到SparseFeat, VarLenSparseFeat。
请问是需要安装的deepctr包，与deepctr代码库不一致？

Operating environment(运行环境):

python version 2.7
tensorflow version 1.11
deepmatch version

Additional context
Add any other context about the problem here.

召回率偏低，使用DSSM模型，在用faiss检索的时候召回率偏低，只有recall@100只有0.2左右，有无什么优化的方案呢？

Please refer to the FAQ in doc and search for the related issues before you ask the question.

Describe the question(问题描述)
A clear and concise description of what the question is.

Additional context
Add any other context about the problem here.

Operating environment(运行环境):

python version [e.g. 3.6]
tensorflow version [e.g. 1.4.0,]
deepmatch version [e.g. 0.2.0,]

FM几个改进建议及问题

建议：

FM里面写死了binary，没提供regression，我试过在我的千万级数据量的时候回归拟合rating分数的效果比binary分类得到的auc等各项指标要好；然后loss对应用的"mse"。

问题：

fm.py中没看见一阶线性部分、二价交叉部分模型结构，封装的部分还没来得及仔细看，，，，
改了下fm.py成：score = tf.reduce_sum(Add()([user_vector_sum, item_vector_sum]), axis=1, keepdims=True)，收敛很慢，auc也掉到了0.66远不如原始的余弦相似度，也不如矩阵分解。
召回率居然不及矩阵分解，虽然测试集auc指标能到0.92(训练集0.96)，但是负样本约0.5分，正样本约0.85分，召回数据top500都是0.99分的，以至于离线评估召回率比不过矩阵分解。

How to load saved SDM weights properly to reproduce embeddings？

Describe the question(问题描述)
After I saved SDM weights and loaded it in another process. It produced different user embeddings.

How to save SDM model properly and then load it properly to reproduce embeddings?

Operating environment(运行环境):

python version [e.g. 3.7.3]
tensorflow version [e.g. 2.2.0,]
deepmatch version [GPU e.g. 0.1.3,]

例子中电影数据是哪个版本啊？有没有更多数据啊

hi，dear
In the code, could you tell me the data from where?
want more data to have a try,
thx

您好，我想请问一下dssm模型还可以加入用户对item的连续特征吗？比如用户看电影的时间长度，这种特征怎么加入其中呢？另外想打印模型结构看，却始终发现某个属性name

Please refer to the FAQ in doc and search for the related issues before you ask the question.

Describe the question(问题描述)
A clear and concise description of what the question is.

Additional context
Add any other context about the problem here.

Operating environment(运行环境):

python version [e.g. 3.6]
tensorflow version [e.g. 1.4.0,]
deepmatch version [e.g. 0.1.1,]

run_youtubednn 保存模型失败

Describe the bug(问题描述)
尝试使用model.save的方式去保存youtbednn模型报错

To Reproduce(复现步骤)
在history这一行后面新增保存模型的两行代码
history = model.fit(train_model_input, train_label, # train_label,
batch_size=256, epochs=20, verbose=1, validation_split=0.0, )

model_path = './my_model'
model.save(model_path)

# 4. Generate user features for testing and full item features for retrieval
test_user_model_input = test_model_input
all_item_model_input = {"movie_id": item_profile['movie_id'].values}

Operating environment(运行环境):

python version 3.7.3
tensorflow version 2.2.0
deepmatch version 0.2.0

请教分布式训练

目前版本是否支持分布式训练呢，能简单介绍一下从哪些方面改进以支持分布式训练吗

看了代码发现样本组织形式是（q,D+)(q,D-),....而不是论文中的(q,D+,D-,D-,D-，D-),这两种有啥区别呢？

Please refer to the FAQ in doc and search for the related issues before you ask the question.

Describe the question(问题描述)
A clear and concise description of what the question is.

Additional context
Add any other context about the problem here.

Operating environment(运行环境):

python version [e.g. 3.6]
tensorflow version [e.g. 1.4.0,]
deepmatch version [e.g. 0.1.1,]

Mind中动态路由的routing logists问题

DeepMatch/deepmatch/layers/core.py

Line 175 in 6b059ca

self.routing_logits = self.add_weight(shape=[1, self.k_max, self.max_len],

如题，胶囊中的routing logists是一个全局的参数吗？看论文貌似是在routing process之前随机初始化的。

DeepMatch/example/preprocess.py里面的gen_data_set函数有bug

Describe the bug(问题描述)
这行代码以及下面出现hist[::-1]的代码，都不需要翻转，因为下面的这行代码执行之后：

data.sort_values("timestamp", inplace=True)

用户的行为已经根据时间从小到大排序了，如果在翻转hist[::-1]，就会导致下面函数gen_model_input里面的变量train_seq_pad的数据出现大量重复：

这几个样本都是完全一样的，是不可能发生的。

To Reproduce(复现步骤)
Steps to reproduce the behavior:

Go to 'DeepMatch/example/'
Click on 'python run_dssm_negsampling.py'

Operating environment(运行环境):

python version [e.g. 3.4, 3.6]
tensorflow version [e.g. 1.4.0, 1.12.0]
deepmatch version [e.g. 0.1.1,]

Additional context
Add any other context about the problem here.

DeepMatch交流群

感兴趣的同学关注我的个人公众号 浅梦学习笔记 回复加群加入我们的交流群～
或者添加我的微信deepctrbot，备注加入DeepMatch交流群

公众号：浅梦学习笔记	微信：deepctrbot	学习小组加入主题集合

关于EmbeddingIndex函数的疑问

Describe the question(问题描述)
class EmbeddingIndex(Layer):

def __init__(self, index,**kwargs):
    self.index =index
    super(EmbeddingIndex, self).__init__(**kwargs)

def build(self, input_shape):

    super(EmbeddingIndex, self).build(
        input_shape)  # Be sure to call this somewhere!
def call(self, x, **kwargs):
   return tf.constant(self.index)
def get_config(self, ):
    config = {'index': self.index, }
    base_config = super(EmbeddingIndex, self).get_config()
    return dict(list(base_config.items()) + list(config.items()))

请问一层返回的是embedding index吗？从call函数里的逻辑看，返回的是整个index list

Expected Results of Demo

Please refer to the FAQ in doc and search for the related issues before you ask the question.

Describe the question(问题描述)
Hi - I am running the examples on a notebook both using the run_youtubednn_sampledsoftmax.py file as well as through notebook cells. I am not sure of what the expected results should be but it doesn't look right. Could you advise?

Additional context
Results for running run_youtubednn_sampledsoftmax.py with step 5 eval removing commented code block.

2020-04-10 04:47:57.512363: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia
2020-04-10 04:47:57.512464: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia
2020-04-10 04:47:57.512482: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
100% 3/3 [00:00<00:00, 1193.60it/s]
6 6
2020-04-10 04:47:58.282434: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
WARNING:tensorflow:
The following Variables were used a Lambda layer's call (lambda), but
are not present in its tracked objects:
  <tf.Variable 'sparse_seq_emb_hist_movie_id/embeddings:0' shape=(209, 16) dtype=float32>
It is possible that this is intended behavior, but it is more likely
an omission. This is a strong indication that this layer should be
formulated as a subclassed Layer rather than a Lambda layer.
Train on 227 samples
/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/indexed_slices.py:433: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/indexed_slices.py:433: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
227/227 [==============================] - 1s 5ms/sample - loss: 1.1347
(3, 16)
(208, 16)
3it [00:00, 2086.72it/s]
recall 0.0
hr 0.0

Operating environment(运行环境):

python version 3.6
tensorflow version 2.1.0 - GPU
deepmatch version 0.1.1

preprocess数据集处理

Please refer to the FAQ in doc and search for the related issues before you ask the question.

Describe the question(问题描述)
A clear and concise description of what the question is.

Additional context
Add any other context about the problem here.

Operating environment(运行环境):

python version [e.g. 3.6]
tensorflow version [e.g. 1.4.0,]
deepmatch version [e.g. 0.2.0,]
请问下，数据集处理为什么没有用到rating特征，而且也没有用到其他item特征：例如genres，title等，有没有办法加入？

同想问下youtubednn下EmbeddingIndex的用法

Describe the question(问题描述)
class EmbeddingIndex(Layer):

def __init__(self, index, **kwargs):
    self.index = index
    super(EmbeddingIndex, self).__init__(**kwargs)

def build(self, input_shape):
    super(EmbeddingIndex, self).build(
        input_shape)  # Be sure to call this somewhere!

def call(self, x, **kwargs):
    return tf.constant(self.index)

def get_config(self, ):
    config = {'index': self.index, }
    base_config = super(EmbeddingIndex, self).get_config()
    return dict(list(base_config.items()) + list(config.items()))

看起来该层只是返回了index的list，但是实际中如果不使用EmbeddingIndex直接获取index的list，会报出mask的某种错误，具体如下：TypeError: Layer sampled_softmax_layer does not support masking, but was passed an input_mask: [<tf.Tensor 'sparse_seq_emb_hist_item_id_1/NotEqual:0' shape=(14781,) dtype=bool>, None, None]

有没有大佬分析一下

Operating environment(运行环境):

python version [e.g. 3.6]
tensorflow version [e.g. 2.0,]
deepmatch version [e.g. 0.2.0,]

运行example文件夹下的代码，想把模型保存为save model格式，同样方式dssm模型成功保存，youtubematch、sdm、mind却保存报错

python3.6.5 tensorflow 2.2
想把模型保存为save model格式，dssm模型成功保存，youtubematch、sdm、mind模型报错
user_embedding_model = Model(inputs=model.user_input, outputs=model.user_embedding)
item_embedding_model = Model(inputs=model.item_input, outputs=model.item_embedding)

用keras.models.save_model保存模型

from tensorflow import keras
keras.models.save_model(user_embedding_model,"./models")

报错如下:
\Python36\site-packages\tensorflow\python\keras\saving\saved_model\save_impl.py", line 566, in call_and_return_conditional_losses
return layer_call(inputs, *args, **kwargs), layer.get_losses_for(inputs)
TypeError: call() missing 1 required positional argument: 'state'

用tf.saved_model.save保存模型

tf.saved_model.save(user_embedding_model,"./models")
报错如下:
\Python36\site-packages\tensorflow\python\keras\saving\saved_model\save_impl.py", line 566, in call_and_return_conditional_losses
return layer_call(inputs, *args, **kwargs), layer.get_losses_for(inputs)
TypeError: call() missing 1 required positional argument: 'state'

先保存为h5再读取保存为savemodel

user_embedding_model.save("./models/models.h5")
pre_model = tf.keras.models.load_model("./models/models.h5")
pre_model.save("./models/output")
报错如下:
\Python36\site-packages\tensorflow\python\keras\utils\generic_utils.py", line 321, in class_and_config_for_serialized_keras_object
raise ValueError('Unknown ' + printable_module_name + ': ' + class_name)
ValueError: Unknown layer: NoMask

deepmatch模型example代码中dsssm成功保存，其它模型保存失败。因为线上调用模型需要将模型保存为saveModel格式，而非h5。请问能否解决一下这个问题？

只能导出h5格式模型，pb格式无法导出

只能导出h5格式模型，pb格式无法导出：
导出模型code如下：
tf.saved_model.save(model, outputDir + 'YouTubeNet_model2')
或：
from tensorflow.python.keras.models import Model, load_model, save_model
save_model(model, 'YouTubeNet_model.pb',save_format='tf')

报错如下：
Traceback (most recent call last):
File "F:/python/DeepMatch-master/examples/run_youtubednn.py", line 70, in
tf.saved_model.save(model, outputDir + 'YouTubeNet_model2')
File "D:\Anaconda3\envs\TF2\lib\site-packages\tensorflow\python\saved_model\save.py", line 1033, in save
obj, signatures, options, meta_graph_def)
File "D:\Anaconda3\envs\TF2\lib\site-packages\tensorflow\python\saved_model\save.py", line 1198, in _build_meta_graph
return _build_meta_graph_impl(obj, signatures, options, meta_graph_def)
File "D:\Anaconda3\envs\TF2\lib\site-packages\tensorflow\python\saved_model\save.py", line 1147, in _build_meta_graph_impl
_ = _SaveableView(checkpoint_graph_view, options)
File "D:\Anaconda3\envs\TF2\lib\site-packages\tensorflow\python\saved_model\save.py", line 186, in init
self.checkpoint_view.objects_ids_and_slot_variables())
File "D:\Anaconda3\envs\TF2\lib\site-packages\tensorflow\python\training\tracking\graph_view.py", line 444, in objects_ids_and_slot_variables
trackable_objects, path_to_root = self._breadth_first_traversal()
File "D:\Anaconda3\envs\TF2\lib\site-packages\tensorflow\python\training\tracking\graph_view.py", line 222, in _breadth_first_traversal
for name, dependency in self.list_dependencies(current_trackable):
File "D:\Anaconda3\envs\TF2\lib\site-packages\tensorflow\python\saved_model\save.py", line 120, in list_dependencies
for name, dep in super(_AugmentedGraphView, self).list_dependencies(obj):
File "D:\Anaconda3\envs\TF2\lib\site-packages\tensorflow\python\training\tracking\graph_view.py", line 164, in list_dependencies
dependencies = obj._checkpoint_dependencies
File "D:\Anaconda3\envs\TF2\lib\site-packages\tensorflow\python\training\tracking\data_structures.py", line 510, in _checkpoint_dependencies
"non-trackable object; it will be subsequently ignored." % (self,)))
ValueError: Unable to save the object ListWrapper([<tensorflow.python.keras.layers.core.Activation object at 0x000002D77C1B6940>, <tensorflow.python.keras.layers.core.Activation object at 0x000002D77C1B6E80>]) (a list wrapper constructed to track trackable TensorFlow objects). A list element was replaced (setitem, setslice), deleted (delitem, delslice), or moved (sort). In order to support restoration on object creation, tracking is exclusively for append-only data structures.

If you don't need this list checkpointed, wrap it in a non-trackable object; it will be subsequently ignored.

MIND模型关于loss和权重计算的问题

想问下MIND里这段代码为什么不需要加上axis=1呢？不是胶囊维度的softmax得到权重吗？

weight = tf.nn.softmax(routing_logits_with_padding)

然后召回阶段是学习表征，目前主流方案都是像word2vec一样负采样用triplet loss来学习，为什么代码里都是sampled_softmax_loss这种方式呢？学习表征的话负采样已经被验证和sampled_softmax_loss相比不影响最终效果

YouTubeDNN 在做SampledSoftmaxLayer

SampledSoftmaxLayer里面的pooling_item_embedding_weight是不是应该换成item_embedding_matrix，求大佬解答

import提示没有preprocess

Describe the question(问题描述)
在运行YoutubeDNN with sampled softmax这个例子时，import提示没有preprocess。我也没有查到这个库。是我版本安装的不对吗？感谢回答

Operating environment(运行环境):

python version 3.7
tensorflow version 2.2
deepmatch version0.13

数据处理用tf.dataset，特征处理用tf.feature_columns，会不会更好一些？

运行run-youtubudnn.py出现了问题,一开始是缺乏int_std,后面又是**not a valid scope name

Please refer to the FAQ in doc and search for the related issues before you ask the question.

Describe the question(问题描述)
VaLueError: '’. seq. emb. hist movie. _id' is not a valid scope name
Additional context
Add any other context about the problem here.

Operating environment(运行环境):

python version [e.g. 3.6]
tensorflow version [e.g. 1.14.0,]
deepmatch version [e.g. 0.2.0,]

shenweichen / deepmatch Goto Github PK

deepmatch's Introduction

deepmatch's People

Contributors

Stargazers

Watchers

Forkers

deepmatch's Issues

用keras.models.save_model保存模型

用tf.saved_model.save保存模型

先保存为h5再读取保存为savemodel

Recommend Projects

Recommend Topics

Recommend Org