maciejkula / triplet_recommendations_keras Goto Github PK

View Code? Open in Web Editor NEW

420.0 22.0 104.0 20 KB

An example of doing MovieLens recommendations using triplet loss in Keras

License: Apache License 2.0

Python 45.75% Jupyter Notebook 54.25%

triplet_recommendations_keras's Introduction

Recommendations in Keras using triplet loss

Note: a much richer set of neural network recommender models is available as Spotlight.

Along the lines of BPR [1].

[1] Rendle, Steffen, et al. "BPR: Bayesian personalized ranking from implicit feedback." Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence. AUAI Press, 2009.

This is implemented (more efficiently) in LightFM (https://github.com/lyst/lightfm). See the MovieLens example (https://github.com/lyst/lightfm/blob/master/examples/movielens/example.ipynb) for results comparable to this notebook.

Set up the architecture

A simple dense layer for both users and items: this is exactly equivalent to latent factor matrix when multiplied by binary user and item indices. There are three inputs: users, positive items, and negative items. In the triplet objective we try to make the positive item rank higher than the negative item for that user.

Because we want just one single embedding for the items, we use shared weights for the positive and negative item inputs (a siamese architecture).

This is all very simple but could be made arbitrarily complex, with more layers, conv layers and so on. I expect we'll be seeing a lot of papers doing just that.

"""
Triplet loss network example for recommenders
"""

from __future__ import print_function

import numpy as np

from keras import backend as K
from keras.models import Model
from keras.layers import Embedding, Flatten, Input, merge
from keras.optimizers import Adam

import data
import metrics


def identity_loss(y_true, y_pred):

    return K.mean(y_pred - 0 * y_true)


def bpr_triplet_loss(X):

    positive_item_latent, negative_item_latent, user_latent = X

    # BPR loss
    loss = 1.0 - K.sigmoid(
        K.sum(user_latent * positive_item_latent, axis=-1, keepdims=True) -
        K.sum(user_latent * negative_item_latent, axis=-1, keepdims=True))

    return loss


def build_model(num_users, num_items, latent_dim):

    positive_item_input = Input((1, ), name='positive_item_input')
    negative_item_input = Input((1, ), name='negative_item_input')

    # Shared embedding layer for positive and negative items
    item_embedding_layer = Embedding(
        num_items, latent_dim, name='item_embedding', input_length=1)

    user_input = Input((1, ), name='user_input')

    positive_item_embedding = Flatten()(item_embedding_layer(
        positive_item_input))
    negative_item_embedding = Flatten()(item_embedding_layer(
        negative_item_input))
    user_embedding = Flatten()(Embedding(
        num_users, latent_dim, name='user_embedding', input_length=1)(
            user_input))

    loss = merge(
        [positive_item_embedding, negative_item_embedding, user_embedding],
        mode=bpr_triplet_loss,
        name='loss',
        output_shape=(1, ))

    model = Model(
        input=[positive_item_input, negative_item_input, user_input],
        output=loss)
    model.compile(loss=identity_loss, optimizer=Adam())

    return model

Using Theano backend.

Load and transform data

We're going to load the Movielens 100k dataset and create triplets of (user, known positive item, randomly sampled negative item).

The success metric is AUC: in this case, the probability that a randomly chosen known positive item from the test set is ranked higher for a given user than a ranomly chosen negative item.

latent_dim = 100
num_epochs = 10

# Read data
train, test = data.get_movielens_data()
num_users, num_items = train.shape

# Prepare the test triplets
test_uid, test_pid, test_nid = data.get_triplets(test)

model = build_model(num_users, num_items, latent_dim)

# Print the model structure
print(model.summary())

# Sanity check, should be around 0.5
print('AUC before training %s' % metrics.full_auc(model, test))

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
====================================================================================================
positive_item_input (InputLayer) (None, 1)             0                                            
____________________________________________________________________________________________________
negative_item_input (InputLayer) (None, 1)             0                                            
____________________________________________________________________________________________________
user_input (InputLayer)          (None, 1)             0                                            
____________________________________________________________________________________________________
item_embedding (Embedding)       (None, 1, 100)        168300      positive_item_input[0][0]        
                                                                   negative_item_input[0][0]        
____________________________________________________________________________________________________
user_embedding (Embedding)       (None, 1, 100)        94400       user_input[0][0]                 
____________________________________________________________________________________________________
flatten_7 (Flatten)              (None, 100)           0           item_embedding[0][0]             
____________________________________________________________________________________________________
flatten_8 (Flatten)              (None, 100)           0           item_embedding[1][0]             
____________________________________________________________________________________________________
flatten_9 (Flatten)              (None, 100)           0           user_embedding[0][0]             
____________________________________________________________________________________________________
loss (Merge)                     (None, 1)             0           flatten_7[0][0]                  
                                                                   flatten_8[0][0]                  
                                                                   flatten_9[0][0]                  
====================================================================================================
Total params: 262700
____________________________________________________________________________________________________
None
AUC before training 0.50247407966

Run the model

Run for a couple of epochs, checking the AUC after every epoch.

for epoch in range(num_epochs):

    print('Epoch %s' % epoch)

    # Sample triplets from the training data
    uid, pid, nid = data.get_triplets(train)

    X = {
        'user_input': uid,
        'positive_item_input': pid,
        'negative_item_input': nid
    }

    model.fit(X,
              np.ones(len(uid)),
              batch_size=64,
              nb_epoch=1,
              verbose=0,
              shuffle=True)

    print('AUC %s' % metrics.full_auc(model, test))

Epoch 0
AUC 0.905896400776
Epoch 1
AUC 0.908241780938
Epoch 2
AUC 0.909650205748
Epoch 3
AUC 0.910820451523
Epoch 4
AUC 0.912184845152
Epoch 5
AUC 0.912632057958
Epoch 6
AUC 0.91326604222
Epoch 7
AUC 0.913786881853
Epoch 8
AUC 0.914638438854
Epoch 9
AUC 0.915375014253

The AUC is in the low-90s. At some point we start overfitting, so it would be a good idea to stop early or add some regularization.

triplet_recommendations_keras's People

Contributors

Stargazers

Watchers

Forkers

xypan1232 snazz2001 mindis nanoc812 zbxzc35 kyung-min icezzzzzz qqgeogor hhh920406 glinxi heihei2015 sandragreiss leihao612 techscientist svats2k hdubey ltoscano akansal1 skallumadi zilongzhong namkhanhtran boluoyu hyzcn feay1234 darkseed jasontam gth158a lindauruchurtu shijie2016 zctzzy dahernan allensmile hualichenxi maggie0830 constantineg1 jiniuniu aascode lampts andrewclegg shuidongliu kirawxz wac81 zaytiamo sandy4321 bikong2 datayears zoujun123 panxipeng zuoshaobo endsmart aimsky chaohuazhu haixoay96 bvasil istvancsabakis keen3986 pabloformoso rkdsone jbgriesner shubhampachori12110095 senliuy dolphintear rxlgq doradumplier ratulghosh afcarl pismar wygla sgoal xichaow ducs-personal hoangcuong2011 currylym eliasah cong222 batermj movie0587 sumedhvdatar hulalazz hongweilibran cxz huizhang2017 2000222 hzwfl2 yuan776 qianrenjian poppyleo unosonu xisun5 manu0377 yuzhaofeng embeddedsamurai nikosvarelas kthouz luanchenhui deandon duanchao

triplet_recommendations_keras's Issues

Does not run with Keras 0.3.3

Hi,

I was testing the file with Keras 0.3.3. And it returns the following error:

Traceback (most recent call last):
  File "triplet_movielens.py", line 139, in <module>
    model = get_graph(num_users, num_items, 256)
  File "triplet_movielens.py", line 105, in get_graph
    model.compile(loss={'triplet_loss': identity_loss}, optimizer=Adam())#Adagrad(lr=0.1, epsilon=1e-06))
  File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 1265, in compile
    if self.outputs[self.output_order[0]].output_shape[-1] == 1:
TypeError: 'NoneType' object has no attribute '__getitem__'

Once I downgrade Keras to 0.3.2 and run the same file, I do not get any error.

Error in running triplet_movielens.py

I am trying to run triplet_movielens.py with the same movielens dataset and I get the following error.

Using TensorFlow backend.
Traceback (most recent call last):
File "triplet_movielens.py", line 80, in
model = build_model(num_users, num_items, latent_dim)
File "triplet_movielens.py", line 58, in build_model
output_shape=(1, ))
TypeError: 'module' object is not callable

I am stuck and I need help.

AttributeError: 'NoneType' object has no attribute 'inbound_nodes'

when I define a new loss function like this

    `    def batch_all_triplet_loss(X):
    # Get the pairwise distance matrix
    print('22')
    labels, embeddings = X
    print('1')
    margin = 1.0
    pairwise_dist = _pairwise_distances(embeddings, squared=False)
    # shape (batch_size, batch_size, 1)
    anchor_positive_dist = tf.expand_dims(pairwise_dist, 2)
    assert anchor_positive_dist.shape[2] == 1, "{}".format(anchor_positive_dist.shape)`
    # shape (batch_size, 1, batch_size)
    anchor_negative_dist = tf.expand_dims(pairwise_dist, 1)
    assert anchor_negative_dist.shape[1] == 1, "{}".format(anchor_negative_dist.shape)

    # Compute a 3D tensor of size (batch_size, batch_size, batch_size)
    # triplet_loss[i, j, k] will contain the triplet loss of anchor=i, positive=j, negative=k
    # Uses broadcasting where the 1st argument has shape (batch_size, batch_size, 1)
    # and the 2nd (batch_size, 1, batch_size)
    triplet_loss = anchor_positive_dist - anchor_negative_dist + margin

    # Put to zero the invalid triplets
    # (where label(a) != label(p) or label(n) == label(a) or a == p)
    mask = _get_triplet_mask(labels)
    mask = tf.to_float(mask)
    triplet_loss = tf.multiply(mask, triplet_loss)

    # Remove negative losses (i.e. the easy triplets)
    triplet_loss = tf.maximum(triplet_loss, 0.0)

    # add my loss
    triplet_loss = tf.multiply(0.5,triplet_loss)

    # Count number of positive triplets (where triplet_loss > 0)
    valid_triplets = tf.to_float(tf.greater(triplet_loss, 1e-16))
    num_positive_triplets = tf.reduce_sum(valid_triplets)
    # num_valid_triplets = tf.reduce_sum(mask)
    # fraction_positive_triplets = num_positive_triplets / (num_valid_triplets + 1e-16)

    # Get final mean triplet loss over the positive valid triplets
    triplet_loss = tf.reduce_sum(triplet_loss) / (num_positive_triplets + 1e-16)
    # return triplet_loss, fraction_positive_triplets
    return triplet_loss`

and i merge it

`    triplet_losses = merge([label, final_rmac_a],
        mode=batch_all_triplet_loss,
        name='loss',
        output_shape=(1,))

rmac_model = Model(
    inputs=[image_a, roi_a],
    outputs=triplet_losses)`

label label = Input(shape=(batch_size,)) final_rmac_a final_rmac_a = BatchNormalization()(rmac_a)
why raise this wrong tips? i guess due to the keras version

Why not control the nb_epoch within the fit function?

The codes have following lines:

for epoch in range(num_epochs)
    ...
        model.fit(...,nb_epoch=1, ...)
    ...

Why not set the nb_epoch as num_epochs directly?

Test set triplet sample

Hi, thank you for your sharing ! you sample the triplet from test set as you do that from training set. But in the real world, we can't know which items are positive or negative. How can we sample the triplet from the test set?

Dumb question: how do ratings on a scale of 1-5 map to positive examples?

Is any item a user gave a rating considered a positive example? Or only items they rated as a 5?

Exception: ('Invalid merge mode:', 'join')

Hi,

When running with latest Keras, I have the following message:

File "", line 1, in
model = get_graph(num_users, num_items, 256)

File "", line 15, in get_graph
merge_mode='join')

File "D:_devs\Python01\WinPython-64-2710\python-2.7.10.amd64\lib\site-packages\keras\legacy\models.py", line 241, in add_shared_node
raise Exception('Invalid merge mode:', merge_mode)

Exception: ('Invalid merge mode:', 'join')

Any idea ?

Thanks

Extract Embedding Model after training

So we have a (shared) embedding network that receives 1 input and outputs 1 embedding. We then build a triplet loss network around it that takes 3 inputs and outputs the loss which we train to minimize the loss. After we've trained our triplet loss network I'd like to use the embedding network without the triplet network. How would we access the "inner" embedding network? My goal is to be able to generate 1 embedding for 1 input.

Loss and output functions are different.

Hi, It's very nice to see a post of pairwise ranking loss for keras. Could you help me to gain more thinking behind the triplet loss definition? As a new user of keras, it's a little hard to understand why it works since the output of model is "triplet loss", but the compiled model's loss is "identity loss". Please, could you give more information or some other references that can help me?

Issues using items metadata

Hi @maciejkula ,

I was trying to embed item metadata (genre, category, etc) into this implementation, but I can't figure out how to deal with the different dimensionality of the user and item embedding layers.

If I concat - let's say - "Item gender" embedding to the item embedding layer (as you suggested in a previously closed question), the dimensionality of the resulting tensor will become incompatible with the one of the user embedding layer.

I can do the trick of increasing the dimensionality of the user embedding layer to be of the same shape of the item-concatenated one, but then I can't take the dot product of the item and user embedding layers to calculate user's preferences.

I feel I'm missing something, but I can't figure out what.

Thanks,
Francesco

Both test loss and validation loss go to 0.5

I modified your code for my problem. I added regularization terms in Embedding layers. But when I train the model, both test loss and val loss go to 0.5. I guess this is because both user and item latent vectors shrink to zero, which makes the bpr_triplet_loss always be 0.5.

Do you have any idea why this happens?

Fixing NaNs with built-in Sigmoid

Howdy. I found that using the BPR loss from your code as-is led to a slew of NaNs after a couple of epochs. I modified it as follows to use the built-in sigmoid, and it seems to have fixed it:

def bpr_triplet_loss(X):

    user_latent, item_latent = X.values()
    positive_item_latent, negative_item_latent = item_latent.values()

    # BPR loss
    loss = -K.sigmoid(K.sum(user_latent * positive_item_latent, axis=-1, keepdims=True)
                                - K.sum(user_latent * negative_item_latent, axis=-1, keepdims=True))

    return loss

Thanks for the nice example code!

Doubt about margin_triplet_loss

Hi! The code is really a good guild for using keras with triplet prediction. But when I changed the loss function from bpr_triplet_loss to margin_triplet_loss, I found following issue:

AUC before training 0.496249791422
Epoch 0
Train on 49906 samples, validate on 5469 samples
Epoch 1/1
3s - loss: 0.7926 - val_loss: 0.3801
AUC 0.801788456964
Inversions percentage 0.488023404644
Epoch 1
Train on 49906 samples, validate on 5469 samples
Epoch 1/1
3s - loss: 0.3477 - val_loss: 0.3517
AUC 0.806704502864
Inversions percentage 0.381605412324
Epoch 2
Train on 49906 samples, validate on 5469 samples
Epoch 1/1
3s - loss: 0.3233 - val_loss: 0.3500
AUC 0.784859436415
Inversions percentage 0.375022856098
Epoch 3
Train on 49906 samples, validate on 5469 samples
Epoch 1/1
3s - loss: 0.3044 - val_loss: 0.3491
AUC 0.762610735591
Inversions percentage 0.38142256354
Epoch 4
Train on 49906 samples, validate on 5469 samples
Epoch 1/1
3s - loss: 0.2818 - val_loss: 0.3493
AUC 0.746641303774
Inversions percentage 0.386725178278

The Inversions percentage was INCREASING, and the AUC was DECREASING Why? It came to overfitting since the epoch 3?

Please,I have a problem with your function "bpr_triplet_loss"...

I saw you have use "K.sum(user_latent * positive_item_latent, axis=-1, keepdims=True)",but in my keras with tensorflow backend,this "*" means point-wise multiplication.
I think we should compute the cos similarity between user and pos-items,so why Why don't we use "K.batch_dot(K.l2_normalize(x,axis = -1),K.l2_normalize(y,axis=-1))"?
Thank you for your answer and help!

Item features as inputs for Item embedding

Hi,
Thanks for sharing this, just wanted to know what's the best possible way to plug in item_features into the item_embedding layer, (genre, types, topics etc)