Giter Site home page Giter Site logo

triplet_recommendations_keras's Introduction

Recommendations in Keras using triplet loss

Note: a much richer set of neural network recommender models is available as Spotlight.

Along the lines of BPR [1].

[1] Rendle, Steffen, et al. "BPR: Bayesian personalized ranking from implicit feedback." Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence. AUAI Press, 2009.

This is implemented (more efficiently) in LightFM (https://github.com/lyst/lightfm). See the MovieLens example (https://github.com/lyst/lightfm/blob/master/examples/movielens/example.ipynb) for results comparable to this notebook.

Set up the architecture

A simple dense layer for both users and items: this is exactly equivalent to latent factor matrix when multiplied by binary user and item indices. There are three inputs: users, positive items, and negative items. In the triplet objective we try to make the positive item rank higher than the negative item for that user.

Because we want just one single embedding for the items, we use shared weights for the positive and negative item inputs (a siamese architecture).

This is all very simple but could be made arbitrarily complex, with more layers, conv layers and so on. I expect we'll be seeing a lot of papers doing just that.

"""
Triplet loss network example for recommenders
"""

from __future__ import print_function

import numpy as np

from keras import backend as K
from keras.models import Model
from keras.layers import Embedding, Flatten, Input, merge
from keras.optimizers import Adam

import data
import metrics


def identity_loss(y_true, y_pred):

    return K.mean(y_pred - 0 * y_true)


def bpr_triplet_loss(X):

    positive_item_latent, negative_item_latent, user_latent = X

    # BPR loss
    loss = 1.0 - K.sigmoid(
        K.sum(user_latent * positive_item_latent, axis=-1, keepdims=True) -
        K.sum(user_latent * negative_item_latent, axis=-1, keepdims=True))

    return loss


def build_model(num_users, num_items, latent_dim):

    positive_item_input = Input((1, ), name='positive_item_input')
    negative_item_input = Input((1, ), name='negative_item_input')

    # Shared embedding layer for positive and negative items
    item_embedding_layer = Embedding(
        num_items, latent_dim, name='item_embedding', input_length=1)

    user_input = Input((1, ), name='user_input')

    positive_item_embedding = Flatten()(item_embedding_layer(
        positive_item_input))
    negative_item_embedding = Flatten()(item_embedding_layer(
        negative_item_input))
    user_embedding = Flatten()(Embedding(
        num_users, latent_dim, name='user_embedding', input_length=1)(
            user_input))

    loss = merge(
        [positive_item_embedding, negative_item_embedding, user_embedding],
        mode=bpr_triplet_loss,
        name='loss',
        output_shape=(1, ))

    model = Model(
        input=[positive_item_input, negative_item_input, user_input],
        output=loss)
    model.compile(loss=identity_loss, optimizer=Adam())

    return model
Using Theano backend.

Load and transform data

We're going to load the Movielens 100k dataset and create triplets of (user, known positive item, randomly sampled negative item).

The success metric is AUC: in this case, the probability that a randomly chosen known positive item from the test set is ranked higher for a given user than a ranomly chosen negative item.

latent_dim = 100
num_epochs = 10

# Read data
train, test = data.get_movielens_data()
num_users, num_items = train.shape

# Prepare the test triplets
test_uid, test_pid, test_nid = data.get_triplets(test)

model = build_model(num_users, num_items, latent_dim)

# Print the model structure
print(model.summary())

# Sanity check, should be around 0.5
print('AUC before training %s' % metrics.full_auc(model, test))
____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
====================================================================================================
positive_item_input (InputLayer) (None, 1)             0                                            
____________________________________________________________________________________________________
negative_item_input (InputLayer) (None, 1)             0                                            
____________________________________________________________________________________________________
user_input (InputLayer)          (None, 1)             0                                            
____________________________________________________________________________________________________
item_embedding (Embedding)       (None, 1, 100)        168300      positive_item_input[0][0]        
                                                                   negative_item_input[0][0]        
____________________________________________________________________________________________________
user_embedding (Embedding)       (None, 1, 100)        94400       user_input[0][0]                 
____________________________________________________________________________________________________
flatten_7 (Flatten)              (None, 100)           0           item_embedding[0][0]             
____________________________________________________________________________________________________
flatten_8 (Flatten)              (None, 100)           0           item_embedding[1][0]             
____________________________________________________________________________________________________
flatten_9 (Flatten)              (None, 100)           0           user_embedding[0][0]             
____________________________________________________________________________________________________
loss (Merge)                     (None, 1)             0           flatten_7[0][0]                  
                                                                   flatten_8[0][0]                  
                                                                   flatten_9[0][0]                  
====================================================================================================
Total params: 262700
____________________________________________________________________________________________________
None
AUC before training 0.50247407966

Run the model

Run for a couple of epochs, checking the AUC after every epoch.

for epoch in range(num_epochs):

    print('Epoch %s' % epoch)

    # Sample triplets from the training data
    uid, pid, nid = data.get_triplets(train)

    X = {
        'user_input': uid,
        'positive_item_input': pid,
        'negative_item_input': nid
    }

    model.fit(X,
              np.ones(len(uid)),
              batch_size=64,
              nb_epoch=1,
              verbose=0,
              shuffle=True)

    print('AUC %s' % metrics.full_auc(model, test))
Epoch 0
AUC 0.905896400776
Epoch 1
AUC 0.908241780938
Epoch 2
AUC 0.909650205748
Epoch 3
AUC 0.910820451523
Epoch 4
AUC 0.912184845152
Epoch 5
AUC 0.912632057958
Epoch 6
AUC 0.91326604222
Epoch 7
AUC 0.913786881853
Epoch 8
AUC 0.914638438854
Epoch 9
AUC 0.915375014253

The AUC is in the low-90s. At some point we start overfitting, so it would be a good idea to stop early or add some regularization.

triplet_recommendations_keras's People

Contributors

maciejkula avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

triplet_recommendations_keras's Issues

Does not run with Keras 0.3.3

Hi,

I was testing the file with Keras 0.3.3. And it returns the following error:

Traceback (most recent call last):
  File "triplet_movielens.py", line 139, in <module>
    model = get_graph(num_users, num_items, 256)
  File "triplet_movielens.py", line 105, in get_graph
    model.compile(loss={'triplet_loss': identity_loss}, optimizer=Adam())#Adagrad(lr=0.1, epsilon=1e-06))
  File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 1265, in compile
    if self.outputs[self.output_order[0]].output_shape[-1] == 1:
TypeError: 'NoneType' object has no attribute '__getitem__'

Once I downgrade Keras to 0.3.2 and run the same file, I do not get any error.

Error in running triplet_movielens.py

I am trying to run triplet_movielens.py with the same movielens dataset and I get the following error.

Using TensorFlow backend.
Traceback (most recent call last):
File "triplet_movielens.py", line 80, in
model = build_model(num_users, num_items, latent_dim)
File "triplet_movielens.py", line 58, in build_model
output_shape=(1, ))
TypeError: 'module' object is not callable

I am stuck and I need help.

AttributeError: 'NoneType' object has no attribute 'inbound_nodes'

when I define a new loss function like this

    `    def batch_all_triplet_loss(X):
    # Get the pairwise distance matrix
    print('22')
    labels, embeddings = X
    print('1')
    margin = 1.0
    pairwise_dist = _pairwise_distances(embeddings, squared=False)
    # shape (batch_size, batch_size, 1)
    anchor_positive_dist = tf.expand_dims(pairwise_dist, 2)
    assert anchor_positive_dist.shape[2] == 1, "{}".format(anchor_positive_dist.shape)`
    # shape (batch_size, 1, batch_size)
    anchor_negative_dist = tf.expand_dims(pairwise_dist, 1)
    assert anchor_negative_dist.shape[1] == 1, "{}".format(anchor_negative_dist.shape)

    # Compute a 3D tensor of size (batch_size, batch_size, batch_size)
    # triplet_loss[i, j, k] will contain the triplet loss of anchor=i, positive=j, negative=k
    # Uses broadcasting where the 1st argument has shape (batch_size, batch_size, 1)
    # and the 2nd (batch_size, 1, batch_size)
    triplet_loss = anchor_positive_dist - anchor_negative_dist + margin

    # Put to zero the invalid triplets
    # (where label(a) != label(p) or label(n) == label(a) or a == p)
    mask = _get_triplet_mask(labels)
    mask = tf.to_float(mask)
    triplet_loss = tf.multiply(mask, triplet_loss)

    # Remove negative losses (i.e. the easy triplets)
    triplet_loss = tf.maximum(triplet_loss, 0.0)

    # add my loss
    triplet_loss = tf.multiply(0.5,triplet_loss)

    # Count number of positive triplets (where triplet_loss > 0)
    valid_triplets = tf.to_float(tf.greater(triplet_loss, 1e-16))
    num_positive_triplets = tf.reduce_sum(valid_triplets)
    # num_valid_triplets = tf.reduce_sum(mask)
    # fraction_positive_triplets = num_positive_triplets / (num_valid_triplets + 1e-16)

    # Get final mean triplet loss over the positive valid triplets
    triplet_loss = tf.reduce_sum(triplet_loss) / (num_positive_triplets + 1e-16)
    # return triplet_loss, fraction_positive_triplets
    return triplet_loss`

and i merge it

`    triplet_losses = merge([label, final_rmac_a],
        mode=batch_all_triplet_loss,
        name='loss',
        output_shape=(1,))
rmac_model = Model(
    inputs=[image_a, roi_a],
    outputs=triplet_losses)`

label label = Input(shape=(batch_size,)) final_rmac_a final_rmac_a = BatchNormalization()(rmac_a)
why raise this wrong tips? i guess due to the keras version

Test set triplet sample

Hi, thank you for your sharing ! you sample the triplet from test set as you do that from training set. But in the real world, we can't know which items are positive or negative. How can we sample the triplet from the test set?

Exception: ('Invalid merge mode:', 'join')

Hi,

When running with latest Keras, I have the following message:

File "", line 1, in
model = get_graph(num_users, num_items, 256)

File "", line 15, in get_graph
merge_mode='join')

File "D:_devs\Python01\WinPython-64-2710\python-2.7.10.amd64\lib\site-packages\keras\legacy\models.py", line 241, in add_shared_node
raise Exception('Invalid merge mode:', merge_mode)

Exception: ('Invalid merge mode:', 'join')

Any idea ?

Thanks

Extract Embedding Model after training

So we have a (shared) embedding network that receives 1 input and outputs 1 embedding. We then build a triplet loss network around it that takes 3 inputs and outputs the loss which we train to minimize the loss. After we've trained our triplet loss network I'd like to use the embedding network without the triplet network. How would we access the "inner" embedding network? My goal is to be able to generate 1 embedding for 1 input.

Loss and output functions are different.

Hi, It's very nice to see a post of pairwise ranking loss for keras. Could you help me to gain more thinking behind the triplet loss definition? As a new user of keras, it's a little hard to understand why it works since the output of model is "triplet loss", but the compiled model's loss is "identity loss". Please, could you give more information or some other references that can help me?

Issues using items metadata

Hi @maciejkula ,

I was trying to embed item metadata (genre, category, etc) into this implementation, but I can't figure out how to deal with the different dimensionality of the user and item embedding layers.

If I concat - let's say - "Item gender" embedding to the item embedding layer (as you suggested in a previously closed question), the dimensionality of the resulting tensor will become incompatible with the one of the user embedding layer.

I can do the trick of increasing the dimensionality of the user embedding layer to be of the same shape of the item-concatenated one, but then I can't take the dot product of the item and user embedding layers to calculate user's preferences.

I feel I'm missing something, but I can't figure out what.

Thanks,
Francesco

Both test loss and validation loss go to 0.5

I modified your code for my problem. I added regularization terms in Embedding layers. But when I train the model, both test loss and val loss go to 0.5. I guess this is because both user and item latent vectors shrink to zero, which makes the bpr_triplet_loss always be 0.5.

Do you have any idea why this happens?

Fixing NaNs with built-in Sigmoid

Howdy. I found that using the BPR loss from your code as-is led to a slew of NaNs after a couple of epochs. I modified it as follows to use the built-in sigmoid, and it seems to have fixed it:

def bpr_triplet_loss(X):

    user_latent, item_latent = X.values()
    positive_item_latent, negative_item_latent = item_latent.values()

    # BPR loss
    loss = -K.sigmoid(K.sum(user_latent * positive_item_latent, axis=-1, keepdims=True)
                                - K.sum(user_latent * negative_item_latent, axis=-1, keepdims=True))

    return loss

Thanks for the nice example code!

Doubt about margin_triplet_loss

Hi! The code is really a good guild for using keras with triplet prediction. But when I changed the loss function from bpr_triplet_loss to margin_triplet_loss, I found following issue:

AUC before training 0.496249791422
Epoch 0
Train on 49906 samples, validate on 5469 samples
Epoch 1/1
3s - loss: 0.7926 - val_loss: 0.3801
AUC 0.801788456964
Inversions percentage 0.488023404644
Epoch 1
Train on 49906 samples, validate on 5469 samples
Epoch 1/1
3s - loss: 0.3477 - val_loss: 0.3517
AUC 0.806704502864
Inversions percentage 0.381605412324
Epoch 2
Train on 49906 samples, validate on 5469 samples
Epoch 1/1
3s - loss: 0.3233 - val_loss: 0.3500
AUC 0.784859436415
Inversions percentage 0.375022856098
Epoch 3
Train on 49906 samples, validate on 5469 samples
Epoch 1/1
3s - loss: 0.3044 - val_loss: 0.3491
AUC 0.762610735591
Inversions percentage 0.38142256354
Epoch 4
Train on 49906 samples, validate on 5469 samples
Epoch 1/1
3s - loss: 0.2818 - val_loss: 0.3493
AUC 0.746641303774
Inversions percentage 0.386725178278

The Inversions percentage was INCREASING, and the AUC was DECREASING Why? It came to overfitting since the epoch 3?

Please,I have a problem with your function "bpr_triplet_loss"...

I saw you have use "K.sum(user_latent * positive_item_latent, axis=-1, keepdims=True)",but in my keras with tensorflow backend,this "*" means point-wise multiplication.
I think we should compute the cos similarity between user and pos-items,so why Why don't we use "K.batch_dot(K.l2_normalize(x,axis = -1),K.l2_normalize(y,axis=-1))"?
Thank you for your answer and help!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.