Giter Site home page Giter Site logo

lightfm's Introduction

LightFM

LightFM logo

A Python implementation of LightFM, a hybrid recommendation algorithm.

The LightFM model incorporates both item and user metadata into the traditional matrix factorization algorithm. It represents each user and item as the sum of the latent representations of their features, thus allowing recommendations to generalise to new items (via item features) and to new users (via user features).

The details of the approach are described in the LightFM paper, available on arXiv.

The model can be trained using four methods:

  • logistic loss: useful when both positive (1) and negative (-1) interactions are present.
  • BPR: Bayesian Personalised Ranking [1] pairwise loss. Maximises the prediction difference between a positive example and a randomly chosen negative example. Useful when only positive interactions are present and optimising ROC AUC is desired.
  • WARP: Weighted Approximate-Rank Pairwise [2] loss. Maximises the rank of positive examples by repeatedly sampling negative examples until a rank violating one is found. Useful when only positive interactions are present and optimising the top of the recommendation list (precision@k) is desired.
  • k-OS WARP: k-th order statistic loss [3]. A modification of WARP that uses the k-th positive example for any given user as a basis for pairwise updates.

Two learning rate schedules are implemented:

  • adagrad: [4]
  • adadelta: [5]

Installation

Install from pypi using pip: pip install lightfm.

Note for OSX users: due to its use of OpenMP, lightfm does not compile under Clang. To install it, you will need a reasonably recent version of gcc (from Homebrew for instance). This should be picked up by setup.py; if it is not, please open an issue.

Usage

Model fitting is very straightforward.

Create a model instance with the desired latent dimensionality

from lightfm import LightFM

model = LightFM(no_components=30)

Assuming train is a (no_users, no_items) sparse matrix (with 1s denoting positive, and -1s negative interactions), you can fit a traditional matrix factorization model by calling

model.fit(train, epochs=20)

This will train a traditional MF model, as no user or item features have been supplied.

To get predictions, call model.predict:

predictions = model.predict(test_user_ids, test_item_ids)

User and item features can be incorporated into training by passing them into the fit method. Assuming user_features is a (no_users, no_user_features) sparse matrix (and similarly for item_features), you can call

model.fit(train,
          user_features=user_features,
          item_features=item_features,
          epochs=20)
predictions = model.predict(test_user_ids,
                            test_item_ids,
                            user_features=user_features,
                            item_features=item_features)

to train the model and obtain predictions.

Both training and prediction can employ multiple cores for speed:

model.fit(train, epochs=20, num_threads=4)
predictions = model.predict(test_user_ids, test_item_ids, num_threads=4)

This implementation uses asynchronous stochastic gradient descent [6] for training. This can lead to lower accuracy when the interaction matrix (or the feature matrices) are very dense and a large number of threads is used. In practice, however, training on a sparse dataset with 20 threads does not lead to a measurable loss of accuracy.

In an implicit feedback setting, the BPR, WARP, or k-OS WARP loss functions can be used. If train is a sparse matrix with positive entries representing positive interactions, the model can be trained as follows:

model = LightFM(no_components=30, loss='warp')
model.fit(train, epochs=20)

Examples

Check the examples directory for more examples.

The Movielens example shows how to use lightfm on the Movielens dataset, both with and without using movie metadata. Another example compares the performance of the adagrad and adadelta learning schedules.

Development

Pull requests are welcome. To install for development:

  1. Clone the repository: git clone [email protected]:lyst/lightfm.git
  2. Install it for development using pip: cd lightfm && pip install -e .
  3. You can run tests by running python setupy.py test.

When making changes to the .pyx extension files, you'll need to run python setup.py cythonize in order to produce the extension .c files before running pip install -e ..

References

[1] Rendle, Steffen, et al. "BPR: Bayesian personalized ranking from implicit feedback." Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence. AUAI Press, 2009.

[2] Weston, Jason, Samy Bengio, and Nicolas Usunier. "Wsabie: Scaling up to large vocabulary image annotation." IJCAI. Vol. 11. 2011.

[3] Weston, Jason, Hector Yee, and Ron J. Weiss. "Learning to rank recommendations with the k-order statistic loss." Proceedings of the 7th ACM conference on Recommender systems. ACM, 2013.

[4] Duchi, John, Elad Hazan, and Yoram Singer. "Adaptive subgradient methods for online learning and stochastic optimization." The Journal of Machine Learning Research 12 (2011): 2121-2159.

[5] Zeiler, Matthew D. "ADADELTA: An adaptive learning rate method." arXiv preprint arXiv:1212.5701 (2012).

[6] Recht, Benjamin, et al. "Hogwild: A lock-free approach to parallelizing stochastic gradient descent." Advances in Neural Information Processing Systems. 2011.

lightfm's People

Contributors

maciejkula avatar

Stargazers

 avatar

Watchers

Joseph Misiti avatar James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.