This repository contains a Python binding for an adaptation of the code from the Google Research IALS repository. The code implements the model described in the paper iALS++: Speeding up Matrix Factorization with Subspace Optimization.
IALS++ is an efficient matrix factorization algorithm for collaborative filtering, designed to speed up the training process using subspace optimization techniques. This project provides a Python binding to the C++ implementation of IALS++, making it easier to integrate into Python-based data processing and machine learning workflows.
- Efficient matrix factorization using IALS++.
- Python binding for easy integration with Python projects.
- Functions to save and load the trained model.
- Evaluation metrics for recommendation performance.
Make sure to install the necessary dependencies:
For MacOS:
brew install eigen
brew install nlohmann-json
For Debian-based Linux:
sudo apt update
sudo apt install libeigen3-dev nlohmann-json3-dev
Finally, you can install the package directly from the GitHub repository.
pip install git+https://github.com/issilva5/ialspp_python.git
import ialspp
Create a Dataset
object from your data file. The data must have two columns: user id and item id.
train_data = ialspp.Dataset('path/to/train.csv')
test_train_data = ialspp.Dataset('path/to/test_train.csv')
test_test_data = ialspp.Dataset('path/to/test_test.csv')
recommender = ialspp.IALSppRecommender(
embedding_dim=16,
num_users=train_data.max_user() + 1,
num_items=train_data.max_item() + 1,
regularization=0.0001,
regularization_exp=1.0,
unobserved_weight=0.1,
stddev=0.1,
block_size=128
)
p = recommender.Train(train_data)
print(recommender.ComputeLosses(train_data, p)) # Print losses information
metrics = recommender.EvaluateDataset(test_train_data, test_test_data.by_user())
print(f"Rec20={metrics[0]:.4f}, Rec50={metrics[1]:.4f}, NDCG100={metrics[2]:.4f}")
recommender.SaveModel('path/to/model.bin')
loaded_recommender = ialspp.LoadModel('path/to/model.bin')
For more detailed examples and use cases, please refer to the examples
directory in the repository.
We welcome contributions! Please read our contributing guidelines to get started.
This project is licensed under the Apache 2.0 License. See the LICENSE file for more details.