maciejkula / sbr-rs Goto Github PK

View Code? Open in Web Editor NEW

123.0 5.0 7.0 946 KB

Deep recommender systems for Rust

License: MIT License

Rust 100.00%

rust recommender-systems deep-learning machine-learning

sbr-rs's Introduction

sbr

An implementation of sequence recommenders based on the wyrm autdifferentiaton library.

sbr-rs

sbr implements efficient recommender algorithms which operate on sequences of items: given previous items a user has interacted with, the model will recommend the items the user is likely to interact with in the future.

Implemented models:

LSTM: a model that uses an LSTM network over the sequence of a user's interaction to predict their next action;
EWMA: a model that uses a simpler exponentially-weighted average of past actions to predict future interactions.

Which model performs the best will depend on your dataset. The EWMA model is much quicker to fit, and will probably be a good starting point.

Example

You can fit a model on the Movielens 100K dataset in about 10 seconds:

let mut data = sbr::datasets::download_movielens_100k().unwrap();

let mut rng = rand::XorShiftRng::from_seed([42; 16]);

let (train, test) = sbr::data::user_based_split(&mut data, &mut rng, 0.2);
let train_mat = train.to_compressed();
let test_mat = test.to_compressed();

println!("Train: {}, test: {}", train.len(), test.len());

let mut model = sbr::models::lstm::Hyperparameters::new(data.num_items(), 32)
    .embedding_dim(32)
    .learning_rate(0.16)
    .l2_penalty(0.0004)
    .lstm_variant(sbr::models::lstm::LSTMVariant::Normal)
    .loss(sbr::models::Loss::WARP)
    .optimizer(sbr::models::Optimizer::Adagrad)
    .num_epochs(10)
    .rng(rng)
    .build();

let start = Instant::now();
let loss = model.fit(&train_mat).unwrap();
let elapsed = start.elapsed();
let train_mrr = sbr::evaluation::mrr_score(&model, &train_mat).unwrap();
let test_mrr = sbr::evaluation::mrr_score(&model, &test_mat).unwrap();

println!(
    "Train MRR {} at loss {} and test MRR {} (in {:?})",
    train_mrr, loss, test_mrr, elapsed
);

License: MIT

sbr-rs's People

Contributors

Stargazers

Watchers

Forkers

happy-ferret mindis timedcy scaevola stanxii baajur guccialex johnkinyanjui

sbr-rs's Issues

Not fully backpropagated error

In the example given in readme

use std::time::Instant;

use rand;
use rand::prng::XorShiftRng;
use rand::SeedableRng;
use sbr;

fn main() {
    let mut data = sbr::datasets::download_movielens_100k().unwrap();

    let mut rng = XorShiftRng::from_seed([42; 16]);

    let (train, test) = sbr::data::user_based_split(&mut data, &mut rng, 0.2);
    let train_mat = train.to_compressed();
    let test_mat = test.to_compressed();

    println!("Train: {}, test: {}", train.len(), test.len());

    let mut model = sbr::models::lstm::Hyperparameters::new(data.num_items(), 32)
        .embedding_dim(32)
        .learning_rate(0.16)
        .l2_penalty(0.0004)
        .lstm_variant(sbr::models::lstm::LSTMVariant::Normal)
        .loss(sbr::models::Loss::WARP)
        .optimizer(sbr::models::Optimizer::Adagrad)
        .num_epochs(10)
        .rng(rng)
        .build();

    let start = Instant::now();
    let loss = model.fit(&train_mat).unwrap();
    let elapsed = start.elapsed();
    let train_mrr = sbr::evaluation::mrr_score(&model, &train_mat).unwrap();
    let test_mrr = sbr::evaluation::mrr_score(&model, &test_mat).unwrap();

    println!(
        "Train MRR {} at loss {} and test MRR {} (in {:?})",
        train_mrr, loss, test_mrr, elapsed
    );
}

getting the below error

Train: 82948, test: 17052
thread '<unnamed>' panicked at 'Not fully backpropagated.', /home/nineleaps/.cargo/registry/src/github.com-1ecc6299db9ec823/wyrm-0.9.1/src/nodes.rs:45:9
note: Run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
thread '<unnamed>' panicked at 'Not fully backpropagated.', /home/nineleaps/.cargo/registry/src/github.com-1ecc6299db9ec823/wyrm-0.9.1/src/nodes.rs:45:9
thread '<unnamed>' panicked at 'Not fully backpropagated.', /home/nineleaps/.cargo/registry/src/github.com-1ecc6299db9ec823/wyrm-0.9.1/src/nodes.rs:45:9
thread '<unnamed>' panicked at 'Not fully backpropagated.', /home/nineleaps/.cargo/registry/src/github.com-1ecc6299db9ec823/wyrm-0.9.1/src/nodes.rs:45:9
thread '<unnamed>' panicked at 'Not fully backpropagated.', /home/nineleaps/.cargo/registry/src/github.com-1ecc6299db9ec823/wyrm-0.9.1/src/nodes.rs:45:9
thread '<unnamed>' panicked at 'Not fully backpropagated.', /home/nineleaps/.cargo/registry/src/github.com-1ecc6299db9ec823/wyrm-0.9.1/src/nodes.rs:45:9
thread '<unnamed>' panicked at 'Not fully backpropagated.', /home/nineleaps/.cargo/registry/src/github.com-1ecc6299db9ec823/wyrm-0.9.1/src/nodes.rs:45:9
thread '<unnamed>' panicked at 'Not fully backpropagated.', /home/nineleaps/.cargo/registry/src/github.com-1ecc6299db9ec823/wyrm-0.9.1/src/nodes.rs:45:9

Unable to fit model when num_threads > number of logical cores on machine

Hi, I ran into an issue around multithreading support when running goodbooks-recommender (https://maciejkula.github.io/2018/07/27/recommending-books-with-rust/)

When instantiating hyperparameters with num_threads greater than the number of cores on my CPU - the task never completes, and goodbooks-recommender's CPU usage drops to 0 almost immediately.

After some println!() driven debugging I narrowed down the issue to this section of sbr https://github.com/maciejkula/sbr-rs/blob/master/src/models/sequence_model.rs#L101-L168

I think the issue is caused by a combination of the following factors:

work in the optimizer (from wyrm) is synchronized with barriers
https://github.com/maciejkula/wyrm/blob/master/src/optim/mod.rs#L81
rayon uses a threadpool to allocate work
https://github.com/rayon-rs/rayon/blob/master/FAQ.md#how-does-rayon-balance-work-between-threads,
and defaults to running n threads, where n is the number of logical cores
https://github.com/rayon-rs/rayon/blob/master/FAQ.md#how-many-threads-will-rayon-spawn
there will be m synchronized optimizers, where m is the number of threads specified
https://github.com/maciejkula/wyrm/blob/12715ae99ca531db6557dca786e4a480ec608101/src/optim/mod.rs#L100-L102
and work cannot start until their barriers are synchronized
unfortunately when num_threads is set higher than the number of logical cores, they will
not all be synchronized, as rayon will only run n (number of cores) pieces of work at a time

Inserting a println!() before and after this line in wyrm
(https://github.com/maciejkula/wyrm/blob/12715ae99ca531db6557dca786e4a480ec608101/src/optim/mod.rs#L81) illustrates this issue.

...
println!("pre barrier sync");
let _barrier = self.barrier_guard.synchronize();
println!("post barrier sync");
...

With thread count set to 4 all is fine and I see repeated "pre barrier sync", "post barrier sync" messages. If I set thread count to 5 I see 4 "pre barrier sync" messages followed by nothing, and the program hangs.

I'm not sure how to solve this, but hopefully this report is helpful nonetheless.

maciejkula / sbr-rs Goto Github PK

sbr-rs's Introduction

sbr

sbr-rs

Example

sbr-rs's People

Contributors

Stargazers

Watchers

Forkers

sbr-rs's Issues

Not fully backpropagated error

Unable to fit model when num_threads > number of logical cores on machine

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent