MCMC seems to be sensitive to the number of iterations between fit_predict calls. Set

Thank's for the example. rm clipping (since it doesn't make a

Done I will see if I can get the same issue to show up using <code clas

Oh, and there is this ToDo: <a href="https://github.com/ibayer/fastFM-core/blob/67

Understanding MCMC and n_more_iter about fastfm HOT 15 CLOSED

ibayer commented on August 17, 2024

Understanding MCMC and n_more_iter

from fastfm.

Comments (15)

ibayer commented on August 17, 2024

mcmc.fit_predict returns the average prediction over all samples. This

    y_pred[y_pred > 5] = 5
    y_pred[y_pred < 1] = 1

causes the model to average the clipped predictions.

Better use copies in your comparison.

    y_pred = fm.fit_predict(X_train, y_train, X_test, n_more_iter=10)
    y_pred_tmp = np.copy(y_pred)
    y_pred_tmp[y_pred_tmp > 5] = 5
    y_pred_tmp[y_pred_tmp < 1] = 1
    print(i, np.sqrt(mean_squared_error(y_pred_tmp, y_test)))

from fastfm.

merrellb commented on August 17, 2024

Are you saying that the clipping of the predictions actually impacts the underlying model? I tried using the copy as you suggest and I seem to get the same results.

Even if this is a potential problem how does this relate to the issue at hand? I don't see how this explains the discrepancy between running the model 100 times with a step of 1 vs running it 10 times with a step of 10.

from fastfm.

ibayer commented on August 17, 2024

Are you saying that the clipping of the predictions actually impacts the underlying model?

Yes, for mcmc the n+1 prediction depends on the n'th prediction. So if you change the n'th prediction by clipping...

I don't see how this explains the discrepancy between running the model 100 times with a step of 1 vs running it 10 times with a step of 10.

Well, the mean over 100 clipped predictions doesn't have to be the same as the mean over 10 clipped predictions, even if the underlying number of iterations is the same.

You could simply run the experiment without clipping as a test, then you know if clipping causes differences or not.

I can look into this If you narrow the issue down and provide a Short, Self Contained, Correct, Example.

from fastfm.

merrellb commented on August 17, 2024

I am not really seeing much (if any) difference by clipping the copy vs the original predictions. I had assumed the predictions provided by fit_predict were already a copy and not a mutable reference to the internal predictions used by the model.

Here is a example I've created. Please let me know if it meets your needs. Thanks!

https://gist.github.com/merrellb/1d5e1b9c2c2c03870c4d

from fastfm.

ibayer commented on August 17, 2024

Thank's for the example.

rm clipping (since it doesn't make a difference)
Reduce the dataset size as much as possible (a small artificial example is best). Ideal is something
one can check by hand. You could also try sklearn.datasets.make_regression.
Can you do the same comparison with the als solver? The mcmc solver uses much of the als code.
If this is a bug in the warm start code then it should also be visible in a als warm start comparison. als is much easier to debug.

from fastfm.

merrellb commented on August 17, 2024

Done
I will see if I can get the same issue to show up using make_regression. If not I will see if I can pare Movielens down small enough to embed in the example
So far I have not been able to recreate this with the als solver. Ten steps of 1 seem to yield the same result at one step of 10.

Could the same reasons that we are unable to separate fit and predict with mcmc (seeming loss of state?) be impacting warm starts? I must admit I don't have a great understanding of mcmc.

from fastfm.

ibayer commented on August 17, 2024

I can reproduce your results. Great, a really small example can even go in the test suite later. Good to know that als works, that narrows things down. Can you compare against restart too?

Could the same reasons that we are unable to separate fit and predict (seeming loss of state?) be impacting warm starts?

That's likely. I'll take a closer look later.

from fastfm.

merrellb commented on August 17, 2024

Can you explain in relative layman's terms the differences with mcmc that prevent good results when splitting fit and predict? It seems that if we have enough state to warm-start the algorithm we should have enough state to predict independently of fitting.

I'm not quite sure what you mean by "restart". I modified my script to explore the one other variation I see and the results were unremarkable. The slightly modified script and its output illustrating this is at:

https://gist.github.com/merrellb/8086a166916a7353f896

from fastfm.

ibayer commented on August 17, 2024

Can you explain in relative layman's terms the differences with mcmc that prevent good results when splitting fit and predict? It seems that if we have enough state to warm-start the algorithm we should have enough state to predict independently of fitting.

MCMC returns the mean over the prediction at every iteration. The model parameter themselves are too expensive to keep. The mean is calculated on a running basis.

I'm not quite sure what you mean by "restart". I modified my script to explore the one other variation I see and the results were unremarkable.

You got it right, it's your "coldstart".

The slightly modified script and its output illustrating this is at:

Great, I think I know now what's wrong!
The formula for the mean updates is only correct for n_more_iter=1.

See:
https://github.com/ibayer/fastFM-core/blob/master/src/ffm_als_mcmc.c#L255
https://github.com/ibayer/fastFM-core/blob/master/src/ffm_utils.c#L54

PR welcome, I won't have time to fix this before the weekend.

from fastfm.

ibayer commented on August 17, 2024

Oh, and there is this ToDo:
https://github.com/ibayer/fastFM-core/blob/67e49ea292d1dedacb591c5ba74e81b2f1acbb75/src/ffm_als_mcmc.c#L259

from fastfm.

merrellb commented on August 17, 2024

That is great you've narrowed down the issue although I am a bit confused that "the formula for the mean updates is only correct for n_more_iter=1." Isn't n_more_iter=1 is the value that my "hot" example uses to illustrate the problem?

Wish I could help on the C side of things but my skills are more than a decade rusty. Definitely happy to help where I can on the Python side of things.

from fastfm.

ibayer commented on August 17, 2024

I have just merge a fix for the bad mcmc performance that you observed when using the n_more_iter parameter. I was wrong about my guess what causes this issue. Please let me know if my PR fixes your issue.
New binaries are available:
https://github.com/ibayer/fastFM/releases/tag/v0.2.4

from fastfm.

merrellb commented on August 17, 2024

My initial testing of the OSX Python 3.5 wheel seems to confirm the fix (some random variation but hot, warm and cold all perform similarly). Thanks!

One very minor issue is that the filename doesn't quite match up with the Github version tag.

from fastfm.

ibayer commented on August 17, 2024

Great, I have fixed the file names too.

from fastfm.

ibayer commented on August 17, 2024

Can we close this issue?

from fastfm.

Understanding MCMC and n_more_iter about fastfm HOT 15 CLOSED

Comments (15)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent