Comments (7)
Hey @mijamo -- this is great feedback, thank you for taking the time!
In general, I think you're absolutely right about the documentation. I'm assigning this to myself to add more.
At a glance, getting very large positive/negative predictions is expected behavior for certain loss functions. Which loss function are you using? Only RMSE loss functions will try to reproduce the input interaction values, so if your loss isn't RMSE then the predictions are unbounded.
Having the items in similar order for many users tends to happen when the dataset has some items which are far more popular than others. A good way to correct for this is through selection of an appropriate loss function. I'd recommend using BalancedWMRBLossGraph
.
Regarding your other questions, I will elaborate them in the documentation. To answer quickly for you here:
Features can be any value.
The expected output depends on the loss function, but is unbounded. If you're using a learning-to-rank loss function, such as WMRB, the system is optimizing for output ranks, not output prediction values.
The system learns in general (using the default prediction graphs and a learning-to-rank loss) by using the dot product of the user_representation and item_representation as a prediction value. These predictions are then compared against the interactions, a loss is calculated, and that loss is propagated back through the representation graphs.
from tensorrec.
Great suggestion -- I added the mixture of tastes and attention systems after reading this paper: https://arxiv.org/abs/1711.08379
I'll add better documentation for it. It probably also merits a blog post outlining the thinking.
from tensorrec.
Thank you very much for your additional detail.
My problem right now is that each predicted user representation seem to be more or less equal despite input data being different, to be more precise, if user1 has representation R, all the other users seem to have a representation xR, x being a number between 0 and 10. As a result, all items predictions for the users are always the same, no matter which loss function I use.
This has been true when using LinearRepresentationGraph
or ReLURepresentationGraph
for the users. Using NormalizedLinearRepresentationGraph
gives me a similar result but there x would be between 0.99 and 1.01, which I guess makes sense if it is normalized.
This is what I made as a dummy user feature matrix ( a simplification of the real dataset, that still produces the same issue:
[[1.0e+01 0.0e+00 2.0e+04 1.0e+00]
[5.0e+01 0.0e+00 1.5e+05 1.0e+00]
[3.0e+01 0.0e+00 4.5e+04 1.0e+00]
[3.0e+01 0.0e+00 4.2e+04 1.0e+00]
[4.0e+01 0.0e+00 6.0e+04 1.0e+00]
[8.0e+00 1.0e+00 1.2e+04 0.0e+00]
[1.5e+01 1.0e+00 2.0e+04 0.0e+00]
[2.5e+01 1.0e+00 2.3e+04 0.0e+00]
[1.4e+01 1.0e+00 1.8e+04 0.0e+00]
[6.0e+00 1.0e+00 8.0e+03 0.0e+00]]
This is the items matrix:
[[5000. 60. 150. 18. 59.3 ]
[4500. 150. 400. 18.1 59.8 ]
[3500. 40. 100. 17.9 58.9 ]
[4200. 200. 200. 18.15 59.5 ]
[3300. 125. 450. 18.08 59.015]
[2300. 60. 150. 11.9 57.7 ]
[2500. 250. 300. 11.8 57.6 ]
[2600. 300. 1000. 11.85 57.55 ]
[2200. 50. 150. 11.98 57.96 ]]
and finally the interaction matrix:
(0, 0) 1
(0, 2) 1
(1, 1) 1
(1, 4) 1
(9, 5) 1
(9, 8) 1
(7, 6) 1
(7, 7) 1
(6, 6) 1
(6, 7) 1
(4, 1) 1
(4, 4) 1
(3, 1) 1
(3, 4) 1
I tried different representation graphs and loss functions. They give me different prediction order but in every case the prediction is the same for all the users.
My gut feeling is that there is something wrong in my input data but I don't really know what right now.
from tensorrec.
The algorithm may be having difficulty due to the item features all being nearly parallel. You may get better results by using a multi-layer neural network item repr (you can construct one using AbstractKerasRepr) or, more easily, by normalizing the item features.
For example:
import numpy as np
import scipy.stats as st
item_features = np.array(
[[5000, 60, 150, 18, 59.3, ],
[4500, 150, 400, 18.1, 59.8, ],
[3500, 40, 100, 17.9, 58.9, ],
[4200, 200, 200, 18.15, 59.5, ],
[3300, 125, 450, 18.08, 59.015,],
[2300, 60, 150, 11.9, 57.7, ],
[2500, 250, 300, 11.8, 57.6, ],
[2600, 300, 1000, 11.85, 57.55, ],
[2200, 50, 150, 11.98, 57.96, ]]
)
norm_item_features = st.zscore(item_features, axis=0)
Yields new item features that are not parallel:
array([[ 1.70332997, -0.86065224, -0.64808756, 0.8791161 , 0.84503229],
[ 1.18890146, 0.14241008, 0.2926847 , 0.91175655, 1.44152567],
[ 0.16004443, -1.08355498, -0.83624201, 0.84647565, 0.36783759],
[ 0.88024435, 0.69966693, -0.45993311, 0.92807677, 1.08362964],
[-0.04572698, -0.13621834, 0.48083916, 0.90522846, 0.50503106],
[-1.07458401, -0.86065224, -0.64808756, -1.1119513 , -1.06374653],
[-0.8688126 , 1.25692378, -0.0836242 , -1.14459175, -1.18304521],
[-0.7659269 , 1.81418063, 2.55053813, -1.12827153, -1.24269455],
[-1.17746971, -0.97210361, -0.64808756, -1.08583894, -0.75356997]])
If you give that a shot, let me know if it works for you!
from tensorrec.
One additional documentation request would be elaborating on n_tastes. Is this supposed to ferret out multiple representations for the same user? For instance, in the case when multiple people share a Netflix account?
from tensorrec.
Here is an example using RMSE to bound the prediction. https://stackoverflow.com/questions/33846069/how-to-set-rmse-cost-function-in-tensorflow
from tensorrec.
Hi @jfkirk
I am also having the same problem with @mijamo . I have applied your suggestions such as different loss function, normalizing the matrices but nothing has changed. The prediction always gives the sequence of items incrementally
model.predict(user_features=user_features[0], item_features=item_features[0:5]) result: [[0.02865639, 0.02865639, 0.02865639, 0.02865639, 0.02865639]]
model.predict_rank(user_features=user_features[0], item_features=item_features[0:5]) result: [1, 2, 3, 4, 5]
However, there is an interesting point my items have no additional features. I just want to use user_features and interactions so I just created a dummy item_features matrix like item_features= [[0], [0], [0], [0], [0], [0], [0], [0]]
If I randomly create an item_features matrix, the predictions become meaningful. Is there a way to produce recommendations without item_features?
from tensorrec.
Related Issues (20)
- TFLite export - Input and output nodes HOT 3
- How to use tensorrec to do online serving?
- Error: Pack node (stack_24) axis attribute is out of bounds: 1 HOT 1
- Validation techniques HOT 1
- Trouble saving models HOT 2
- How to exclude user's liked items from the predictions? HOT 1
- Unexpected outcomes from custom prediction graph
- How can i get similar users?
- Tensorrec environment issues HOT 5
- Inferences' issue HOT 1
- How to make Tensorrec have stable results HOT 2
- Temporal dynamics
- Fix simple typo: predictiones -> predictions
- How to write kNN by TensorRec?
- Error on dimensionality HOT 1
- Error - tensorflow has no attribute`get_default_session()` HOT 3
- module 'tensorflow' has no attribute 'get_default_session' HOT 4
- About users Features matrix
- Performance issues in the program
- Question - install of tensorrec only once
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tensorrec.