shiba24 / learning2rank Goto Github PK
View Code? Open in Web Editor NEWLearning to rank with neuralnet - RankNet and ListNet
Learning to rank with neuralnet - RankNet and ListNet
I did the following :
$ git clone https://github.com/shiba24/learning2rank.git
$ python learning2rank/__init__.py
Traceback (most recent call last):
File "learning2rank/__init__.py", line 1, in <module>
import utils, rank, regression
File "/mnt/E4481D43481D1640/Various/books/ML-Course/pw/learning2rank/rank/__init__.py", line 1, in <module>
import ListNet, RankNet
ModuleNotFoundError: No module named 'ListNet'
Shows this error.
What should I do?
how to compute the loss function....we only use one sequence if enough or need to use any different sequence to get the loss
I wrote the following code
import numpy as np
import utils, rank, regression
from rank import RankNet
Model = RankNet.RankNet()
X = np.array([[1, 2], [2, 3], [4, 5], [1, 3], [0, 0]]);
y = np.array([1, 2, 3, 4, 5]);
Model.fit(X, y);
Model.predict(X);
and got the next error, could you please help me with it
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "/Applications/PyCharm.app/Contents/helpers/pydev/_pydev_bundle/pydev_umd.py", line 197, in runfile
pydev_imports.execfile(filename, global_vars, local_vars) # execute the script
File "/Users/igladush/opensource-projects/learning2rank/__init__.py", line 3, in <module>
import utils, rank, regression
File "/Applications/PyCharm.app/Contents/helpers/pydev/_pydev_bundle/pydev_import_hook.py", line 21, in do_import
module = self._system_import(name, *args, **kwargs)
File "/Users/igladush/opensource-projects/learning2rank/rank/__init__.py", line 1, in <module>
import ListNet, RankNet
File "/Applications/PyCharm.app/Contents/helpers/pydev/_pydev_bundle/pydev_import_hook.py", line 21, in do_import
module = self._system_import(name, *args, **kwargs)
File "/Users/igladush/opensource-projects/learning2rank/rank/ListNet.py", line 18, in <module>
from learning2rank.utils import plot_result
File "/Applications/PyCharm.app/Contents/helpers/pydev/_pydev_bundle/pydev_import_hook.py", line 21, in do_import
module = self._system_import(name, *args, **kwargs)
File "/Users/igladush/opensource-projects/learning2rank/rank/../../learning2rank/__init__.py", line 11, in <module>
Model.fit(X, y);
File "/Users/igladush/opensource-projects/learning2rank/rank/../../learning2rank/rank/RankNet.py", line 123, in fit
self.trainModel(train_X, train_y, validate_X, validate_y, n_iter)
File "/Users/igladush/opensource-projects/learning2rank/rank/../../learning2rank/rank/RankNet.py", line 108, in trainModel
train_ndcg = self.ndcg(y_train, train_score)
File "/Users/igladush/opensource-projects/learning2rank/rank/../../learning2rank/rank/RankNet.py", line 84, in ndcg
ideal_dcg += (2 ** y_true_sorted[i] - 1.) / np.log2(i + 2)
Thank you for you code. But sorry, I don't understand how to use it. Could you please explain how to set up data for training in vector X and Y? Could you please provide more details?
I want to train ListNet to re-rank retrieved document so I got this error
Traceback (most recent call last):
File "ranking/rank/train.py", line 88, in
model.fit(X_train, y_train, X_test, y_test, Query, Query2, batchsize, n_epoch, n_hidden1, n_hidden2)
File "/home/ama/mhidy/nn_ranking/RankNet_chainer/ranking/rank/ListNet.py", line 193, in fit
self.trainModel(train_X, train_y, validate_X, validate_y, query, query_validate, n_epoch, batchsize)
File "/home/ama/mhidy/nn_ranking/RankNet_chainer/ranking/rank/ListNet.py", line 160, in trainModel
self.optimizer.update(self.model, x, t)
File "/home/ama/mhidy/.local/lib/python2.7/site-packages/chainer/optimizer.py", line 392, in update
loss = lossfun(*args, **kwds)
File "/home/ama/mhidy/nn_ranking/RankNet_chainer/ranking/rank/ListNet.py", line 61, in call
self.loss = self.listwise_cost(y, t)
File "/home/ama/mhidy/nn_ranking/RankNet_chainer/ranking/rank/ListNet.py", line 91, in listwise_cost
return - np.sum(self.topkprob(list_ans) * np.log(self.topkprob(list_pred)))
File "/home/ama/mhidy/nn_ranking/RankNet_chainer/ranking/rank/ListNet.py", line 85, in topkprob
vec_sort = np.sort(vec)[-1::-1]
File "/home/ama/mhidy/.local/lib/python2.7/site-packages/numpy/core/fromnumeric.py", line 824, in sort
a = asanyarray(a).copy(order="K")
File "/home/ama/mhidy/.local/lib/python2.7/site-packages/numpy/core/numeric.py", line 533, in asanyarray
return array(a, dtype, copy=False, order=order, subok=True)
File "/home/ama/mhidy/.local/lib/python2.7/site-packages/chainer/functions/array/get_item.py", line 71, in get_item
return GetItem(slices)(x)
File "/home/ama/mhidy/.local/lib/python2.7/site-packages/chainer/function.py", line 189, in call
self._check_data_type_forward(in_data)
File "/home/ama/mhidy/.local/lib/python2.7/site-packages/chainer/function.py", line 271, in _check_data_type_forward
six.raise_from(
AttributeError: 'module' object has no attribute 'raise_from'
Hi
I am facing two problems when using ListNet in this code with Letor dataset.
Problem 1:
My Loss does not seem to be decreasing. Following is the situation in the start.
epoch: 2
NDCG@100 | train: 0.2016394568476937, test: 0.19944033792067814
and these values are same for epoch 200
train mean loss=0.0
test mean loss=0.0
epoch: 201
NDCG@100 | train: 0.2016394568476937, test: 0.19944033792067814
Can you please comment why the loss isnt changing at all?
Problem 2:
Chainer maths warnings in log functions. Can you please tell how can I get rid of the following warnings?
Can you tell me how to get rid of Runtime warning related to chainer maths log functions?
..Anaconda3\lib\site-packages\chainer\functions\math\exponential.py:47: RuntimeWarning: divide by zero encountered in log
return utils.force_array(numpy.log(x[0])),
..Anaconda3\lib\site-packages\chainer\functions\math\exponential.py:47: RuntimeWarning: invalid value encountered in log
return utils.force_array(numpy.log(x[0])),
..Anaconda3\lib\site-packages\chainer\functions\math\basic_math.py:240: RuntimeWarning: invalid value encountered in multiply
return utils.force_array(x[0] * x[1]),
Thankyou
I get the score as:
[[nan]
[nan]
[nan]
[nan]
[nan]]
and I have changed the loss function as
def ndcg(self, y_true, y_score, k=1):
RESTART: G:\Implementation of the Project\Learning to Rank Algorithms\Python\ListNet+RankNet\learning2rank-master\rank\ListNet.py
Traceback (most recent call last):
File "G:\Implementation of the Project\Learning to Rank Algorithms\Python\ListNet+RankNet\learning2rank-master\rank\ListNet.py", line 18, in
from learning2rank.utils import NNfuncs
ModuleNotFoundError: No module named 'learning2rank'
I don't know how to set up parameters for different query
Could you tell me how to install this package?
Hi,
thanks a lot for sharing the code!
I am trying to use ListNet to learn a ranking problem:
import numpy as np
import random
from learning2rank.rank import ListNet
n = 1000
d = 200
X = np.random.rand(n,d)
y = np.random.rand(n)
model = ListNet.ListNet()
random.seed(1313)
model.fit(X, y,
batchsize=16,
n_epoch=10,
n_units1=256,
n_units2=256,
tv_ratio=0.67,
optimizerAlgorithm="Adam",
savefigName="result.pdf",
savemodelName="ListNet.model")
However, I run into the following error messages.
Start training and validation loop......
epoch 1
0%| | 0/42 [00:00<?, ?it/s]C:\ProgramData\Anaconda3\lib\site-packages\chainer\functions\math\exponential.py:51: RuntimeWarning: invalid value encountered in log
return utils.force_array(numpy.log(x[0])),
C:\ProgramData\Anaconda3\lib\site-packages\chainer\functions\activation\relu.py:38: RuntimeWarning: invalid value encountered in maximum
return utils.force_array(numpy.maximum(x, 0, dtype=x.dtype)),
C:\ProgramData\Anaconda3\lib\site-packages\learning2rank\rank\ListNet.py:58: RuntimeWarning: invalid value encountered in greater
ind = vec_true.data * vec_compare.data > 0
C:\ProgramData\Anaconda3\lib\site-packages\chainer\functions\activation\relu.py:97: RuntimeWarning: invalid value encountered in greater
Any idea what's going here? I am clueless...
I tried simple regression example.
but, I received following error messages.
import sys, os
import numpy as np
from learning2rank.regression import NN
X = np.array([[1, 1, 1], [2, 2, 2], [3, 3, 3]])
y = np.array([[1], [2], [3]])
Model = NN.NN()
Model.fit(X, y)
X = np.array([[1, 1, 1]])
Model.predict(X)
AssertionError Traceback (most recent call last)
in ()
11
12 X = np.array([[1, 1, 1]])
---> 13 Model.predict(X)
/root/learning2rank/utils/NNfuncs.pyc in predict(self, predict_X)
66
67 def predict(self, predict_X):
---> 68 return self.model.predict(predict_X.astype(np.float32))
69
70 # def predict(self, predict_X, batchsize=100):
/root/learning2rank/regression/NN.pyc in predict(self, x)
41
42 def predict(self, x):
---> 43 h1 = F.relu(self.l1(x))
44 h2 = F.relu(self.l2(h1))
45 h = F.relu(self.l3(h2))
/root/.pyenv/versions/anaconda3-5.2.0/envs/python27/lib/python2.7/site-packages/chainer/links/connection/linear.pyc in call(self, x)
63
64 """
---> 65 return linear.linear(x, self.W, self.b)
/root/.pyenv/versions/anaconda3-5.2.0/envs/python27/lib/python2.7/site-packages/chainer/functions/connection/linear.pyc in linear(x, W, b)
79 return LinearFunction()(x, W)
80 else:
---> 81 return LinearFunction()(x, W, b)
/root/.pyenv/versions/anaconda3-5.2.0/envs/python27/lib/python2.7/site-packages/chainer/function.pyc in call(self, *inputs)
100 in_data = tuple([x.data for x in inputs])
101 if self.type_check_enable:
--> 102 self._check_data_type_forward(in_data)
103 # Forward prop
104 with cuda.get_device(*in_data):
/root/.pyenv/versions/anaconda3-5.2.0/envs/python27/lib/python2.7/site-packages/chainer/function.pyc in _check_data_type_forward(self, in_data)
134
135 def _check_data_type_forward(self, in_data):
--> 136 in_type = type_check.get_types(in_data, 'in_types', False)
137 try:
138 self.check_type_forward(in_type)
/root/.pyenv/versions/anaconda3-5.2.0/envs/python27/lib/python2.7/site-packages/chainer/utils/type_check.pyc in get_types(data, name, accept_none)
44
45 info = TypeInfoTuple(
---> 46 _get_type(name, i, x, accept_none) for i, x in enumerate(data))
47 # I don't know a method to set an attribute in an initializer of tuple.
48 info.name = name
/root/.pyenv/versions/anaconda3-5.2.0/envs/python27/lib/python2.7/site-packages/chainer/utils/type_check.pyc in ((i, x))
44
45 info = TypeInfoTuple(
---> 46 _get_type(name, i, x, accept_none) for i, x in enumerate(data))
47 # I don't know a method to set an attribute in an initializer of tuple.
48 info.name = name
/root/.pyenv/versions/anaconda3-5.2.0/envs/python27/lib/python2.7/site-packages/chainer/utils/type_check.pyc in _get_type(name, index, array, accept_none)
58
59 assert(isinstance(array, numpy.ndarray) or
---> 60 isinstance(array, cuda.ndarray))
61 return Variable(TypeInfo(array.shape, array.dtype), var)
62
AssertionError:
Are you put query and document dat into X ? I can't understand that how to input my query and document data to train the model .
Can you please guide me on how to do the installation?
Did not mention anything about setup in the readme appreciate it if you share info on how to do installation
How to deal with many querys?
Hi
Please expalin the input and output formats required for this, is the output of ranknet a probability or a rank?
please clarify
Hi, I am implementing the model on MovieLens dataset, I am facing an issue with model training.
When I start training on the dataset, it generates the following error,
`InvalidType:
Invalid operation is performed in: LinearFunction (Forward)
Expect: x.shape[1] == W.shape[1]
Actual: 5 != 950198`
The complete output of model.fit(X,y)
is as follows:
`load dataset
The number of data, train: 950198 validate: 50011
prepare initialized model!
0%| | 0/5000 [00:00<?, ?it/s]
InvalidType Traceback (most recent call last)
in
----> 1 model.fit(X,y)
C:/Users/ppawar/Desktop/Genesys_PDP_code/ml-1m/learning2rank/rank\RankNet.py in fit(self, fit_X, fit_y, batchsize, n_iter, n_units1, n_units2, tv_ratio, optimizerAlgorithm, savefigName, savemodelName)
131 self.initializeModel(Model, train_X, n_units1, n_units2, optimizerAlgorithm)
132
--> 133 self.trainModel(train_X, train_y, validate_X, validate_y, n_iter)
134
135 plot_result.acc(self.train_loss, self.test_loss, savename=savefigName)
C:/Users/ppawar/Desktop/Genesys_PDP_code/ml-1m/learning2rank/rank\RankNet.py in trainModel(self, x_train, y_train, x_test, y_test, n_iter)
111 y_j = chainer.Variable(y_train[j])
112
--> 113 self.optimizer.update(self.model, x_i, x_j, y_i, y_j)
114
115 if (step + 1) % loss_step == 0:
~\AppData\Local\Continuum\anaconda3\envs\tensorflow_gpu_keras\lib\site-packages\chainer\optimizer.py in update(self, lossfun, *args, **kwds)
678 if lossfun is not None:
679 use_cleargrads = getattr(self, '_use_cleargrads', True)
--> 680 loss = lossfun(*args, **kwds)
681 if use_cleargrads:
682 self.target.cleargrads()
C:/Users/ppawar/Desktop/Genesys_PDP_code/ml-1m/learning2rank/rank\RankNet.py in call(self, x_i, x_j, t_i, t_j)
35 )
36 def call(self, x_i, x_j, t_i, t_j):
---> 37 s_i = self.l3(F.relu(self.l2(F.relu(self.l1(x_i)))))
38 s_j = self.l3(F.relu(self.l2(F.relu(self.l1(x_j)))))
39 s_diff = s_i - s_j
~\AppData\Local\Continuum\anaconda3\envs\tensorflow_gpu_keras\lib\site-packages\chainer\link.py in call(self, *args, **kwargs)
240 if forward is None:
241 forward = self.forward
--> 242 out = forward(*args, **kwargs)
243
244 # Call forward_postprocess hook
~\AppData\Local\Continuum\anaconda3\envs\tensorflow_gpu_keras\lib\site-packages\chainer\links\connection\linear.py in forward(self, x, n_batch_axes)
136 in_size = functools.reduce(operator.mul, x.shape[1:], 1)
137 self._initialize_params(in_size)
--> 138 return linear.linear(x, self.W, self.b, n_batch_axes=n_batch_axes)
~\AppData\Local\Continuum\anaconda3\envs\tensorflow_gpu_keras\lib\site-packages\chainer\functions\connection\linear.py in linear(x, W, b, n_batch_axes)
286 args = x, W, b
287
--> 288 y, = LinearFunction().apply(args)
289 if n_batch_axes > 1:
290 y = y.reshape(batch_shape + (-1,))
~\AppData\Local\Continuum\anaconda3\envs\tensorflow_gpu_keras\lib\site-packages\chainer\function_node.py in apply(self, inputs)
243
244 if configuration.config.type_check:
--> 245 self._check_data_type_forward(in_data)
246
247 hooks = chainer.get_function_hooks()
~\AppData\Local\Continuum\anaconda3\envs\tensorflow_gpu_keras\lib\site-packages\chainer\function_node.py in _check_data_type_forward(self, in_data)
328 in_type = type_check.get_types(in_data, 'in_types', False)
329 with type_check.get_function_check_context(self):
--> 330 self.check_type_forward(in_type)
331
332 def check_type_forward(self, in_types):
~\AppData\Local\Continuum\anaconda3\envs\tensorflow_gpu_keras\lib\site-packages\chainer\functions\connection\linear.py in check_type_forward(self, in_types)
25 x_type.ndim == 2,
26 w_type.ndim == 2,
---> 27 x_type.shape[1] == w_type.shape[1],
28 )
29 if type_check.eval(n_in) == 3:
~\AppData\Local\Continuum\anaconda3\envs\tensorflow_gpu_keras\lib\site-packages\chainer\utils\type_check.py in expect(*bool_exprs)
544 for expr in bool_exprs:
545 assert isinstance(expr, Testable)
--> 546 expr.expect()
547
548
~\AppData\Local\Continuum\anaconda3\envs\tensorflow_gpu_keras\lib\site-packages\chainer\utils\type_check.py in expect(self)
481 raise InvalidType(
482 '{0} {1} {2}'.format(self.lhs, self.exp, self.rhs),
--> 483 '{0} {1} {2}'.format(left, self.inv, right))
484
485
InvalidType:
Invalid operation is performed in: LinearFunction (Forward)
Expect: x.shape[1] == W.shape[1]
Actual: 5 != 950198
`
Hi @betterenvi
thank you for your modification of this repo!
I looked through your repository (and commits), and think it is a nice change. Could you send a PR to this master branch, if it doesn't bother you?
Thank you for reading!
Can you give an example? @shiba24
Hi,
Thanks so much for making your code available online!
I had a question: does your approach work if the y's are almost binary (very close to 0 or very close to 1)? Because I tried it and when I did
from learning2rank.rank import RankNet, ListNet
Model = RankNet.RankNet()
Model.fit(X,y()
predy = Model.predict(X)
np.min(predy),np.max(predy)
I got 0.0, 0.0.
My X data consist of 6 features (float rankings of objects according to 6 different approaches), and about 100,000 rows. The y's are close to either 0 or 1, depending on whether the objects appeared in a gold standard dataset. I am not sure if the code is designed to work for this type of setup?
Thank you!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.