Giter Site home page Giter Site logo

tensorflow / ranking Goto Github PK

View Code? Open in Web Editor NEW
2.7K 98.0 473.0 6.24 MB

Learning to Rank in TensorFlow

License: Apache License 2.0

Python 91.28% Jupyter Notebook 6.60% Shell 0.14% Starlark 1.99%
ranking machine-learning deep-learning information-retrieval learning-to-rank recommender-systems

ranking's Introduction

TensorFlow Ranking

TensorFlow Ranking is a library for Learning-to-Rank (LTR) techniques on the TensorFlow platform. It contains the following components:

We envision that this library will provide a convenient open platform for hosting and advancing state-of-the-art ranking models based on deep learning techniques, and thus facilitate both academic research and industrial applications.

Tutorial Slides

TF-Ranking was presented at premier conferences in Information Retrieval, SIGIR 2019 and ICTIR 2019! The slides are available here.

Demos

We provide a demo, with no installation required, to get started on using TF-Ranking. This demo runs on a colaboratory notebook, an interactive Python environment. Using sparse features and embeddings in TF-Ranking Run in Google Colab. This demo demonstrates how to:

  • Use sparse/embedding features
  • Process data in TFRecord format
  • Tensorboard integration in colab notebook, for Estimator API

Also see Running Scripts for executable scripts.

Linux Installation

Stable Builds

To install the latest version from PyPI, run the following:

# Installing with the `--upgrade` flag ensures you'll get the latest version.
pip install --user --upgrade tensorflow_ranking

To force a Python 3-specific install, replace pip with pip3 in the above commands. For additional installation help, guidance installing prerequisites, and (optionally) setting up virtual environments, see the TensorFlow installation guide.

Note: Since TensorFlow is now included as a dependency of the TensorFlow Ranking package (in setup.py). If you wish to use different versions of TensorFlow (e.g., tensorflow-gpu), you may need to uninstall the existing verison and then install your desired version:

$ pip uninstall tensorflow
$ pip install tensorflow-gpu

Installing from Source

  1. To build TensorFlow Ranking locally, you will need to install:

    • Bazel, an open source build tool.

      $ sudo apt-get update && sudo apt-get install bazel
    • Pip, a Python package manager.

      $ sudo apt-get install python-pip
    • VirtualEnv, a tool to create isolated Python environments.

      $ pip install --user virtualenv
  2. Clone the TensorFlow Ranking repository.

    $ git clone https://github.com/tensorflow/ranking.git
  3. Build TensorFlow Ranking wheel file and store them in /tmp/ranking_pip folder.

    $ cd ranking  # The folder which was cloned in Step 2.
    $ bazel build //tensorflow_ranking/tools/pip_package:build_pip_package
    $ bazel-bin/tensorflow_ranking/tools/pip_package/build_pip_package /tmp/ranking_pip
  4. Install the wheel package using pip. Test in virtualenv, to avoid clash with any system dependencies.

    $ ~/.local/bin/virtualenv -p python3 /tmp/tfr
    $ source /tmp/tfr/bin/activate
    (tfr) $ pip install /tmp/ranking_pip/tensorflow_ranking*.whl

    In some cases, you may want to install a specific version of tensorflow, e.g., tensorflow-gpu or tensorflow==2.0.0. To do so you can either

    (tfr) $ pip uninstall tensorflow
    (tfr) $ pip install tensorflow==2.0.0

    or

    (tfr) $ pip uninstall tensorflow
    (tfr) $ pip install tensorflow-gpu
  5. Run all TensorFlow Ranking tests.

    (tfr) $ bazel test //tensorflow_ranking/...
  6. Invoke TensorFlow Ranking package in python (within virtualenv).

    (tfr) $ python -c "import tensorflow_ranking"

Running Scripts

For ease of experimentation, we also provide a TFRecord example and a LIBSVM example in the form of executable scripts. This is particularly useful for hyperparameter tuning, where the hyperparameters are supplied as flags to the script.

TFRecord Example

  1. Set up the data and directory.

    MODEL_DIR=/tmp/tf_record_model && \
    TRAIN=tensorflow_ranking/examples/data/train_elwc.tfrecord && \
    EVAL=tensorflow_ranking/examples/data/eval_elwc.tfrecord && \
    VOCAB=tensorflow_ranking/examples/data/vocab.txt
  2. Build and run.

    rm -rf $MODEL_DIR && \
    bazel build -c opt \
    tensorflow_ranking/examples/tf_ranking_tfrecord_py_binary && \
    ./bazel-bin/tensorflow_ranking/examples/tf_ranking_tfrecord_py_binary \
    --train_path=$TRAIN \
    --eval_path=$EVAL \
    --vocab_path=$VOCAB \
    --model_dir=$MODEL_DIR \
    --data_format=example_list_with_context

LIBSVM Example

  1. Set up the data and directory.

    OUTPUT_DIR=/tmp/libsvm && \
    TRAIN=tensorflow_ranking/examples/data/train.txt && \
    VALI=tensorflow_ranking/examples/data/vali.txt && \
    TEST=tensorflow_ranking/examples/data/test.txt
  2. Build and run.

    rm -rf $OUTPUT_DIR && \
    bazel build -c opt \
    tensorflow_ranking/examples/tf_ranking_libsvm_py_binary && \
    ./bazel-bin/tensorflow_ranking/examples/tf_ranking_libsvm_py_binary \
    --train_path=$TRAIN \
    --vali_path=$VALI \
    --test_path=$TEST \
    --output_dir=$OUTPUT_DIR \
    --num_features=136 \
    --num_train_steps=100

TensorBoard

The training results such as loss and metrics can be visualized using Tensorboard.

  1. (Optional) If you are working on remote server, set up port forwarding with this command.

    $ ssh <remote-server> -L 8888:127.0.0.1:8888
  2. Install Tensorboard and invoke it with the following commands.

    (tfr) $ pip install tensorboard
    (tfr) $ tensorboard --logdir $OUTPUT_DIR

Jupyter Notebook

An example jupyter notebook is available in tensorflow_ranking/examples/handling_sparse_features.ipynb.

  1. To run this notebook, first follow the steps in installation to set up virtualenv environment with tensorflow_ranking package installed.

  2. Install jupyter within virtualenv.

    (tfr) $ pip install jupyter
  3. Start a jupyter notebook instance on remote server.

    (tfr) $ jupyter notebook tensorflow_ranking/examples/handling_sparse_features.ipynb \
            --NotebookApp.allow_origin='https://colab.research.google.com' \
            --port=8888
  4. (Optional) If you are working on remote server, set up port forwarding with this command.

    $ ssh <remote-server> -L 8888:127.0.0.1:8888
  5. Running the notebook.

    • Start jupyter notebook on your local machine at http://localhost:8888/ and browse to the ipython notebook.

    • An alternative is to use colaboratory notebook via colab.research.google.com and open the notebook in the browser. Choose local runtime and link to port 8888.

References

  • Rama Kumar Pasumarthi, Sebastian Bruch, Xuanhui Wang, Cheng Li, Michael Bendersky, Marc Najork, Jan Pfeifer, Nadav Golbandi, Rohan Anil, Stephan Wolf. TF-Ranking: Scalable TensorFlow Library for Learning-to-Rank. KDD 2019.

  • Qingyao Ai, Xuanhui Wang, Sebastian Bruch, Nadav Golbandi, Michael Bendersky, Marc Najork. Learning Groupwise Scoring Functions Using Deep Neural Networks. ICTIR 2019

  • Xuanhui Wang, Michael Bendersky, Donald Metzler, and Marc Najork. Learning to Rank with Selection Bias in Personal Search. SIGIR 2016.

  • Xuanhui Wang, Cheng Li, Nadav Golbandi, Mike Bendersky, Marc Najork. The LambdaLoss Framework for Ranking Metric Optimization. CIKM 2018.

Citation

If you use TensorFlow Ranking in your research and would like to cite it, we suggest you use the following citation:

@inproceedings{TensorflowRankingKDD2019,
   author = {Rama Kumar Pasumarthi and Sebastian Bruch and Xuanhui Wang and Cheng Li and Michael Bendersky and Marc Najork and Jan Pfeifer and Nadav Golbandi and Rohan Anil and Stephan Wolf},
   title = {TF-Ranking: Scalable TensorFlow Library for Learning-to-Rank},
   booktitle = {Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
   year = {2019},
   pages = {2970--2978},
   location = {Anchorage, AK}
}

ranking's People

Contributors

aijunbai avatar aikinogard avatar anandolee avatar ananth1998 avatar brills avatar edloper avatar hertschuh avatar hongleizhuang avatar joefernandez avatar jplu avatar lukewood avatar lyyanlely avatar markdaoust avatar markmcd avatar nkovela1 avatar qlzh727 avatar ramakumar1729 avatar rickeylev avatar rjagerman avatar rohan100jain avatar ronshapiro avatar saberkun avatar sbruch avatar ucdmkt avatar we1559 avatar weilonghu avatar wentong-dst avatar xuanhuiwang avatar yilei avatar zenogantner avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ranking's Issues

Unbiased Learning-to-Rank from biased feedback data

The README references the work from Joachims in "Unbiased Learning-to-Rank from biased feedback data." I was expecting to see some type of IPS implementation in the code, but I can't find any reference to the work. What am I missing?

Model export

Could you please give a code example of how to export a model for TensorFlow Serving? No luck with estimator.export_saved_model or tf.estimator.BestExporter. I must be doing something wrong with feature_spec.

feeding data too slow

Hi,i have a large training set.So i use libsvm_generator to construct my dataset.But the cpu usage is really low.Do you have any api or methods like "Parallelize Data Transformation" in tensorflow Data Input Pipeline Performance guide so i can feeding data in parallel?

How does batch normalization work in input

There is an example with batch normalization. But I am still confused about the process.

First, the input tensor has the shape of [list size, num features]. Does different normalization apply to individual elements in the matrix of [list size, num features]? I think we should actually do normalization on different feature columns but not rows.

Second, there are padding elements filled with zeros. It looks like they will contribute to the normalization process as well. This seems to be very fishy. Can someone confirm?

How to interpret the predict output?

I'm new to learning to rank, and I need some help on understanding the model's output.
Using the sample code and data, for each iteration, it outputs a (100,) array for a single query, but how to interpret this result?

Does it indicate the relevance score of the first 100 "document" (order by document_id)? If so, how can I find the corresponding document in the dataset? (it seems we don't specify an document_id in the dataset)

Thanks!

Siamese RankNet

I was wondering if it's possible to implement the Siamese Ranknet using TF Ranking: http://www.eggie5.com/130-learning-to-rank-siamese-network-pairwise-data


Figure 1: Example pairwise training routine (w/ feature extraction, Scoring function (MLP), and objective)


Figure 2: Keras diagram of objective (w/o scoring function)

Pointwise Scoring function:
F(x) = [f(x_1), f(x_2), ..., f(x_n)]

where f(x) is some type of scoring regression that that maps the input feature x to a scalar. f: x -> \matbb{R} . (The should be no problem w/ TF Ranking)

Pairwise Loss:

The loss uses binary cross entropy to minimize this objective:

logits = f(x_i) - f(x_j)

which is the difference between the two pairwise training pairs. The respective label y in [-1,0,1] indicates if f(x_i) or f(x_j) is larger or equal.

I see this loss exists: pairwise_logistic_loss, but I'm not sure how to use this w/ TF Ranking. How do I handle data that is in pairs? For example:

A > B
B > C
C > D
A > C

This type of data can come from logs, like in: Joachims, Thorsten. "Optimizing search engines using clickthrough data."

Pairwise Logistic Loss (paper typo)

In the TF Ranking paper, equation 4, for the Pairwise Logistic Loss shows:

log(1 + exp(pairwise_logits) 

but, I think it's missing a negative sign. Should be:

log(1 + exp(-pairwise_logits)

Otherwise a correct ranking would incur a large loss penalty.

How to use tensorflow ranking for prediction ?

As per my understanding from sample dataset, libsvm generator is assuming that relevance level will be present.

I am using sample
Can you suggest what changes to make so that I can use my trained model later for prediction of relevance levels/ranking based on features and document list for a query.

Negative train and validation loss as training proceeds

For document relevance score > 1 as training proceeds, loss decreases and becomes negative. Which should not be the case. I am using sigmoid_cross_entropy_loss.

The NN architecture is similar to what is provided in example, with group size 2.

Could you please assist in debugging what might be going wrong?

Below is the output log during training:

INFO:tensorflow:Running training and evaluation locally (non-distributed).
INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps 1000 or save_checkpoints_secs None.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Use groupwise dnn v2.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 0 into ./models_08Mar_T3/model.ckpt.
INFO:tensorflow:loss = 0.71311927, step = 1
INFO:tensorflow:global_step/sec: 2.5571
INFO:tensorflow:loss = 0.29717356, step = 101 (39.108 sec)
INFO:tensorflow:global_step/sec: 2.61819
INFO:tensorflow:loss = 0.18879782, step = 201 (38.194 sec)
INFO:tensorflow:global_step/sec: 2.6048
INFO:tensorflow:loss = -0.15803288, step = 301 (38.391 sec)
INFO:tensorflow:global_step/sec: 2.61858
INFO:tensorflow:loss = -0.27342975, step = 401 (38.189 sec)
INFO:tensorflow:global_step/sec: 2.62001
INFO:tensorflow:loss = -0.7961479, step = 501 (38.167 sec)
INFO:tensorflow:global_step/sec: 2.61745
INFO:tensorflow:loss = -2.0954611, step = 601 (38.205 sec)
INFO:tensorflow:global_step/sec: 2.62294
INFO:tensorflow:loss = -1.7541554, step = 701 (38.125 sec)
INFO:tensorflow:global_step/sec: 2.60777
INFO:tensorflow:loss = -2.5586243, step = 801 (38.347 sec)
INFO:tensorflow:global_step/sec: 2.62359
INFO:tensorflow:loss = -2.4577546, step = 901 (38.116 sec)

ValueError: could not convert string to float

Traceback (most recent call last):
File "/home/Downloads/ranking/./bazel-bin/tensorflow_ranking/examples/tf_ranking_libsvm_py_binary.runfiles/org_tensorflow_ranking/tensorflow_ranking/examples/tf_ranking_libsvm.py", line 338, in
tf.app.run()
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "/home/Downloads/ranking/./bazel-bin/tensorflow_ranking/examples/tf_ranking_libsvm_py_binary.runfiles/org_tensorflow_ranking/tensorflow_ranking/examples/tf_ranking_libsvm.py", line 329, in main
train_and_eval()
File "/home/Downloads/ranking/./bazel-bin/tensorflow_ranking/examples/tf_ranking_libsvm_py_binary.runfiles/org_tensorflow_ranking/tensorflow_ranking/examples/tf_ranking_libsvm.py", line 283, in train_and_eval
FLAGS.list_size)
File "/home/Downloads/ranking/./bazel-bin/tensorflow_ranking/examples/tf_ranking_libsvm_py_binary.runfiles/org_tensorflow_ranking/tensorflow_ranking/examples/tf_ranking_libsvm.py", line 144, in load_libsvm_data
qid, features, label = _parse_line(line)
File "/home/Downloads/ranking/./bazel-bin/tensorflow_ranking/examples/tf_ranking_libsvm_py_binary.runfiles/org_tensorflow_ranking/tensorflow_ranking/examples/tf_ranking_libsvm.py", line 126, in _parse_line
features = {k: float(v) for (k, v) in kv_pairs}
File "/home/Downloads/ranking/./bazel-bin/tensorflow_ranking/examples/tf_ranking_libsvm_py_binary.runfiles/org_tensorflow_ranking/tensorflow_ranking/examples/tf_ranking_libsvm.py", line 126, in
features = {k: float(v) for (k, v) in kv_pairs}
ValueError: could not convert string to float:

Does that mean no supporting for string feature?
Thanks

Support for negative sampling with implicit interaction data

I would like to train a model capable of recommending products to users. My training data is implicit, so each query-document pair has no relevance score.

For example:

1 uid:10 32:0.14 48:0.97  51:0.45
1 uid:10 1:0.15  31:0.75  32:0.24  49:0.6
1 uid:10 1:0.71  2:0.36   31:0.58  51:0.12
1 uid:4 1:0.15  31:0.75  32:0.24  49:0.6
1 uid:4 1:0.71  2:0.36   31:0.58  51:0.12

Here, the dataset contains 2 users. User "10" has 3 interactions and user "4" has 2 interactions. All interactions are simply 'click', without a relevance score.

I want to randomly sample negatives (items a user did not click) and train the model with a pairwise ranking loss.

The paper states tf-ranking supports this, however, I cannot find any examples in the repo. Can someone point me in the right direction?

approx_ndcg_loss loss is not detected

hi,

I want to downloaded latest repo and test approx_ndcg_loss. But the key is not picked up from the losses package, for some reason.

I tried debugging and found that "key_to_fn" doesn't list "RankingLossKey.APPROX_NDCG_LOSS" in debug mode. Not sure why.

Could someone please fix this or guide ASAP?

Thanks,
Narasimha

How to interpret steps?

My eval data set contains 39540 queries, with list size 10 and batch size 128. Inside the data, there are 231820 distinct comparable pairs such that label_i>label_j. I did evaluation with 1 epoch. I use a data set generator like this:
train_dataset.shuffle(5000).repeat(num_epochs).batch(_BATCH_SIZE)

But for 1 epoch evaluation, I see 32475 global steps. This stackoverflow post says steps indicates batches. If so, for 1 epoch I should have 39540/128~309 steps, but the number of steps is way bigger than this and is of the same scale of number of queries.

Can someone give me a hint how steps are defined?

function_utils does not exist in tensorflow.python.util

mc@prec:~/workplace/newsp/$ python -c "import tensorflow_ranking"
Traceback (most recent call last):
File "", line 1, in
File "/workplace/newsp/tensorflow_ranking/init.py", line 20, in
from tensorflow_ranking.python import * # pylint: disable=wildcard-import
File "/workplace/newsp/tensorflow_ranking/python/init.py", line 26, in
from tensorflow_ranking.python import model
File "/workplace/newsp/tensorflow_ranking/python/model.py", line 32, in
from tensorflow.python.util import function_utils
ImportError: cannot import name 'function_utils'

Could you double check about this?

Windows or Mac?

Hi, what about Windows or Mac? Is there any docs on installation on Windows or Mac?

Regularization

I wanted to do an experiment w/ a pairwise loss where my inputs are a user and item/document embeddings. I should be able to recover BPR style matrix factorization. Here is my scoring function:

def make_score_fn():
    def _score_fn(context_features, group_features, mode, params, config):

        with tf.name_scope("input_layer"):
            item_id = categorical_column_with_identity(
                key='iid', num_buckets=params.item_buckets, default_value=0)
            item_emb = embedding_column(item_id, params.K)

            user_id = categorical_column_with_identity(
                key="uid", num_buckets=params.user_buckets, default_value=0)
            user_emb = embedding_column(user_id, params.K)

            dense_user = tf.feature_column.input_layer(group_features, [user_emb])
            dense_item = tf.feature_column.input_layer(group_features, [item_emb])
        
        dot = tf.reduce_sum(tf.math.multiply(dense_user, dense_item), 1, keep_dims=True)
        logits = dot
        return logits
        #Returns: Tensor of shape [batch_size, group_size] containing per-example scores.

    return _score_fn

As you can see it's the dot product of the embeddings. This seems to work fine as it overfits the training set. Now, I'd like to regularize the embeddings, but I can't figure out how to do it in this scoring function. If I had access to the loss, I would do something like this:

l2=tf.contrib.layers.l2_regularizer(lambda)
l2_reg = tf.contrib.layers.apply_regularization(l2, weight_list=[dense_user, dense_item])
loss += l2_reg

Make export type configurable

Currently when saving a trained estimator the default ranking head uses tf.estimator.export.RegressionOutput, but this limits the saved model to string inputs. The ranking head can be extended, but it might be more convenient for downstram users if this was a providable option. This would allow downstream users to choose how they will feed the models. In our system the models execute in the same process that calculates the features, making a serialization step undesirable.

BatchNorm and DropOut layer control - Train vs Eval mode

Can someone please clarify if batchnorm and drop out layer modes are automatically switched or we have to manually control train vs eval mode.

I presume for custom tf.estimators, we have to add tf.control_dependencies. Do we have to do this tensorflow ranking as well?

Please clarify with an example if you could.

Thanks,
Narasimha

lambdaMART tutorials

hi,would you please issue a tutorials for how to create a lambdaMART model using TF-ranking frame?

Bazel test failed

Hi,
After I installed tensforflow_ranking successfully, I ran the test script but found following errors:
INFO: Invocation ID: e63b1cdf-a900-4899-b2e2-b85de852fa6b
INFO: Analysed 21 targets (8 packages loaded, 192 targets configured).
INFO: Found 13 targets and 8 test targets...
FAIL: //tensorflow_ranking/python:model_test (see /root/.cache/bazel/_bazel_root/197a28263a16f7d64afd921c986f3c1b/execroot/org_tensorflow_ranking/bazel-out/k8-fastbuild/testlogs/tensorflow_ranking/python/model_test/test.log)
FAIL: //tensorflow_ranking/python:head_test (see /root/.cache/bazel/_bazel_root/197a28263a16f7d64afd921c986f3c1b/execroot/org_tensorflow_ranking/bazel-out/k8-fastbuild/testlogs/tensorflow_ranking/python/head_test/test.log)
FAIL: //tensorflow_ranking/python:losses_test (see /root/.cache/bazel/_bazel_root/197a28263a16f7d64afd921c986f3c1b/execroot/org_tensorflow_ranking/bazel-out/k8-fastbuild/testlogs/tensorflow_ranking/python/losses_test/test.log)
FAIL: //tensorflow_ranking/python:data_test (see /root/.cache/bazel/_bazel_root/197a28263a16f7d64afd921c986f3c1b/execroot/org_tensorflow_ranking/bazel-out/k8-fastbuild/testlogs/tensorflow_ranking/python/data_test/test.log)
FAIL: //tensorflow_ranking/python:feature_test (see /root/.cache/bazel/_bazel_root/197a28263a16f7d64afd921c986f3c1b/execroot/org_tensorflow_ranking/bazel-out/k8-fastbuild/testlogs/tensorflow_ranking/python/feature_test/test.log)
FAIL: //tensorflow_ranking/python:metrics_test (see /root/.cache/bazel/_bazel_root/197a28263a16f7d64afd921c986f3c1b/execroot/org_tensorflow_ranking/bazel-out/k8-fastbuild/testlogs/tensorflow_ranking/python/metrics_test/test.log)
FAIL: //tensorflow_ranking/python:utils_test (see /root/.cache/bazel/_bazel_root/197a28263a16f7d64afd921c986f3c1b/execroot/org_tensorflow_ranking/bazel-out/k8-fastbuild/testlogs/tensorflow_ranking/python/utils_test/test.log)
FAIL: //tensorflow_ranking/examples:tf_ranking_libsvm_test (see /root/.cache/bazel/_bazel_root/197a28263a16f7d64afd921c986f3c1b/execroot/org_tensorflow_ranking/bazel-out/k8-fastbuild/testlogs/tensorflow_ranking/examples/tf_ranking_libsvm_test/test.log)

Can you please help to identify root cause?

Exporting LTR Model in SavedModel Format

Could we please get an example of how to save the produced model in https://github.com/tensorflow/ranking/blob/master/tensorflow_ranking/examples/tf_ranking_libsvm.ipynb in SavedModel format (ref: https://www.tensorflow.org/guide/saved_model). I've tried a bunch of variations of exporting from the TensorFlow API docs and GitHub examples with no luck.

feature_columns = example_feature_columns()
feature_spec = tf.feature_column.make_parse_example_spec(feature_columns)
export_fn = tf.estimator.export.build_parsing_serving_input_receiver_fn(feature_spec)
ranker.export_savedmodel("savedmodel", export_fn)

With the example above I get KeyError: NumericColumn(key='1', shape=(1,), default_value=(0.0,), dtype=tf.float32, normalizer_fn=None)

Some guidance on this problem would be greatly appreciated.

Tensor board empty scalers

All the scalars that are in the tensorboard are almost empty, where they only show the either only the 1st or 100th step.

Here's a couple of examples of downloaded CSVs after running.

run_eval-tag-metric_ndcg@10

Wall time Step Value
1547130262.059631 100 0.792011022567749
1547130264.828965 100 0.792011022567749
1547133669.9526756 100 0.792011022567749

run_.-tag-loss

Wall time Step Value
1547130254.665982 1 1.5420621633529663

I ran the code as in the instructions, without any edit, still I get those weird scalers.

The tf.summary namespace is included normally in the code, still gets ambiguous results.
tf.summary.scalar("input_sparsity", tf.nn.zero_fraction(input_layer)) tf.summary.scalar("input_max", tf.reduce_max(input_layer)) tf.summary.scalar("input_min", tf.reduce_min(input_layer))

I am running remotely on an ec2 instance, using ssh -L 16006:127.0.0.1:6006 ... and opening http://localhost:16006/ on chrome.

Support for tensorflow gpu

Currently the bazel build hardcodes the tensorflow dependency to the cpu version, it would be nice if this was configurable to support gpu builds.

Also I'm interested to know if you've benchmarked this library on tensorflow gpu and what the performance characteristics look like for different size data sets.

libSVM Parser Shuffles Documents

Using the libsvm parser I noticed my documents were getting shuffled. I found this code:

np.random.shuffle(doc_list)

Why would one shuffle the documents? Doesn't the order (position) of your relevance labels imply meaning? Wouldn't this throw off DCG?

DCG for reference:

DCG = \sum^n_{i=1} \frac{rel_i}{\log(i+1)}

dcg = lambda r: np.sum(r/np.log2(np.arange(2, r.size+2)))

training is too slow

Ubuntu 18, CPU mode

I have 44 features in libSVM format
I run tf_ranking_libsvm.ipynb example on my dataset
I guess the problem is input_fn function that calls tfr.data.libsvm_generator, because cpu load is minimal

How can I fix it? Is that a problem with tf.data.Dataset.from_generator?

Losses & Metrics Ignore Negative Numbers?

The ranking metrics in TF Ranking seem to ignore negative numbers, which I think is a nice thing, is this a feature? Where is this in the code?

#the ranking metrics seem to ignore -1s....
labels = [[-1, 1., 0.]]
scores = [[3, 1., 0.]]

weights_feature_name = 'weights'
features = {weights_feature_name: weights}
dcg = tfr.metrics.make_ranking_metric_fn(tfr.metrics.RankingMetricKey.DCG)
val, op = dcg(labels, scores, None)

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    sess.run(tf.local_variables_initializer())
    print(sess.run(op))
1.0

For example, a common DGC implementation will not ignore negatives:

labels=[[-1, 1., 0.]]
r = np.asarray(labels)
dcg = lambda r: np.sum(r/np.log2(np.arange(2, r.size+2)))
dcg2 = lambda r: np.sum((np.power(2,r)-1)/np.log2(np.arange(2, r.size+2)))
dcg(r), dcg2(r)
(-0.36907024642854247, 0.13092975357145753)

I just want to confirm, b/c I want to use this behavior to generate libsvm data for queries where I only have implicit judgements from clickthrough data. For example, the query:

A
B
C (clicked)
D
E

Means that C should be ranked above B and A, but we can't make any judgements about D and E. Encoded in libsvm format:

0 qid:1 [features] #A
0 qid:1  [features] #B
1 qid:1  [features] #C
-1 qid:1  [features] #D
-1 qid:1 [features] #E

If we use the Pairwise Logistic loss I should be able to learn C>B and C>A and then use the ranking metrics to evaluate my ranker against queries in my test set.

Are predictions stable over time?

I am wondering why I am seeing different results in all three of train(loss) , evaluate (eval_fn metrics), and predict when I run the model at multiple points in time.

Background: I'm using the MQ2008 and am having a hard time interpreting why results differ from run to run. (What's even more surprising to me is that even the predict output will differ in between runs)

Please advise / help :)

Failed to test my own data set

Below is part of the log. Can you please help to identify the cause?
Thanks.

........
........
INFO:tensorflow:Number of queries: 622450
INFO:tensorflow:Number of documents in total: 9436717
INFO:tensorflow:Number of documents discarded: 0
INFO:tensorflow:Loading data from /home/Downloads/train.tsv
INFO:tensorflow:Number of queries: 298651
INFO:tensorflow:Number of documents in total: 4519591
INFO:tensorflow:Number of documents discarded: 0
INFO:tensorflow:Loading data from /home/Downloads/evaluation.tsv
INFO:tensorflow:Number of queries: 298651
INFO:tensorflow:Number of documents in total: 4519591
INFO:tensorflow:Number of documents discarded: 0
INFO:tensorflow:Using config: {'_model_dir': '/tmp/output', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 1000, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
rewrite_options {
meta_optimizer_iterations: ONE
}
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f4fa0320d30>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}

WARNING:tensorflow:Estimator's model_fn (<function make_groupwise_ranking_fn.._model_fn at 0x7f4fa032a620>) includes params argument, but params are not passed to Estimator.
INFO:tensorflow:Not using Distribute Coordinator.
INFO:tensorflow:Running training and evaluation locally (non-distributed).
INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps 1000 or save_checkpoints_secs None.
INFO:tensorflow:Skipping training since max_steps has already saved.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Use groupwise dnn v2.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2019-01-04-15:25:52
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/output/model.ckpt-100
Traceback (most recent call last):
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [18,256] rhs shape= [136,256]
[[{{node save/Assign_18}} = Assign[T=DT_FLOAT, _class=["loc:@group_score/dense/kernel"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](group_score/dense/kernel, save/RestoreV2:18)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1546, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [18,256] rhs shape= [136,256]
[[node save/Assign_18 (defined at /home/Downloads/ranking/./bazel-bin/tensorflow_ranking/examples/tf_ranking_libsvm_py_binary.runfiles/org_tensorflow_ranking/tensorflow_ranking/examples/tf_ranking_libsvm.py:323) = Assign[T=DT_FLOAT, _class=["loc:@group_score/dense/kernel"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](group_score/dense/kernel, save/RestoreV2:18)]]

Caused by op 'save/Assign_18', defined at:
File "/home/Downloads/ranking/./bazel-bin/tensorflow_ranking/examples/tf_ranking_libsvm_py_binary.runfiles/org_tensorflow_ranking/tensorflow_ranking/examples/tf_ranking_libsvm.py", line 338, in
tf.app.run()
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "/home/Downloads/ranking/./bazel-bin/tensorflow_ranking/examples/tf_ranking_libsvm_py_binary.runfiles/org_tensorflow_ranking/tensorflow_ranking/examples/tf_ranking_libsvm.py", line 329, in main
train_and_eval()
File "/home/Downloads/ranking/./bazel-bin/tensorflow_ranking/examples/tf_ranking_libsvm_py_binary.runfiles/org_tensorflow_ranking/tensorflow_ranking/examples/tf_ranking_libsvm.py", line 323, in train_and_eval
estimator.evaluate(input_fn=test_input_fn, hooks=[test_hook])
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 478, in evaluate
return _evaluate()
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 467, in _evaluate
output_dir=self.eval_dir(name))
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1591, in _evaluate_run
config=self._session_config)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/evaluation.py", line 271, in _evaluate_once
session_creator=session_creator, hooks=hooks) as session:
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 921, in init
stop_grace_period_secs=stop_grace_period_secs)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 643, in init
self._sess = _RecoverableSession(self._coordinated_creator)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1107, in init
_WrappedSession.init(self, self._create_session())
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1112, in _create_session
return self._sess_creator.create_session()
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 800, in create_session
self.tf_sess = self._session_creator.create_session()
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 557, in create_session
self._scaffold.finalize()
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 213, in finalize
self._saver = training_saver._get_saver_or_default() # pylint: disable=protected-access
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 886, in _get_saver_or_default
saver = Saver(sharded=True, allow_empty=True)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1102, in init
self.build()
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1114, in build
self._build(self._filename, build_save=True, build_restore=True)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1151, in _build
build_save=build_save, build_restore=build_restore)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 789, in _build_internal
restore_sequentially, reshape)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 459, in _AddShardedRestoreOps
name="restore_shard"))
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 428, in _AddRestoreOps
assign_ops.append(saveable.restore(saveable_tensors, shapes))
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 119, in restore
self.op.get_shape().is_fully_defined())
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/ops/state_ops.py", line 221, in assign
validate_shape=validate_shape)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/ops/gen_state_ops.py", line 61, in assign
use_locking=use_locking, name=name)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
op_def=op_def)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1770, in init
self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [18,256] rhs shape= [136,256]
[[node save/Assign_18 (defined at /home/Downloads/ranking/./bazel-bin/tensorflow_ranking/examples/tf_ranking_libsvm_py_binary.runfiles/org_tensorflow_ranking/tensorflow_ranking/examples/tf_ranking_libsvm.py:323) = Assign[T=DT_FLOAT, _class=["loc:@group_score/dense/kernel"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](group_score/dense/kernel, save/RestoreV2:18)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/Downloads/ranking/./bazel-bin/tensorflow_ranking/examples/tf_ranking_libsvm_py_binary.runfiles/org_tensorflow_ranking/tensorflow_ranking/examples/tf_ranking_libsvm.py", line 338, in
tf.app.run()
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "/home/Downloads/ranking/./bazel-bin/tensorflow_ranking/examples/tf_ranking_libsvm_py_binary.runfiles/org_tensorflow_ranking/tensorflow_ranking/examples/tf_ranking_libsvm.py", line 329, in main
train_and_eval()
File "/home/Downloads/ranking/./bazel-bin/tensorflow_ranking/examples/tf_ranking_libsvm_py_binary.runfiles/org_tensorflow_ranking/tensorflow_ranking/examples/tf_ranking_libsvm.py", line 323, in train_and_eval
estimator.evaluate(input_fn=test_input_fn, hooks=[test_hook])
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 478, in evaluate
return _evaluate()
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 467, in _evaluate
output_dir=self.eval_dir(name))
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1591, in _evaluate_run
config=self._session_config)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/evaluation.py", line 271, in _evaluate_once
session_creator=session_creator, hooks=hooks) as session:
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 921, in init
stop_grace_period_secs=stop_grace_period_secs)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 643, in init
self._sess = _RecoverableSession(self._coordinated_creator)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1107, in init
_WrappedSession.init(self, self._create_session())
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1112, in _create_session
return self._sess_creator.create_session()
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 800, in create_session
self.tf_sess = self._session_creator.create_session()
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 566, in create_session
init_fn=self._scaffold.init_fn)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/session_manager.py", line 288, in prepare_session
config=config)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/session_manager.py", line 202, in _restore_checkpoint
saver.restore(sess, checkpoint_filename_with_path)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1582, in restore
err, "a mismatch between the current graph and the graph")
tensorflow.python.framework.errors_impl.InvalidArgumentError: Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Assign requires shapes of both tensors to match. lhs shape= [18,256] rhs shape= [136,256]
[[node save/Assign_18 (defined at /home/Downloads/ranking/./bazel-bin/tensorflow_ranking/examples/tf_ranking_libsvm_py_binary.runfiles/org_tensorflow_ranking/tensorflow_ranking/examples/tf_ranking_libsvm.py:323) = Assign[T=DT_FLOAT, _class=["loc:@group_score/dense/kernel"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](group_score/dense/kernel, save/RestoreV2:18)]]

Caused by op 'save/Assign_18', defined at:
File "/home/Downloads/ranking/./bazel-bin/tensorflow_ranking/examples/tf_ranking_libsvm_py_binary.runfiles/org_tensorflow_ranking/tensorflow_ranking/examples/tf_ranking_libsvm.py", line 338, in
tf.app.run()
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "/home/Downloads/ranking/./bazel-bin/tensorflow_ranking/examples/tf_ranking_libsvm_py_binary.runfiles/org_tensorflow_ranking/tensorflow_ranking/examples/tf_ranking_libsvm.py", line 329, in main
train_and_eval()
File "/home/Downloads/ranking/./bazel-bin/tensorflow_ranking/examples/tf_ranking_libsvm_py_binary.runfiles/org_tensorflow_ranking/tensorflow_ranking/examples/tf_ranking_libsvm.py", line 323, in train_and_eval
estimator.evaluate(input_fn=test_input_fn, hooks=[test_hook])
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 478, in evaluate
return _evaluate()
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 467, in _evaluate
output_dir=self.eval_dir(name))
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1591, in _evaluate_run
config=self._session_config)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/evaluation.py", line 271, in _evaluate_once
session_creator=session_creator, hooks=hooks) as session:
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 921, in init
stop_grace_period_secs=stop_grace_period_secs)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 643, in init
self._sess = _RecoverableSession(self._coordinated_creator)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1107, in init
_WrappedSession.init(self, self._create_session())
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1112, in _create_session
return self._sess_creator.create_session()
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 800, in create_session
self.tf_sess = self._session_creator.create_session()
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 557, in create_session
self._scaffold.finalize()
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 213, in finalize
self._saver = training_saver._get_saver_or_default() # pylint: disable=protected-access
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 886, in _get_saver_or_default
saver = Saver(sharded=True, allow_empty=True)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1102, in init
self.build()
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1114, in build
self._build(self._filename, build_save=True, build_restore=True)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1151, in _build
build_save=build_save, build_restore=build_restore)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 789, in _build_internal
restore_sequentially, reshape)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 459, in _AddShardedRestoreOps
name="restore_shard"))
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 428, in _AddRestoreOps
assign_ops.append(saveable.restore(saveable_tensors, shapes))
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 119, in restore
self.op.get_shape().is_fully_defined())
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/ops/state_ops.py", line 221, in assign
validate_shape=validate_shape)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/ops/gen_state_ops.py", line 61, in assign
use_locking=use_locking, name=name)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
op_def=op_def)
File "/tmp/tfr/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1770, in init
self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Assign requires shapes of both tensors to match. lhs shape= [18,256] rhs shape= [136,256]
[[node save/Assign_18 (defined at /home/Downloads/ranking/./bazel-bin/tensorflow_ranking/examples/tf_ranking_libsvm_py_binary.runfiles/org_tensorflow_ranking/tensorflow_ranking/examples/tf_ranking_libsvm.py:323) = Assign[T=DT_FLOAT, _class=["loc:@group_score/dense/kernel"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](group_score/dense/kernel, save/RestoreV2:18)]]

AttributeError: 'tuple' object has no attribute 'dtype'

Can I use the metrics available in a custom metric of Keras? I tried to code a custom metroc like this:

def nDCG(y_true, y_pred):
    return metrics.mnormalized_discounted_cumulative_gain(y_true,y_pred,topn = 3)

but I get the error:

AttributeError: 'tuple' object has no attribute 'dtype'

What is the LambdaWeights for lambdarank?

Hi authors,
Please correct me if I am wrong. In my opinion:
(1) the lambdaloss paper "The LambdaLoss Framework for ranking metric optimization" offers an unified framework for metric-driven optimization: metric-driven loss can be expressed as sum of weighted loss for data pairs.
(2) The weights in (1) is LambdaWeights class in tf-ranking code.
(3) So, to make a metric-driven loss, all I need to do are:
1. Subclass LambdaWeights and implement my custom weights
2, Create a loss with my LambdaWeights object.

e.g: let's say I want to make a ranknet loss:
(1) Instancelize a DCGLambdaWeight obeject.
(2) Call _pariwise_logistic_loss, where the parameter lambda_weight is set to object in (1).

Is that correct?

Pair-wise losses fail when providing dynamically sized logits

First, thanks much for all the work and also for sharing it. Highly appreciated!
When using some of the provided ranking losses like _softmax_loss or _pairwise_hinge_loss I run into problems with the shape.
The reason is, that my logits depend on a placeholder in my graph, that has a flexible shape.
self.array = tf.placeholder(tf.string, [None, None], name='array') # Batch_size x C
In _softmax_loss function and several other loss functions you use the function unstack(...) which cannot work with variable shapes (see details discussed here: Using tf.unpack() when first dimension of Variable is None)
The unstack command is used within the loss functions to determine the topn.
Help is much appreciated.

My trace

/var/log/application.log aws_ec2_instance_id='X'     File "/usr/lib/python3.5/runpy.py", line 184, in _run_module_as_main
/var/log/application.log aws_ec2_instance_id='X'       "__main__", mod_spec)
/var/log/application.log aws_ec2_instance_id='X'     File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
/var/log/application.log aws_ec2_instance_id='X'       exec(code, run_globals)
/var/log/application.log aws_ec2_instance_id='X'     File "/opt/workspace/1.55233553599e+12/XXX-307d542f0b148f74ebd68f9b94c2d0432ab21d9f/models/training/runner.py", line 50, in <module>
/var/log/application.log aws_ec2_instance_id='X'       trainer.run()
/var/log/application.log aws_ec2_instance_id='X'     File "/opt/workspace/1.55233553599e+12/XXX-307d542f0b148f74ebd68f9b94c2d0432ab21d9f/models/training/trainer.py", line 109, in run
 23:04:51.751 recolab-tuning /var/log/application.log aws_ec2_instance_id='X'       **self.model_params
/var/log/application.log aws_ec2_instance_id='X'     File "/opt/workspace/1.55233553599e+12/XXX-307d542f0b148f74ebd68f9b94c2d0432ab21d9f/models/models_common/model_factory.py", line 44, in create_model
/var/log/application.log aws_ec2_instance_id='X'       **kwargs
/var/log/application.log aws_ec2_instance_id='X'     File "/opt/workspace/1.55233553599e+12/XXX-307d542f0b148f74ebd68f9b94c2d0432ab21d9f/models/models_common/model/model.py", line 10, in __init__
/var/log/application.log aws_ec2_instance_id='X'       self.create(*args, **kwargs)
 23:04:51.751 recolab-tuning /var/log/application.log aws_ec2_instance_id='X'     File "/opt/workspace/1.55233553599e+12/XXX-307d542f0b148f74ebd68f9b94c2d0432ab21d9f/models/models/catalog/catalog_v1.py", line 202, in create
/var/log/application.log aws_ec2_instance_id='X'       unfiltered_cross_entropy_loss = tfr.losses._softmax_loss(target_array, self.logits)
/var/log/application.log aws_ec2_instance_id='X'     File "/usr/local/lib/python3.5/dist-packages/tensorflow_ranking/python/losses.py", line 759, in _softmax_loss
 /var/log/application.log aws_ec2_instance_id='X'       labels, logits, weights)
/var/log/application.log aws_ec2_instance_id='X'     File "/usr/local/lib/python3.5/dist-packages/tensorflow_ranking/python/losses.py", line 481, in _sort_and_normalize
/var/log/application.log aws_ec2_instance_id='X'       _, topn = array_ops.unstack(array_ops.shape(logits))
/var/log/application.log aws_ec2_instance_id='X'     File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/array_ops.py", line 1026, in unstack
/var/log/application.log aws_ec2_instance_id='X'       raise ValueError("Cannot infer num from shape %s" % value_shape)
/var/log/application.log aws_ec2_instance_id='X'   ValueError: Cannot infer num from shape (?,)```

GSOC project proposal

Hi,

Tensorflow is participating in GSOC(Google Summer of Code) this year and the students were told they can make their own project proposals based on issues that has "contributions welcome" label. Based on this issue opened by Paige Bailey I'm drafting my project proposal and I've picked up some learning-to-rank related concepts already.

Would it be okay? I thought I should talk to the project owners before going ahead with the proposal. Please let me know if there's some important change or expectations around library migration.

tensorflow/tensorflow#25445

image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.