lucasmaystre / choix Goto Github PK

View Code? Open in Web Editor NEW

156.0 156.0 27.0 206 KB

Inference algorithms for models based on Luce's choice axiom

License: MIT License

Python 49.90% Jupyter Notebook 50.10%

machine-learning python ranking

choix's People

Contributors

Stargazers

Watchers

choix's Issues

Dynamically choosing the pairs to compare in an experiment to limit the number of comparisons

Hi,

I have about 30,000 images that I need to rank on a linear scale (let's say from 1 to 10). The naive approach would dictate to do a full pairwise comparison (n² comparisons), making the task impossible, however I'm searching for a method that would allow me to only do about nlogn comparisons by "wisely" choosing the pairs to compare. Of course, using such a method we can only hope to get "near" the ranking ground truth, but it should be sufficient for my needs.

I'm under the impression that choix could be used to attain my goal, but I can't determine how exactly at this point. Any pointers that might help? Thanks!

Weights for the importance of each observation

A nice feature would be adding a weights vector to the data vector, in order to be able to assign a different "importance for each observation".
Using this suggested feature, it would also be easy to implement a regularisation, by adding another observation for each pair (in the pairwise comparison models), and give it a small weight.
Currently when the directed graph is acyclic, the ML is that the root will basically have an infinite strength, and regularisation fixes that.

Using features as extra evidence for preference selection

Thanks for the great package!

Is it possible in choix for each observation (or pair if you wish) to attribute and use some feature vector? The feature vectors, not outcomes alone, will be used to learn the preferences.

Thanks

Thank you for the excellent library!

No real issue and you can close this issue, but thank you so much for the library! I knew I needed a Bradley-Terry implementation in Python and went searching expecting to find just code snippets, not a whole nice library! The API and documentation are great.

Probability of win given pairwise probabilities

Hi @lucasmaystre,

Thanks for taking the time to build this package. Its been of great help.

We are just struggling a bit around a particular problem and hoping you can assist. We are given a matrix pairwise probabilities and we need to compute the probability of win against all other queries in the matrix (i.e. probability of ranking first). How can we make use of your package for this problem.

Thanks.

where can I find the function of "partial ranking"

Can you provide a function documentation?

Thanks,

Citation?

Do you have a paper I can cite if I use choix in my work?

Using probabilistic comparisons as training data

Hi! Thanks for the great package. Is there a way to train the Bradley-Terry model starting from probabilistic data? E.g. one training sample might be" "item A is preferred to item B with probability 0.7."

Use Choix for table-like heterogenous data for comparison?

I'm trying to determine if Choix can be used to compare two rows of data that look like this:

  Item         1        2      3     4         5      6       7       8
0    A  369248.0  12757.0  3.45%  0.83  10569.60  104.0  101.63  0.820%
1    B   35621.0    245.0  0.69%  0.90    219.74    3.0   73.25  1.220%

What I want to do is add my own weight (bias) to several of these columns to say that one feature is more important than another.

I looked at your examples and the data is in integer format. I was thinking maybe I can just rank this above data for each column to get an output of 0 - n-1 for each column but then how would I use choix to compare each row to get an output of "Best" to "Worst"?

Handling ties

In your paper you claim to extend the BTL for ties using P.V.Rao and L.L.Kupper paper. I was wondering if it's possible to implement it here.

Thanks!

Understanding the islr_pairwise behaviour

I followed the code in your intro notebook

Infer Bradley-Terry model parameters.

est = choix.ilsr_pairwise(n_items, data, alpha=1e-3)

Ranking of the items (from worst to best)

ranking = np.argsort(est)

using the following data:

data = [(0, 1),
(0, 2),
(0, 3),
(1, 2),
(1, 3),
(2, 3)]

which produced the following output:

[ 0.02440682 0.00827314 -0.00812072 -0.02478326 0.00011201 0.00011201]
('ranking (worst to best):', array([3, 2, 4, 5, 1, 0]))

I am mainly wondering how the ranking part of the algorithm works:

My understanding was that this algorithm should produce a rank of [0, 1, 2], since 0 is the best player in this case. Clearly this is not what is happening, so what are the params actually calculating, and how would I get the ranking as I want above?

Is "Team" ranking possible with LSR and I-LSR?

First, congratulations for your paper and the awesome piece of code you provided with!

It looks like Packet-Luce ratings can outperform other ranking algorithms such TrueSkill (see here).

Some nice feature of TrueSkill are the possibility to also rank any number of "teams" of any size of TrueSkill's ratings and update the ratings of members of a team according to the team's results.

TrueSkill also can handle "partial play" for each member of a team and adjust the ratings according to the "participation ratio" of the members.

Do you think such behaviors, especially the ability to build teams, are possible with LSR / I-LSR or have any idea how I could do that?

Edit: for partial play, I guess we could simple multiply the initial weight of an edge by the participation ratio...

Missing link to Bayesian example network

The link to the example (https://pypi.org/project/choix/notebooks/ep-example.ipynb) doesn't exist.

How to provide probabilities as well for pairwise to list ?

Hi Author,

I came down to this library from the Stackoverflow question that you have answered at https://datascience.stackexchange.com/questions/18828/from-pairwise-comparisons-to-ranking-python .

I have probabilities for each pairwise comparison, can that also go as input to any of the lsr_pairwise / ilsr_pairwise methods ? Example, let's say probabilities coming from a pairwise classifier trained separately.

Thanks,
Hasan

Using Python sets to represent top-1 data leads to TypeError

The documentation states that to represent a top-1 list, a Python list with an integer and a Python set should be used. This leads to a TypeError:

% python3 -m venv venv
% . venv/bin/activate
% pip install choix   
Collecting choix
  Using cached choix-0.3.5.tar.gz (63 kB)
  Preparing metadata (setup.py) ... done
Collecting numpy
  Using cached numpy-1.22.1-cp39-cp39-macosx_11_0_arm64.whl (12.8 MB)
Collecting scipy
  Using cached scipy-1.7.3-1-cp39-cp39-macosx_12_0_arm64.whl (27.0 MB)
Using legacy 'setup.py install' for choix, since package 'wheel' is not installed.
Installing collected packages: numpy, scipy, choix
    Running setup.py install for choix ... done
Successfully installed choix-0.3.5 numpy-1.22.1 scipy-1.7.3
(venv) michi@lappy ~ % python
Python 3.9.10 (main, Jan 20 2022, 11:41:00) 
[Clang 13.0.0 (clang-1300.0.29.30)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import choix
>>> choix.ilsr_top1(3, [[0, {1, 2}]], alpha=0.1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/michi/venv/lib/python3.9/site-packages/choix/lsr.py", line 391, in ilsr_top1
    return _ilsr(fun, initial_params, max_iter, tol)
  File "/Users/michi/venv/lib/python3.9/site-packages/choix/lsr.py", line 30, in _ilsr
    params = fun(initial_params=params)
  File "/Users/michi/venv/lib/python3.9/site-packages/choix/lsr.py", line 350, in lsr_top1
    val = 1 / (weights.take(losers).sum() + weights[winner])
TypeError: int() argument must be a string, a bytes-like object or a number, not 'set'

Using another list instead works:

>>> choix.ilsr_top1(3, [[0, [1, 2]]], alpha=0.1)
array([ 0.97755805, -0.48877902, -0.48877902])

Also, IMHO a tuple is a better fit to represent a top-1 list, e.g. [(0, {1, 2})]. This can be accurately represented using the types from the typing module, i.e. List[Tuple[int, Set[int]]] where the currently documented convention can't.

Ties in rankings

I see in #17 some suggestions for handling ties in pairwise comparisons. Is there a way to do that in rankings? Specifically, I have a case where several competitors in a race may not finish, and all of the should be ranked as number of finishers + 1. Is there a way to handle such a condition?

Type Error in choicerank-tutorial.ipynb

networkx version == 2.5
choix version == 0.3.4

Running the code cell in 1. Generating sample data lead to the following error:

This can be fixed using:

neighbors = list(graph.successors(src))

The same error also occurred for me in 2. Estimating transistions using choicerank and can be fixed in the same way.

partial ranking aggression

I do not still understand which part/algorithm I should use for my problem.
I have several partial ranking lists for n object (some of the rankings for some object may missing in a list) and I want to aggregate the list in a final ranking list.
Which method I should use and how?
How should I represent the item: ranking object for existing and nonexisting object?

Bradley Terry Model

I'd like to build a Bradley Terry model. Could you provide an example please?

Speed up parameter inference

Currently the various parameter inference algorithms are implemented in pure python (with some vectorized operations via numpy).

Since most of these algorithms are iterative, the implementation is still relatively slow and inefficient. From experience, numba could potentially speed up the code by several orders of magnitude.

I hence intend to speed up some of the inference algorithms with numba, starting with the opt_* functions.

Unexpected argument 'alpha' for opt_pairwise

Hi @lucasmaystre,

When using choix.opt_pairwise, setting alpha raises an error:

TypeError: opt_pairwise() got an unexpected keyword argument 'alpha'

But the doc suggests it should be possible to set it :)

Example code:

import choix

n_items = 5
data = [
    (1, 0), (0, 4), (3, 1),
    (0, 2), (2, 4), (4, 3),
]

choix.opt_pairwise(n_items, data, alpha=0.1)

lucasmaystre / choix Goto Github PK

choix's People

Contributors

Stargazers

Watchers

Forkers

choix's Issues

Infer Bradley-Terry model parameters.

Ranking of the items (from worst to best)

Recommend Projects

Recommend Topics

Recommend Org