hello, first of all, congrats for your work! I was wondering if ther

Thanks Amir for your suggestion! Ioannis: <p dir="a

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

parameter selection about kickscore HOT 12 CLOSED

lucasmaystre commented on June 26, 2024

parameter selection

from kickscore.

Comments (12)

ioannis12 commented on June 26, 2024 2

For the selection of kernels: I think Constant + Exponential is a good starting point and usually gets you 99% of the performance of more complex combinations. And I would always have a Constant-only baseline to see whether time matters at all.

that was a good tip! I tried 'constant' + 'exponential' and did a basic grid search and the results improved a lot. They outperform trueskill by a small margin!

from kickscore.

victorkristof commented on June 26, 2024 1

Hehe, indeed, there is theoretically an infinite number of combinations, when you besides consider that each kernel comes with some hyperparameters (see Table 6 and 7 in the paper)...

I would suggest to select a small number of kernels that "make sense to you", i.e., that enable you to capture some of your hypotheses and your intuition into the model. I could also suggest to focus on "simple" models first.

For example, if you include a "home advantage" parameter, you could make the hypothesis that there is no clear reason why this parameter should vary over time, and therefore use a constant kernel instead of a fancy combination of different kernels.

I hope this helps!

from kickscore.

amirbachar commented on June 26, 2024 1

If I may jump in to the discussion, I suggest using a library such as scikit-optimize (https://scikit-optimize.github.io/stable/) or hyperopt (https://github.com/hyperopt/hyperopt) to find good hyper-parameters more efficiently. Grid search is really expensive, and should most likely be used only in extreme cases.

from kickscore.

victorkristof commented on June 26, 2024 1

Thanks Amir for your suggestion!

Ioannis:

I guess some kernels converge faster than others, any ideas which ones to prefer?

I don't recall observing different convergence rates for different kernels, so I guess it's really up to your modeling assumptions. But note that if computational efficiency is important to you, you could use our implementation of Kickscore in Go!

And maybe @lucasmaystre would like to comment on this discussion? :)

from kickscore.

lucasmaystre commented on June 26, 2024 1

@ioannis12 great question overall. Automatic model selection is not yet possible with kickscore but it's something I'd like to add in the future.

Agreed with @victorkristof for now the best you can do is try different configurations and use model.log_likelihood() to select the best performing model (at least this avoids having to do cross-validation).

For the selection of kernels: I think Constant + Exponential is a good starting point and usually gets you 99% of the performance of more complex combinations. And I would always have a Constant-only baseline to see whether time matters at all.

These choices "converge" fast in the sense that there are few hyperparameters -> fewer knobs to turn & they're usually more robust to a wide range of hyperparameter values.

@amirbachar using a hyperparameter-optimization library is indeed a principled way to explore the space of hyperparameter values.

from kickscore.

lucasmaystre commented on June 26, 2024 1

Hi @tha23rd sorry for the delay. Here's a snippet.

import kickscore as ks
model = ks.BinaryModel()
k = ks.kernel.Constant(var=1.0)

# Add items.
model.add_item("A", kernel=k)
model.add_item("B", kernel=k)
model.add_item("home-adv", kernel=k)

# A wins against B in a "home" game.
model.observe(winners=["A", "home-adv"], losers=["B"], t=0.0)

# A wins against B in an "away" game.
model.observe(winners=["A"], losers=["B", "home-adv"], t=0.0)

Hope this helps.

from kickscore.

victorkristof commented on June 26, 2024

Hi @ioannis12,

Thanks for reaching out!
It is true that the hyperparameters have a big influence on the model performance in general.

In our case, we usually run a grid-search (or randomized search) for some ranges of values for all hyperparameters.
We select the combination of hyperparameters that gives the highest log-likelihood on the training data only.
You can use the model.log_likelihood() function for that.

Does that help?

Victor.

from kickscore.

ioannis12 commented on June 26, 2024

thanks for the prompt reply.
yes, I was planning to do that, create a loop and test all the kernels.
Not sure, though, what to include, I see in your paper 3 variations (affine +wiener, constant + matern, constant + wiener), but combinations are hundreds, right?

from kickscore.

ioannis12 commented on June 26, 2024

I see, actually the difficulty about my data is that most players have just 3 - 4 games. I guess some kernels converge faster than others, any ideas which ones to prefer?

from kickscore.

tha23rd commented on June 26, 2024

or example, if you include a "home advantage" parameter

Hi - I don't want to derail discussion too much, but, how would I do the above? I tried to find some simple examples where you model phenomenon like the above but failed to find anything. Or just a nudge in the right direction would be much appreciated :)

from kickscore.

ioannis12 commented on June 26, 2024

hello,

coming back to the last question, how do you include the home advantage? for example, if you have a team that's winning 55% of the home games, what do you put in the model parameter? 0.55 or some other value?

from kickscore.

lucasmaystre commented on June 26, 2024

Hi @ioannis12 apologies for the belated reply.

The value of the home-advantage parameter is learned as part of the inference process—you simply need to provide a kernel (e.g. constant with a given variance, and usually a variance that is roughly on the same scale as that used for the teams works well).

The fitted value of the home-advantage parameter is usually not very interpretable. It is optimized in such a way that, once all iterms are combined together, the predicted probablities match the outcomes observed in the data as well as possible.

from kickscore.

parameter selection about kickscore HOT 12 CLOSED

Comments (12)

Related Issues (11)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent