Giter Site home page Giter Site logo

parameter selection about kickscore HOT 12 CLOSED

lucasmaystre avatar lucasmaystre commented on June 26, 2024
parameter selection

from kickscore.

Comments (12)

ioannis12 avatar ioannis12 commented on June 26, 2024 2

For the selection of kernels: I think Constant + Exponential is a good starting point and usually gets you 99% of the performance of more complex combinations. And I would always have a Constant-only baseline to see whether time matters at all.

that was a good tip! I tried 'constant' + 'exponential' and did a basic grid search and the results improved a lot. They outperform trueskill by a small margin!

from kickscore.

victorkristof avatar victorkristof commented on June 26, 2024 1

Hehe, indeed, there is theoretically an infinite number of combinations, when you besides consider that each kernel comes with some hyperparameters (see Table 6 and 7 in the paper)...

I would suggest to select a small number of kernels that "make sense to you", i.e., that enable you to capture some of your hypotheses and your intuition into the model. I could also suggest to focus on "simple" models first.

For example, if you include a "home advantage" parameter, you could make the hypothesis that there is no clear reason why this parameter should vary over time, and therefore use a constant kernel instead of a fancy combination of different kernels.

I hope this helps!

from kickscore.

amirbachar avatar amirbachar commented on June 26, 2024 1

If I may jump in to the discussion, I suggest using a library such as scikit-optimize (https://scikit-optimize.github.io/stable/) or hyperopt (https://github.com/hyperopt/hyperopt) to find good hyper-parameters more efficiently. Grid search is really expensive, and should most likely be used only in extreme cases.

from kickscore.

victorkristof avatar victorkristof commented on June 26, 2024 1

Thanks Amir for your suggestion!

Ioannis:

I guess some kernels converge faster than others, any ideas which ones to prefer?

I don't recall observing different convergence rates for different kernels, so I guess it's really up to your modeling assumptions. But note that if computational efficiency is important to you, you could use our implementation of Kickscore in Go!

And maybe @lucasmaystre would like to comment on this discussion? :)

from kickscore.

lucasmaystre avatar lucasmaystre commented on June 26, 2024 1

@ioannis12 great question overall. Automatic model selection is not yet possible with kickscore but it's something I'd like to add in the future.

Agreed with @victorkristof for now the best you can do is try different configurations and use model.log_likelihood() to select the best performing model (at least this avoids having to do cross-validation).

For the selection of kernels: I think Constant + Exponential is a good starting point and usually gets you 99% of the performance of more complex combinations. And I would always have a Constant-only baseline to see whether time matters at all.

These choices "converge" fast in the sense that there are few hyperparameters -> fewer knobs to turn & they're usually more robust to a wide range of hyperparameter values.

@amirbachar using a hyperparameter-optimization library is indeed a principled way to explore the space of hyperparameter values.

from kickscore.

lucasmaystre avatar lucasmaystre commented on June 26, 2024 1

Hi @tha23rd sorry for the delay. Here's a snippet.

import kickscore as ks
model = ks.BinaryModel()
k = ks.kernel.Constant(var=1.0)

# Add items.
model.add_item("A", kernel=k)
model.add_item("B", kernel=k)
model.add_item("home-adv", kernel=k)

# A wins against B in a "home" game.
model.observe(winners=["A", "home-adv"], losers=["B"], t=0.0)

# A wins against B in an "away" game.
model.observe(winners=["A"], losers=["B", "home-adv"], t=0.0)

Hope this helps.

from kickscore.

victorkristof avatar victorkristof commented on June 26, 2024

Hi @ioannis12,

Thanks for reaching out!
It is true that the hyperparameters have a big influence on the model performance in general.

In our case, we usually run a grid-search (or randomized search) for some ranges of values for all hyperparameters.
We select the combination of hyperparameters that gives the highest log-likelihood on the training data only.
You can use the model.log_likelihood() function for that.

Does that help?

Victor.

from kickscore.

ioannis12 avatar ioannis12 commented on June 26, 2024

thanks for the prompt reply.
yes, I was planning to do that, create a loop and test all the kernels.
Not sure, though, what to include, I see in your paper 3 variations (affine +wiener, constant + matern, constant + wiener), but combinations are hundreds, right?

from kickscore.

ioannis12 avatar ioannis12 commented on June 26, 2024

I see, actually the difficulty about my data is that most players have just 3 - 4 games. I guess some kernels converge faster than others, any ideas which ones to prefer?

from kickscore.

tha23rd avatar tha23rd commented on June 26, 2024

or example, if you include a "home advantage" parameter

Hi - I don't want to derail discussion too much, but, how would I do the above? I tried to find some simple examples where you model phenomenon like the above but failed to find anything. Or just a nudge in the right direction would be much appreciated :)

from kickscore.

ioannis12 avatar ioannis12 commented on June 26, 2024

hello,

coming back to the last question, how do you include the home advantage? for example, if you have a team that's winning 55% of the home games, what do you put in the model parameter? 0.55 or some other value?

from kickscore.

lucasmaystre avatar lucasmaystre commented on June 26, 2024

Hi @ioannis12 apologies for the belated reply.

The value of the home-advantage parameter is learned as part of the inference process—you simply need to provide a kernel (e.g. constant with a given variance, and usually a variance that is roughly on the same scale as that used for the teams works well).

The fitted value of the home-advantage parameter is usually not very interpretable. It is optimized in such a way that, once all iterms are combined together, the predicted probablities match the outcomes observed in the data as well as possible.

from kickscore.

Related Issues (11)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.