Giter Site home page Giter Site logo

Comments (9)

ablaom avatar ablaom commented on May 14, 2024 1

Deterministic vs Probabilistic predictions for ensemble models

For each atomic model type Atom, we have an ensemble model type Ensemble{Atom}. (I'm omitting the fitresult type parameter). Here's a proposal for ensembles:

If Atom is Deterministic, then so is Ensemble{Atom}. Predicting a distribution does not make any sense here, as far as I can see. The variability of the individual atomic predictions is an artifice of the algorithm used to generate the ensemble (e.g. bagging) and does not directly reflect of the uncertainty of the ensemble model's final prediction. (Edit: That said, the random forest classifier in scitkit-learn and elsewhere predicts a probability and not the ensemble mode.)

If Atom is Probabilistic there are two cases:

(i) Atom has nominal target. In this case it makes sense to average the discrete probability distributions (i.e. average the underlying measures) and so Ensemble{Atom} becomes Probabilistic.

(ii) Atom has numeric target. In this case averaging the measures almost always delivers a distribution of different form from that in the atoms (even in the normal case - averaging the measures is not the same thing as averaging the associated random variables, and the latter doesn't make sense to me here). So I guess we make Ensemble{Atom} is Deterministic in this case. We could take the mean of the means to get a point-estimate, or randomly sample each atom's predicted probability distribution (exactly once at the time of fitting the ensemble) and take the mean of these samples?

What do others think?

from mlj.jl.

ablaom avatar ablaom commented on May 14, 2024

Further to (ii): I suppose in the normal case, not expecting large variations, we could just approximate the averaged pdf by a normal one. The mean for this approximation would be the mean of the means. What is the most natural way to combine the standard deviations?

from mlj.jl.

fkiraly avatar fkiraly commented on May 14, 2024

"If Atom is Deterministic, then so is Ensemble{Atom}."
"Predicting a distribution does not make any sense here, as far as I can see. "

Hm, but it could make sense to attach a transformer to the deterministic ensemble of predictions to make into a distribution (which need not be identical)? I.e., instead of using the re-sampled distribution as a probabilistic prediction, one could use it as a (distributional) feature in fitting one?
Though this is a bit hypothetical and non-standard.

But 150% agreed that the re-sampled distribution is not a good probabilistic supervised prediction in general. A lot of people make that mistake - I applaud you for being aware of the issue.

from mlj.jl.

fkiraly avatar fkiraly commented on May 14, 2024

The natural way to to bagging on a probabilistic estimator is to average pdf/pdf, see section 6.3 of https://arxiv.org/abs/1801.00753
(again, a round of applause for @ablaom)

In case (ii), I would still average - in skpro, we've used a mixture distribution type for this.

Of course you may want to fuse this with a transformer-adaptor which makes it simple parameteric, e.g., normal, with the mixture distribution's mean and variance. I'd consider this a composite strategy though, not the "natural" bagging ensembler.

from mlj.jl.

fkiraly avatar fkiraly commented on May 14, 2024

PS: the expectation/mean of samples of atoms' pdf is the same as the mean of the atoms' pdf's mean

from mlj.jl.

fkiraly avatar fkiraly commented on May 14, 2024

Actually, come to think of it,

"Hm, but it could make sense to attach a transformer to the deterministic ensemble of predictions to make into a distribution (which need not be identical)? I.e., instead of using the re-sampled distribution as a probabilistic prediction, one could use it as a (distributional) feature in fitting one?"

is not non-standard:
probability calibration is an instance of this!
https://scikit-learn.org/stable/modules/calibration.html
The natural type of probability calibration is a distribution->distribution target transformer.

from mlj.jl.

ablaom avatar ablaom commented on May 14, 2024

I see that Distributions has mixture models, yay. So I'll look into that for ensembles of numeric probabilistic models.

from mlj.jl.

fkiraly avatar fkiraly commented on May 14, 2024

Well, that should make it pretty easy then?

from mlj.jl.

ablaom avatar ablaom commented on May 14, 2024

Yes, now done. Here is the current doc string for the basic ensembling now implemented:

EnsembleModel(atom=nothing, weights=Float64[], bagging_fraction=0.8, rng_seed=0, n=100, parallel=true)

Create a model for training an ensemble of n learners, with optional
bagging, each with associated model atom. Useful if
fit!(machine(atom, data...)) does not create identical models on
repeated calls (ie, is a stochastic model, such as a decision tree
with randomized node selection criteria), or if bagging_fraction is
set to a value not equal to 1.0 (or both). The constructor fails if no
atom is specified.

Predictions are weighted according to the vector weights (to allow
for external optimization) except in the case that atom is a
Deterministic classifier. Uniform weights are used if weight has zero length.

The ensemble model is Deterministic or Probabilistic, according to
the corresponding supertype of atom. In the case of classifiers, the
predictions are majority votes, and for regressors they are ordinary
averages. Probabilistic predictions are obtained by averaging the
atomic probability distribution functions; in particular, for
regressors, the ensemble prediction on each input pattern has the type
MixtureModel{VF,VS,D} from the Distributions.jl package, where D
is the type of predicted distribution for atom.

from mlj.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.