Comments (19)
well, if you want to use skpro
for probabilistic predictions, the package itself needs an update - it's a bit of a larger project than just adding a method. If you're interested to do that, we should set up a call - also (I believe you are part of the mentoring programme?) discuss with your mentor.
from skpro.
actually yes!
We've been re-working the probabilistic forecasting interface:
sktime/sktime#4359
This will enable using probabilistic supervised learners in compositors like make_reduction
much more easily.
Want to help work on this, @drackham ? It's a bit of an engineering project, but there's a step-by-step roadmap.
Would be much appreciated!
We'll probably move this a bit over the Easter holdays where the volunteer contributors tend to have more time.
What would also be helpful is testing the probabilistic forecasting interface and reporting your experiences or any design suggestions (in sktime/sktime#4359). Will be released experimental in 0.17.0 and full in 0.18.0.
from skpro.
(will continue on sktime/sktime#4359 for architecture discussion)
from skpro.
I'll try to start working on adding an interface of ngboost
to skpro
soon after going through its docs.
from skpro.
Interesting!
NGboost is a probabilistic supervised prediction algorithm (for tabular data), not a forecaster - so we would first have to build a full probabilistic interface. That would be a great thing to have for sktime
.
Now about something that's partly funny and partly not funny...
"NGboost" is basically identical to the probabilistic boosting algorithm I proposed (earlier) in section 6.4 of my 2018 paper https://arxiv.org/abs/1801.00753
The probabilistic prediction interface of the NGboost python package closely follows principles in the sktime
companion package skpro
(scikit-learn like probabilistic prediction), or the R package mlr3proba
.
I'm reasonably certain that Ng et al know of both the methodological paper and the software interface designs, but still don't cite them... that's not very nice.
Anyway, we should develop skpro
into the probabilistic scikit-learn that it was meant to be, but unfortunately it's currently without maintainer. Might that be something you would be interested in?
from skpro.
Yes, 6.4 of the above mentioned paper deals with a similar concept.
That sounds good to me, I can use Skpro to build it. The question is how to build that probabilistic interface, if I can get some starting point I can carry it from there, as I was planning to use make_reduction
(Which I assume now is not the correct way???)
from skpro.
Yes, will discuss about this with my mentor. Thanks
from skpro.
I'm very interested in the capability discussed here. It doesn't appear like there has been any progress here. Would you correct me if that is in fact not the case? Thank you!
from skpro.
speaking of which, @frthjf, are you still around?
I would like to move the probabilistic interface into skpro
within the next year or so and use it as an import.
That would be step number 7 or 8 in sktime/sktime#4359 (not there yet, but see context above).
from skpro.
@fkiraly thank you!
I discovered sktime recently, and these types of models are not really in my core areas of competency, so I'd likely be unable to contribute effectively. That said, I'll take a look at the contribution documentation and see if my apprehension is unwarranted.
I'll also take a look at the probabilistic forecasting interface and report back. Thanks again!
from skpro.
@fkiraly I am still around :-)
Just to clarify, do I understand correctly that the plan is to resurrect the skpro
package which in turn becomes an optional dependency of sktime? I could certainly help with updating the skpro code.
from skpro.
@fkiraly I am still around :-)
Nice to hear of you again! Let's catch up, discord perhaps?
Just to clarify, do I understand correctly that the plan is to resurrect the skpro package which in turn becomes an optional dependency of sktime? I could certainly help with updating the skpro code.
Yes!
For now, I've been working in the sktime/proba
module. Have a look and let me now what you think!
The design is a mix of pandas
, sklearn
(base interface), and tensorflow-probability
(parameter broadcast).
I'd like to move it out to skpro
, and make skpro
a core (not optional) dependency of sktime
. Both sktime
and skpro
would eventually depend on skbase
which has the base class framework.
from skpro.
For comments, the topical issue is here, @frthjf:
sktime/sktime#4359
(this issue is about a specific probabilistic forecaster)
from skpro.
I see, in that case, why move this out of sktime
into an independent skpro
package if it's required back in anyway? Wouldn't it be easier to port the skpro features into sktime instead? It seems to me that the sktime package already has all the CI and package infrastructure we would have to recreate for a skpro re-release.
from skpro.
The skpro
package is now sufficiently mature to accommodate an interface to ngboost
- moved the issue therefore to skpro
.
from skpro.
FYI @drackham, @satya-pattnaik, @frthjf - I have updated the issue with instructions on "how-to". Together with the skpro
machinery and the existing integration to sktime
, this should now be pretty straightforward.
Would be great if one of you would like to implement this, I'm happy to advise!
Also FYI @Alex-JG3, @Ram0nB, in case one of you is interested, this intersects with your previous contribution topics.
from skpro.
Have you considered XGBoostLSS and LightGBMLSS as well? Both offer great flexibility and are based on the two most commonly used tabular data boosting machines. Yet, there is no sklearn API available, but has a PR on this.
Shall I open a new issue for this?
from skpro.
Excellent suggestion, @KiwiAthlete - opened an issue here: #135
from skpro.
Great! I notice I linked the wrong issue, fixed the link.
I'd recommend you continue discussing in the issue #135 how the interface would look like.
from skpro.
Related Issues (20)
- [ENH] adding truncated means as an interface point?
- [BUG] sphinx build failures with json parse error HOT 20
- [ENH] adapter for `scipy` distributions HOT 6
- [ENH] design discussion - `pdf` and `pmf` in distributions, discrete, continuous, and mixed HOT 1
- [ENH] further work on `GLMRegressor` interfacing `statsmodels` `GLM` HOT 3
- [ENH] Add a time-dependent gaussian distribution. HOT 4
- [ENH] Consolidate quantile parameterized distributions in few classes HOT 3
- [ENH] inspectable set-valued domains for distributions HOT 25
- [ENH] non-parametric default for `_predict_proba` if `_predict_quantiles` is available HOT 2
- [ENH] add an estimator overview like in `sktime`
- [ENH] design - dealing with incomplete distributions such as predictive survival function estimates HOT 1
- [ENH] `lifelines` accelerated failure time models
- [ENH] improved boilerplate for distribution objects HOT 1
- [ENH] `QPD_B` `mean` and `var` fail due to numeric instability HOT 2
- [ENH] distributions: add missing explicit energy computations
- [ENH] distributions: move parameters for Monte Carlo approximations to configs
- [ENH] issues and features for the `Empirical` distribution
- [ENH] Add Probability Mass Function (`pmf`) and Its Log (`log_pmf`) Methods on Base Distribution Class HOT 1
- [ENH] `ngboost` survival prediction model HOT 3
- [BUG] `NGBoostRegressor` failing when `dist="TDistribution"`
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from skpro.