In 2014, a paper published in JMLR reported the results of more than 100+ classificati

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Is there any future plan for supporting classification benchmarks? about srbench HOT 5 CLOSED

cavalab commented on August 11, 2024

Is there any future plan for supporting classification benchmarks?

from srbench.

Comments (5)

athril commented on August 11, 2024 1

I don't think we currently plan to work on a large scale benchmarking of classifiers with specific focus on GP-based ones.
Please notice that the datasets included in srbench are regression problems, PMLB covers over 165 classification problems as well.
Nonetheless, you might be interested in taking a look at the following papers that cover more recent methods:
https://biodatamining.biomedcentral.com/articles/10.1186/s13040-017-0154-4
https://www.worldscientific.com/doi/pdf/10.1142/9789813235533_0018
https://arxiv.org/abs/2107.06475

from srbench.

lacava commented on August 11, 2024

We have certainly thought about incorporating symbolic classification algorithms into our benchmarking. They are a bit less common in SR literature, but nonetheless I agree such a benchmark would be very useful. I could see it being an addition to this repo.

from srbench.

hengzhe-zhang commented on August 11, 2024

Are there any specific plans with respect to this matter? In my opinion, it seems all analysis scripts can be reused, and we only need to change experimental datasets to those classification datasets in PMLB database, and change those machine learning estimators to its classification counterpart.

from srbench.

hengzhe-zhang commented on August 11, 2024

By the way, I'm not sure about whether we should reuse the results reported in the previous large-scale benchmark paper. The classifiers used in that paper are rather old, and it doesn't include SOTA classifiers such as XGBoost and LightGBM. Consequently, it is questionable if it is necessary to use the existing results of that article. And even worse, some papers pointed out that results are obtained under a flawed experimental protocol [1], e.g., that paper uses test data to tune the hyper-parameter. Consequently, the results obtained by that article are not reliable.

[1]. Wainberg M, Alipanahi B, Frey B J. Are random forests truly the best classifiers?[J]. The Journal of Machine Learning Research, 2016, 17(1): 3837-3841.

from srbench.

hengzhe-zhang commented on August 11, 2024

@athril Thank you for providing the DIGEN package. This is exactly what I am looking for, excellent work!

from srbench.

Recommend Projects

Is there any future plan for supporting classification benchmarks? about srbench HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent