Giter Site home page Giter Site logo

Comments (5)

athril avatar athril commented on August 11, 2024 1

I don't think we currently plan to work on a large scale benchmarking of classifiers with specific focus on GP-based ones.
Please notice that the datasets included in srbench are regression problems, PMLB covers over 165 classification problems as well.
Nonetheless, you might be interested in taking a look at the following papers that cover more recent methods:
https://biodatamining.biomedcentral.com/articles/10.1186/s13040-017-0154-4
https://www.worldscientific.com/doi/pdf/10.1142/9789813235533_0018
https://arxiv.org/abs/2107.06475

from srbench.

lacava avatar lacava commented on August 11, 2024

We have certainly thought about incorporating symbolic classification algorithms into our benchmarking. They are a bit less common in SR literature, but nonetheless I agree such a benchmark would be very useful. I could see it being an addition to this repo.

from srbench.

hengzhe-zhang avatar hengzhe-zhang commented on August 11, 2024

Are there any specific plans with respect to this matter? In my opinion, it seems all analysis scripts can be reused, and we only need to change experimental datasets to those classification datasets in PMLB database, and change those machine learning estimators to its classification counterpart.

from srbench.

hengzhe-zhang avatar hengzhe-zhang commented on August 11, 2024

By the way, I'm not sure about whether we should reuse the results reported in the previous large-scale benchmark paper. The classifiers used in that paper are rather old, and it doesn't include SOTA classifiers such as XGBoost and LightGBM. Consequently, it is questionable if it is necessary to use the existing results of that article. And even worse, some papers pointed out that results are obtained under a flawed experimental protocol [1], e.g., that paper uses test data to tune the hyper-parameter. Consequently, the results obtained by that article are not reliable.

[1]. Wainberg M, Alipanahi B, Frey B J. Are random forests truly the best classifiers?[J]. The Journal of Machine Learning Research, 2016, 17(1): 3837-3841.

from srbench.

hengzhe-zhang avatar hengzhe-zhang commented on August 11, 2024

@athril Thank you for providing the DIGEN package. This is exactly what I am looking for, excellent work!

from srbench.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.