It would be great if you could do benchmarks with different data set sizes and with ta

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

dataset sizes for benchmarks about scikit-learn_bench HOT 5 CLOSED

intelpython commented on May 30, 2024

dataset sizes for benchmarks

from scikit-learn_bench.

Comments (5)

amueller commented on May 30, 2024 2

And again, it's also an issue of how you display the results. I'm much more likely to believe a speedup of 20x from .1s to 0.005s than from 1h to 3m. If something is instantaneous, we don't really try to optimize much more usually.

from scikit-learn_bench.

amueller commented on May 30, 2024

It would also be great to have the absolute times, not only the relative times. Some of these algorithms take .5s. In that case our input validation overhead probably is possibly dominating the work.

from scikit-learn_bench.

bibikar commented on May 30, 2024

Hi @amueller,

We can definitely try both tall and wide data and report absolute timing. As for input validation, we disable it entirely here. That basically calls sklearn.set_config(assume_finite=True).

Currently, sparse inputs will always cause our patches to fall back to scikit-learn or convert the sparse matrix to a dense one.

from scikit-learn_bench.

amueller commented on May 30, 2024

@bibikar enabling assume_finite is definitely the right way to go. Still, I don't expect anything that takes .5s to be optimized in sklearn. Can you run something that takes like 10s or 1m?

from scikit-learn_bench.

napetrov commented on May 30, 2024

In last several years datasets sizes get more variety and we are working on including more datasets with introduction of GPU support

from scikit-learn_bench.

Recommend Projects

dataset sizes for benchmarks about scikit-learn_bench HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent