Giter Site home page Giter Site logo

Comments (3)

declanvk avatar declanvk commented on August 21, 2024

Initial attempt using memchr crate caused a performance regression of about 20% in all benchmarks which used the search functions:

Benchmark is comparing 62dec88 versus 7979569:

skewed/search/first_key time:   [228.0429 cycles 228.0435 cycles 228.0440 cycles]                    
                        change: [+11.766% +11.768% +11.771%] (p = 0.00 < 0.05)
                        Performance has regressed.
skewed/search/middle_key                                                                              
                        time:   [17035.9464 cycles 17036.2015 cycles 17036.4510 cycles]
                        change: [+23.159% +23.163% +23.168%] (p = 0.00 < 0.05)
                        Performance has regressed.
skewed/search/last_key  time:   [33660.2670 cycles 33660.3431 cycles 33660.4204 cycles]             
                        change: [+23.257% +23.260% +23.263%] (p = 0.00 < 0.05)
                        Performance has regressed.
skewed/minimum          time:   [10985.9127 cycles 10985.9386 cycles 10985.9655 cycles]    
                        change: [-0.0017% +0.0008% +0.0032%] (p = 0.55 > 0.05)
                        No change in performance detected.
skewed/maximum          time:   [108.0185 cycles 108.0187 cycles 108.0190 cycles]           
                        change: [-0.0021% +0.0004% +0.0028%] (p = 0.76 > 0.05)
                        No change in performance detected.

fixed_length/search/first_key                                                                             
                        time:   [1163.2561 cycles 1163.2592 cycles 1163.2622 cycles]
                        change: [+20.767% +20.771% +20.774%] (p = 0.00 < 0.05)
                        Performance has regressed.
fixed_length/search/middle_key                                                                             
                        time:   [1219.2575 cycles 1219.2607 cycles 1219.2639 cycles]
                        change: [+18.693% +18.697% +18.701%] (p = 0.00 < 0.05)
                        Performance has regressed.
fixed_length/search/last_key                                                                             
                        time:   [1275.2677 cycles 1275.2709 cycles 1275.2742 cycles]
                        change: [+18.606% +18.609% +18.612%] (p = 0.00 < 0.05)
                        Performance has regressed.
fixed_length/minimum    time:   [406.2254 cycles 406.2330 cycles 406.2404 cycles]                 
                        change: [+0.0260% +0.0319% +0.0376%] (p = 0.00 < 0.05)
                        Change within noise threshold.
fixed_length/maximum    time:   [423.1099 cycles 423.1113 cycles 423.1128 cycles]                 
                        change: [-0.0035% +0.0007% +0.0047%] (p = 0.74 > 0.05)
                        No change in performance detected.

large_prefixes/search/first_key                                                                             
                        time:   [1255.2685 cycles 1255.2718 cycles 1255.2752 cycles]
                        change: [+18.953% +18.958% +18.963%] (p = 0.00 < 0.05)
                        Performance has regressed.
large_prefixes/search/middle_key                                                                             
                        time:   [1326.2755 cycles 1326.2791 cycles 1326.2826 cycles]
                        change: [+16.924% +16.928% +16.931%] (p = 0.00 < 0.05)
                        Performance has regressed.
large_prefixes/search/last_key                                                                             
                        time:   [1382.2774 cycles 1382.2809 cycles 1382.2844 cycles]
                        change: [+16.925% +16.928% +16.931%] (p = 0.00 < 0.05)
                        Performance has regressed.
large_prefixes/minimum  time:   [406.1022 cycles 406.1036 cycles 406.1052 cycles]                   
                        change: [-0.0023% +0.0019% +0.0071%] (p = 0.44 > 0.05)
                        No change in performance detected.
large_prefixes/maximum  time:   [423.1083 cycles 423.1098 cycles 423.1113 cycles]                   
                        change: [-0.0052% +0.0000% +0.0042%] (p = 0.99 > 0.05)
                        No change in performance detected.

My current guess is that the memchr optimizations are not useful for the small vector sizes (4,16), so the linear search wins.

from blart.

declanvk avatar declanvk commented on August 21, 2024

Next step for this would be implementing a generalized search function using std::simd

from blart.

declanvk avatar declanvk commented on August 21, 2024

Initial attempt at optimizing with std::simd resulted in significant slowdown. Benchmark is comparing commit b1b1cac to b75697d :

skewed/search/first_key time:   [16.128 ns 16.159 ns 16.191 ns]                                      
                        change: [+19.492% +19.649% +19.783%] (p = 0.00 < 0.05)
                        Performance has regressed.
skewed/search/middle_key                                                                              
                        time:   [1.1508 µs 1.1509 µs 1.1510 µs]
                        change: [+49.440% +49.519% +49.593%] (p = 0.00 < 0.05)
                        Performance has regressed.
skewed/search/last_key  time:   [2.2543 µs 2.2549 µs 2.2555 µs]                                     
                        change: [+42.548% +42.761% +42.974%] (p = 0.00 < 0.05)
                        Performance has regressed.
skewed/minimum          time:   [901.53 ns 901.70 ns 901.92 ns]                            
                        change: [+0.1286% +0.1762% +0.2229%] (p = 0.00 < 0.05)
                        Change within noise threshold.
skewed/maximum          time:   [7.3368 ns 7.3502 ns 7.3632 ns]                             
                        change: [-6.4921% -6.3671% -6.2193%] (p = 0.00 < 0.05)
                        Performance has improved.

fixed_length/search/first_key                                                                              
                        time:   [85.108 ns 85.174 ns 85.248 ns]
                        change: [+32.576% +32.684% +32.800%] (p = 0.00 < 0.05)
                        Performance has regressed.
fixed_length/search/middle_key                                                                              
                        time:   [84.978 ns 84.995 ns 85.017 ns]
                        change: [+23.260% +23.433% +23.600%] (p = 0.00 < 0.05)
                        Performance has regressed.
fixed_length/search/last_key                                                                              
                        time:   [87.592 ns 87.658 ns 87.729 ns]
                        change: [+26.340% +26.450% +26.549%] (p = 0.00 < 0.05)
                        Performance has regressed.
fixed_length/minimum    time:   [24.959 ns 25.004 ns 25.052 ns]                                   
                        change: [-7.5806% -7.4452% -7.3023%] (p = 0.00 < 0.05)
                        Performance has improved.
fixed_length/maximum    time:   [24.572 ns 24.586 ns 24.601 ns]                                   
                        change: [-10.081% -10.012% -9.9429%] (p = 0.00 < 0.05)
                        Performance has improved.

large_prefixes/search/first_key                                                                              
                        time:   [89.627 ns 89.644 ns 89.667 ns]
                        change: [+29.362% +29.514% +29.665%] (p = 0.00 < 0.05)
                        Performance has regressed.
large_prefixes/search/middle_key                                                                              
                        time:   [89.319 ns 89.342 ns 89.377 ns]
                        change: [+27.225% +27.362% +27.492%] (p = 0.00 < 0.05)
                        Performance has regressed.
large_prefixes/search/last_key                                                                              
                        time:   [93.617 ns 93.786 ns 93.966 ns]
                        change: [+29.486% +29.680% +29.881%] (p = 0.00 < 0.05)
                        Performance has regressed.
large_prefixes/minimum  time:   [24.519 ns 24.535 ns 24.553 ns]                                     
                        change: [-11.876% -11.768% -11.663%] (p = 0.00 < 0.05)
                        Performance has improved.
large_prefixes/maximum  time:   [24.548 ns 24.563 ns 24.581 ns]                                     
                        change: [-13.087% -12.959% -12.832%] (p = 0.00 < 0.05)
                        Performance has improved.

from blart.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.