Giter Site home page Giter Site logo

Comments (4)

oxinabox avatar oxinabox commented on June 16, 2024 2

@willtebbutt has a prototype for not doing this.
Not sure if he has run timing.
But it is extremely fast, and he is dealing with data that is explictly 1 array containing elements that mix boundedness.

from intervals.jl.

rofinn avatar rofinn commented on June 16, 2024 2

Okay, so my understanding as it relates to JuliaMath/IntervalSets.jl#67 is that we probably want:

  1. A shared AbstractInterval type in IntervalSets.jl which isn't parameterized by the bounds, which would make it easier for Intervals.jl to extend.
  2. A required API for grabbing bounds which folks can use as a fallback for writing generic algorithms.
  3. Both parameterized and non-parameterized concrete interval types depending on the application.
  4. Update LibPQ.jl to use the non-parameterized type.

from intervals.jl.

willtebbutt avatar willtebbutt commented on June 16, 2024 1

Yup. It's on a WIP MR to an internal package but there's not much code, so here's a snippet for reference:

struct DynamicInterval{T}
    infimum::T
    supremum::T
    lhs_is_open::Bool
    rhs_is_open::Bool
end

function Base.issubset(x::DynamicInterval, y::DynamicInterval)
    lower_is_in = if x.lhs_is_open
        x.infimum >= y.infimum
    else
        y.lhs_is_open ? (x.infimum > y.infimum) : (x.infimum >= y.infimum)
    end
    upper_is_in = if x.rhs_is_open
        x.supremum <= y.supremum
    else
        y.rhs_is_open ? (x.supremum < y.supremum) : (x.supremum <= y.supremum)
    end
    return lower_is_in && upper_is_in
end

Performance of construction of arrays of intervals with mixed open-ness / closed-ness and issubset is pretty good. For example:

using BenchmarkTools

_opens = [rand([true, false]) for _ in 1:1_000_000]
_closeds = [rand([true, false]) for _ in 1:1_000_000]
@benchmark [DynamicInterval(0.0, 1.0, o, c) for(o, c) in zip($_opens, $_closeds)]
BenchmarkTools.Trial: 794 samples with 1 evaluation.
 Range (min  max):  3.330 ms  14.258 ms  ┊ GC (min  max): 0.00%   6.20%
 Time  (median):     3.416 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   6.290 ms ±  4.255 ms  ┊ GC (mean ± σ):  7.31% ± 14.75%

  █              ▄                                 ▁▃▂     ▄  
  █▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▅▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄███▅▆▄▁▄█ ▇
  3.33 ms      Histogram: log(frequency) by time     14.1 ms <

 Memory estimate: 22.89 MiB, allocs estimate: 2.

xs = [DynamicInterval(0.0, 1.0, rand([true, false]), rand([true, false])) for _ in 1:1_000_000]
x = DynamicInterval(0.0, 1.0, false, false)

@benchmark map(Base.Fix1(issubset, $x), $xs)
BenchmarkTools.Trial: 2046 samples with 1 evaluation.
 Range (min  max):  2.130 ms    3.905 ms  ┊ GC (min  max): 0.00%  13.12%
 Time  (median):     2.389 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.436 ms ± 203.115 μs  ┊ GC (mean ± σ):  0.49% ±  2.54%

       ▁▂▅▆▅█▇▆▅▃▄▂▁                                           
  ▂▂▃▄▅█████████████▇▇▅▅▅▄▃▃▄▂▃▂▃▂▁▂▂▂▂▂▂▂▃▂▃▂▃▂▂▂▂▂▂▂▂▂▂▂▂▁▂ ▄
  2.13 ms         Histogram: frequency by time        3.36 ms <

 Memory estimate: 976.67 KiB, allocs estimate: 2.

Constrast this with:

using Intervals

_opens_typed = [rand([Open, Closed]) for _ in 1:1_000_000]
_closed_typed = [rand([Open, Closed]) for _ in 1:1_000_000]
@benchmark [Interval{o, c}(0.0, 1.0) for (o, c) in zip($_opens_typed, $_closed_typed)]
BenchmarkTools.Trial: 3 samples with 1 evaluation.
 Range (min  max):  1.727 s     2.053 s  ┊ GC (min  max): 13.63%  22.27%
 Time  (median):     1.938 s               ┊ GC (median):    18.28%
 Time  (mean ± σ):   1.906 s ± 165.230 ms  ┊ GC (mean ± σ):  18.31% ±  4.32%

  █                                    █                   █  
  █▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
  1.73 s         Histogram: frequency by time         2.05 s <

 Memory estimate: 640.87 MiB, allocs estimate: 12000018.

xs_typed = [Interval{rand([Open, Closed]), rand([Open, Closed])}(0.0, 1.0) for _ in 1:1_000_000]
x_typed = Interval{Closed, Closed}(0.0, 1.0)

@benchmark map(Base.Fix1(issubset, $x_typed), $xs_typed)
BenchmarkTools.Trial: 85 samples with 1 evaluation.
 Range (min  max):  52.160 ms  72.713 ms  ┊ GC (min  max):  0.00%  0.00%
 Time  (median):     59.560 ms              ┊ GC (median):    10.26%
 Time  (mean ± σ):   59.213 ms ±  4.144 ms  ┊ GC (mean ± σ):   6.27% ± 4.96%

       █           ▂▂  ▂  ▂▂  ▂  ▂                             
  ▄▄█▆▄█▄▄█▄▆▆▄▄▁▆▄██▁▆█████▆███▄█▆▆▄▁▁▄▁▁▆▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄ ▁
  52.2 ms         Histogram: frequency by time        72.6 ms <

 Memory estimate: 31.47 MiB, allocs estimate: 1000002.

As you would hope, if you're working with a vector of intervals with fixed open/closed-ness, then the performance is better than the dynamic intervals (presumably just because it's able to eliminate a bunch of branching at compile time), but only by a factor of 3:

vs = [rand() for _ in 1:1_000_000]
@benchmark [Interval{Closed, Closed}(v - 0.5, 5.0) for v in $vs]
BenchmarkTools.Trial: 2499 samples with 1 evaluation.
 Range (min  max):  1.762 ms    9.897 ms  ┊ GC (min  max): 0.00%   9.05%
 Time  (median):     1.834 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.995 ms ± 505.555 μs  ┊ GC (mean ± σ):  7.18% ± 11.85%

    ▂█▆                                                        
  ▂▅████▅▃▂▂▂▂▂▁▂▁▁▁▁▁▁▁▁▁▁▂▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▂▃▄▄▄▃▃▂ ▃
  1.76 ms         Histogram: frequency by time        2.76 ms <

 Memory estimate: 15.26 MiB, allocs estimate: 2.

# change end-points so that the comparison isn't entirely vaccuous
xs_typed_fixed = [Interval{Closed, Closed}(rand() - 0.5, 5.0) for _ in 1:1_000_000]
@benchmark map(Base.Fix1(issubset, $x_typed), $xs_typed_fixed)
BenchmarkTools.Trial: 6879 samples with 1 evaluation.
 Range (min  max):  685.808 μs    2.068 ms  ┊ GC (min  max): 0.00%  40.18%
 Time  (median):     702.151 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   721.751 μs ± 116.212 μs  ┊ GC (mean ± σ):  1.76% ±  6.55%

  ██▅▃▁                                                         ▁
  █████▇▇▅▄▃▁▅▄▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▁▁▃▇▇▅▃▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▁▆█ █
  686 μs        Histogram: log(frequency) by time       1.53 ms <

 Memory estimate: 976.67 KiB, allocs estimate: 2.

These kind of performance profiles suggest that you probably want dynamic intervals if you either have mixed open / closed-ness, and presumably it's going to be useful in cases where you don't know whether you have constant open / closed-ness and therefore can't just make a collection of things up front. From the above, I would characterise the gains to be had my knowing the open / closed-ness up front as "modest", but definitely something you want to have if you can.

from intervals.jl.

rofinn avatar rofinn commented on June 16, 2024

FWIW, we do dispatch on those parameters in a few places, but rewriting that logic with an explicit condition or by passing those as arguments to subfunction could be a reasonable solution.

https://github.com/invenia/Intervals.jl/blob/master/src/interval.jl#L190

This would also make it easier to break up Intervals.jl into a set of IntervalSet.jl extensions as discussed here.

JuliaMath/IntervalSets.jl#67

from intervals.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.