Postgres has a tzrange type, which LibPQ.jl returns a

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Okay, so my understanding as it relates to <a class="issue-link js-issue-link" data-er

Storing open-ness as a type parameter makes Postgres parsing type unstable about intervals.jl HOT 4 OPEN

oxinabox commented on June 16, 2024 3

Storing open-ness as a type parameter makes Postgres parsing type unstable

from intervals.jl.

Comments (4)

oxinabox commented on June 16, 2024 2

@willtebbutt has a prototype for not doing this.
Not sure if he has run timing.
But it is extremely fast, and he is dealing with data that is explictly 1 array containing elements that mix boundedness.

from intervals.jl.

rofinn commented on June 16, 2024 2

Okay, so my understanding as it relates to JuliaMath/IntervalSets.jl#67 is that we probably want:

A shared AbstractInterval type in IntervalSets.jl which isn't parameterized by the bounds, which would make it easier for Intervals.jl to extend.
A required API for grabbing bounds which folks can use as a fallback for writing generic algorithms.
Both parameterized and non-parameterized concrete interval types depending on the application.
Update LibPQ.jl to use the non-parameterized type.

from intervals.jl.

willtebbutt commented on June 16, 2024 1

Yup. It's on a WIP MR to an internal package but there's not much code, so here's a snippet for reference:

struct DynamicInterval{T}
    infimum::T
    supremum::T
    lhs_is_open::Bool
    rhs_is_open::Bool
end

function Base.issubset(x::DynamicInterval, y::DynamicInterval)
    lower_is_in = if x.lhs_is_open
        x.infimum >= y.infimum
    else
        y.lhs_is_open ? (x.infimum > y.infimum) : (x.infimum >= y.infimum)
    end
    upper_is_in = if x.rhs_is_open
        x.supremum <= y.supremum
    else
        y.rhs_is_open ? (x.supremum < y.supremum) : (x.supremum <= y.supremum)
    end
    return lower_is_in && upper_is_in
end

Performance of construction of arrays of intervals with mixed open-ness / closed-ness and issubset is pretty good. For example:

using BenchmarkTools

_opens = [rand([true, false]) for _ in 1:1_000_000]
_closeds = [rand([true, false]) for _ in 1:1_000_000]
@benchmark [DynamicInterval(0.0, 1.0, o, c) for(o, c) in zip($_opens, $_closeds)]
BenchmarkTools.Trial: 794 samples with 1 evaluation.
 Range (min … max):  3.330 ms … 14.258 ms  ┊ GC (min … max): 0.00% …  6.20%
 Time  (median):     3.416 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   6.290 ms ±  4.255 ms  ┊ GC (mean ± σ):  7.31% ± 14.75%

  █              ▄                                 ▁▃▂     ▄  
  █▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▅▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄███▅▆▄▁▄█ ▇
  3.33 ms      Histogram: log(frequency) by time     14.1 ms <

 Memory estimate: 22.89 MiB, allocs estimate: 2.

xs = [DynamicInterval(0.0, 1.0, rand([true, false]), rand([true, false])) for _ in 1:1_000_000]
x = DynamicInterval(0.0, 1.0, false, false)

@benchmark map(Base.Fix1(issubset, $x), $xs)
BenchmarkTools.Trial: 2046 samples with 1 evaluation.
 Range (min … max):  2.130 ms …   3.905 ms  ┊ GC (min … max): 0.00% … 13.12%
 Time  (median):     2.389 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.436 ms ± 203.115 μs  ┊ GC (mean ± σ):  0.49% ±  2.54%

       ▁▂▅▆▅█▇▆▅▃▄▂▁                                           
  ▂▂▃▄▅█████████████▇▇▅▅▅▄▃▃▄▂▃▂▃▂▁▂▂▂▂▂▂▂▃▂▃▂▃▂▂▂▂▂▂▂▂▂▂▂▂▁▂ ▄
  2.13 ms         Histogram: frequency by time        3.36 ms <

 Memory estimate: 976.67 KiB, allocs estimate: 2.

Constrast this with:

using Intervals

_opens_typed = [rand([Open, Closed]) for _ in 1:1_000_000]
_closed_typed = [rand([Open, Closed]) for _ in 1:1_000_000]
@benchmark [Interval{o, c}(0.0, 1.0) for (o, c) in zip($_opens_typed, $_closed_typed)]
BenchmarkTools.Trial: 3 samples with 1 evaluation.
 Range (min … max):  1.727 s …    2.053 s  ┊ GC (min … max): 13.63% … 22.27%
 Time  (median):     1.938 s               ┊ GC (median):    18.28%
 Time  (mean ± σ):   1.906 s ± 165.230 ms  ┊ GC (mean ± σ):  18.31% ±  4.32%

  █                                    █                   █  
  █▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
  1.73 s         Histogram: frequency by time         2.05 s <

 Memory estimate: 640.87 MiB, allocs estimate: 12000018.

xs_typed = [Interval{rand([Open, Closed]), rand([Open, Closed])}(0.0, 1.0) for _ in 1:1_000_000]
x_typed = Interval{Closed, Closed}(0.0, 1.0)

@benchmark map(Base.Fix1(issubset, $x_typed), $xs_typed)
BenchmarkTools.Trial: 85 samples with 1 evaluation.
 Range (min … max):  52.160 ms … 72.713 ms  ┊ GC (min … max):  0.00% … 0.00%
 Time  (median):     59.560 ms              ┊ GC (median):    10.26%
 Time  (mean ± σ):   59.213 ms ±  4.144 ms  ┊ GC (mean ± σ):   6.27% ± 4.96%

       █           ▂▂  ▂  ▂▂  ▂  ▂                             
  ▄▄█▆▄█▄▄█▄▆▆▄▄▁▆▄██▁▆█████▆███▄█▆▆▄▁▁▄▁▁▆▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄ ▁
  52.2 ms         Histogram: frequency by time        72.6 ms <

 Memory estimate: 31.47 MiB, allocs estimate: 1000002.

As you would hope, if you're working with a vector of intervals with fixed open/closed-ness, then the performance is better than the dynamic intervals (presumably just because it's able to eliminate a bunch of branching at compile time), but only by a factor of 3:

vs = [rand() for _ in 1:1_000_000]
@benchmark [Interval{Closed, Closed}(v - 0.5, 5.0) for v in $vs]
BenchmarkTools.Trial: 2499 samples with 1 evaluation.
 Range (min … max):  1.762 ms …   9.897 ms  ┊ GC (min … max): 0.00% …  9.05%
 Time  (median):     1.834 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.995 ms ± 505.555 μs  ┊ GC (mean ± σ):  7.18% ± 11.85%

    ▂█▆                                                        
  ▂▅████▅▃▂▂▂▂▂▁▂▁▁▁▁▁▁▁▁▁▁▂▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▂▃▄▄▄▃▃▂ ▃
  1.76 ms         Histogram: frequency by time        2.76 ms <

 Memory estimate: 15.26 MiB, allocs estimate: 2.

# change end-points so that the comparison isn't entirely vaccuous
xs_typed_fixed = [Interval{Closed, Closed}(rand() - 0.5, 5.0) for _ in 1:1_000_000]
@benchmark map(Base.Fix1(issubset, $x_typed), $xs_typed_fixed)
BenchmarkTools.Trial: 6879 samples with 1 evaluation.
 Range (min … max):  685.808 μs …   2.068 ms  ┊ GC (min … max): 0.00% … 40.18%
 Time  (median):     702.151 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   721.751 μs ± 116.212 μs  ┊ GC (mean ± σ):  1.76% ±  6.55%

  ██▅▃▁                                                         ▁
  █████▇▇▅▄▃▁▅▄▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▁▁▃▇▇▅▃▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▁▆█ █
  686 μs        Histogram: log(frequency) by time       1.53 ms <

 Memory estimate: 976.67 KiB, allocs estimate: 2.

These kind of performance profiles suggest that you probably want dynamic intervals if you either have mixed open / closed-ness, and presumably it's going to be useful in cases where you don't know whether you have constant open / closed-ness and therefore can't just make a collection of things up front. From the above, I would characterise the gains to be had my knowing the open / closed-ness up front as "modest", but definitely something you want to have if you can.

from intervals.jl.

rofinn commented on June 16, 2024

FWIW, we do dispatch on those parameters in a few places, but rewriting that logic with an explicit condition or by passing those as arguments to subfunction could be a reasonable solution.

https://github.com/invenia/Intervals.jl/blob/master/src/interval.jl#L190

This would also make it easier to break up Intervals.jl into a set of IntervalSet.jl extensions as discussed here.

JuliaMath/IntervalSets.jl#67

from intervals.jl.

Storing open-ness as a type parameter makes Postgres parsing type unstable about intervals.jl HOT 4 OPEN

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent