jszymon / pacal Goto Github PK

View Code? Open in Web Editor NEW

22.0 22.0 7.0 1.05 MB

PaCAL - ProbAbilistic CALculator

License: GNU General Public License v3.0

Shell 0.02% Python 81.61% Jupyter Notebook 18.37%

pacal's People

Contributors

Stargazers

Watchers

Forkers

dkasak rossjjennings ligon mkorzen1729 sczesla yuntianyeaws naveensaigit ricardov94

pacal's Issues

Making a Fourier Transfers Program.

Hello, I am making a python program that can write a data file that represents a narrow slit with a width of 1 mm, representing a one dimensional aperture screen with a length of 81.92 mm, with a step size (or pixel size) of 20 μm. The total number of data points along the screen should be 4,096. Thus, your aperture should represent 50 pixels.

I wrote the program, using one of Python’s built-in FFT functions to apply a fast Fourier transform (FFT) to my
data and display a plot of the Fourier transformed data. However, I have issues with the FFT functions here. Can anyone give me a tip?

Thank you.

import numpy as np
#from scipy.fftpack import ftt, ifft
#import np.ftt as ftt
from numpy.fft import fft, ifft
pi=np.pi; sin=np.sin; cos=np.cos; fft=np.fft

import matplotlib.pyplot as plt

lam = 1.0e-6
wavenumber=2.0*pi/lam

z0=100.0
nseg=1
zseg_start=np.zeros(nseg)
zseg_end=np.zeros(nseg)
zseg_start[0]=-0.5e-3
zseg_end[0]=0.5e-3

nobserve = 4096
step = 20e-6
observe_lower = -40.96e-3
observe_upper = 40.96e-3

zx = np.zeros(nobserve)
zint = np.zeros(nobserve)

for i in range(nobserve):

Ramp = 0.0
Iamp = 0.0

x0 = observe_lower + i*step

for iseg in range(nseg):
    
    x = zseg_start[iseg] + i/step
    relz = z0
    
    dist = np.sqrt(x0*x0+z0*z0)
    Ramp=Ramp + (1.0/dist)*cos(wavenumber*dist)
    Iamp=Iamp + (1.0/dist)*sin(wavenumber*dist)

Intensity=Ramp*Ramp + Iamp*Iamp
zx[i] = x0
zint[i] = Intensity
print(" %10i %10i %10.6f %10.5f" % (nobserve,i,Intensity,zx[i]))

print(zx)
fx=fft.fftfreq(zx)

fi = fft.fft(zint)
#fi=np.ftt.ftt(zint)
#fx=np.ftt.fftfreq(zx)

plt.figure(figsize = (12, 6))
plt.subplot(121)

plt.plot(zx,zint,'b')
plt.xlabel('x')
plt.ylabel('Intensity')

plt.subplot(122)

plt.plot(fx,fi,'o')
plt.xlabel('freq x')
plt.ylabel('FTT Intensity')

Requirements not specified correctly

pacal's installation doesn't resolve its own dependencies, and complains that numpy is missing. Should they be in install_requires rather than requires, and are they formatted correctly?

Improving `Distr.hist()`

Hello,

I have several suggestions for improving the Distr.hist() method. I would have done this as a pull request, but I didn't know which branch to pick (python3 or master).

In distr.py, line 359: I don't understand why we divide by dw? Wouldn't we obtain the bin frequencies simply as counts * (float(n)/float(allDrawn)) / n ?

pacal/pacal/distr.py

Line 359 in b38333c

w = (float(n)/float(allDrawn)) / n / dw
I strongly suggest adding axis labels to the plot. A title would also be welcome. Here's some code to produce a visual example:

import pacal
pacal.BinomialDistr(50, 0.6).hist()
ax = plt.gca()
ax.set_title("Histogram")
ax.set_xlabel("Value")
ax.set_ylabel("Bin frequency")
pacal.show()

Of course, the label of the y-axis depends on the resolution of point 1.

Messages about missing compiled routines should go to stderr

DiracSegment & Segment with 0 probability ... can we skip them?

hello,

I notice that sometimes a PiecewiseFunction distribution has DiracSegment and Segment with 0 probability.

I was wondering if these could not be skipped entirely (ie not added).

For instance in PiecewiseFunction.add_diracs, could we have

    def add_diracs(self, other):
        """Pointwise sum of discrete part of two piecewise functions"""
        for other_dirac in other.getDiracs():
            self_dirac = self.getDirac(other_dirac.a)
            if self_dirac is None:
                if other_dirac.f > 0: # this test condition is added
                    self.addSegment(other_dirac)
            else:
                self_dirac.f += other_dirac.f

AssertionError: [MInfSegment, [-inf, 0.05648840000000001]]==MInfSegment, [-inf, 0.05648840000000001]==Segment, [0.05648840000000001, -0.08108632]==-inf==0.0564884000000000

I try to use pacal to create pdf of summation of three random variables. Then I use Nelder-Mead to minimize the negative loglikelihood. However, after several iterations, I came across this error message raised by segments.py in pacal package.

param: [2. 0. 0.2]
processed in 7.367134094238281e-05 seconds ---
5289.1225545465795
param: [2.1 0. 0.2]
processed in 9.1552734375e-05 seconds ---
5413.69569454499
param: [2.0e+00 2.5e-04 2.0e-01]
processed in 0.0001373291015625 seconds ---
5289.2735727455265
param: [2. 0. 0.21]
processed in 7.677078247070312e-05 seconds ---
5291.688762705109
param: [1.90000000e+00 1.66666667e-04 2.06666667e-01]
processed in 0.00013709068298339844 seconds ---
5162.96097874654
param: [1.8e+00 2.5e-04 2.1e-01]
processed in 0.0030710697174072266 seconds ---
5032.661772965377
param: [1.86666667e+00 3.33333333e-04 1.96666667e-01]
processed in 9.465217590332031e-05 seconds ---
5116.824460425303
param: [1.77777778e+00 1.38888889e-04 2.04444444e-01]
processed in 0.00011873245239257812 seconds ---
5001.204618755317
param: [1.66666667e+00 8.33333333e-05 2.06666667e-01]
processed in 0.00010085105895996094 seconds ---
4851.356632260949
param: [1.55555556e+00 4.44444444e-04 2.08888889e-01]
processed in 8.20159912109375e-05 seconds ---
4698.843942739831
param: [1.33333333e+00 6.66666667e-04 2.13333333e-01]
processed in 0.00016498565673828125 seconds ---
4389.884695448772
param: [1.33333333e+00 3.33333333e-04 2.23333333e-01]
processed in 9.131431579589844e-05 seconds ---
4394.404234423141
param: [1.08888889e+00 4.72222222e-04 2.18888889e-01]
processed in 7.963180541992188e-05 seconds ---
4069.8662612067
param: [7.33333333e-01 5.83333333e-04 2.23333333e-01]
processed in 0.00010180473327636719 seconds ---
3861.2544906000508
param: [0.6 0.00097222 0.23333333]
processed in 8.392333984375e-05 seconds ---
4055.4810152942673
param: [0.44444444 0.00114815 0.22333333]
processed in 0.0001323223114013672 seconds ---
4867.380442059709
param: [1.11111111e+00 5.37037037e-04 2.23333333e-01]
processed in 8.893013000488281e-05 seconds ---
4099.60266263902
param: [0.2962963 0.0007284 0.24 ]
processed in 9.965896606445312e-05 seconds ---
6952.753469381866
param: [1.07407407e+00 6.82098765e-04 2.20000000e-01]
processed in 9.369850158691406e-05 seconds ---
4053.412098162286
param: [0.49382716 0.00095473 0.22777778]
processed in 9.465217590332031e-05 seconds ---
4496.614564486516
param: [9.56790123e-01 6.41460905e-04 2.24444444e-01]
processed in 0.0001659393310546875 seconds ---
3933.7490338058774
param: [1.24279835e+00 2.99039781e-04 2.11851852e-01]
processed in 0.00011301040649414062 seconds ---
4264.013825788384
param: [0.76069959 0.00080393 0.22796296]
processed in 0.00011706352233886719 seconds ---
3854.024649773798
param: [0.55980796 0.00067038 0.23049383]
processed in 8.177757263183594e-05 seconds ---
4179.071532320201
param: [9.45507545e-01 6.79169524e-04 2.22623457e-01]
processed in 7.414817810058594e-05 seconds ---
3922.888324502457
param: [0.66957019 0.00073616 0.22483539]
processed in 9.107589721679688e-05 seconds ---
3920.1594117114482
param: [0.49689453 0.00073644 0.228131 ]
processed in 8.392333984375e-05 seconds ---
4476.307250583266
param: [8.33354291e-01 6.93487877e-04 2.24000343e-01]
processed in 8.225440979003906e-05 seconds ---
3855.589457865707
param: [8.82021287e-01 6.51006473e-04 2.25362369e-01]
processed in 0.00012493133544921875 seconds ---
3878.5581421462252
param: [8.28908512e-01 6.72294540e-04 2.25230624e-01]
processed in 8.392333984375e-05 seconds ---
3854.914557579754
param: [8.81974928e-01 8.63139352e-04 2.28129287e-01]
processed in 0.00015616416931152344 seconds ---
3880.874955501708
param: [7.70493732e-01 6.53284838e-04 2.24532322e-01]
processed in 0.0001010894775390625 seconds ---
3849.8283418083765
param: [7.40046931e-01 7.26182784e-04 2.27816930e-01]
processed in 8.344650268554688e-05 seconds ---
3860.5974550157202
param: [8.10027451e-01 7.01661603e-04 2.24954490e-01]
processed in 7.557868957519531e-05 seconds ---
3850.5827846618563
param: [0.73190533 0.00076695 0.22640256]
processed in 0.00021266937255859375 seconds ---
3864.0475105394494
param: [8.04657718e-01 6.95959446e-04 2.25523608e-01]
processed in 0.0002453327178955078 seconds ---
3850.1306397142275
param: [8.29419679e-01 5.63343980e-04 2.22043983e-01]
processed in 8.988380432128906e-05 seconds ---
3852.830156121806
param: [8.12239656e-01 6.23489638e-04 2.23523728e-01]
processed in 7.796287536621094e-05 seconds ---
3849.8769339170635
param: [7.81566620e-01 6.13494344e-04 2.24098615e-01]
processed in 9.965896606445312e-05 seconds ---
3848.289064454587
param: [7.67336205e-01 5.69410715e-04 2.23670678e-01]
processed in 7.510185241699219e-05 seconds ---
3849.618741910505
param: [7.71542287e-01 5.64219768e-04 2.22579502e-01]
processed in 9.179115295410156e-05 seconds ---
3848.3443620554585
param: [7.36828770e-01 5.97176329e-04 2.23949898e-01]
processed in 0.0001723766326904297 seconds ---
3859.835450980743
param: [7.93386935e-01 6.16911311e-04 2.23630271e-01]
processed in 0.00013828277587890625 seconds ---
3847.90997596603
param: [7.93836829e-01 5.43132110e-04 2.22339937e-01]
processed in 0.00010633468627929688 seconds ---
3846.965369220417
param: [8.05508378e-01 4.88055746e-04 2.21243745e-01]
processed in 7.462501525878906e-05 seconds ---
3847.1230922372747
param: [8.07651302e-01 6.18138742e-04 2.24133046e-01]
processed in 8.988380432128906e-05 seconds ---
3849.4799136312854
param: [7.80569541e-01 5.77699511e-04 2.22967888e-01]
processed in 0.0001895427703857422 seconds ---
3847.624068337901
param: [7.96962250e-01 5.45000944e-04 2.21860115e-01]
processed in 0.00012063980102539062 seconds ---
3846.8595405953224
param: [8.04660065e-01 5.10754244e-04 2.20740865e-01]
processed in 7.963180541992188e-05 seconds ---
3846.8092937871143
param: [7.92657356e-01 4.70812600e-04 2.20402190e-01]
processed in 0.00019478797912597656 seconds ---
3845.616883105677
param: [7.92292566e-01 3.97763244e-04 2.18788150e-01]
processed in 0.00010895729064941406 seconds ---
3844.477681428711
param: [8.13290099e-01 3.90066887e-04 2.18278080e-01]
processed in 9.059906005859375e-05 seconds ---
3846.2981525343926
param: [8.12991657e-01 3.22590806e-04 2.16198126e-01]
processed in 0.00011730194091796875 seconds ---
3844.841948197754
param: [8.07722817e-01 2.29526382e-04 2.14768705e-01]
processed in 7.653236389160156e-05 seconds ---
3842.915472472104
param: [8.09254193e-01 8.89124506e-05 2.11782625e-01]
processed in 7.677078247070312e-05 seconds ---
3840.999820460356
param: [7.96402178e-01 1.49444113e-04 2.12901187e-01]
processed in 7.796287536621094e-05 seconds ---
3840.5714275841674
param: [7.87958218e-01 2.91327265e-05 2.10212741e-01]
processed in 8.0108642578125e-05 seconds ---
3838.60465661308
param: [7.80011660e-01 2.12814745e-05 2.10990884e-01]
processed in 7.748603820800781e-05 seconds ---
3839.422861748212
param: [ 7.92523481e-01 -3.04878810e-04 2.03202684e-01]
processed in 8.559226989746094e-05 seconds ---
3833.724530052107
param: [ 7.92638939e-01 -6.56199837e-04 1.95409951e-01]
processed in 9.489059448242188e-05 seconds ---
3828.458931342521
param: [ 7.64485019e-01 -4.92769541e-04 1.99293092e-01]
processed in 0.00016760826110839844 seconds ---
3834.26295517022
param: [ 7.83376457e-01 -7.67839242e-04 1.92286305e-01]
processed in 0.00013875961303710938 seconds ---
3826.8256800975155
param: [ 0.78505885 -0.0011624 0.18293402]
processed in 7.748603820800781e-05 seconds ---
3820.6920915476903
param: [ 0.77349699 -0.00157005 0.17487863]
processed in 9.608268737792969e-05 seconds ---
3816.8808179592206
param: [ 0.76626638 -0.00236963 0.15721158]
processed in 0.0001201629638671875 seconds ---
3807.7752830896975
param: [ 0.79815776 -0.00229939 0.15774394]
processed in 7.915496826171875e-05 seconds ---
3804.5435945323006
param: [ 0.81499413 -0.00320269 0.13696936]
processed in 0.00013756752014160156 seconds ---
3793.957732312433
param: [ 0.7849073 -0.00383362 0.12266668]
processed in 0.00015473365783691406 seconds ---
3785.3001119282135
param: [ 0.78104149 -0.00542233 0.08629505]
processed in 0.0001010894775390625 seconds ---
3768.599582643904
param: [ 0.78980914 -0.00616737 0.07071664]
processed in 7.772445678710938e-05 seconds ---
3761.2207949550857
param: [ 0.79218429 -0.00866986 0.01460795]
processed in 0.0001475811004638672 seconds ---
3744.789138756524
param: [ 0.82588023 -0.00916029 0.00137 ]
processed in 7.534027099609375e-05 seconds ---
3746.9002257836974
param: [ 0.78440987 -0.01229896 -0.06878736]
AssertionError Traceback (most recent call last)
in ()
----> 1 res2 = minimize(loglikelihood, [2.0,0,0.2], method='Nelder-Mead', tol=1e-6)

/usr/local/lib/python3.4/dist-packages/scipy/optimize/_minimize.py in minimize(fun, x0, args, method, jac, hess, hessp, bounds, constraints, tol, callback, options)
473 callback=callback, **options)
474 elif meth == 'nelder-mead':
--> 475 return _minimize_neldermead(fun, x0, args, callback, **options)
476 elif meth == 'powell':
477 return _minimize_powell(fun, x0, args, callback, **options)

/usr/local/lib/python3.4/dist-packages/scipy/optimize/optimize.py in _minimize_neldermead(func, x0, args, callback, maxiter, maxfev, disp, return_all, initial_simplex, xatol, fatol, **unknown_options)
546 xbar = numpy.add.reduce(sim[:-1], 0) / N
547 xr = (1 + rho) * xbar - rho * sim[-1]
--> 548 fxr = func(xr)
549 doshrink = 0
550

/usr/local/lib/python3.4/dist-packages/scipy/optimize/optimize.py in function_wrapper(*wrapper_args)
290 def function_wrapper(wrapper_args):
291 ncalls[0] += 1
--> 292 return function((wrapper_args + args))
293
294 return ncalls, function_wrapper

in loglikelihood(param)
28 for i in range(0, len(data)):
29 start_time = time.time()
---> 30 LL = LL - np.log(findYpdf(data.ix[i,'TC'], data.ix[i,'tsigt']))
31 #print("--- data",i,"is processed in %s seconds ---" % (time.time() - start_time))
32 print("processed in %s seconds ---" % (time.time() - start_time))

in findYpdf(x, tsigt)
20 def findYpdf(x, tsigt):
21 index = bisect(grid, tsigt)
---> 22 pdf_l = Y[index-1].pdf(x)
23 pdf_u = Y[index].pdf(x)
24 pdf_final = pdf_l + (pdf_u - pdf_l)*(tsigt - grid[index-1])/(grid[index] - grid[index-1]) #interpolate

/usr/local/lib/python3.4/dist-packages/pacal/distr.py in pdf(self, x)
124
125 def pdf(self,x):
--> 126 return self.get_piecewise_pdf()(x)
127 def cdf(self,x):
128 """Cumulative piecewise function."""

/usr/local/lib/python3.4/dist-packages/pacal/distr.py in get_piecewise_pdf(self)
78 """return PDF function as a PiecewiseDistribution object"""
79 if self.piecewise_pdf is None:
---> 80 self.init_piecewise_pdf()
81 return self.piecewise_pdf
82 def get_piecewise_cdf(self):

/usr/local/lib/python3.4/dist-packages/pacal/distr.py in init_piecewise_pdf(self)
1126 return r1 + r2
1127 def init_piecewise_pdf(self):
-> 1128 self.piecewise_pdf = conv(self.d1.get_piecewise_pdf(), self.d2.get_piecewise_pdf())
1129 class SubDistr(SubRV, OpDistr):
1130 """Difference of distributions."""

/usr/local/lib/python3.4/dist-packages/pacal/distr.py in init_piecewise_pdf(self)
584 return self.f(self.d.rand(n, cache))
585 def init_piecewise_pdf(self):
--> 586 self.piecewise_pdf = self.d.get_piecewise_pdf().copyComposition(self.f, self.f_inv, self.f_inv_deriv, pole_at_zero = self.pole_at_zero)
587
588 class ShiftedScaledDistr(ShiftedScaledRV, OpDistr):

/usr/local/lib/python3.4/dist-packages/pacal/standard_distr.py in init_piecewise_pdf(self)
154 wrapped_pdf = wrap_pdf(self.pdf)
155 self.piecewise_pdf.addSegment(MInfSegment(b1, wrapped_pdf))
--> 156 self.piecewise_pdf.addSegment(Segment(b1, b2, wrapped_pdf))
157 self.piecewise_pdf.addSegment(PInfSegment(b2, wrapped_pdf))
158 def pdf(self, x):

/usr/local/lib/python3.4/dist-packages/pacal/segments.py in addSegment(self, segment)
1016 if segment.isDirac():
1017 segment = DiracSegment(seg.a, segment.f)
-> 1018 assert seg>segment or segment>seg, "{}=={}=={}=={}=={}".format(self.segments, seg, segment, seg.a, segment.a)
1019 bisect.insort(self.segments, segment)
1020 self.breaks = unique(append(self.breaks, [segment.a, segment.b]))

AssertionError: [MInfSegment, [-inf, 0.05648840406515476]]==MInfSegment, [-inf, 0.05648840406515476]==Segment, [0.05648840406515476, -0.08108631573253092]==-inf==0.05648840406515476

Assertion error when adding two r.vs (Normal)

Hi,

I just setup Pacal from your repo (running on Python3). Thank you so much for adding in Python3 support.

The following piece of code gives me an assertion error

X1 = pc.NormalDistr(400, 1)
X2 = pc.NormalDistr(400, 1)
X3 = X1 + X2
X3.summary()

The output I get is:

============= summary =============
ASSERT x is None y= 0.5 (MInfSegment, [-inf, 798.0]),(Segment, [798.0, 800.0]),(Segment, [800.0, 802.0]),(PInfSegment, [802.0, inf])
ASSERT x is None vals= [(nan, nan), (nan, nan), (nan, nan), (nan, nan)]
ASSERT x is None y= 0.975 (MInfSegment, [-inf, 798.0]),(Segment, [798.0, 800.0]),(Segment, [800.0, 802.0]),(PInfSegment, [802.0, inf])
ASSERT x is None vals= [(nan, nan), (nan, nan), (nan, nan), (nan, nan)]
ASSERT x is None y= 0.025 (MInfSegment, [-inf, 798.0]),(Segment, [798.0, 800.0]),(Segment, [800.0, 802.0]),(PInfSegment, [802.0, inf])
ASSERT x is None vals= [(nan, nan), (nan, nan), (nan, nan), (nan, nan)]
ASSERT x is None y= 0.5 (MInfSegment, [-inf, 798.0]),(Segment, [798.0, 800.0]),(Segment, [800.0, 802.0]),(PInfSegment, [802.0, inf])
ASSERT x is None vals= [(nan, nan), (nan, nan), (nan, nan), (nan, nan)]
  N(400,1)+N(400,1)
                mean  =  nan
                 var  =  nan
            skewness  =  nan
            kurtosis  =  nan
             entropy  =  nan
              median  =  nan
                mode  =  800.0000192897513
--- medianad
      iqrange(0.025)  =  nan
--- interval(0.95)
               range  =  (-inf, inf)
             tailexp  =  (0, 0)
             int_err  =  nan
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/pacal/distr.py", line 267, in summary_map
    r['medianad'] = self.medianad()
  File "/usr/local/lib/python3.5/dist-packages/pacal/distr.py", line 187, in medianad
    return self.get_piecewise_pdf().medianad()
  File "/usr/local/lib/python3.5/dist-packages/pacal/segments.py", line 1887, in medianad
    f1 = self.copyShiftedAndScaled(shift = -median, scale = 1.0)
  File "/usr/local/lib/python3.5/dist-packages/pacal/segments.py", line 1325, in copyShiftedAndScaled
    copyFunction.addSegment(seg.shiftAndScale(shift, scale))
  File "/usr/local/lib/python3.5/dist-packages/pacal/segments.py", line 1018, in addSegment
    assert seg>segment or segment>seg, "{}=={}=={}=={}=={}".format(self.segments, seg, segment, seg.a, segment.a)
AssertionError: [MInfSegment, [-inf, nan]]==MInfSegment, [-inf, nan]==Segment, [nan, nan]==-inf==nan

Is this an issue with Python3 package? I would appreciate any quick fix I can apply. Thanks again for supporting this package!

Inverse CDF (PPF) fails closer to zero (outputs nan)

Hi,

Another issue that I found on Pacal (built from master after your current fix) running on Python3.

Setup: Define a variable dist ~ N(2,1). Use pacal functions to get the cdf, and then the ppf using cdf.invfun.

dist = pacal.NormalDistr(2, 1)
cdf = dist.get_piecewise_cdf()
ppf = cdf.invfun(rangeY=None)

Then, I try to calculate the ppf are various points:

[ppf(x/10) for x in range(11)]

This outputs:

[0,
 nan,
 1.1583787664270961,
 1.4755994872920493,
 1.7466528968641983,
 1.999999999999999,
 2.2533471031358014,
 2.5244005127079476,
 2.841621233572903,
 3.281551558981633,
 9.952639588985395]

I see that it's failing closer to zero. When I try to plot it, I see this:

ppf.plot()

Output:

For reference, this is the expected output I get by using scipy:

from scipy.stats import norm
[norm.ppf(x/10, loc=2,scale=1) for x in range(11)]

[-inf,
 0.7184484344553996,
 1.1583787664270857,
 1.475599487291959,
 1.7466528968642003,
 2.0,
 2.2533471031357997,
 2.5244005127080404,
 2.8416212335729143,
 3.2815515655446004,
 inf]

Let me know if you need any other information! Thanks again for supporting this package in Python3.

An import "numpy.fft.fftpack" is missing

Hello, all. After installing Pacal on a Python 3.7.3, Mac OS 10.11 environment, I was eager to start using it, but I encounter an error when importing.

import pacal as co gives back a ModuleNotFoundError: No module named 'numpy.fft.fftpack'.

Is there any potentially easy fix for this? This seems like one of the only convolution libraries in Python, so I hope this works.

On computing Pr( Y-const1 < X <= Y+const2) for random vars X and Y

Dear authors of Pacal, thank you for such an interesting tool.

I was playing with Pacal trying to compute for two random vars X and Y the probablity P( Y-const1 < X <= Y+const2) and it seems I stuck into some weird behaviour... (edited:) Most likely I'm doing something wrong, but I can't find what is it. Could someone please help?

The context of the use-case under a spoiler

Click here to unroll (you may skip that section entirely, it's just to understand a context of the task)

Imagine a game of bowling. There is a fixed width lane (s), on one side of it there's a pin of width (f) placed according to some distribution D1 and there's a bowling ball of width (h) thrown parallel to the lane boundaries (ball's position is also some random distribution D2):

=upper-lane-boundary============================
 ^                          o <-- bowling pin of width f
 |   ball width h --> O
 |
 lane width S
 |
 v
=lower-lane-boundary============================

The question is what's the probability of ball hitting the pin?

If D1 and D2 are uniform distributions, it's easy to calculate the probability analytically. So knowing the answer beforehand, I started to test Pacal:

from pacal import *

s=20
f=2
h=3

# ball is a coordinate of topmost point of the ball. Lowmost is (ball+h)
ball = UniformDistr(0, s-h)

# pin is a coordinate of topmost point of the pin. Lowmost is (pin+f)
pin = UniformDistr(0, s-f)

# when ball coordinate satisfies `pin-h < ball <= pin+f` inequality, ball hits the pin, so we
# need to find probability of that: P(pin-h < ball <= pin+f) which can be calculated using CDF
# as `ball.cdf(pin+f) - ball.cdf(pin-h)`.
print( ball.cdf(pin+f) - ball.cdf(pin-h) )

# and that line fails (details below).
# it probably anyway "incorrect" because of how Pacal handles dependent variables
# but I can't find a way how to rephrase it.

# btw, computing just P(ball <= pin+f) which doesn't introduce dependencies
# also fails because of the same reason
print( ball.cdf(pin+f) )

Here's the error just for the case, but that not really why I started that issue

  File "D:\utils\Anaconda3\lib\site-packages\pacal\distr.py", line 129, in cdf
    return self.get_piecewise_cdf()(x)

  File "D:\utils\Anaconda3\lib\site-packages\pacal\segments.py", line 997, in __call__
    ind = ((x>=seg.a) & (x<seg.b))

  TypeError: '>=' not supported between instances of 'ShiftedScaledDistr' and 'float'

(I guess it is because cdf() doesn't expect it's argument to be a distribution, rather than a float)

Ok, leave it as it is and try to calculate the probability in an old fashioned way via integration of joint pdf of distribution.

The issue

I tried to compute my probability directly by integrating pdf's of distribution and I obtain wrong results when I'm using pdf's given by Pacal, thought pdf's from sympy.stats or when coded directly by hand are fine.

Here's the full code:

import sympy as sp
import sympy.stats as st
import pacal as pc
import timeit
import scipy.integrate as scpi

ttrun = 1

s=20
f=2
h=3

anProb = 1 - (s-f-h)**2/((s-f)*(s-h))

def showres(n,pr,tm):
    if type(pr) is tuple:
        print('{0}-{1:.06f}s: computed={2}, an={3:.06f}'.format(n, tm, pr, anProb))
    else:
        print('{0}-{1:.06f}s: computed={2} ({3:.06f}), an={4:.06f}'.format(n, tm, pr,\
            pr if type(pr) is float else float(pr), anProb))

'''
ball = UniformDistr(0, s-h)
pin = UniformDistr(0,s-f)

print( ball.cdf(pin+f)-ball.cdf(pin-h) )
print( ball.cdf(pin+f) )
# doesn't work, should integrate by hand
'''

pr=0.0

def pacal_scipy_numint():
    global pr
    ball = pc.UniformDistr(0, s-h)
    pin = pc.UniformDistr(0,s-f)
    
    ball_pdf = ball.get_piecewise_pdf()
    pin_pdf = pin.get_piecewise_pdf()
    
    fxy = lambda x,y: ball.pdf(x)*pin.pdf(y)
    #fxy = lambda x,y: ball_pdf(x)*pin_pdf(y)   #same result

    ###### important
    #fxy2 = ball_pdf*pin_pdf
    #fxy = lambda x,y: fxy2(x)  #this is obviously wrong pdf, but it produces the same result as before, so most likely I'm doing something wrong, but what exactly?
    ###### /important
    
    #fxy = lambda x,y: 1/((s-h)*(s-f))   # works correctly (need to subtract 2 more integrals to make equal with anProb)
    
    pr = scpi.dblquad(fxy, 0, s-f, lambda y: y-h, lambda y: y+f)
    
    #def bnd_x(y):
    #    return [y-h, y+f]
    #pr = scpi.nquad(fxy, [bnd_x, [0, s-f]])  # same result
    
    return pr

bht = timeit.timeit(pacal_scipy_numint, number=ttrun)/ttrun
showres('pacal_scipy_numint', pr, bht)

#############################

def sympy_byhand():
    global pr
    #compute joint pdf and cdf value by hand
    ball_pdf = 1/(s-h)
    pin_pdf = 1/(s-f)
    fxy = ball_pdf*pin_pdf
    
    x = sp.Symbol('x')
    y = sp.Symbol('y')
    
    # need either to update pdfs to make it zero in out-of-domain area or to subtract false tails
    pr = sp.integrate(fxy,(x,y-h,y+f),(y,0,s-f)) \
        - sp.integrate(fxy,(x,y-h,0),(y,0,h)) \
           - sp.integrate(fxy,(x,s-h,y+f),(y,s-f-h,s-f))
    return pr

bht = timeit.timeit(sympy_byhand, number=ttrun)/ttrun
showres('sympy_byhand', pr, bht)

#############################

def sympy_stat():
    global pr
    x = sp.Symbol('x')
    y = sp.Symbol('y')
    
    od = st.Uniform('od', 0, s-h)
    fd = st.Uniform('od', 0, s-f)
    
    opdf = st.density(od)(x)
    fpdf = st.density(fd)(y)
    
    pr = sp.integrate(opdf * fpdf, (x,y-h,y+f),(y,0,s-f))
    return pr

bht = timeit.timeit(sympy_stat, number=ttrun)/ttrun
showres('sympy_stat', pr, bht)

The function that uses Pacal is pacal_scipy_numint(). Note the comments - I tried .pdf(), .get_piecewise_pdf() and a direct pdf specification. Only the latter works correctly.

Here's the results of run:

pacal_scipy_numint-1.638697s: computed=(0.014537426519383745, 1.4811726520553338e-08), an=0.264706
sympy_byhand-0.070829s: computed=0.264705882352941 (0.264706), an=0.264706
sympy_stat-1.090513s: computed=9/34 (0.264706), an=0.264706

Only pacal_scipy_numint() returns wrong result, all other implementations are fine... It looks that I missed something (it's quite possible, I'm newbie to python), but I don't get what's wrong.

Any ideas what is going on?

Thank you for your time.

apnd: it's definitely my fault somewhere with integration, because even when I set fxy = lambda x,y: float((x>=0) & (x<=(s-h)) & (y>=0) & (y<=(s-f)))/((s-h)*(s-f)) it gives the same incorrect result. I guess I'm overheated heavily))), but I still don't see what's wrong. Only unconstrained fxy = lambda x,y: 1/((s-h)*(s-f)) with subsequent subtraction (like in sympy_byhand) works properly.

Log logistic function

Hi, it's a lovely and nice library. Is it too difficult to add other functions such as log logistic? If following the other examples of distribution, would it be possible to do it?