Comments (2)
Thank you for the feedback! I agree. I will lower all capitals.
Furthermore, I have been looking into your issue. For many of the distributions, it uses scipy, such as the lognorm. The log/scale parameters are likely better described there.
For the lognormal distribution, the "mean" and "std dev" correspond to log(scale) and shape.
For demonstration:
loc = 5
scale=10
sample_dist = st.lognorm.rvs(3, loc=loc, scale=np.exp(scale), size=10000)
dfit = distfit('parametric', todf=True, distr=["lognorm"])
dfit.fit_transform(sample_dist)
print('Estimated loc: %g, input loc: %g' %(dfit.model['loc'], loc))
print('Estimated mu or scale: %g, input scale: %g' %(np.log(dfit.model['scale']), scale))
[distfit] >INFO> fit
[distfit] >INFO> transform
[distfit] >INFO> [lognorm] [0.36 sec] [RSS: 1.76437e-10] [loc=5.069 scale=22043.122]
[distfit] >INFO> Compute confidence intervals [parametric]
Estimated loc: 5.06934, input loc: 5
Estimated mu or scale: 10.0008, input scale: 10
The loc/scale is nicely estimated.
If I now do the same in your case but first without the filters. The mu seems pretty close.
mu=13.8
loc=47.55
x_sim = np.random.normal(loc=loc,scale=np.exp(mu), size = 10000)
# x_sim = np.append([*filter(lambda x: x<=80, x_sim)],np.random.normal(loc=90,scale=10, size = 50))
# x_sim = np.array([*filter(lambda x: x >=0,x_sim)])
dfit = distfit('parametric', todf=True, distr=["lognorm"])
dfit.fit_transform(x_sim)
dfit.bootstrap(x_sim, n_boots=1)
print('Estimated mu or scale: %g, input scale: %g' %(np.log(dfit.model['scale']), mu))
Estimated mu or scale: 17.3597, input scale: 13.8
Checkout this thread on stackoverflow.
from distfit.
I will read more into the resources you provided, regarding the "manually" simulated 2nd plot (where the pdf basically looks like a corner, and one can not see bars from the contained histogram), I now understand why it does look this way. The upper-limit confidence interval does explode. E.g. the empirical values in my distribution let's say range from distfit
with the popular distribution and consequently executing the bootstrap test the 95-upper Confidence Interval boundary (for the e.g., paretro distribution) is estimated to be at
So I made sure to reread the information you provided. Thank you very much for clarifying the relation between mean, SD and log(loc) and log(scale). Still as far as I understand it negative values should not be possible under the distribution, log(negative) = results in complex number with an imaginary component
from distfit.
Related Issues (20)
- New User Plot Question HOT 5
- new feature request - automatically fit multiple variables HOT 2
- plot_summary throws error HOT 2
- extend __init__.py descriptions HOT 3
- Help needed to interpret loc and scale HOT 2
- Boostrapping bug - The data contains non-finite values HOT 2
- Feature request: parallel execution when fitting multiple distributions HOT 3
- Quantile Bootstrap to avoid computationally expensive simple Bootstrap HOT 1
- plot freeze HOT 8
- Support Gaussian Mixture distribution fitting (2GMM, 3GMM..) HOT 1
- My code bog down on drawing pdf if best fit distribution is T distribution HOT 6
- More theoretical distributions can be added, such as zero truncated and zero modified theoretical distributions HOT 3
- Logo has a typo HOT 2
- Incorporate numba with scipy.special and numba-stats HOT 4
- module 'scipy.stats' has no attribute 'gilbrat' HOT 4
- Library decomposition HOT 1
- Error from scipy.optimize within fit_transform HOT 2
- Adjusting the bin size of plots HOT 1
- New methods to assess the goodness-of-fit HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from distfit.