Giter Site home page Giter Site logo

martinit18 / bayesian_average Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 28 KB

Python package to calculate the weighted average with Bayesian methods

License: Other

Python 100.00%
bayesian-statistics data-analysis weighted-average bayesian statistics

bayesian_average's Introduction

Bayesian average

Version:

0.2.4

Authors

Martino Trassinelli
CNRS, Institute of NanoSciences of Paris
emails: trassinelli AT cnrs.fr, m.trassinelli AT gmail.com

Marleen Maxton
Max Planck Institute for Nuclear Physics, Heidelberg

Homepage

https://github.com/martinit18/bayesian_average

License

Type: X11, see LICENCE.txt

Short description

This package calculates a robust weighted average and its uncertainty from a set of data points and their uncertainties based on Bayesian statistical methods. The proposed weighted average is particularly adapted for inconsistent data sets and the presence of outliers, both of which can distort the results of standard methods.

Basic principles

Given the arrays data and sigma, representing a set of data points $x_i$ and their associated uncertainties $\sigma_i$, this package calculates the corresponding weighted average particularly adapted for inconsistent data sets (with a spread larger than the associated error bars) and the presence of outliers.

This robust weighted average is based on Bayesian statistics, assuming a normal distribution for each $x_i$ and considering $\sigma_i$ as a lower bound of the possibly larger real uncertainty $\sigma'$. Two different priors are proposed for $\sigma'$: the non-informative Jeffreys' prior $p(\sigma') \propto 1/ \sigma'$ (more precisely its limit, see Ref. [1]), and a modified version of it $p(\sigma') \propto 1/ (\sigma')^2$ proposed in Ref. [2]. The probability distribution is obtained by marginalizing over $\sigma'$, resulting in a modified Gaussian distribution for each $x_i$ that still depends on $\sigma_i$ and is characterized by smoothly decreasing wings.

For both priors, the weighted average and its associated uncertainty are obtained numerically using the basinhopping minimisation algorithm.

For comparison, both the standard (inverse-variance) weighted average and its value corrected by the Birge ratio are included.

How to install it

In your terminal, run:

pip install bayesian_average

How to use it

For the calculation of the weighted average, simply type in your Python shell:

import bayesian_average as ba
ba.average(data, sigma)

data and sigma are two arrays of the same dimension containing the data points and the associated uncertainties, respectively. The average mode can be specified using the keyword mode, with the is default assumption being Jeffreys' prior (jeffreys). The other available modes are cons, standard, and birge.

ba.average(data, sigma, mode='cons')

Details on the different methods are presented below.

The typical output is:

(6.6742395674538315, 9.74833292573106e-5)

where the first number represents the weighted average and the second represents its estimated uncertainty.

To plot the resulting probability distribution, the weighted average, and the input data, use the following command:

ba.plot_average(data, sigma)

The default mode presents the Jeffreys' weighted average and its associated probability distribution in log-scale. For plotting, additional options are provided, like:

ba.plot_average(data, sigma, jeffreys_val=True, jeffreys_like=True, plot_data=True)

The option xxx_val=True displays the value of the weighted average of the xxx method.
xxx_like=True plots the likelihood function of the xxx method (in log-scale by default).
plot_data=True shows the input data with their corresponding errorbars.
legendon=True plots the legend.
linear=True plots the likelihood function with a linear scale.
normalize=True normalises the likelihood function.
showon=True can be used in case the plot is not shown.

Details on the available weighted average modes

  • jeffreys: Jeffreys' weighted average (default average, recommended, see Ref. [1])
    The priors of the real uncertainty value are non-informative Jeffeys' prior proportional to $1/\sigma'$. Because of the non-normalisability of the probability distribution, the value of the weighted average corresponds to the limit case with prior bounds $[\sigma_i, \sigma_\mathrm{max}]$ and $\sigma_\mathrm{max} \to \infty$, where $\sigma_i$ is the uncertainty of the data point. The final probability distribution is, however, not a proper probability distribution.
  • cons: Conservative weighted average (adapted for proper probability distributions, see Ref. [2])
    The priors of the real uncertainty value are proportional to $\sigma_i/(\sigma')^2$, where $\sigma_i$ is the uncertainty of the data point The bounds of the prior are $[\sigma_i, \sigma_\mathrm{max}]$ with $\sigma_\mathrm{max} \to \infty$. This is a modified and normalisable version of the non-informative Jeffeys' prior.
  • standard: Standard weighted average
    The standard inverse-variance weighted average useful for comparisons.
  • birge: Standard weighted average corrected with the Birge ratio
    The uncertainty of the weighted average is enhanced by a factor proportional to the $\chi^2$ of the data and the weighted average if $\chi^2 > 1$, following Ref.[3].

Reference articles:

[1] M. Trassinelli and M. Maxton, A minimalistic and general weighted average for inconsistent data, arXiv:2406.08293, submitted to Metrologia
[2] D. S. Sivia and J. Skilling, Data analysis: a Bayesian tutorial, 2nd ed 2006, Oxford Univ. Press
[3] R. T. Birge, The Calculation of Errors by the Method of Least Squares, Phys. Rev. 40, 207 (1932)

Version history

  • 0.2: rearrangement of the average function(s), Birge ratio added.
  • 0.1.5: First version available on GitHub with documentation.
  • 0.0.1: First version published in PyPI with conservative, Jeffreys' and standard weighted averages.

bayesian_average's People

Contributors

martinit18 avatar

Stargazers

 avatar

Watchers

 avatar

bayesian_average's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.