reactivebayes / reactivemp.jl Goto Github PK

High-performance reactive message-passing based Bayesian inference engine

License: MIT License

Julia 99.91% Makefile 0.09%

bayesian-inference message-passing probabilistic-programming probabilistic-graphical-models inference julia variational-bayes variational-inference

reactivemp.jl's People

Contributors

Stargazers

Watchers

Forkers

cscherrer a-vzer mfkiwl albertpod hoangmhnguyen softwareimpacts wmkouw pitmonticone tntue peikalunci hanmeh tela

reactivemp.jl's Issues

StringIndexError: invalid index [80], valid nearby indices [79]=>'Λ', [81]=>':'

I'm getting the following StringIndexError after completing the inference procedure:

StringIndexError: invalid index [80], valid nearby indices [79]=>'Λ', [81]=>':'

It occurs while printing the InferenceResult mimestring:

Stacktrace:
  [1] string_index_err(s::String, i::Int64)
    @ Base .\strings\string.jl:12
  [2] SubString{String}(s::String, i::Int64, j::Int64)
    @ Base .\strings\substring.jl:32
  [3] SubString
    @ .\strings\substring.jl:38 [inlined]
  [4] SubString
    @ .\strings\substring.jl:40 [inlined]
  [5] view
    @ .\strings\substring.jl:50 [inlined]
  [6] show(io::IOContext{IOBuffer}, result::InferenceResult{Dict{Symbol, Array}, Vector{Real}, FactorGraphModel, Tuple{Vector{DataVariable{PointMass{Float64}, Rocket.RecentSubjectInstance{Message{PointMass{Float64}}, Subject{Message{PointMass{Float64}}, AsapScheduler, AsapScheduler}}}}, RandomVariable, Vector{RandomVariable}, Matrix{RandomVariable}, Vector{RandomVariable}}})
    @ ReactiveMP C:\Users\kouww\.julia\packages\ReactiveMP\bdzHB\src\inference.jl:90
  [7] show
    @ .\multimedia.jl:47 [inlined]
  [8] limitstringmime(mime::MIME{Symbol("text/plain")}, x::InferenceResult{Dict{Symbol, Array}, Vector{Real}, FactorGraphModel, Tuple{Vector{DataVariable{PointMass{Float64}, Rocket.RecentSubjectInstance{Message{PointMass{Float64}}, Subject{Message{PointMass{Float64}}, AsapScheduler, AsapScheduler}}}}, RandomVariable, Vector{RandomVariable}, Matrix{RandomVariable}, Vector{RandomVariable}}})
    @ IJulia C:\Users\kouww\.julia\packages\IJulia\AQu2H\src\inline.jl:43
  [9] display_mimestring
    @ C:\Users\kouww\.julia\packages\IJulia\AQu2H\src\display.jl:71 [inlined]
 [10] display_dict(x::InferenceResult{Dict{Symbol, Array}, Vector{Real}, FactorGraphModel, Tuple{Vector{DataVariable{PointMass{Float64}, Rocket.RecentSubjectInstance{Message{PointMass{Float64}}, Subject{Message{PointMass{Float64}}, AsapScheduler, AsapScheduler}}}}, RandomVariable, Vector{RandomVariable}, Matrix{RandomVariable}, Vector{RandomVariable}}})
    @ IJulia C:\Users\kouww\.julia\packages\IJulia\AQu2H\src\display.jl:102
 [11] #invokelatest#2
    @ .\essentials.jl:716 [inlined]
 [12] invokelatest
    @ .\essentials.jl:714 [inlined]
 [13] execute_request(socket::ZMQ.Socket, msg::IJulia.Msg)
    @ IJulia C:\Users\kouww\.julia\packages\IJulia\AQu2H\src\execute_request.jl:112
 [14] #invokelatest#2
    @ .\essentials.jl:716 [inlined]
 [15] invokelatest
    @ .\essentials.jl:714 [inlined]
 [16] eventloop(socket::ZMQ.Socket)
    @ IJulia C:\Users\kouww\.julia\packages\IJulia\AQu2H\src\eventloop.jl:8
 [17] (::IJulia.var"#15#18")()
    @ IJulia .\task.jl:423`

I can't share the model publicly. This issue is just to archive this bug. Once we find a fix, we'll post it here.

Versions:

Julia: 1.7.2
[336ed68f] CSV v0.10.4
[8f4d0f93] Conda v1.7.0
[a93c6f00] DataFrames v1.3.4
[31c24e10] Distributions v0.25.58
[b3f8163a] GraphPPL v2.0.1
[7073ff75] IJulia v1.23.3
[b964fa9f] LaTeXStrings v1.3.0
[91a5bcdd] Plots v1.29.0
[92933f4c] ProgressMeter v1.7.2
[d330b81b] PyPlot v2.10.0
[a194aa59] ReactiveMP v2.0.3
[295af30f] Revise v3.3.3
[df971d30] Rocket v1.3.22
[37e2e46d] LinearAlgebra

Complete optimization procedure for multivariate and matrix variate random variables.

The current implementation of PointMassFormConstraint aka EM-like MP doesn't support multivariate or matrix variate distributions.

Add Poisson node

ReactiveMP.jl is missing Poisson node.

The node exists in ForneyLab.jl.

Any volunteers to port?

Problems with Laplace approximation method

There are couple of problems with the Laplace approximation:

For Laplace approximation we use Optim.jl package to evaluate derivatives, however in current version of Optim.jl derivative syntax for 1D functions has changed leading to failures in ReactiveMP.jl.
Current implementation of Laplace is not numerically stable.

Allow user to choose between safe/unsafe/fastcholesky inverse in Gaussian nodes

This might be a meta flag for Gaussian nodes.
For example by default we might assume Hermittian structure and use meta = AssumeHermittian() that fallbacks to fastcholesky.

cholinv(::AssumeHermittian, something) = fastcholesky(something)

In other cases we might prever pinv(), e.g. meta = AssumeNonHermittian()

cholinv(::AssumeNonHermittian, something) = pinv(something)

@bartvanerp , @albertpod

Importance sampling for GammaMixture support.

We have (tw @bvdmitri ) spotted an issue with running an important sampling strategy around the GammaMixture node.
We proposed this strategy n this paper.

The API for the form constraint has changed i.e.
as = randomvar(nmixtures, prod_constraint = ProdGeneric(), form_constraint = SampleListFormConstraint(rng, 1000, LeftProposal()))
compared to
as = randomvar(nmixtures, prod_strategy = as_prod_strategy, constraint = as_constraint)

Display user-friendly error when indexing random variables within the `@model` macro

Currently Julia simply gives big, somewhat unreadable stacktrace on the following model:

@model function mymodel()
    r ~ MvNormalMeanPrecision(..., ...)
    ...
    ... ~ r[i] ...
end

We should inform users that indexing of random variables is not allowed. Indexing (with slices included) of constvars and possibly datavars should be allowed though.

Linearize `DistProduct` for identical types of distributions

Currently DistProduct creates a 'product-tree' even if all types of distributions are identical. That leads to a significant (and redundant!) compilation latencies. We should implement similar product linearisation procedure as we already have for logpdf-messages here:
https://github.com/biaslab/ReactiveMP.jl/blob/master/src/distributions/function.jl#L183

A model with switching dynamics silently yields an empty buffer

Hey everyone!

Thanks for the work put in ReactiveMP and the surrounding projects, it looks very interesting!

This is, perhaps, more of a question than an issue, so please feel free to redirect it to a more suitable platform, if any.

I've been trying to build a form of a Switching Linear Dynamical System (SLDS), which has the transition of a continuous latent variable (x) conditioned on the value of a categorical latent variable (s). Attached DAG attempts to illustrates it (note that I am using a continuous auxiliary variable α to condition x).

The complete attempt with some details can be found in this notebook, but in the very nutshell:

I am trying to use Transition to condition α on s (much like the emitted value y in your HMM demo)

    α ~ Transition(s[t], Tα) # typeof(Tx) == MatrixDirichlet{Float64, Matrix{Float64}}

then I construct a node Tx to compute µ for the next x

    Tx ~ 1 + (-dt) * α # typeof(dt) == Float64
    x[t] ~ NormalMeanVariance(Tx * x_prev, σx)

My most naive implementation of the inference silently returns empty buffers and I am clueless how to debug it. The notebook includes a working, slightly simpler form of the model without the latent s, and this is the only reason I assume I am misusing α ~ Transition or node generation for Tx... Please, let me know if there is a piece of documentation I am missing.

Outgoing BP rules for :var interface of NormalMeanVariance node.

Discussed in https://github.com/biaslab/ReactiveMP.jl/discussions/95

Hi guys,
(@ismailsenoz @semihakbayrak @bvdmitri)
How sure we are about the BP message for :var interface that is associated with variance?

We had a look a few days ago with @bartvanerp and it seemed to us that the normalization of the message is incorrect.
I see there are tests for these rules, but I am still concerned about some missing factors in:

ContinuousUnivariateLogPdf(DomainSets.HalfLine(), (x) -> -log(m_μ_cov + x) / 2 - 1 / (2x) * (m_out_mean - m_μ_mean) ^ 2)

There is an issue with BP message from GaussianMeanVariance node. The correct version when both :mean and :out inbounds messages are Gaussians should be:

(x) -> -log(m_μ_cov + m_out_cov + x) / 2 - 1 / (2*(m_μ_cov + m_out_cov + x)) * (m_out_mean - m_μ_mean) ^ 2)

Skill Model, Basic usage

I am trying to apply ReactiveMP to Winn and Bishop's skill model. I am starting ultra simple and will build up. I just learned to do this in ForneyLab with help from the developers. I hope to do the same here. What I have so far, modeling my efforts to mimic the Coin Flip problem since my model is discrete. Of course I get an error.

using Rocket, GraphPPL, ReactiveMP, Distributions, Random

@model function toy3(n)
    y = datavar(n)  # 3 questions
    
    csharp ~ Bernoulli(0.5)
    sql ~ Bernoulli(0.5)
    
    isCorrect1 ~ Bernoulli(Addnoise(csharp))
    isCorrect2 ~ Bernoulli(Addnoise(sql))
    
    return y, isCorrect1, isCorrect2
end

dataset = [1., 0.]

result = inference(
    model = Model(toy3, length(dataset)),
    data  = (y = dataset, )
)

Error:

lid type object was expected but '2' has been found

Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:33
  [2] ensure_type(x::Int64)
    @ GraphPPL ~/.julia/packages/GraphPPL/KNm3q/src/model.jl:22
  [3] macro expansion
    @ ./In[56]:4 [inlined]
  [4] macro expansion
    @ ~/.julia/packages/GraphPPL/KNm3q/src/model.jl:347 [inlined]
  [5] create_model(::Type{toy3}, toy3constraints_in#451::Nothing, toy3meta_in#453::Nothing, toy3options_in#455::NamedTuple{(), Tuple{}}, n::Int64)
    @ Main ~/.julia/packages/GraphPPL/KNm3q/src/backends/reactivemp.jl:73
  [6] create_model(generator::ReactiveMP.ModelGenerator{toy3, Tuple{Int64}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, Nothing, Nothing, Nothing}, constraints::Nothing, meta::Nothing, options::NamedTuple{(), Tuple{}})
    @ ReactiveMP ~/src/2022/ReactiveMP.jl/src/model.jl:42
  [7] inference(; model::ReactiveMP.ModelGenerator{toy3, Tuple{Int64}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, Nothing, Nothing, Nothing}, data::NamedTuple{(:y,), Tuple{Vector{Float64}}}, initmarginals::Nothing, initmessages::Nothing, constraints::Nothing, meta::Nothing, options::NamedTuple{(), Tuple{}}, returnvars::Nothing, iterations::Int64, free_energy::Bool, showprogress::Bool, callbacks::Nothing, warn::Bool)
    @ ReactiveMP ~/src/2022/ReactiveMP.jl/src/inference.jl:274
  [8] top-level scope
    @ In[57]:3
  [9] eval
    @ ./boot.jl:373 [inlined]
 [10] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)
    @ Base ./loading.jl:1196

Obviously, there is some basic concept I do not understand, although I thought I did. What could have gone amiss? Thanks for any input!

Gordon

Report where free energy computes `NaN`

inference function should check and report at what node or edge Free Energy computation fails with NaN

Fit at the beginning of an array takes 20x more iterations than the rest

As discussed, there is an odd artefact where the fit gets great almost immediately across the model except for the beginning, which then takes up to 20x iterations to converge (linked more to the array size).

Result with 20 iterations (observe the beginning):

Result with 365 iterations (=N)

MWE for a simple curve-fitting scenario

# Generate data to fit
time_max=365
noise_scale=0.05

time_index=collect(1:time_max)/time_max
p=180/time_max
growth=2
offset=0
y=offset .+ sin.(time_index*2π/p) .+ growth.*time_index .+rand(Normal(0,noise_scale),time_max)
plot(y)

# Generate splines to approximate the function
# Note: boundary knots are important to be outside of the needed range to avoid a row of all zeros (which breaks the backprop)
X=Splines2.bs(time_index,df=10,boundary_knots=(-0.01,1.01));

# Build the model
@model function linreg(X,n,dim_x)

    T=Float64
    
    y = datavar(T, n)
    aux = randomvar(n)

    sigma  ~ GammaShapeRate(1.0, 1.0)
    intercept ~ NormalMeanVariance(0.0, 2.0)
    beta  ~ MvNormalMeanPrecision(zeros(dim_x), diageye(dim_x))
    
    for i in 1:n
        aux[i] ~ intercept + dot(X[i,:], beta)
        y[i] ~ NormalMeanPrecision(aux[i], sigma) 
    end

    return beta,aux,y
end

constraints = @constraints begin 
    q(aux, sigma) = q(aux)q(sigma)
end

# Run inference
@time results = inference(
    model = Model(linreg,X,size(X)...),
    data  = (y = y,),
    constraints = constraints,
    initmessages = (intercept = vague(NormalMeanVariance),),
    initmarginals = (sigma = GammaShapeRate(1.0, 1.0),),
    returnvars   = (sigma = KeepLast(),beta = KeepLast(), aux = KeepLast()),#,y=KeepLast()),
    iterations   = 20,
    warn = true,
    free_energy=true
)

# Plot results
# Note: observe the divergence in the first 50 data points
# It disappears as you increase number of iterations
plot(mean.(results.posteriors[:aux]), ribbon = (results.posteriors[:sigma]|>mean|>inv|>sqrt),label="Fitted")
plot!(y,label="Observed data")

Linear combination is inconsistent / numerically unstable

I have a linear Gaussian state-space model with D separate states evolving according to a random walk,
z_dk ~ N(z_dk-1, 1.0).
That is, a set of D Gaussian random variables instead of a D-dimensional Gaussian random vector. This is a workaround for constraining the states to be independent (constraining the covariance matrix to be diagonal).

My measurement is a linear combination of these individual states:
x_k = \sum_{d=1}^D H_d z_dk.
GraphPPL does not support indexing a column of a randomvar object, i.e., H*z[:,k] and also does not support multiple addition operations during variable definition, i.e., x_k ~ H[1]*z[1,k] + H[1]*z[2,k] + ... As such, I have constructed the linear combination with the following for-loop:

x[1,k] ~ H[1]*z[1,k]
for d = 2:D
    x[d,k] ~ x[d-1,k] + H[d]*z[d,k]
end

where x = randomvar(D,T) and z = randomvar(D,T). So, basically, the rows of x serve as the intermediate variables in the linear combination. The final x[D,k] should be the result I need.

However, inference returns a mismatch between manually taking the linear combination H*mean.(qz[:,k]) and mean(qx[k]).H*z explodes as I increase the number of iterations (which makes me suspect some form of numerical instability). But everything is Gaussian and I should be able to do sum-product message passing here.

It is best explained with an example:

    
    # Preallocate
    z = randomvar(D,T)
    x = randomvar(D,T)
    y = datavar(Float64, T)
    
    ### k = 1        

    # Initial state
    for d = 1:D
        z[d,1] ~ NormalMeanVariance(0.0, 1.0)
    end
    
    # Measurement function
    x[1,1] ~ H[1]*z[1,1]
    for d = 2:D
        x[d,1] ~ x[d-1,1] + H[d]*z[d,1]
    end
    
    # Likelihood
    y[1] ~ NormalMeanVariance(x[D,1], 1.0)
    
    ### k >= 1
    
    for k = 2:T
        
        # State transition
        for d = 1:D
            z[d,k] ~ NormalMeanVariance(z[d,k-1], 1.0)
        end
        
        # Measurement function
        x[1,k] ~ H[1]*z[1,k]
        for d = 2:D
            x[d,k] ~ x[d-1,k] + H[d]*z[d,k]
        end 
        
        # Likelihood
        y[k] ~ NormalMeanVariance(x[D,k], 1.0)
        
    end  
    
    return y, x, z
end

ReactiveMP.make_actor(x::Matrix{ <: RandomVariable }, ::ReactiveMP.KeepLast) = buffer(Marginal, size(x))
ReactiveMP.make_actor(::Matrix{ <: RandomVariable }, ::ReactiveMP.KeepEach) = keep(Matrix{Marginal})


# Test data
T = 100
D = 7
H = randn(7)
y = randn(T)

# Define factorization
constraints = @constraints begin
    q(x,z) = q(x)q(z)
end

# Automatic inference API
results = inference(
    model         = Model(mwe, H, D=D, T=T), 
    data          = (y = y,),
    constraints   = constraints,
    options       = (limit_stack_depth = 100,),
    initmessages  = (
        z = NormalMeanVariance(0.0, 1.0),
        x = NormalMeanVariance(0.0, 1.0)
    ),
    initmarginals = (
        z = NormalMeanVariance(0.0, 1.0),
        x = NormalMeanVariance(0.0, 1.0)
    ),
    returnvars    = (z = KeepLast(), x = KeepLast()), 
    free_energy   = true,
    showprogress  = true,
    iterations    = 100
)

# Extract approximate posteriors
qz = results.posteriors[:z]
qx = results.posteriors[:x]

# Plot difference between H*z and x
plot(1:T, [H'*mean.(qz[:,k]) for k=1:T], color="purple", label="H*qz")
plot!(1:T, [mean(qx[D,k]) for k=1:T], color="green", label="qx")

Versions:

Julia: 1.7.2
GraphPPL: 2.2.0
ReactiveMP: 2.2.0 (dev - master branch)
Rocket: 1.4.0

Extract specific messages for inspection during inference

Do we have a procedure for inspecting specific messages during inference?

In ForneyLab, the step!(data, marginals, messages) functions accepted an empty Array{Message}() that would be populated during inference. That array could be inspected and visualized, which was useful for debugging and education purposes (see for instance here).

I had a quick look through ReactiveMP's API (e.g., https://biaslab.github.io/ReactiveMP.jl/stable/lib/message/) but couldn't find a method for accessing specific messages sent by nodes. I know you can expose individual nodes through node, variable ~ Node(args...) (GraphPPL API), but does that also let you access messages belonging to that node?

I imagine that if we have a reference to a specific message, then we can define an observable that subscribes to it and keeps the messages computed during inference for later inspection.

Propagate log of Categorical distribution parameters?

@bartvanerp

We now propagate vectors with elements in [0, 1], but what if we would instead propagate the log of these elements for better robustness? I think this might also get rid of having to clamp the variables.

Use safe domains for point mass constraints optimisations

Found by @ismailsenoz

We should use stricter lower and upper bounds for various distributions in point mass form constraint optimisation. For example for a Gamma distributions a safe optimisation domain should be probably [ shape, Inf ] where shape refers to the shape of the prior distribution. Now we use [ 0.0, Inf ] and it may cause instability problems.

We need to support custom optimisation domains in the same way we support custom starting optimisation point.

Error if the same variable has been used both in LHS and RHS of tilde expression

Related to make_node function. It may not error though if variable is AutoVar.

Refactor approximation methods

Approximation methods section is a bit outdated in comparison to other parts of the package and has been implemented long ago. It need revision, especially methods like GaussHermite, GaussLaguerre, approximate_meancov and others. We need to come up with a similar interface for all approximation methods or maybe just exploit another package?

Approximation methods and their code is located under src/approximations/ folder:

We use it here:
https://github.com/biaslab/ReactiveMP.jl/blob/c8fab6b4d78f704ab3ba1311005ce6204efdda15/src/nodes/probit.jl#L34
https://github.com/biaslab/ReactiveMP.jl/blob/c7e60f907393bd732933ffb1503af88a38e2e0b0/src/distributions/exp_linear_quadratic.jl#L18
https://github.com/biaslab/ReactiveMP.jl/blob/c7e60f907393bd732933ffb1503af88a38e2e0b0/src/nodes/kernel_gcv.jl#L26
and maybe somewhere else..

Turing.jl benchmarks: LinearGaussianSSM

First I want to say: this is sick package! Nice work!

In benchmarks/notebooks you have a notebook benchmarking ReactiveMP.jl vs. Turing.jl on a Linear Gaussian State-Space Model.

If you make the following changes to the Turing.jl-section, you get a 30X speedup.

First off, don't make the x variable untyped, i.e. instead define the model as

Turing.@model function LinearGaussianSSM(y, A, B, P, Q, ::Type{TV} = Vector{Vector{Float64}}) where {TV}
    n = length(y)

    # State sequence.
    x = TV(undef, n)

    # Observe each point of the input.
    x[1] ~ MvNormal([ 0.0, 0.0 ], 1e1)
    y[1] ~ MvNormal(B * x[1], Q)

    for t in 2:n
        x[t] ~ MvNormal(A * x[t - 1], P)
        y[t] ~ MvNormal(B * x[t], Q)
    end
end;

Secondly, use ReverseMode.jl + Memoization.jl to use reverse-mode AD with the tape compiled, i.e. add:

using ReverseDiff, Memoization
Turing.setadbackend(:reversediff)
Turing.setrdcache(true)

before the sampling statements.

On my end, this brings us to 1s and 2s for the respective.

Of course, ReactiveMP.jl is much faster than is:) But I was reviewing the paper and had a look at the benchmarks, upon which I noticed that the current implementation was somewhat lacking.

And for future reference, these things are described more in detail here: https://turing.ml/dev/docs/using-turing/performancetips.

As a side-note, (from one PPL-dev to another 😉) I'd recommend trying to ping someone from the PPLs you're comparing to when making benchmarks. This way you can ensure that you get a fair comparison 👍 Of course you have to be a bit careful in what you allow, e.g. as I Turing.jl, I could further optimize the heck out this model, but I refrain from doing that as it's not really representative of what your trying to demonstrate/what a regular user would be doing. But I think the change I proposed above falls within "reasonable alterations", as we would expect performance-sensitive users to look at the docs for performance tips.

EDIT: There are also alternative samplers which would be very well-suited for this problem, e.g. Elliptical Slice Sampling (Turing.ESS), which would scale even better than HMC with reverse-mode wrt. number of data-points.

MethodError: no method matching keys(::KeepLast)

This error might occur while using the automatic inference block. For example:

result = inference(
    model         = Model(model), 
    data          = (y = 0.3,),
    returnvars    = (x = KeepLast())
)

Julia: 1.7.2

  [b3f8163a] GraphPPL v2.0.1
  [bdcacae8] LoopVectorization v0.12.108
  [92933f4c] ProgressMeter v1.7.2
  [a194aa59] ReactiveMP v2.0.1
  [df971d30] Rocket v1.3.20

Missing cell in demo Coin Flip Example

I would like to report that Cell [3]:

result = inference(
    model = Model(coin_model, length(dataset)), 
    data  = (y = dataset, )
)

does not run, because dataset is not defined. Thanks.

Generalise Gaussian rules to support all Gaussian parametrisations

Currently nodes from Gaussian distributions family accept messages of the same parametrisation only.
E.g.

@rule NormalMeanPrecision(:μ, Marginalisation) (m_out::NormalMeanPrecision, q_τ::Any) = NormalMeanPrecision(mean(m_out), cholinv( cov(m_out) + cholinv(mean(q_τ)) ))

We need to generalise it with UnivariateNormalDistributionsFamily

Operations on distributions

Are the operations as Add, Subtract, Multiply available only for Gaussian nodes?

Adding API for inference routines

As ReactiveMP.jl uses Rocket.jl under the hood, the inference (unlike model specification) might look a little frightening for new users. Therefore, to make "ReactiveMP.jl" more user-friendly, we must develop clean and flexible APIs for inference procedures.

First, we should notice that every inference procedure comprises a set of routines almost independent of the model specification.
Generally speaking, all inference functions take two primary inputs: data and initial marginals.
data is typically used for supplying priors and observations to the model.
initial marginals speak for themselves. They carry the information about initial marginal distributions used in "non-belief-propagation" inference algorithms.

Typically, all inference functions exhibit the following routine:

Retrieve random variables (RV) from the graphical model.
Initialize buffers for the free-energy (FE) and RVs of interest.
Subscribe to the RVs and FE.
Set marginals according to initial marginals
Run inference by updating observed RVs with data
Unsubscribe and return the quantities of interest, i.e., RVs and FE.

We can't overlook the possibility of automation of this routine.
As a start, we propose to have the following API:

rvs_, fe_ = inference(model, data, marginals, constraints=nothing, options=nothing)

model returns random variables of the graphical model
data and marginals carry, for example, dictionaries needed for the algorithm to run; constraints will serve as a placeholder (at some point, constraints will be separated from the model definition).
options should carry the information about the number of variational iterations, the parameters that inference should return, and the type of the return, e.g., whether the user wants to obtain the posterior of the last VMP iteration or the posteriors over all iterations.

I suggest that we firstly settle on these things:

what is the right way of passing data and marginals?
how to set marginals?
which options should be specified?

FE computation for Categoricals fails when the parameter vector has 0 entries

When running an HMM, FE computation fails due to 0 entries. For example in the model

@model function t_maze(A,D,B,T)
Ac = constvar(A)
z_0 ~ Categorical(D)
z = randomvar(T)
x = datavar(Vector{Float64}, T)
z_prev = z_0
        for t in 1:T
                z[t] ~ Transition(z_prev,B[t])
                x[t] ~ Transition(z[t], Ac)
                z_prev = z[t]
        end
end

We are unable to compute FE if D contains 0's

average energy computation for the Wishart node

I've been looking into the computation of average energy for the Inverse-Wishart node.
So I had a look at the computation of this quantity for Wishart node as well.
It appears to me that the average energy computation for the Wishart node is incorrect, i.e.,

@average_energy Wishart (q_out::Any, q_ν::Any, q_S::Any)

You would normally expect q_out to be Wishart distribution, meaning that we'd need to take the log expectation of q_out, not just logdet(m_q_out) where m_q_out = mean(q_out).

It seems to me that ForneyLab.jl does it properly. Is there something I miss?

Deprecate `dim(d)` with `size(d, 1)`

See JuliaStats/PDMats.jl#170

Feature request: Broadcast distributions / mean-field MvNormal?

At the moment it's challenging to define a mean-field Multivariate Normal distribution (ie, with a diagonal covariance matrix).

Eg, assuming I have a regression y=beta * X + eps, where beta is MvNormal of dimension dim_x
At the moment I would can easily use MvNormal and fit the full covariance matrix

beta  ~ MvNormalMeanPrecision(zeros(dim_x), diageye(dim_x))

for i in 1:n
    y[i] ~ NormalMeanPrecision(dot(X[i,:],beta), 1.)
end

If I wanted a mean-field MvNormal instead (eg, to have faster fitting), I would have to jump through a lot of hoops (eg, unroll dot product etc), which would ultimately give me no benefits

# create a RV with dimension `dim_x`
beta  ~ randomvar(dim_x)

# since I cannot use dot() with randomvar(), I would need to unroll it
# I would need some temporary variable 
# because I cannot accumulate (+=), I would need to save intermediate results into a helper dimension
temp ~ randomvar(n,dim_x)

for i in 1:n
    # unsafe! just to show the needed steps
    # assumes dim_x>1
    temp[i,1]=X[i,1] * beta[1]
    for j in 2:dim_x
        temp[i,j] ~ temp[i,j-1] + X[i,j] * beta[j]
    end
    
    # and the use it for the normal likelihood
    y[i] ~ NormalMeanPrecision(loc[i,dim_x], 1.)
end

As a user, I'd like to define something like

beta ~ randomvar(Normal(0,1),n) 

# or ala Turing when all dists are the same
beta ~ filldist(Normal(0,1),n)
# or if they are different
beta ~ arraydist([Normal(mu,1) for mu in mu_array])

Ideally, this object would then support all the operators like MvNormal (eg, dot product)

It would be helpful when dim_x gets large (eg, 1000s)

distribution parameterization by keyword arguments

This is not really a feature request but rather a discussion item that concerns distributions like NormalMeanVariance(), NormalMeanPrecision() etc.
Rather than

xt_min ~ NormalMeanVariance(xt_min_mean, xt_min_var)

is there a reason why we dont write

xt_min ~ Normal( mean=xt_min_mean, var=xt_min_var)

This way you can also write

xt_min ~ Normal( natural_mean=xt_m, precision=xt_v)

You get the idea. The parameter list determines whether you have the moment or information parameterization scheme.

I think it's cleaner and more consistent with other notation. Eg, we dont write GammaShapeScale but just Gamma for the Gamma distribution and all the exponential family distributions have moment and canonical parameterizations, so the user would not know if you use GammaShapeScale or GammaShapeRate. Here, I would also recommend

Gamma(shape= ..., scale= ...)

to make the chosen parameterization clear.

Question: Can you set returnvars to return KeepLast() from all RVs?

Is it possible to set the keyword returnvars in inference() function to keep only the last distribution from all random variables?

Eg,

results = inference(
    model = Model(m, x),
    data  = (y = y,),
    initmessages = (
        a = vague(NormalMeanVariance)
        ),
    initmarginals = (sigma = GammaShapeRate(1.0, 1.0),),
    
    # a way to KeepLast() for all variables
    returnvars   = KeepLast()
 
)

Add the possibility of blocking updates of the initialized marginals.

When running variational message passing, ReactiveMP.jl updates all random variables.
For prototyping/experimenting, it is sometimes useful to block some of the marginal updates.
For now, I see only one way of enforcing this behavior by substituting randomvar with datavar.
This is option is undesirable as it requires some changes in model/inference specification.

randomvar()

I have what seems to be a trivial issue with randomvar(). The Stable ReactiveMP 2.0 documentation shows many examples in the Model Specification section that use randomvar(). Yet, in the REPL of Atom, this function is not recognized. Instead, there are 12 variations of the function with different signatures. I could use one of these of course, but I am wondering why the an apparent discrepancy between the documentation and the actual methods. Thanks.

GLM with an exponential Link function

Hi,
I am trying to figure out how to convert , as much as possible, a model that are using non-Conjugate Priors.
I am like very much Statistical Rethinking book and resources to guide me on modeling.
For example, how you could represent the following Statistical Rethinking Turing model in ReactiveMP?
It uses a GLM with an exponential Link function.

click here for this Turing Model

Inference on DAG

I am thinking using ReactiveMP.jl for inference of AR-HMM, which requires inference on DAG. So ,my question is, can ReactiveMP handles general DAG.

I'm new to this field, but I did some reading, and now know that there is a special type of message passing algorithm called junction tree algorithm is required to inference on a DAG, rather than normal message passing for a tree. I greeped around the files with the term "junction" and couldn't find the word, but I couldn't determine if inference on a DAG is possible from it due to my shortage of the knowledge, which is why I am asking this question.

Just for an information, I paste a graphical model of AR-HMM. (Stanculescu, Ioan et al, Autoregressive hidden Markov models for the early detection of neonatal sepsis." IEEE journal of biomedical and health informatics 18.5 (2013): 1560-1570. )

Docs should contain list of exported methods

Feedback from users:

It is bad practice (common to many programming languages) to import all methods implicitly, e.g., from numpy import *, as it might silently overwrite a reference. So, users would like to explicitly import ReactiveMP: update!, ... instead of using ReactiveMP.

However, the documentation does not list all methods that ReactiveMP exports. Adding the remainder would be helpful.

Update README

Given the recent (and old) updates of ReactiveMP.jl, the current README does not fully represent what our framework is capable of.

The framework works beyond only conjugate SSMs, so we need to highlight how to achieve inference in these model classes.

Things to change:

Add a short passage on non-conjugate models to README.md
Update video on API

Add SGCV node

We should add the switching gaussian controlled variance SGCV node.

Updating parameters

I was trying to update a parameter and I faced this error:

x = datavar(PointMass{Float64}, ...)' accepts data of type PointMass{Float64}, but Irrational{:π} has been supplied. Check 'update!(x, data::Irrational{:π})' and explicitly convert data to type PointMass{Float64}.

This is how I updated the x parameter with the goal:
goal = [π for t in 1:T]
(x, x_0, z, znodes, u, zcm, zcc) = model.model_variables
update!(x, goal)

And this is my model:

@model function pendulum_controller(T, A, B, Q, C, R, m_u, v_u)

    cA = constvar(A) # Transition matrix
    cB = constvar(B) # Control matrix
    cC = constvar(C) # Observation matrix
    cQ = constvar(Q) # Process noise covariance matrix
    cR = constvar(R) # Measurement noise covariance matrix
    
    # Vector of reference to factor nodes of latent states
    znodes = Vector{FactorNode}(undef, T)
    
    z = randomvar(T)        # Hidden states
    u = randomvar(T)        # Controls
    x = datavar(Float64, T) # Goal observations

    # Parameters of prior state
    zcm = datavar(Vector{Float64})
    zcc  = datavar(Matrix{Float64})
    
    # Prior state
    z_prior ~ MvNormalMeanCovariance(zcm, zcc)
    z_prev = z_prior
    
    # Current observation for filtering
    x_0 = datavar(Float64)
    x_0 ~ NormalMeanVariance(dot(cC, z_prior), cR)

    # Extend the model T steps into the future
    for t in 1:T
        
        # Control prior at t
        u[t] ~ NormalMeanVariance(m_u, v_u)
        
        # State transition at t
        znodes[t], z[t] ~ MvNormalMeanCovariance(cA * z_prev + cB * u[t], cQ)
        
        # Likelihood at t
        x[t] ~ NormalMeanVariance(dot(cC, z[t]), cR)
        
        # Update previous state
        z_prev = z[t]
    end

    return x, x_0, z, znodes, u, zcm, zcc
end

`+` node should use `convolve` from Distributions.jl

help?> convolve
search: convolve ConcurrencyViolationError

  convolve(d1::Distribution, d2::Distribution)

  Convolve two distributions and return the distribution corresponding to the sum of
  independent random variables drawn from the underlying distributions.

  Currently, the function is only defined in cases where the convolution has a closed
  form. More precisely, the function is defined if the distributions of d1 and d2 are
  the same and one of

    •  Bernoulli

    •  Binomial

    •  NegativeBinomial

    •  Geometric

    •  Poisson

    •  Normal

    •  Cauchy

    •  Chisq

    •  Exponential

    •  Gamma

    •  MvNormal

  External links: List of convolutions of probability distributions on Wikipedia
  (https://en.wikipedia.org/wiki/List_of_convolutions_of_probability_distributions)

README not readable in dark mode

In dark mode, the equations in dark mode are not readable.

[](url

)

GCV node needs update

In the current implementation of the GCV node the parameters kappa and omega are not being updated.
Update rules lack mean-field approximation which makes it impossible to connect the output of the GCV node to a pointmass variable.

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Method error hint does not work for structured rules

https://github.com/biaslab/ReactiveMP.jl/blob/master/src/rule.jl#L737

in case of structured rules we do not have a direct index of interface

Constraints on a covariance matrix

I want to specify a linear Gaussian dynamical system with constraints on the process noise covariance matrix.

I am starting out with the constraint that the matrix should be diagonal. For the static case, this can be done with the following MWE model:

@model function diagN(; D=2)
    
    Id = diageye(D)
        
    # Allocate variables
    τ = randomvar(D)
    X = randomvar(D)
    x = randomvar(D)
    y = datavar(Float64)
    
    # Define variances
    for d = 1:D
       τ[d] ~ Gamma(1.0, 1.0) 
    end
    
    # Define first element
    x[1] ~ NormalMeanPrecision(0.0, τ[1]) where { q = MeanField() }

    # Expand to vector
    X[1] ~ Id[:,1]*x[1]
    
    for d = 2:D
        
        # Define later elements
        x[d] ~ NormalMeanPrecision(0.0, τ[d]) where { q = MeanField() }

        # Expanded to vector and accumulate
        X[d] ~ X[d-1] + Id[:,d]*x[d]
        
    end
    
    # Likelihood
    y ~ NormalMeanPrecision(dot(ones(D), X[D]), 1.0) where { q = MeanField() }
        
    return y, x, τ
end

result = inference(
    model         = Model(diagN, D=2), 
    data          = (y = 0.3,),
    options       = (limit_stack_depth = 100,),
    returnvars    = (x = KeepLast(), τ = KeepEach()),
    initmarginals = (x = NormalMeanPrecision(0.0, 1.0),),
    free_energy   = true,
    iterations    = 100, 
    showprogress  = true
)

The multivariate Gaussian X is constructed from a cumulative sum of basis expanded univariate Gaussians x, each with its own Gamma distributed noise precision parameter (the final X[D] is the complete r.v.).

Now, I'd like to extend this to the dynamical setting. I've defined arrays of randomvars with dynamics on the univariate states. I've used the earlier construction to construct a multivariate Gaussian each time-step, which is fed to the likelihood. My MWE model is:

@model function SSM_diagN(; D=2, T=100)
    
    # Aux matrix
    Id = diageye(D)
    
    # Allocate
    Z = randomvar(D,T)
    z = randomvar(D,T)
    y = datavar(Float64, T)
    
    ### k=1
    
    # Define first element
    z[1,1] ~ NormalMeanPrecision(0.0, 1.0) where { q = MeanField() }
    
    # Expand to vector
    Z[1,1] ~ Id[:,1]*z[1,1]

    for d = 2:D

        # Define later elements
        z[d,1] ~ NormalMeanPrecision(0.0, 1.0) where { q = MeanField() }

        # Expand to vector and accumulate
        Z[d,1] ~ Z[d-1,1] + Id[:,d]*z[d,1]

    end

    # Emit observation
    y[1] ~ NormalMeanPrecision(dot(ones(D), Z[D,1]), 1.0) where { q = MeanField() }
    
    ### k>=1
    
    for k = 2:T
        
        # Define first element
        z[1,k] ~ NormalMeanPrecision(z[1,k-1], 1.0) where { q = MeanField() }
        
        # Expand to vector
        Z[1,k] ~ Id[:,1]*z[1,k]

        for d = 2:D

            # Define later elements
            z[d,k] ~ NormalMeanPrecision(z[d,k-1], 1.0) where { q = MeanField() }

            # Expand to vector and accumulate
            Z[d,k] ~ Z[d-1,k] + Id[:,d]*z[d,k]

        end
    
        # Emit observation
        y[k] ~ NormalMeanPrecision(dot(ones(D), Z[D,k]), 1.0) where { q = MeanField() }
        
    end    
    return y, z, Z
end

D = 2
T = 10

result = inference(
    model         = Model(SSM_diagN, D=D, T=T), 
    data          = (y = sin.(1:T),),
    options       = (limit_stack_depth = 100,),
    initmessages  = (z = NormalMeanPrecision(0.0, 1.0), Z = MvNormalMeanPrecision(zeros(D), diageye(D))),
    initmarginals = (z = NormalMeanPrecision(0.0, 1.0), Z = MvNormalMeanPrecision(zeros(D), diageye(D))),
    returnvars    = (z = KeepLast(),),
    free_energy   = true,
    iterations    = 10, 
    showprogress  = true
)

My problem is that I'm apparently feeding a Matrix{Marginal} to a Vector{Marginal} r.v. I'm having trouble seeing where exactly. My stack trace:

Actor of type BufferActor{Marginal, Vector{Marginal}} expects data to be of type Vector{Marginal}, while subscribable of type ProxyObservable{Matrix{Marginal}, Rocket.CollectLatestObservable{Marginal, Matrix{ProxyObservable{Marginal, ReactiveMP.MarginalObservable, Rocket.FilterProxy{ReactiveMP.var"#53#54"}}}, Matrix{Marginal}, typeof(copy)}, Rocket.TapProxy{ReactiveMP.var"#614#615"{ReactiveMP.MarginalHasBeenUpdated}}} produces data of type Matrix{Marginal}.

I've done a few checks.

typeof(Id[:,1]) = Vector{Float64} (alias for Array{Float64, 1})
I've tried various combinations of initmessages and initmarginals.

But I'm at a loss of what to try next. Any suggestions?

Versions:

Julia: 1.7.2
[b3f8163a] GraphPPL v2.0.1
[bdcacae8] LoopVectorization v0.12.108
[92933f4c] ProgressMeter v1.7.2
[a194aa59] ReactiveMP v2.0.1
[df971d30] Rocket v1.3.20

Variables connected to Uninformative node not inferred

I was working on creating a efficient Kalman filter in ReactiveMP and wanted to combine the equality and multiplication node are combined according to Table 4 in (Loeliger, 2007, The factor graph approach to model-based signal processing) as in ReactiveBayes/GraphPPL.jl#28 . The message towards the future then corresponds to the new prior, etc.. In order to terminate the graph I used the Uninformative node to "close" the future. For efficiency it would here not suffice to put an uninformative normal prior on the future, as this would lead to unnecessary inversions.

Currently the Uninformative node is a Deterministic node. As a result, the variables connected to it are constant variables and are not inferred during inference. In my case I call:

x ~ Uninformative()
y ~ Node(x)

so perhaps this particular ordering is something which causes the issues.

A fix would be to convert the Uninformative node to be a Stochastic node. By definition it should not influence the calculations in the rest of the model and therefore I propose to set its average energy to 0. However, this differs from the average energy of Normal priors with a very large variance:
https://github.com/biaslab/ReactiveMP.jl/blob/cee5ccd9d944c7746a9bc7e2a80b9e75bcfc9753/src/nodes/normal_mean_precision.jl#L15-L18
I think that it might be okay to set its average energy to 0 as it truly does not influence inference, contrary to the uninformative gaussian case where the zero mean could be propagated (informatively) using VMP.

Support of message passing for Boolean functions

Currently, ReactiveMP.jl does not support inference in Belief networks composed of Bernoulli random variables (RVs).

One good example of such a network is shown in Chapter 2 of Bishop's Model-Based Machine Learning book.

The dependencies between these variables are enforced by logic functions, e.g., AND, NOT, OR, etc.

ForneyLab.jl has the demo on building AND node.

A related issue was initially opened in ForneyLab.jl.

Implement RequireInboundMarginal

In the same way as we have RequireInbound for messages we may want to add RequireInboundMarginal or smth similar to request marginals on inbound edges for further messages computation.

Networks of discrete random variables

This is more of a question than an issue. (If there's a better place to ask it please let me know.)

I'm hoping to do inference on networks of (mostly) discrete random variables. ReactiveMP sounded like it might be well suited to my applications, because message passing is likely to perform a lot better than the MCMC algorithms that other PPLs tend to use.

However, I can't get some simple examples to work, and I'm wondering whether ReactiveMP can be used for this sort of thing or if I'm just barking up the wrong tree.

Here's the simplest thing I tried, modifying the coin flip example from the getting started page:

@model function coin_model()
    y = datavar(Float64, 1)
    θ ~ Bernoulli(0.5)
    y[1] ~ Bernoulli(θ)
    return y, θ
end

result = inference(
    model = Model(coin_model),
    data  = (y = Vector([1.0]), )
)

Here the 'parameter' θ is drawn from a Bernoulli distribution and y is just a copy of θ. We condition on y=1, so we hope to find that the posterior for θ is also concentrated on 1. However, it gives this error message instead (plus a stack trace):

No analytical rule available to compute a product of distributions Bernoulli{Float64}(p=0.5) and Beta{Float64}(α=2.0, β=1.0).

I understand this error, though it seems that calculating the product of those distributions analytically should be straightforward (you just evaluate the beta distribution at each point in the support of the Bernoulli one). Does this mean this kind of thing isn't supported?

I also tried some variations on this slightly less silly model, where y is a noisy copy of θ instead of an exact copy:

@model function coin_model()
    y = datavar(Float64, 1)
    θ ~ Bernoulli(0.5)
    p ~ 0.2 + (θ*0.6)    
    y[1] ~ Bernoulli(p)
    return y, θ
end

result = inference(
    model = Model(coin_model),
    data  = (y = Vector([1.0]), )
)

but I couldn't get any version of this to work either. (This version doesn't seem to like the +.)

In short my question is whether I can use ReactiveMP to build up networks of discrete random variables and perform inference on them, or is this not supported?

Average energy fails for the multiplication node if scalar is the second argument

Thanks to @svilupp

reactivebayes / reactivemp.jl Goto Github PK

reactivemp.jl's People

Contributors

Stargazers

Watchers

Forkers

reactivemp.jl's Issues

Discussed in https://github.com/biaslab/ReactiveMP.jl/discussions/95

Recommend Projects

Recommend Topics

Recommend Org