reactivebayes / reactivemp.jl Goto Github PK
View Code? Open in Web Editor NEWHigh-performance reactive message-passing based Bayesian inference engine
License: MIT License
High-performance reactive message-passing based Bayesian inference engine
License: MIT License
I'm getting the following StringIndexError after completing the inference procedure:
StringIndexError: invalid index [80], valid nearby indices [79]=>'Λ', [81]=>':'
It occurs while printing the InferenceResult mimestring:
Stacktrace:
[1] string_index_err(s::String, i::Int64)
@ Base .\strings\string.jl:12
[2] SubString{String}(s::String, i::Int64, j::Int64)
@ Base .\strings\substring.jl:32
[3] SubString
@ .\strings\substring.jl:38 [inlined]
[4] SubString
@ .\strings\substring.jl:40 [inlined]
[5] view
@ .\strings\substring.jl:50 [inlined]
[6] show(io::IOContext{IOBuffer}, result::InferenceResult{Dict{Symbol, Array}, Vector{Real}, FactorGraphModel, Tuple{Vector{DataVariable{PointMass{Float64}, Rocket.RecentSubjectInstance{Message{PointMass{Float64}}, Subject{Message{PointMass{Float64}}, AsapScheduler, AsapScheduler}}}}, RandomVariable, Vector{RandomVariable}, Matrix{RandomVariable}, Vector{RandomVariable}}})
@ ReactiveMP C:\Users\kouww\.julia\packages\ReactiveMP\bdzHB\src\inference.jl:90
[7] show
@ .\multimedia.jl:47 [inlined]
[8] limitstringmime(mime::MIME{Symbol("text/plain")}, x::InferenceResult{Dict{Symbol, Array}, Vector{Real}, FactorGraphModel, Tuple{Vector{DataVariable{PointMass{Float64}, Rocket.RecentSubjectInstance{Message{PointMass{Float64}}, Subject{Message{PointMass{Float64}}, AsapScheduler, AsapScheduler}}}}, RandomVariable, Vector{RandomVariable}, Matrix{RandomVariable}, Vector{RandomVariable}}})
@ IJulia C:\Users\kouww\.julia\packages\IJulia\AQu2H\src\inline.jl:43
[9] display_mimestring
@ C:\Users\kouww\.julia\packages\IJulia\AQu2H\src\display.jl:71 [inlined]
[10] display_dict(x::InferenceResult{Dict{Symbol, Array}, Vector{Real}, FactorGraphModel, Tuple{Vector{DataVariable{PointMass{Float64}, Rocket.RecentSubjectInstance{Message{PointMass{Float64}}, Subject{Message{PointMass{Float64}}, AsapScheduler, AsapScheduler}}}}, RandomVariable, Vector{RandomVariable}, Matrix{RandomVariable}, Vector{RandomVariable}}})
@ IJulia C:\Users\kouww\.julia\packages\IJulia\AQu2H\src\display.jl:102
[11] #invokelatest#2
@ .\essentials.jl:716 [inlined]
[12] invokelatest
@ .\essentials.jl:714 [inlined]
[13] execute_request(socket::ZMQ.Socket, msg::IJulia.Msg)
@ IJulia C:\Users\kouww\.julia\packages\IJulia\AQu2H\src\execute_request.jl:112
[14] #invokelatest#2
@ .\essentials.jl:716 [inlined]
[15] invokelatest
@ .\essentials.jl:714 [inlined]
[16] eventloop(socket::ZMQ.Socket)
@ IJulia C:\Users\kouww\.julia\packages\IJulia\AQu2H\src\eventloop.jl:8
[17] (::IJulia.var"#15#18")()
@ IJulia .\task.jl:423`
I can't share the model publicly. This issue is just to archive this bug. Once we find a fix, we'll post it here.
Versions:
Julia: 1.7.2
[336ed68f] CSV v0.10.4
[8f4d0f93] Conda v1.7.0
[a93c6f00] DataFrames v1.3.4
[31c24e10] Distributions v0.25.58
[b3f8163a] GraphPPL v2.0.1
[7073ff75] IJulia v1.23.3
[b964fa9f] LaTeXStrings v1.3.0
[91a5bcdd] Plots v1.29.0
[92933f4c] ProgressMeter v1.7.2
[d330b81b] PyPlot v2.10.0
[a194aa59] ReactiveMP v2.0.3
[295af30f] Revise v3.3.3
[df971d30] Rocket v1.3.22
[37e2e46d] LinearAlgebra
The current implementation of PointMassFormConstraint aka EM-like MP doesn't support multivariate or matrix variate distributions.
There are couple of problems with the Laplace approximation:
This might be a meta flag for Gaussian nodes.
For example by default we might assume Hermittian structure and use meta = AssumeHermittian()
that fallbacks to fastcholesky
.
cholinv(::AssumeHermittian, something) = fastcholesky(something)
In other cases we might prever pinv()
, e.g. meta = AssumeNonHermittian()
cholinv(::AssumeNonHermittian, something) = pinv(something)
We have (tw @bvdmitri ) spotted an issue with running an important sampling strategy around the GammaMixture node.
We proposed this strategy n this paper.
The API for the form constraint has changed i.e.
as = randomvar(nmixtures, prod_constraint = ProdGeneric(), form_constraint = SampleListFormConstraint(rng, 1000, LeftProposal()))
compared to
as = randomvar(nmixtures, prod_strategy = as_prod_strategy, constraint = as_constraint)
Currently Julia simply gives big, somewhat unreadable stacktrace on the following model:
@model function mymodel()
r ~ MvNormalMeanPrecision(..., ...)
...
... ~ r[i] ...
end
We should inform users that indexing of random variables is not allowed. Indexing (with slices included) of constvar
s and possibly datavar
s should be allowed though.
Currently DistProduct
creates a 'product-tree' even if all types of distributions are identical. That leads to a significant (and redundant!) compilation latencies. We should implement similar product linearisation procedure as we already have for logpdf-messages here:
https://github.com/biaslab/ReactiveMP.jl/blob/master/src/distributions/function.jl#L183
Hey everyone!
Thanks for the work put in ReactiveMP and the surrounding projects, it looks very interesting!
This is, perhaps, more of a question than an issue, so please feel free to redirect it to a more suitable platform, if any.
I've been trying to build a form of a Switching Linear Dynamical System (SLDS), which has the transition of a continuous latent variable (x
) conditioned on the value of a categorical latent variable (s
). Attached DAG attempts to illustrates it (note that I am using a continuous auxiliary variable α
to condition x
).
The complete attempt with some details can be found in this notebook, but in the very nutshell:
Transition
to condition α
on s
(much like the emitted value y
in your HMM demo) α ~ Transition(s[t], Tα) # typeof(Tx) == MatrixDirichlet{Float64, Matrix{Float64}}
Tx
to compute µ for the next x
Tx ~ 1 + (-dt) * α # typeof(dt) == Float64
x[t] ~ NormalMeanVariance(Tx * x_prev, σx)
My most naive implementation of the inference silently returns empty buffers and I am clueless how to debug it. The notebook includes a working, slightly simpler form of the model without the latent s
, and this is the only reason I assume I am misusing α ~ Transition
or node generation for Tx
... Please, let me know if there is a piece of documentation I am missing.
Hi guys,
(@ismailsenoz @semihakbayrak @bvdmitri)
How sure we are about the BP message for :var
interface that is associated with variance?
We had a look a few days ago with @bartvanerp and it seemed to us that the normalization of the message is incorrect.
I see there are tests for these rules, but I am still concerned about some missing factors in:
ContinuousUnivariateLogPdf(DomainSets.HalfLine(), (x) -> -log(m_μ_cov + x) / 2 - 1 / (2x) * (m_out_mean - m_μ_mean) ^ 2)
There is an issue with BP message from GaussianMeanVariance
node. The correct version when both :mean
and :out
inbounds messages are Gaussians should be:
(x) -> -log(m_μ_cov + m_out_cov + x) / 2 - 1 / (2*(m_μ_cov + m_out_cov + x)) * (m_out_mean - m_μ_mean) ^ 2)
I am trying to apply ReactiveMP to Winn and Bishop's skill model. I am starting ultra simple and will build up. I just learned to do this in ForneyLab with help from the developers. I hope to do the same here. What I have so far, modeling my efforts to mimic the Coin Flip problem since my model is discrete. Of course I get an error.
using Rocket, GraphPPL, ReactiveMP, Distributions, Random
@model function toy3(n)
y = datavar(n) # 3 questions
csharp ~ Bernoulli(0.5)
sql ~ Bernoulli(0.5)
isCorrect1 ~ Bernoulli(Addnoise(csharp))
isCorrect2 ~ Bernoulli(Addnoise(sql))
return y, isCorrect1, isCorrect2
end
dataset = [1., 0.]
result = inference(
model = Model(toy3, length(dataset)),
data = (y = dataset, )
)
Error:
lid type object was expected but '2' has been found
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:33
[2] ensure_type(x::Int64)
@ GraphPPL ~/.julia/packages/GraphPPL/KNm3q/src/model.jl:22
[3] macro expansion
@ ./In[56]:4 [inlined]
[4] macro expansion
@ ~/.julia/packages/GraphPPL/KNm3q/src/model.jl:347 [inlined]
[5] create_model(::Type{toy3}, toy3constraints_in#451::Nothing, toy3meta_in#453::Nothing, toy3options_in#455::NamedTuple{(), Tuple{}}, n::Int64)
@ Main ~/.julia/packages/GraphPPL/KNm3q/src/backends/reactivemp.jl:73
[6] create_model(generator::ReactiveMP.ModelGenerator{toy3, Tuple{Int64}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, Nothing, Nothing, Nothing}, constraints::Nothing, meta::Nothing, options::NamedTuple{(), Tuple{}})
@ ReactiveMP ~/src/2022/ReactiveMP.jl/src/model.jl:42
[7] inference(; model::ReactiveMP.ModelGenerator{toy3, Tuple{Int64}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, Nothing, Nothing, Nothing}, data::NamedTuple{(:y,), Tuple{Vector{Float64}}}, initmarginals::Nothing, initmessages::Nothing, constraints::Nothing, meta::Nothing, options::NamedTuple{(), Tuple{}}, returnvars::Nothing, iterations::Int64, free_energy::Bool, showprogress::Bool, callbacks::Nothing, warn::Bool)
@ ReactiveMP ~/src/2022/ReactiveMP.jl/src/inference.jl:274
[8] top-level scope
@ In[57]:3
[9] eval
@ ./boot.jl:373 [inlined]
[10] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)
@ Base ./loading.jl:1196
Obviously, there is some basic concept I do not understand, although I thought I did. What could have gone amiss? Thanks for any input!
Gordon
inference
function should check and report at what node or edge Free Energy computation fails with NaN
As discussed, there is an odd artefact where the fit gets great almost immediately across the model except for the beginning, which then takes up to 20x iterations to converge (linked more to the array size).
Result with 20 iterations (observe the beginning):
Result with 365 iterations (=N
)
MWE for a simple curve-fitting scenario
# Generate data to fit
time_max=365
noise_scale=0.05
time_index=collect(1:time_max)/time_max
p=180/time_max
growth=2
offset=0
y=offset .+ sin.(time_index*2π/p) .+ growth.*time_index .+rand(Normal(0,noise_scale),time_max)
plot(y)
# Generate splines to approximate the function
# Note: boundary knots are important to be outside of the needed range to avoid a row of all zeros (which breaks the backprop)
X=Splines2.bs(time_index,df=10,boundary_knots=(-0.01,1.01));
# Build the model
@model function linreg(X,n,dim_x)
T=Float64
y = datavar(T, n)
aux = randomvar(n)
sigma ~ GammaShapeRate(1.0, 1.0)
intercept ~ NormalMeanVariance(0.0, 2.0)
beta ~ MvNormalMeanPrecision(zeros(dim_x), diageye(dim_x))
for i in 1:n
aux[i] ~ intercept + dot(X[i,:], beta)
y[i] ~ NormalMeanPrecision(aux[i], sigma)
end
return beta,aux,y
end
constraints = @constraints begin
q(aux, sigma) = q(aux)q(sigma)
end
# Run inference
@time results = inference(
model = Model(linreg,X,size(X)...),
data = (y = y,),
constraints = constraints,
initmessages = (intercept = vague(NormalMeanVariance),),
initmarginals = (sigma = GammaShapeRate(1.0, 1.0),),
returnvars = (sigma = KeepLast(),beta = KeepLast(), aux = KeepLast()),#,y=KeepLast()),
iterations = 20,
warn = true,
free_energy=true
)
# Plot results
# Note: observe the divergence in the first 50 data points
# It disappears as you increase number of iterations
plot(mean.(results.posteriors[:aux]), ribbon = (results.posteriors[:sigma]|>mean|>inv|>sqrt),label="Fitted")
plot!(y,label="Observed data")
I have a linear Gaussian state-space model with D
separate states evolving according to a random walk,
z_dk ~ N(z_dk-1, 1.0)
.
That is, a set of D
Gaussian random variables instead of a D
-dimensional Gaussian random vector. This is a workaround for constraining the states to be independent (constraining the covariance matrix to be diagonal).
My measurement is a linear combination of these individual states:
x_k = \sum_{d=1}^D H_d z_dk
.
GraphPPL does not support indexing a column of a randomvar object, i.e., H*z[:,k]
and also does not support multiple addition operations during variable definition, i.e., x_k ~ H[1]*z[1,k] + H[1]*z[2,k] + ..
. As such, I have constructed the linear combination with the following for-loop:
x[1,k] ~ H[1]*z[1,k]
for d = 2:D
x[d,k] ~ x[d-1,k] + H[d]*z[d,k]
end
where x = randomvar(D,T)
and z = randomvar(D,T)
. So, basically, the rows of x
serve as the intermediate variables in the linear combination. The final x[D,k]
should be the result I need.
However, inference
returns a mismatch between manually taking the linear combination H*mean.(qz[:,k])
and mean(qx[k])
.H*z
explodes as I increase the number of iterations (which makes me suspect some form of numerical instability). But everything is Gaussian and I should be able to do sum-product message passing here.
It is best explained with an example:
# Preallocate
z = randomvar(D,T)
x = randomvar(D,T)
y = datavar(Float64, T)
### k = 1
# Initial state
for d = 1:D
z[d,1] ~ NormalMeanVariance(0.0, 1.0)
end
# Measurement function
x[1,1] ~ H[1]*z[1,1]
for d = 2:D
x[d,1] ~ x[d-1,1] + H[d]*z[d,1]
end
# Likelihood
y[1] ~ NormalMeanVariance(x[D,1], 1.0)
### k >= 1
for k = 2:T
# State transition
for d = 1:D
z[d,k] ~ NormalMeanVariance(z[d,k-1], 1.0)
end
# Measurement function
x[1,k] ~ H[1]*z[1,k]
for d = 2:D
x[d,k] ~ x[d-1,k] + H[d]*z[d,k]
end
# Likelihood
y[k] ~ NormalMeanVariance(x[D,k], 1.0)
end
return y, x, z
end
ReactiveMP.make_actor(x::Matrix{ <: RandomVariable }, ::ReactiveMP.KeepLast) = buffer(Marginal, size(x))
ReactiveMP.make_actor(::Matrix{ <: RandomVariable }, ::ReactiveMP.KeepEach) = keep(Matrix{Marginal})
# Test data
T = 100
D = 7
H = randn(7)
y = randn(T)
# Define factorization
constraints = @constraints begin
q(x,z) = q(x)q(z)
end
# Automatic inference API
results = inference(
model = Model(mwe, H, D=D, T=T),
data = (y = y,),
constraints = constraints,
options = (limit_stack_depth = 100,),
initmessages = (
z = NormalMeanVariance(0.0, 1.0),
x = NormalMeanVariance(0.0, 1.0)
),
initmarginals = (
z = NormalMeanVariance(0.0, 1.0),
x = NormalMeanVariance(0.0, 1.0)
),
returnvars = (z = KeepLast(), x = KeepLast()),
free_energy = true,
showprogress = true,
iterations = 100
)
# Extract approximate posteriors
qz = results.posteriors[:z]
qx = results.posteriors[:x]
# Plot difference between H*z and x
plot(1:T, [H'*mean.(qz[:,k]) for k=1:T], color="purple", label="H*qz")
plot!(1:T, [mean(qx[D,k]) for k=1:T], color="green", label="qx")
Versions:
Do we have a procedure for inspecting specific messages during inference?
In ForneyLab, the step!(data, marginals, messages)
functions accepted an empty Array{Message}()
that would be populated during inference. That array could be inspected and visualized, which was useful for debugging and education purposes (see for instance here).
I had a quick look through ReactiveMP's API (e.g., https://biaslab.github.io/ReactiveMP.jl/stable/lib/message/) but couldn't find a method for accessing specific messages sent by nodes. I know you can expose individual nodes through node, variable ~ Node(args...)
(GraphPPL API), but does that also let you access messages belonging to that node?
I imagine that if we have a reference to a specific message, then we can define an observable that subscribes to it and keeps the messages computed during inference for later inspection.
We now propagate vectors with elements in [0, 1], but what if we would instead propagate the log of these elements for better robustness? I think this might also get rid of having to clamp the variables.
Found by @ismailsenoz
We should use stricter lower and upper bounds for various distributions in point mass form constraint optimisation. For example for a Gamma
distributions a safe optimisation domain should be probably [ shape, Inf ]
where shape
refers to the shape
of the prior distribution. Now we use [ 0.0, Inf ]
and it may cause instability problems.
We need to support custom optimisation domains in the same way we support custom starting optimisation point.
Related to make_node
function. It may not error though if variable is AutoVar
.
Approximation methods section is a bit outdated in comparison to other parts of the package and has been implemented long ago. It need revision, especially methods like GaussHermite
, GaussLaguerre
, approximate_meancov
and others. We need to come up with a similar interface for all approximation methods or maybe just exploit another package?
Approximation methods and their code is located under src/approximations/
folder:
We use it here:
https://github.com/biaslab/ReactiveMP.jl/blob/c8fab6b4d78f704ab3ba1311005ce6204efdda15/src/nodes/probit.jl#L34
https://github.com/biaslab/ReactiveMP.jl/blob/c7e60f907393bd732933ffb1503af88a38e2e0b0/src/distributions/exp_linear_quadratic.jl#L18
https://github.com/biaslab/ReactiveMP.jl/blob/c7e60f907393bd732933ffb1503af88a38e2e0b0/src/nodes/kernel_gcv.jl#L26
and maybe somewhere else..
First I want to say: this is sick package! Nice work!
In benchmarks/notebooks
you have a notebook benchmarking ReactiveMP.jl vs. Turing.jl on a Linear Gaussian State-Space Model.
If you make the following changes to the Turing.jl-section, you get a 30X speedup.
First off, don't make the x
variable untyped, i.e. instead define the model as
Turing.@model function LinearGaussianSSM(y, A, B, P, Q, ::Type{TV} = Vector{Vector{Float64}}) where {TV}
n = length(y)
# State sequence.
x = TV(undef, n)
# Observe each point of the input.
x[1] ~ MvNormal([ 0.0, 0.0 ], 1e1)
y[1] ~ MvNormal(B * x[1], Q)
for t in 2:n
x[t] ~ MvNormal(A * x[t - 1], P)
y[t] ~ MvNormal(B * x[t], Q)
end
end;
Secondly, use ReverseMode.jl + Memoization.jl to use reverse-mode AD with the tape compiled, i.e. add:
using ReverseDiff, Memoization
Turing.setadbackend(:reversediff)
Turing.setrdcache(true)
before the sampling statements.
On my end, this brings us to 1s and 2s for the respective.
Of course, ReactiveMP.jl is much faster than is:) But I was reviewing the paper and had a look at the benchmarks, upon which I noticed that the current implementation was somewhat lacking.
And for future reference, these things are described more in detail here: https://turing.ml/dev/docs/using-turing/performancetips.
As a side-note, (from one PPL-dev to another 😉) I'd recommend trying to ping someone from the PPLs you're comparing to when making benchmarks. This way you can ensure that you get a fair comparison 👍 Of course you have to be a bit careful in what you allow, e.g. as I Turing.jl, I could further optimize the heck out this model, but I refrain from doing that as it's not really representative of what your trying to demonstrate/what a regular user would be doing. But I think the change I proposed above falls within "reasonable alterations", as we would expect performance-sensitive users to look at the docs for performance tips.
EDIT: There are also alternative samplers which would be very well-suited for this problem, e.g. Elliptical Slice Sampling (Turing.ESS
), which would scale even better than HMC
with reverse-mode wrt. number of data-points.
This error might occur while using the automatic inference block. For example:
result = inference(
model = Model(model),
data = (y = 0.3,),
returnvars = (x = KeepLast())
)
Julia: 1.7.2
[b3f8163a] GraphPPL v2.0.1
[bdcacae8] LoopVectorization v0.12.108
[92933f4c] ProgressMeter v1.7.2
[a194aa59] ReactiveMP v2.0.1
[df971d30] Rocket v1.3.20
I would like to report that Cell [3]:
result = inference(
model = Model(coin_model, length(dataset)),
data = (y = dataset, )
)
does not run, because dataset
is not defined. Thanks.
Currently nodes from Gaussian distributions family accept messages of the same parametrisation only.
E.g.
@rule NormalMeanPrecision(:μ, Marginalisation) (m_out::NormalMeanPrecision, q_τ::Any) = NormalMeanPrecision(mean(m_out), cholinv( cov(m_out) + cholinv(mean(q_τ)) ))
We need to generalise it with UnivariateNormalDistributionsFamily
Are the operations as Add, Subtract, Multiply available only for Gaussian nodes?
As ReactiveMP.jl
uses Rocket.jl
under the hood, the inference (unlike model specification) might look a little frightening for new users. Therefore, to make "ReactiveMP.jl" more user-friendly, we must develop clean and flexible APIs for inference procedures.
First, we should notice that every inference procedure comprises a set of routines almost independent of the model specification.
Generally speaking, all inference functions take two primary inputs: data
and initial marginals
.
data
is typically used for supplying priors and observations to the model.
initial marginals
speak for themselves. They carry the information about initial marginal distributions used in "non-belief-propagation" inference algorithms.
Typically, all inference functions exhibit the following routine:
initial marginals
We can't overlook the possibility of automation of this routine.
As a start, we propose to have the following API:
rvs_, fe_ = inference(model, data, marginals, constraints=nothing, options=nothing)
model
returns random variables of the graphical model
data
and marginals
carry, for example, dictionaries needed for the algorithm to run; constraints will serve as a placeholder (at some point, constraints will be separated from the model definition).
options
should carry the information about the number of variational iterations, the parameters that inference should return, and the type of the return, e.g., whether the user wants to obtain the posterior of the last VMP iteration or the posteriors over all iterations.
I suggest that we firstly settle on these things:
When running an HMM, FE computation fails due to 0 entries. For example in the model
@model function t_maze(A,D,B,T)
Ac = constvar(A)
z_0 ~ Categorical(D)
z = randomvar(T)
x = datavar(Vector{Float64}, T)
z_prev = z_0
for t in 1:T
z[t] ~ Transition(z_prev,B[t])
x[t] ~ Transition(z[t], Ac)
z_prev = z[t]
end
end
We are unable to compute FE if D contains 0's
I've been looking into the computation of average energy for the Inverse-Wishart
node.
So I had a look at the computation of this quantity for Wishart
node as well.
It appears to me that the average energy computation for the Wishart
node is incorrect, i.e.,
@average_energy Wishart (q_out::Any, q_ν::Any, q_S::Any)
You would normally expect q_out
to be Wishart distribution, meaning that we'd need to take the log expectation of q_out
, not just logdet(m_q_out)
where m_q_out = mean(q_out)
.
It seems to me that ForneyLab.jl does it properly. Is there something I miss?
At the moment it's challenging to define a mean-field Multivariate Normal distribution (ie, with a diagonal covariance matrix).
Eg, assuming I have a regression y=beta * X + eps
, where beta is MvNormal of dimension dim_x
At the moment I would can easily use MvNormal and fit the full covariance matrix
beta ~ MvNormalMeanPrecision(zeros(dim_x), diageye(dim_x))
for i in 1:n
y[i] ~ NormalMeanPrecision(dot(X[i,:],beta), 1.)
end
If I wanted a mean-field MvNormal instead (eg, to have faster fitting), I would have to jump through a lot of hoops (eg, unroll dot product etc), which would ultimately give me no benefits
# create a RV with dimension `dim_x`
beta ~ randomvar(dim_x)
# since I cannot use dot() with randomvar(), I would need to unroll it
# I would need some temporary variable
# because I cannot accumulate (+=), I would need to save intermediate results into a helper dimension
temp ~ randomvar(n,dim_x)
for i in 1:n
# unsafe! just to show the needed steps
# assumes dim_x>1
temp[i,1]=X[i,1] * beta[1]
for j in 2:dim_x
temp[i,j] ~ temp[i,j-1] + X[i,j] * beta[j]
end
# and the use it for the normal likelihood
y[i] ~ NormalMeanPrecision(loc[i,dim_x], 1.)
end
As a user, I'd like to define something like
beta ~ randomvar(Normal(0,1),n)
# or ala Turing when all dists are the same
beta ~ filldist(Normal(0,1),n)
# or if they are different
beta ~ arraydist([Normal(mu,1) for mu in mu_array])
Ideally, this object would then support all the operators like MvNormal (eg, dot product)
It would be helpful when dim_x
gets large (eg, 1000s)
This is not really a feature request but rather a discussion item that concerns distributions like NormalMeanVariance()
, NormalMeanPrecision()
etc.
Rather than
xt_min ~ NormalMeanVariance(xt_min_mean, xt_min_var)
is there a reason why we dont write
xt_min ~ Normal( mean=xt_min_mean, var=xt_min_var)
This way you can also write
xt_min ~ Normal( natural_mean=xt_m, precision=xt_v)
You get the idea. The parameter list determines whether you have the moment or information parameterization scheme.
I think it's cleaner and more consistent with other notation. Eg, we dont write GammaShapeScale
but just Gamma
for the Gamma distribution and all the exponential family distributions have moment and canonical parameterizations, so the user would not know if you use GammaShapeScale
or GammaShapeRate
. Here, I would also recommend
Gamma(shape= ..., scale= ...)
to make the chosen parameterization clear.
Is it possible to set the keyword returnvars
in inference()
function to keep only the last distribution from all random variables?
Eg,
results = inference(
model = Model(m, x),
data = (y = y,),
initmessages = (
a = vague(NormalMeanVariance)
),
initmarginals = (sigma = GammaShapeRate(1.0, 1.0),),
# a way to KeepLast() for all variables
returnvars = KeepLast()
)
When running variational message passing, ReactiveMP.jl
updates all random variables.
For prototyping/experimenting, it is sometimes useful to block some of the marginal updates.
For now, I see only one way of enforcing this behavior by substituting randomvar
with datavar
.
This is option is undesirable as it requires some changes in model/inference specification.
I have what seems to be a trivial issue with randomvar()
. The Stable ReactiveMP 2.0 documentation shows many examples in the Model Specification section that use randomvar()
. Yet, in the REPL of Atom, this function is not recognized. Instead, there are 12 variations of the function with different signatures. I could use one of these of course, but I am wondering why the an apparent discrepancy between the documentation and the actual methods. Thanks.
Hi,
I am trying to figure out how to convert , as much as possible, a model that are using non-Conjugate Priors.
I am like very much Statistical Rethinking book and resources to guide me on modeling.
For example, how you could represent the following Statistical Rethinking Turing model in ReactiveMP?
It uses a GLM with an exponential Link function.
I am thinking using ReactiveMP.jl
for inference of AR-HMM, which requires inference on DAG. So ,my question is, can ReactiveMP handles general DAG.
I'm new to this field, but I did some reading, and now know that there is a special type of message passing algorithm called junction tree algorithm is required to inference on a DAG, rather than normal message passing for a tree. I greeped around the files with the term "junction" and couldn't find the word, but I couldn't determine if inference on a DAG is possible from it due to my shortage of the knowledge, which is why I am asking this question.
Just for an information, I paste a graphical model of AR-HMM. (Stanculescu, Ioan et al, Autoregressive hidden Markov models for the early detection of neonatal sepsis." IEEE journal of biomedical and health informatics 18.5 (2013): 1560-1570. )
Feedback from users:
It is bad practice (common to many programming languages) to import all methods implicitly, e.g., from numpy import *
, as it might silently overwrite a reference. So, users would like to explicitly import ReactiveMP: update!, ...
instead of using ReactiveMP
.
However, the documentation does not list all methods that ReactiveMP exports. Adding the remainder would be helpful.
Given the recent (and old) updates of ReactiveMP.jl
, the current README does not fully represent what our framework is capable of.
The framework works beyond only conjugate SSMs, so we need to highlight how to achieve inference in these model classes.
Things to change:
We should add the switching gaussian controlled variance SGCV
node.
I was trying to update a parameter and I faced this error:
x = datavar(PointMass{Float64}, ...)' accepts data of type PointMass{Float64}, but Irrational{:π} has been supplied. Check 'update!(x, data::Irrational{:π})' and explicitly convert data to type PointMass{Float64}.
This is how I updated the x parameter with the goal:
goal = [π for t in 1:T]
(x, x_0, z, znodes, u, zcm, zcc) = model.model_variables
update!(x, goal)
And this is my model:
@model function pendulum_controller(T, A, B, Q, C, R, m_u, v_u)
cA = constvar(A) # Transition matrix
cB = constvar(B) # Control matrix
cC = constvar(C) # Observation matrix
cQ = constvar(Q) # Process noise covariance matrix
cR = constvar(R) # Measurement noise covariance matrix
# Vector of reference to factor nodes of latent states
znodes = Vector{FactorNode}(undef, T)
z = randomvar(T) # Hidden states
u = randomvar(T) # Controls
x = datavar(Float64, T) # Goal observations
# Parameters of prior state
zcm = datavar(Vector{Float64})
zcc = datavar(Matrix{Float64})
# Prior state
z_prior ~ MvNormalMeanCovariance(zcm, zcc)
z_prev = z_prior
# Current observation for filtering
x_0 = datavar(Float64)
x_0 ~ NormalMeanVariance(dot(cC, z_prior), cR)
# Extend the model T steps into the future
for t in 1:T
# Control prior at t
u[t] ~ NormalMeanVariance(m_u, v_u)
# State transition at t
znodes[t], z[t] ~ MvNormalMeanCovariance(cA * z_prev + cB * u[t], cQ)
# Likelihood at t
x[t] ~ NormalMeanVariance(dot(cC, z[t]), cR)
# Update previous state
z_prev = z[t]
end
return x, x_0, z, znodes, u, zcm, zcc
end
help?> convolve
search: convolve ConcurrencyViolationError
convolve(d1::Distribution, d2::Distribution)
Convolve two distributions and return the distribution corresponding to the sum of
independent random variables drawn from the underlying distributions.
Currently, the function is only defined in cases where the convolution has a closed
form. More precisely, the function is defined if the distributions of d1 and d2 are
the same and one of
• Bernoulli
• Binomial
• NegativeBinomial
• Geometric
• Poisson
• Normal
• Cauchy
• Chisq
• Exponential
• Gamma
• MvNormal
External links: List of convolutions of probability distributions on Wikipedia
(https://en.wikipedia.org/wiki/List_of_convolutions_of_probability_distributions)
In the current implementation of the GCV
node the parameters kappa
and omega
are not being updated.
Update rules lack mean-field approximation which makes it impossible to connect the output of the GCV
node to a pointmass variable.
This issue is used to trigger TagBot; feel free to unsubscribe.
If you haven't already, you should update your TagBot.yml
to include issue comment triggers.
Please see this post on Discourse for instructions and more details.
If you'd like for me to do this for you, comment TagBot fix
on this issue.
I'll open a PR within a few hours, please be patient!
https://github.com/biaslab/ReactiveMP.jl/blob/master/src/rule.jl#L737
in case of structured rules we do not have a direct index of interface
I want to specify a linear Gaussian dynamical system with constraints on the process noise covariance matrix.
I am starting out with the constraint that the matrix should be diagonal. For the static case, this can be done with the following MWE model:
@model function diagN(; D=2)
Id = diageye(D)
# Allocate variables
τ = randomvar(D)
X = randomvar(D)
x = randomvar(D)
y = datavar(Float64)
# Define variances
for d = 1:D
τ[d] ~ Gamma(1.0, 1.0)
end
# Define first element
x[1] ~ NormalMeanPrecision(0.0, τ[1]) where { q = MeanField() }
# Expand to vector
X[1] ~ Id[:,1]*x[1]
for d = 2:D
# Define later elements
x[d] ~ NormalMeanPrecision(0.0, τ[d]) where { q = MeanField() }
# Expanded to vector and accumulate
X[d] ~ X[d-1] + Id[:,d]*x[d]
end
# Likelihood
y ~ NormalMeanPrecision(dot(ones(D), X[D]), 1.0) where { q = MeanField() }
return y, x, τ
end
result = inference(
model = Model(diagN, D=2),
data = (y = 0.3,),
options = (limit_stack_depth = 100,),
returnvars = (x = KeepLast(), τ = KeepEach()),
initmarginals = (x = NormalMeanPrecision(0.0, 1.0),),
free_energy = true,
iterations = 100,
showprogress = true
)
The multivariate Gaussian X
is constructed from a cumulative sum of basis expanded univariate Gaussians x
, each with its own Gamma distributed noise precision parameter (the final X[D]
is the complete r.v.).
Now, I'd like to extend this to the dynamical setting. I've defined arrays of randomvars with dynamics on the univariate states. I've used the earlier construction to construct a multivariate Gaussian each time-step, which is fed to the likelihood. My MWE model is:
@model function SSM_diagN(; D=2, T=100)
# Aux matrix
Id = diageye(D)
# Allocate
Z = randomvar(D,T)
z = randomvar(D,T)
y = datavar(Float64, T)
### k=1
# Define first element
z[1,1] ~ NormalMeanPrecision(0.0, 1.0) where { q = MeanField() }
# Expand to vector
Z[1,1] ~ Id[:,1]*z[1,1]
for d = 2:D
# Define later elements
z[d,1] ~ NormalMeanPrecision(0.0, 1.0) where { q = MeanField() }
# Expand to vector and accumulate
Z[d,1] ~ Z[d-1,1] + Id[:,d]*z[d,1]
end
# Emit observation
y[1] ~ NormalMeanPrecision(dot(ones(D), Z[D,1]), 1.0) where { q = MeanField() }
### k>=1
for k = 2:T
# Define first element
z[1,k] ~ NormalMeanPrecision(z[1,k-1], 1.0) where { q = MeanField() }
# Expand to vector
Z[1,k] ~ Id[:,1]*z[1,k]
for d = 2:D
# Define later elements
z[d,k] ~ NormalMeanPrecision(z[d,k-1], 1.0) where { q = MeanField() }
# Expand to vector and accumulate
Z[d,k] ~ Z[d-1,k] + Id[:,d]*z[d,k]
end
# Emit observation
y[k] ~ NormalMeanPrecision(dot(ones(D), Z[D,k]), 1.0) where { q = MeanField() }
end
return y, z, Z
end
D = 2
T = 10
result = inference(
model = Model(SSM_diagN, D=D, T=T),
data = (y = sin.(1:T),),
options = (limit_stack_depth = 100,),
initmessages = (z = NormalMeanPrecision(0.0, 1.0), Z = MvNormalMeanPrecision(zeros(D), diageye(D))),
initmarginals = (z = NormalMeanPrecision(0.0, 1.0), Z = MvNormalMeanPrecision(zeros(D), diageye(D))),
returnvars = (z = KeepLast(),),
free_energy = true,
iterations = 10,
showprogress = true
)
My problem is that I'm apparently feeding a Matrix{Marginal}
to a Vector{Marginal}
r.v. I'm having trouble seeing where exactly. My stack trace:
Actor of type BufferActor{Marginal, Vector{Marginal}} expects data to be of type Vector{Marginal}, while subscribable of type ProxyObservable{Matrix{Marginal}, Rocket.CollectLatestObservable{Marginal, Matrix{ProxyObservable{Marginal, ReactiveMP.MarginalObservable, Rocket.FilterProxy{ReactiveMP.var"#53#54"}}}, Matrix{Marginal}, typeof(copy)}, Rocket.TapProxy{ReactiveMP.var"#614#615"{ReactiveMP.MarginalHasBeenUpdated}}} produces data of type Matrix{Marginal}.
I've done a few checks.
typeof(Id[:,1])
= Vector{Float64} (alias for Array{Float64, 1})
initmessages
and initmarginals
.But I'm at a loss of what to try next. Any suggestions?
Versions:
Julia: 1.7.2
[b3f8163a] GraphPPL v2.0.1
[bdcacae8] LoopVectorization v0.12.108
[92933f4c] ProgressMeter v1.7.2
[a194aa59] ReactiveMP v2.0.1
[df971d30] Rocket v1.3.20
I was working on creating a efficient Kalman filter in ReactiveMP and wanted to combine the equality and multiplication node are combined according to Table 4 in (Loeliger, 2007, The factor graph approach to model-based signal processing) as in ReactiveBayes/GraphPPL.jl#28 . The message towards the future then corresponds to the new prior, etc.. In order to terminate the graph I used the Uninformative
node to "close" the future. For efficiency it would here not suffice to put an uninformative normal prior on the future, as this would lead to unnecessary inversions.
Currently the Uninformative
node is a Deterministic
node. As a result, the variables connected to it are constant variables and are not inferred during inference. In my case I call:
x ~ Uninformative()
y ~ Node(x)
so perhaps this particular ordering is something which causes the issues.
A fix would be to convert the Uninformative
node to be a Stochastic
node. By definition it should not influence the calculations in the rest of the model and therefore I propose to set its average energy to 0. However, this differs from the average energy of Normal priors with a very large variance:
https://github.com/biaslab/ReactiveMP.jl/blob/cee5ccd9d944c7746a9bc7e2a80b9e75bcfc9753/src/nodes/normal_mean_precision.jl#L15-L18
I think that it might be okay to set its average energy to 0 as it truly does not influence inference, contrary to the uninformative gaussian case where the zero mean could be propagated (informatively) using VMP.
Currently, ReactiveMP.jl
does not support inference in Belief networks composed of Bernoulli random variables (RVs).
One good example of such a network is shown in Chapter 2 of Bishop's Model-Based Machine Learning book.
The dependencies between these variables are enforced by logic functions, e.g., AND
, NOT
, OR
, etc.
ForneyLab.jl
has the demo on building AND
node.
A related issue was initially opened in ForneyLab.jl
.
In the same way as we have RequireInbound
for messages we may want to add RequireInboundMarginal
or smth similar to request marginals on inbound edges for further messages computation.
This is more of a question than an issue. (If there's a better place to ask it please let me know.)
I'm hoping to do inference on networks of (mostly) discrete random variables. ReactiveMP sounded like it might be well suited to my applications, because message passing is likely to perform a lot better than the MCMC algorithms that other PPLs tend to use.
However, I can't get some simple examples to work, and I'm wondering whether ReactiveMP can be used for this sort of thing or if I'm just barking up the wrong tree.
Here's the simplest thing I tried, modifying the coin flip example from the getting started page:
@model function coin_model()
y = datavar(Float64, 1)
θ ~ Bernoulli(0.5)
y[1] ~ Bernoulli(θ)
return y, θ
end
result = inference(
model = Model(coin_model),
data = (y = Vector([1.0]), )
)
Here the 'parameter' θ is drawn from a Bernoulli distribution and y is just a copy of θ. We condition on y=1, so we hope to find that the posterior for θ is also concentrated on 1. However, it gives this error message instead (plus a stack trace):
No analytical rule available to compute a product of distributions Bernoulli{Float64}(p=0.5) and Beta{Float64}(α=2.0, β=1.0).
I understand this error, though it seems that calculating the product of those distributions analytically should be straightforward (you just evaluate the beta distribution at each point in the support of the Bernoulli one). Does this mean this kind of thing isn't supported?
I also tried some variations on this slightly less silly model, where y is a noisy copy of θ instead of an exact copy:
@model function coin_model()
y = datavar(Float64, 1)
θ ~ Bernoulli(0.5)
p ~ 0.2 + (θ*0.6)
y[1] ~ Bernoulli(p)
return y, θ
end
result = inference(
model = Model(coin_model),
data = (y = Vector([1.0]), )
)
but I couldn't get any version of this to work either. (This version doesn't seem to like the +
.)
In short my question is whether I can use ReactiveMP to build up networks of discrete random variables and perform inference on them, or is this not supported?
Thanks to @svilupp
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.