Giter Site home page Giter Site logo

ecmerkle / blavaan Goto Github PK

View Code? Open in Web Editor NEW
77.0 11.0 22.0 261.95 MB

An R package for Bayesian structural equation modeling

Home Page: https://ecmerkle.github.io/blavaan

R 83.48% Stan 14.22% C++ 0.01% TeX 2.29%
structural-equation-modeling latent-variables factor-analysis growth-curve-models path-analysis multilevel-models multivariate-analysis psychometrics missing-data bayesian-statistics

blavaan's Introduction

blavaan

R build status

blavaan is a free, open source R package for Bayesian latent variable analysis. It relies on JAGS and Stan to estimate models via MCMC.

The blavaan functions and syntax are similar to lavaan. For example, consider the Political Democracy example from Bollen (1989):

library(lavaan) # for the PoliticalDemocracy data
library(blavaan)

model <- '
   # latent variable definitions
     ind60 =~ x1 + x2 + x3
     dem60 =~ y1 + y2 + y3 + y4
     dem65 =~ y5 + y6 + y7 + y8
   # regressions
     dem60 ~ ind60
     dem65 ~ ind60 + dem60
   # residual covariances
     y1 ~~ y5
     y2 ~~ y4 + y6
     y3 ~~ y7
     y4 ~~ y8
     y6 ~~ y8
'
fit <- bsem(model, data = PoliticalDemocracy)
summary(fit)

The development version of blavaan (containing updates not yet on CRAN) can be installed via the command below. Compilation is required; this may be a problem for users who currently rely on a binary version of blavaan from CRAN.

remotes::install_github("ecmerkle/blavaan", INSTALL_opts = "--no-multiarch")

For further information, see:

Merkle, E. C., Fitzsimmons, E., Uanhoro, J., & Goodrich, B. (2021). Efficient Bayesian structural equation modeling in Stan. Journal of Statistical Software, 100(6), 1–22.

Merkle, E. C., & Rosseel, Y. (2018). blavaan: Bayesian structural equation models via parameter expansion. Journal of Statistical Software, 85(4), 1–30.

blavaan is supported by the Institute of Education Sciences, U.S. Department of Education, Grant R305D210044, as well as NSF grants SES-1061334 and 1460719.

blavaan's People

Contributors

andrjohns avatar ecmerkle avatar lstmemery avatar maugavilla avatar olivroy avatar rdf310ca avatar tdjorgensen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

blavaan's Issues

stan examples

Can you add a few Stan examples with bsem and bcfa for beginners?

small change needed in tests/tests.blavaanobject-methods.R

In lavaan dev 0.6-7, I changed the order of the arguments of lavPredict(). The second argument is now newdata= (to be consistent with predict()), instead of type=. Therefore, line 90 in tests.blavaanobject-methods.R:

lavPredict(fitlav, 'lv')[,1]) > .95) 

should be replaced by

lavPredict(fitlav, type='lv')[,1]) > .95)

Maybe there are more places in blavaan where this is needed (but they did not show up in my tests).
Yves.

Bug with multiple groups

It seems the latest version crashes when I add labels or equality constraints in a multiple group model. Without constraints it works well, but when I add things like weak invariance constraints I get this

This only happens with std.lv=T , with std.lv=F it runs fine

mod2 <- '
Positive =~ c(l1,l1)*PosAFF1 + c(l2,l2)*PosAFF2 + c(l3,l3)*PosAFF3
Agency =~ c(l4,l4)*Agency1 + c(l5,l5)*Agency2 + c(l6,l6)*Agency3

Positive ~~ c(1, NA)*Positive
Agency ~~ c(1, NA)*Agency
            '

> fit2_stan <-  bcfa(mod2, data=dat, std.lv=T, 
+               group="Gender",
+               n.chains = 3, burnin=5000,
+               sample=1000, target="stan", meanstructure=T)
[1] "Error in lamsign[l1, 1] : subscript out of bounds\n"
attr(,"class")
[1] "try-error"
attr(,"condition")
<subscriptOutOfBoundsError in lamsign[l1, 1]: subscript out of bounds>
Error in blavaan(mod2, data = dat, std.lv = T, group = "Gender", meanstructure = T,  : 
  blavaan ERROR: problem with translation from lavaan to MCMC syntax.

blavFitIndices & multiple groups

@maugavilla @TDJorgensen I believe there is a small issue with counting parameters, around the line referenced below. Basically, there is a variable "nvar" that I think is assumed to be a single number, but it contains multiple numbers when there are multiple groups (the value is being pulled from object@Model@nvar). It is also used in bgammahat and other computations, and I am not immediately sure whether you intended multiple values there. So I think either of your eyes would be helpful!

p <- ((nvar * (nvar + 1)) / 2)

One other thing: lavaan 0.6-13 does some robust fitMeasures computations, and there is a bug that leads to some errors here. It goes away if you use the github version of lavaan.

seed different chains while using Stan

I have written some basic tutorials on how to use blavaan (with jags) in the past: https://www.rensvandeschoot.com/tutorials/bayesian-regression-blavaan/ and now I am trying to also create a blavaan (Stan) version of these. However, some of the commands when switching from jags to Stan no longer work.

for example this works:

model.informative.priors1 <- 
                    '#the regression model with priors
                    diff ~ prior("dnorm(3,2.5)")*age + prior("dnorm(0,10)")*age2

                    #show that dependent variable has variance
                    diff ~~ diff

                    #we want to have an intercept (with normal prior)
                    diff ~ 1'

fit.bayes.infprior1 <- blavaan(model.informative.priors1, data=data, convergence= "auto",  test="none", seed=c(21,08,2018))

summary(fit.bayes.infprior1, fit.measures=TRUE, ci = TRUE, rsquare=TRUE)

But this no longer works:

model.informative.priors1 <- 
                    '#the regression model with priors
                    diff ~ prior("normal(3, 0.632)")*age + prior("normal(0, 0.316 )")*age2

                    #show that dependent variable has variance
                    diff ~~ diff

                    #we want to have an intercept (with normal prior)
                    diff ~ 1'

fit.bayes.infprior1 <- blavaan(model.informative.priors1, data=data,  target= "stan",  test="none", seed=c(21,08,2018))

summary(fit.bayes.infprior1, fit.measures=TRUE, ci = TRUE, rsquare=TRUE)

I understand from error messages that autoconvergence is not surported using stan, which is no problem, but I can also not het the seed to work. If I try to set a seed, when using stan, I get the following error:

Error in rsmcmc[1, 1, ] : incorrect number of dimensions

How can I still set a seed for the chains, in order to get a reproducible example while using Stan as a sampler?

Consider estimating marginal log-likelihood in Stan?

Ed

I was looking at the Stan code for the pre-compile model. Have you consider add the estimation of the marginal log-likelihood in the Stan model, instead of doing that in R?

Adding in the generated quantities block something like this (i think my indexing is incorrect)

##### row data log-likelihood
  if (has_data) {
    int obsidx[p + q];
    int r1;
    int r2;
    int grpidx;
    for (mm in 1:Np) {
      obsidx = Obsvar[mm,];
      r1 = startrow[mm];
      r2 = endrow[mm];
      grpidx = grpnum[mm];
       log_lik[mm] = multi_normal_lpdf(YX[r1:r2,1:Nobs[mm]] | Mu[grpidx, obsidx[1:Nobs[mm]]], Sigma[grpidx, obsidx[1:Nobs[mm]], obsidx[1:Nobs[mm]]]);
    }
#### model marginal log-likelihood
marg_log_lik = sum(log_lik); // 

This could be particularly useful with categorical indicators, I think would be easier to estimate this in Stan than outside

avoid nested lists

The bcontrol and mcmcextra arguments currently involve nested lists that are not entirely user friendly. Allow the user to directly add arguments in blavaan() that would automatically be passed to stan() or run.jags() or etc, without the nested lists.

Different MCMC sampler

Hey,

Do you plan to add support for a different MCMC sampler, such as the one used by rstanarm and rstan? In particular, a sampler that would not need external depencies :)

Thanks!!

Standardized Posteriors

Hi, is it possible to standardize the posterior distribution (all the draws) instead of just having the standardization of the point-estimates? Thanks a lot :)

convert custom jags model to a lavaan object?

I'm developing a custom jags model through blavaan using the mcmcfile=T option, then editing that file directly. I have successfully run that file in jags, but it doesn't return a lavaan object. My code later down the line requires it to be a lavaan object. Is there a way to pass a jags file through blavaan rather than lavaan syntax?

problem with translation from lavaan to MCMC syntax

Hi Ed, @bgoodri suggested that I come to you again for this question:

See the simulated dataset (test.csv) and a minimal example below for a pretty complex model that we are trying. We got this error message when trying to run the chunk below. The lavaan model syntax seems to be okay since it runs with lavaan::sem().

blavaan NOTE: ordinal models are new, please report bugs!
https://github.com/ecmerkle/blavaan/issues

[1] "Error in if (lamsign[l2, 1] == 1) l2 <- lamsign[l2, 2] : \n  argument is of length zero\n"
attr(,"class")
[1] "try-error"
attr(,"condition")
<simpleError in if (lamsign[l2, 1] == 1) l2 <- lamsign[l2, 2]: argument is of length zero>
Error in blavaan(model_all_1, data = test, ordered = c("PSRA1
[test.csv](https://github.com/ecmerkle/blavaan/files/8751283/test.csv)
_1", "PSRA2_1",  : 
  blavaan ERROR: problem with translation from lavaan to MCMC syntax.
model_all_1 <- '
    # CFA
    PSRA_1 =~ fl1*PSRA1_1 + fl2*PSRA2_1 + fl3*PSRA3_1 + fl4*PSRA4_1 + fl5*PSRA5_1 + fl6*PSRA6_1 + fl7*PSRA7_1 + fl8*PSRA8_1 + fl9*PSRA9_1 + fl10*PSRA10_1 + fl11*PSRA11_1 + fl12*PSRA12_1 + fl13*PSRA13_1
    
    PSRA_2 =~ fl1*PSRA1_2 + fl2*PSRA2_2 + fl3*PSRA3_2 + fl4*PSRA4_2 + fl5*PSRA5_2 + fl6*PSRA6_2 + fl7*PSRA7_2 + fl8*PSRA8_2 + fl9*PSRA9_2 + fl10*PSRA10_2 + fl11*PSRA11_2 + fl12*PSRA12_2 + fl13*PSRA13_2
    
    PSRA_3 =~ fl1*PSRA1_3 + fl2*PSRA2_3 + fl3*PSRA3_3 + fl4*PSRA4_3 + fl5*PSRA5_3 + fl6*PSRA6_3 + fl7*PSRA7_3 + fl8*PSRA8_3 + fl9*PSRA9_3 + fl10*PSRA10_3 + fl11*PSRA11_3 + fl12*PSRA12_3 + fl13*PSRA13_3
    
    # thresholds

PSRA1_1|a1*t1 + a2*t2 + a3*t3
PSRA1_2|a1*t1 + a2*t2 + a3*t3
PSRA1_3|a1*t1 + a2*t2 + a3*t3

PSRA2_1|b1*t1 + b2*t2 + b3*t3
PSRA2_2|b1*t1 + b2*t2 + b3*t3
PSRA2_3|b1*t1 + b2*t2 + b3*t3

PSRA3_1|c1*t1 + c2*t2 + c3*t3
PSRA3_2|c1*t1 + c2*t2 + c3*t3
PSRA3_3|c1*t1 + c2*t2 + c3*t3

PSRA4_1|d1*t1 + d2*t2 + d3*t3
PSRA4_2|d1*t1 + d2*t2 + d3*t3
PSRA4_3|d1*t1 + d2*t2 + d3*t3

PSRA5_1|e1*t1 + e2*t2 + e3*t3
PSRA5_2|e1*t1 + e2*t2 + e3*t3
PSRA5_3|e1*t1 + e2*t2 + e3*t3

PSRA6_1|f1*t1 + f2*t2 + f3*t3
PSRA6_2|f1*t1 + f2*t2 + f3*t3
PSRA6_3|f1*t1 + f2*t2 + f3*t3

PSRA7_1|g1*t1 + g2*t2 + g3*t3
PSRA7_2|g1*t1 + g2*t2 + g3*t3
PSRA7_3|g1*t1 + g2*t2 + g3*t3

PSRA8_1|h1*t1 + h2*t2 + h3*t3
PSRA8_2|h1*t1 + h2*t2 + h3*t3
PSRA8_3|h1*t1 + h2*t2 + h3*t3

PSRA9_1|i1*t1 + i2*t2 + i3*t3
PSRA9_2|i1*t1 + i2*t2 + i3*t3
PSRA9_3|i1*t1 + i2*t2 + i3*t3

PSRA10_1|j1*t1 + j2*t2 + j3*t3
PSRA10_2|j1*t1 + j2*t2 + j3*t3
PSRA10_3|j1*t1 + j2*t2 + j3*t3

PSRA11_1|k1*t1 + k2*t2 + k3*t3
PSRA11_2|k1*t1 + k2*t2 + k3*t3
PSRA11_3|k1*t1 + k2*t2 + k3*t3

PSRA12_1|l1*t1 + l2*t2 + l3*t3
PSRA12_2|l1*t1 + l2*t2 + l3*t3
PSRA12_3|l1*t1 + l2*t2 + l3*t3

PSRA13_1|m1*t1 + m2*t2 + m3*t3
PSRA13_2|m1*t1 + m2*t2 + m3*t3
PSRA13_3|m1*t1 + m2*t2 + m3*t3

  # CFA
  
  # measurement model
  EGRA_1 =~ VOC_1 + LTRC_1 + GRPH_1 + INV_1 + OPR_1 + RDCP_1
  GRPH_1 ~~ LTRC_1
  RDCP_1 ~~ VOC_1
  RDCP_1 ~~ OPR_1
  
    # measurement model
  EGRA_2 =~ VOC_2 + LTRC_2 + GRPH_2 + INV_2 + OPR_2 + RDCP_2
  GRPH_2 ~~ LTRC_2
  RDCP_2 ~~ VOC_2
  RDCP_2 ~~ OPR_2
  
    # measurement model
  EGRA_3 =~ VOC_3 + LTRC_3 + GRPH_3 + INV_3 + OPR_3 + RDCP_3
  GRPH_3 ~~ LTRC_3
  RDCP_3 ~~ VOC_3
  RDCP_3 ~~ OPR_3
 
# cross-lagged effects

    # region
    IC_2 ~ IC_1
    IC_3 ~ IC_1 + IC_2
    WM_2 ~ WM_1
    WM_3 ~ WM_1 + WM_2
    
    PSRA_2 ~ PSRA_1 + IC_1
    PSRA_3 ~ PSRA_1 + PSRA_2 + IC_2
    
    EGRA_2 ~ EGRA_1 + PSRA_1 + IC_1
    EGRA_3 ~ EGRA_1 + EGRA_2 + PSRA_2 + IC_2
    
# fully correlated at each wave
    EGRA_1 ~~ PSRA_1 + IC_1 + WM_1
    PSRA_1 ~~ IC_1 + WM_1
    IC_1 ~~ WM_1

    EGRA_2 ~~ PSRA_2 + IC_2 + WM_2
    PSRA_2 ~~ IC_2 + WM_2
    IC_2 ~~ WM_2

    EGRA_3 ~~ PSRA_3 + IC_3 + WM_3
    PSRA_3 ~~ IC_3 + WM_3
    IC_3 ~~ WM_3
'
mydp <- dpriors(tau = "normal(0, 0.5)",
                lambda = "normal(0, 2)",
                beta = "normal(0.5,0.2)",
                rho = "beta(1,1)",
                nu = "normal(0, 5)")

fit_all_1 <- bsem(model_all_1, data = test,
            ordered = c("PSRA1_1", "PSRA2_1", "PSRA3_1",
                        "PSRA4_1", "PSRA5_1", "PSRA6_1",
                        "PSRA7_1", "PSRA8_1", "PSRA10_1",
                        "PSRA11_1", "PSRA12_1", "PSRA13_1",
                        "PSRA1_2", "PSRA2_2", "PSRA3_2",
                        "PSRA4_2", "PSRA5_2", "PSRA6_2",
                        "PSRA7_2", "PSRA8_2", "PSRA10_2",
                        "PSRA11_2", "PSRA12_2", "PSRA13_2",
                        "PSRA1_3", "PSRA2_3", "PSRA3_3",
                        "PSRA4_3", "PSRA5_3", "PSRA6_3",
                        "PSRA7_3", "PSRA8_3", "PSRA10_3",
                        "PSRA11_3", "PSRA12_3", "PSRA13_3"
                        ),
           dp = mydp,
           std.lv=TRUE,
           seed = 1234)

Model fit and ppmc() with ordinal data

Hi Ed,

@bgoodri and I are working on a project together where we wish to build a Bayesian CFA model for a few measures with items on an ordinal scale (e.g., 1-4 Likert scale). We have a few questions:

  1. I am a bit puzzled by the fact that the model fit of the Bayesian CFA (given by blavFitIndices()) is very different from that of the Frequentist CFA (given by lavaan::cfa()). I understand that the Frequentist model may be overfitting the data by giving CFI = 0.99 and RMSEA = 0.04. But the Bayesian model fit is very different. I checked the predicted ordinal values from the model and they seem to correspond well with the raw distribution of the items. Could it be because these Bayesian fit indices do not work too well with ordinal data?
Posterior mean (EAP) of devm-based fit indices:

      BRMSEA    BGammaHat adjBGammaHat          BMc 
       0.637        0.246       -0.497        0.000 
  1. When I tried to use ppmc() on my fitted Bayesian CFA model, I got this error. Is this because ppmc() does not work too well with ordinal data? Would adding mcmcextra = list(data = list(emiter = 50)) in bcfa() help? I am using blavaan_0.3-18.853.
Error in "mcmcdata" %in% names(lavobject@external) : 
  trying to get slot "external" from an object of a basic class ("NULL") with no slots
  1. The MargLogLik is still NA after adding mcmcextra = list(data = list(llnsamp = 200)). Is there another way to compute the likelihood?

Thanks!

blavaan failed to compile (Mac M1)

I was trying to troubleshoot another problem (all missing predicted latent variables from blavPredict() and tried to install the current version from GitHub with remotes::install_github("ecmerkle/blavaan"). I was prompted to update rstan and StanHeaders, which I did compiling them from source, which worked. When blavaan tried to compile, I got an error (long output below). My session info is:

R version 4.2.2 (2022-10-31)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib

locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] compiler_4.2.2

The output for the warning and error messages is too long, but here's a link to the text file.

Some plots when using stan do not work

Dear @ecmerkle,
Once again I have written some basic tutorials on how to use blavaan (with jags) in the past: https://www.rensvandeschoot.com/tutorials/wambs-blavaan-tutorial/ and now I am trying to also create a blavaan (Stan) version of these. However, some of the commands when switching from jags to Stan no longer work. when I try to make some convergence plots I get the following error for some, but not all of them. Error in .local(x, ...) : Plotting function not found. Maybe this bug can be fixed in the future or maybe you can point out what I am doing wrong.

This is some reproducible code that gives the bug when using Stan, but not Jags.


# load packages
library(rstan)
library(blavaan)
library(coda)
library(rjags)
library(tidyverse)
library(coda)

# create data
x <-runif(333, 20, 80)
y <- -45 + 2.5*x- 0.025*x^2 +rnorm(333,0,20)
data <- data.frame("y"=y, "x"=x, "x2"=x^2 )


# define models in jags and stan
model.regression.jags <- '#the regression model
                    y ~ prior("dnorm(0.8, 0.2)")*x + prior("dnorm(0, 0.1)")*x2

#show that dependent variable has variance
y ~~ prior("dgamma(.5, .5)")*y

#we want to have an intercept
y ~ prior("dnorm(-35, 0.05)")*1'

model.regression.stan <- '#the regression model
                    y ~ prior("normal(0.8, 2.24)")*x + prior("normal(0, 3.16)")*x2

                    #show that dependent variable has variance
                    y ~~ prior("gamma(.5, .5)")*y

                    #we want to have an intercept
                    y ~ prior("normal(-35, 4.47)")*1'


# run models

fit.bayes.jags <- blavaan(model.regression.jags, data=data, n.chains=3, burnin =1000, sample=2000, test="none",  target= "jags", seed=c(20,01,2019)) 
fit.bayes.stan <- blavaan(model.regression.stan, data=data, n.chains=3, burnin =1000, sample=2000, test="none",  target= "stan", seed=20012019) 

par(mfrow=c(2,2))
# these do all work
plot(fit.bayes.jags, pars=1:4, plot.type="trace")
plot(fit.bayes.stan, pars=1:4, plot.type="trace") # only 'stan' plot that does work
plot(fit.bayes.jags, pars=1:4, plot.type="autocorr")
plot(fit.bayes.jags, pars=1:4, plot.type="density")
plot(fit.bayes.jags, pars=1:4, plot.type="histogram")


# these do not work
plot(fit.bayes.stan, pars=1:4, plot.type="autocorr")
plot(fit.bayes.stan, pars=1:4, plot.type="histogram")
plot(fit.bayes.stan, pars=1:4, plot.type="density")


# These are solution to still get similar plots when using Stan from the output.

# histograms
blavInspect(fit.bayes.stan, what="mcmc") %>%
  as.matrix()%>%
  as_tibble()%>%
  gather()%>%
  ggplot(aes(x=value))+
  geom_histogram()+
  facet_wrap(~key, scales="free")

# densities
blavInspect(fit.bayes.stan, what="mcmc") %>%
  as.matrix()%>%
  as_tibble()%>%
  gather()%>%
  ggplot(aes(x=value))+
  geom_density()+
  facet_wrap(~key, scales="free")

# autocorr
coda::autocorr.plot(blavInspect(fit.bayes.stan, what="mcmc"))

blavaan ERROR: problem with translation from lavaan to MCMC syntax

Hi, I´m running a measurement invariance testing with blavaan but I´m getting de next error:
[1] "Error in lamsign[l1, 1] : subscript out of bounds\n"
attr(,"class")
[1] "try-error"
attr(,"condition")
<subscriptOutOfBoundsError in lamsign[l1, 1]: subscript out of bounds>
Error in blavaan(mod_sem_1, ordered = c("y3", "y4", "y6", "y7", "y8", :
blavaan ERROR: problem with translation from lavaan to MCMC syntax.

I´m working on Rstudio with the code:
fit_mg_g3 <- bsem(mod_sem_1,
ordered=c("y3","y4","y6","y7","y8","y9","y10","y11","y12","y13",
"x14","x16","x17","x18","x19","x20","x21","x22","x23","x24"),
std.lv=TRUE, dp=dpriors(lambda="normal(1,1)"),
group = "genero",
group.equal = "loadings",
n.chains = 3, burnin = 9000, sample = 1000,
data=datos_imputados)

I already tried to re-install blavaan package and I´m sure that the code, variable and data are fine. Actually, the multigroup model without constraints run succesfully, only after I add the line "group.equal = "loadings" the error show up. I really hope you can help me fix it. Thanks.

Error in lav2standata(lavobject) : object 'ptot' not found

Hi, I am running a 2 level model in blavaan but I am getting the following error:
[1] "Error in lav2standata(lavobject) : object 'ptot' not found\n"
attr(,"class")
[1] "try-error"
attr(,"condition")
<simpleError in lav2standata(lavobject): object 'ptot' not found>
Error in blavaan(bmsem2, pisa_sgstusch, cluster = "CNTSCHID", target = "stan", :
blavaan ERROR: problem with translation from lavaan to MCMC syntax.

I'm working on Rstudio with the code:

model<- '
level: within
eta1_w=~y1 + y2 + y3
eta2_w=~y4 + y5 + y6

eta2_w~ eta1_w

level:between
eta1_b=~ y1 + y2 + y3
eta2_b=~y4 + y5 + y6
eta3_b=~y7+y8+y9

eta2_b~ eta1_b + eta3_b
eta3_b~eta1_b
'
bsem(model, data, cluster')

It seems like the error is about an object 'ptot' in lav2standata which is not part of my codes or data.
Would greatly appreciate your kind support please. Thanks!

Growth Mrxture Modelling in blavaan

Hi Ed,

is it possible to do Growth Mixture Modelling with blavaan? If this is the case, could you tell me which function I should use for this task?

Many thanks in advance
Andreas

feature request: blavInspect(fit, "lir")

Hi Ed,

I noticed recent changes are prepping for handing ordinal outcomes. Not to rush anything, but once that is implemented, could you make it possible to save and return draws of latent item responses, similar to making factor scores available from each iteration? Mplus makes these available via SAVE: LRESPONSES; so some users might like to see the same feature in blavaan. For my part, I think it would be cool to open the possibility of model-based imputation of missing item-level ordinal data by saving the latent item responses instead of the imputed binary data. (Of course, you could just make your inferences from the fitted BSEM, but one might want to capitalize on established frequentist model comparison statistics by fitting competing models to the same data imputed under the most general model.)

Thanks for considering.

Error when attempting to set Stan's adapt_delta option

I am having issues changing the adapt_delta option for Stan. I tried this as an argument to bcfa as well as an argument in the bcontrol list. Both returned errors.

library(lavaan)
library(blavaan)
HS.model <- ' visual  =~ x1 + x2 + x3
              textual =~ x4 + x5 + x6
              speed   =~ x7 + x8 + x9 '
fit1 <- lavaan::cfa(HS.model, data = HolzingerSwineford1939, group = "school")
bfit1 <- bcfa(
  parTable(fit1), 
  data = HolzingerSwineford1939, 
  group = "school",
  bcontrol = list(cores = 3, adapt_delta = 0.9)
)
Error in checkForRemoteErrors(val) : 
  3 nodes produced errors; first error: passing unknown arguments: adapt_delta.
Error in blavaan(parTable(fit1), data = HolzingerSwineford1939, group = "school",  : 
  blavaan ERROR: problem with MCMC estimation.  The model syntax and data have been exported.

Modification indices

Hi,

thanks a lot for a very useful package! I tried to get modification indices via summary(object, modindices=T) but keep getting an error (arguments imply differing number of rows) when fitting the extended lavaan object in lav_object_extended().

Do you have any idea why it's not working?

Thanks,
Adam

lavExport

lavaan uses lavExport() to write model syntax to a file. blavaan currently uses the argument mcmcfile=TRUE. Create a blavExport() function that accepts target="stan" or target="jags".

add relative effective sample size to loo

Hi Ed

I saw that one of the new features of the loo package was to include the relative effective sample size to the estimation of loo. By adding this to the estimation of loo, it improves the PSIS effective sample size, and the Monte Carlo error estimates.

I edited the blavcompre function to include this for the calculation of loo, and automatically accounts for it in the comparison. This could also be added to the fitMeasures()

blavCompare_2 <- function (object1, object2, ...) {
if (blavInspect(object1, "Options")$test == "none" | blavInspect(object2, "Options")$test == "none") {
stop("blavaan ERROR: Models cannot be compared when test='none'")
}
bf <- object1@test[[1]]$stat - object2@test[[1]]$stat
res <- c(bf, object1@test[[1]]$stat, object2@test[[1]]$stat)
names(res) <- c("bf", "mll1", "mll2")
lavopt1 <- object1@Options
lavopt1$estimator <- "ML"
ll1 <- case_lls(object1@external$mcmcout, object1@Model,
object1@ParTable, object1@SampleStats, lavopt1, object1@Cache,
object1@Data, make_mcmc(object1@external$mcmcout))
lavopt2 <- object2@Options
lavopt2$estimator <- "ML"
ll2 <- case_lls(object2@external$mcmcout, object2@Model,
object2@ParTable, object2@SampleStats, lavopt2, object2@Cache,
object2@Data, make_mcmc(object2@external$mcmcout))

if(packageVersion("loo") >= "2.0.0") {
### for loo package 2 or higher
nchain1 <- length(object1@external$mcmcout$mcmc)
nchain2 <- length(object2@external$mcmcout$mcmc)
niter1 <- nrow(ll1)
niter2 <- nrow(ll2)

chainid1 <- NULL ## set chain id for each iteration
for(i in 1:nchain1){
  temp <- rep(i, niter1/nchain1)
  chainid1 <- do.call(c, list(chainid1, temp))
}

chainid2 <- NULL ## set chain id for each iteration
for(i in 1:nchain2){
  temp <- rep(i, niter2/nchain2)
  chainid2 <- do.call(c, list(chainid2, temp))
}

ref1 <- relative_eff(ll1, chain_id=chainid1) ## compute MCMC relative effective sample size
ref2 <- relative_eff(ll2, chain_id=chainid2)

loo1 <- loo(ll1, r_eff=ref1) ## compute loo including relative effective sample size
loo2 <- loo(ll2, r_eff=ref2)
waic1 <- waic(ll1)
waic2 <- waic(ll2)

looobj <- list(loo1$estimates, loo2$estimates)
waicobj <- list(waic1$estimates, waic2$estimates)

}
else {
loo1 <- loo(ll1)
loo2 <- loo(ll2)
waic1 <- waic(ll1)
waic2 <- waic(ll2)

looobj <- rbind(loo1[1:6], loo2[1:6])
waicobj <- rbind(loo1[1:6], loo2[1:6])

}

diff_loo <- compare(loo1, loo2)
diff_waic <- compare(waic1, waic2)
cat("\nWAIC difference & SE (positive values favor object2): \n",
sprintf("%8.3f", diff_waic[1]), sprintf("%8.3f", diff_waic[2]),
"\n\n")
cat("LOO difference & SE (positive values favor object2): \n",
sprintf("%8.3f", diff_loo[1]), sprintf("%8.3f", diff_loo[2]),
"\n\n")
cat("Laplace approximation to the log-Bayes factor\n(experimental; positive values favor object1):",
sprintf("%8.3f", bf), "\n\n")

res <- list(bf = res, loo = looobj, diff_loo = diff_loo,
waic = waicobj, diff_waic = diff_waic)

invisible(res)
}

improve output when prisamp=TRUE

When sampling from the prior with prisamp=TRUE, summary() displays the prior means/SDs, but they are labeled as Post.SD. This and possibly other labels should be changed. The ppp and fitMeasures are also fairly useless, so some warning should be issued.

adjBGammaHat > 1 and BRMSEA is NA

Hi Ed,

For the following model, I got NAs for BRMSEA and a value over 1 for adjBGammaHat (which should be bounded between 0 and 1 right?) Do you know why that could be the case?

For BTLI, I understand it can be over 1 from the formula, but is it the larger the better or the closer to 1 the better?

Thanks!

> summary(FI_fit_egra_test_2, central.tendency = c("mean","median","mode"), prob = .90)

Posterior summary statistics and highest posterior density (HPD) 90% credible intervals for devm-based fit indices:

               EAP Median   MAP    SD lower upper
BRMSEA         NaN     NA    NA    NA   NaN    NA
BGammaHat    0.783  0.784 0.784 0.002 0.781 0.786
adjBGammaHat 1.468  1.467 1.466 0.003 1.462 1.473
BMc          0.436  0.437 0.438 0.003 0.431 0.442
BCFI         0.632  0.632 0.633 0.004 0.626 0.637
BTLI         1.082  1.082 1.082 0.001 1.081 1.083
BNFI         0.648  0.649 0.649 0.003 0.642 0.654
Warning message:
In (function (..., deparse.level = 1)  :
  number of columns of result is not a multiple of vector length (arg 1)
blavaan (0.4-2.912) results of 1000 samples after 500 adapt/burnin iterations

  Number of observations                           611

  Number of missing patterns                         1

  Statistic                                 MargLogLik         PPP
  Value                                      -9681.216       0.000

Latent Variables:
                   Estimate  Post.SD pi.lower pi.upper     Rhat    Prior       
  EGRA_2m1 =~                                                                  
    VOC_2m1           0.106    0.044    0.022    0.195    1.000    normal(0, 2)
    LTRC_2m1          0.257    0.062    0.172    0.347    1.000    normal(0, 2)
    GRPH_2m1          0.288    0.065    0.207    0.377    1.000    normal(0, 2)
    INV_2m1           0.599    0.114    0.530    0.690    1.000    normal(0, 2)
    OPR_2m1           0.921    0.168    0.851    1.006    1.000    normal(0, 2)
    RDCP_2m1          0.680    0.129    0.587    0.781    1.000    normal(0, 2)
  EGRA_3m1 =~                                                                  
    VOC_3m1           0.158    0.045    0.072    0.245    1.000    normal(0, 2)
    LTRC_3m1          0.225    0.042    0.144    0.307    1.000    normal(0, 2)
    GRPH_3m1          0.243    0.043    0.160    0.325    1.001    normal(0, 2)
    INV_3m1           0.738    0.045    0.667    0.809    1.000    normal(0, 2)
    OPR_3m1           0.872    0.050    0.793    0.947    1.004    normal(0, 2)
    RDCP_3m1          0.790    0.051    0.707    0.871    0.999    normal(0, 2)

Covariances:
                   Estimate  Post.SD pi.lower pi.upper     Rhat    Prior       
 .LTRC_2m1 ~~                                                                  
   .GRPH_2m1          0.447    0.042    0.367    0.534    1.000       beta(1,1)
 .VOC_2m1 ~~                                                                   
   .RDCP_2m1          0.023    0.038   -0.051    0.096    1.000       beta(1,1)
 .OPR_2m1 ~~                                                                   
   .RDCP_2m1         -0.188    0.033   -0.244   -0.113    1.000       beta(1,1)
 .LTRC_3m1 ~~                                                                  
   .GRPH_3m1          0.475    0.044    0.394    0.566    1.001       beta(1,1)
 .VOC_3m1 ~~                                                                   
   .RDCP_3m1          0.028    0.035   -0.042    0.095    1.000       beta(1,1)
 .OPR_3m1 ~~                                                                   
   .RDCP_3m1         -0.241    0.029   -0.290   -0.178    1.002       beta(1,1)
  EGRA_2m1 ~~                                                                  
    EGRA_3m1          0.620    0.030    0.563    0.679    1.000     lkj_corr(1)

Intercepts:
                   Estimate  Post.SD pi.lower pi.upper     Rhat    Prior       
   .VOC_2m1          -0.000    0.040   -0.079    0.080    0.999    normal(0, 5)
   .LTRC_2m1          0.001    0.041   -0.076    0.082    1.001    normal(0, 5)
   .GRPH_2m1         -0.000    0.040   -0.080    0.077    1.003    normal(0, 5)
   .INV_2m1          -0.001    0.041   -0.084    0.079    1.001    normal(0, 5)
   .OPR_2m1          -0.001    0.041   -0.082    0.079    1.001    normal(0, 5)
   .RDCP_2m1         -0.001    0.041   -0.079    0.082    1.001    normal(0, 5)
   .VOC_3m1          -0.000    0.041   -0.078    0.080    0.999    normal(0, 5)
   .LTRC_3m1         -0.001    0.040   -0.080    0.077    1.000    normal(0, 5)
   .GRPH_3m1         -0.000    0.040   -0.077    0.078    1.001    normal(0, 5)
   .INV_3m1          -0.001    0.041   -0.083    0.077    1.001    normal(0, 5)
   .OPR_3m1          -0.002    0.041   -0.082    0.078    1.002    normal(0, 5)
   .RDCP_3m1         -0.002    0.043   -0.087    0.079    1.000    normal(0, 5)
    EGRA_2m1          0.000                                                    
    EGRA_3m1          0.000                                                    

Variances:
                   Estimate  Post.SD pi.lower pi.upper     Rhat    Prior       
   .VOC_2m1           0.995    0.057    0.892    1.111    0.999 gamma(1,.5)[sd]
   .LTRC_2m1          0.938    0.053    0.844    1.050    0.999 gamma(1,.5)[sd]
   .GRPH_2m1          0.922    0.054    0.822    1.034    1.000 gamma(1,.5)[sd]
   .INV_2m1           0.640    0.041    0.561    0.723    1.001 gamma(1,.5)[sd]
   .OPR_2m1           0.140    0.043    0.071    0.242    1.003 gamma(1,.5)[sd]
   .RDCP_2m1          0.533    0.053    0.432    0.643    1.001 gamma(1,.5)[sd]
   .VOC_3m1           0.982    0.056    0.880    1.096    1.000 gamma(1,.5)[sd]
   .LTRC_3m1          0.957    0.055    0.857    1.074    1.000 gamma(1,.5)[sd]
   .GRPH_3m1          0.949    0.056    0.847    1.065    1.000 gamma(1,.5)[sd]
   .INV_3m1           0.467    0.034    0.401    0.534    1.004 gamma(1,.5)[sd]
   .OPR_3m1           0.251    0.042    0.177    0.341    1.008 gamma(1,.5)[sd]
   .RDCP_3m1          0.384    0.043    0.303    0.470    1.001 gamma(1,.5)[sd]
    EGRA_2m1          1.000                                                    
    EGRA_3m1          1.000                                                    

Priors applied to the wrong parameters (in certain conditions) using the rstan interface

Hi Ed,

I noticed that, when setting informed priors for specific parameters, "bsem" applies those priors to the wrong parameters.

Specifically, this happens under certain conditions when you run a path model with multiple dependent variables.

For example, you set a prior for the path
Predictor1 -> DepVar1
but it ends up being applied to the path
Predictor 1 -> DepVar2.

The issue is described in the attached documents which include:

  • PNG screenshot that depicts the issue
  • R code for example and a simulated dataset (zipped)

This happens when using the rstan interface.
Note: I use blavaan version 0.3-7, and R version 3.6.1

Thanks,
Enrico

PS. I wrote to Y Rosseel who encouraged me to report the issue here. He also noted that this happens when using the rstan interface but not the rjags interface.

example blavaan output

[example_code_and_dataset.zip](https://github.com/ecmerkle/blavaan/files/3840546/example_code_and_dataset.zip)

Latent variable names missing from blavPredict(type = "lv")

ls there any way to add variable names to the output of blavPredict(type = "lv")? Looks like it's not returning them and it can be unclear which column in the output matrix is which latent variable. The base lavPredict() doesn't seem to have this issue. Tried looking around the source code, but would need to dig in deep to find out how naming works and (perhaps falsely) assume this is something that would be pretty easy to address if I already knew my way around.


ibrary(dplyr)
library(blavaan)
set.seed(123)

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Load data
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

data(HolzingerSwineford1939)
dat <- HolzingerSwineford1939

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Specify SEM
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

model <- "

# Measurement Model
lv1 =~ NA*x1 + x2 + x3
lv2 =~ NA*x4 + x5 + x6
lv3 =~ NA*x7 + x8 + x9

# Residual Variances of Latent Factors
lv1 ~~ 1*lv1
lv2 ~~ 1*lv2
lv3 ~~ 1*lv3

# Add some arbitrary covariances
lv1 ~~ lv2
x1 ~~ x4

# Regression
lv1 ~ lv3 + sex + grade + ageyr
lv2 ~ lv3 + sex + grade + ageyr
"

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Estimation
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# Fit model with default priors
m_out <- model %>% blavaan::bsem(data = dat, 
                                 sample = 1000, 
                                 burnin = 1000, 
                                 n.chains = 2,
                                 save.lvs= TRUE, 
                                 seed = 123)
                                 
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Post-estimation Issue : blavPredict() dropping variable names
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# In lavaan, lavPredict returns a lavaan.matrix with a column name attribute
# for identifying which column contain's the scores for which factor scores.
m_out %>% lavPredict(type = "lv") %>% str()

# However, In blavaan, blavPredict returns a list of matrices WITHOUT the
# column name attribute. This makes it unclear in many cases which column 
# corresponds to which factor scores.
m_out %>% blavPredict(type = "lv") %>% str()

package install on windows

Current version of blavaan only installs on windows using the command

remotes::install_github("ecmerkle/blavaan", INSTALL_opts = "--no-multiarch")

and Rcpp needs to be built on the current version of R. When the --no-multiarch option is not set, R complains that no DLL was created. I don't know enough about this stuff to give good recommendations, but this workaround may help some people until the makevars.win file is edited appropriately!

`blavCompare` errors

This code use to work, but now errors, any idea why?

library(bayestestR)
library(blavaan)

data("PoliticalDemocracy", package = "lavaan")

model <- "
  # latent variable definitions
  dem60 =~ y1 + a*y2
  dem65 =~ y5 + a*y6

  # regressions
  dem65 ~ dem60

  # residual correlations
  y1 ~~ y5
"

model2 <- "
  # latent variable definitions
  dem60 =~ y1 + a*y2
  dem65 =~ y5 + a*y6

  # regressions
  dem65 ~ 0*dem60

  # residual correlations
  y1 ~~ 0*y5
"
suppressWarnings(capture.output({
  bfit <- blavaan::bsem(model,
                        data = PoliticalDemocracy,
                        n.chains = 1, burnin = 50, sample = 100
  )
  bfit2 <- blavaan::bsem(model2,
                         data = PoliticalDemocracy,
                         n.chains = 1, burnin = 50, sample = 100
  )
}))

blavaan::blavCompare(bfit, bfit2)
#> Error in apply(diffs, 2, sum) : dim(X) must have a positive length

not saving object because of delta parameterization

Hi Ed

Found that the new version is not working simple examples, it fails to save the results objects due to missing arguments I think

when I run the example
HS.model <- ' visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9 '

fit <- bcfa(HS.model, data=HolzingerSwineford1939)
summary(fit)

I get this error
Error in lav_model_implied(lavmodel, delta = (lavmodel@parameterization == :
unused argument (delta = (lavmodel@parameterization == "delta"))

And no result object is saved

sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] blavaan_0.4-1 Rcpp_1.0.8 lavaan_0.6-10.1660

loaded via a namespace (and not attached):
[1] mvtnorm_1.1-3 lattice_0.20-45 listenv_0.8.0 runjags_2.2.0-3
[5] prettyunits_1.1.1 ps_1.6.0 zoo_1.8-9 assertthat_0.2.1
[9] digest_0.6.29 utf8_1.2.2 parallelly_1.30.0 plyr_1.8.6
[13] R6_2.5.1 nonnest2_0.5-5 ggridges_0.5.3 MatrixModels_0.5-0
[17] stats4_4.1.2 coda_0.19-4 ggplot2_3.3.5 pillar_1.7.0
[21] rlang_1.0.1 SparseM_1.81 callr_3.7.0 Matrix_1.3-4
[25] pbivnorm_0.6.0 loo_2.4.1 munsell_0.5.0 compiler_4.1.2
[29] rstan_2.21.3 pkgconfig_2.0.3 mnormt_2.0.2 pkgbuild_1.3.1
[33] tmvnsim_1.0-2 rstantools_2.1.1 globals_0.14.0 mcmc_0.9-7
[37] tidyselect_1.1.2 tibble_3.1.6 gridExtra_2.3 matrixStats_0.61.0
[41] codetools_0.2-18 fansi_1.0.2 future_1.24.0 crayon_1.5.0
[45] dplyr_1.0.8 MASS_7.3-54 grid_4.1.2 DBI_1.1.2
[49] gtable_0.3.0 lifecycle_1.0.1 magrittr_2.0.2 StanHeaders_2.21.0-7
[53] scales_1.1.1 RcppParallel_5.1.5 future.apply_1.8.1 cli_3.2.0
[57] ellipsis_0.3.2 vctrs_0.3.8 generics_0.1.2 sandwich_3.0-1
[61] tools_4.1.2 CompQuadForm_1.4.3 glue_1.6.1 purrr_0.3.4
[65] processx_3.5.2 parallel_4.1.2 inline_0.3.19 colorspace_2.0-3
[69] bayesplot_1.8.1 quantreg_5.88 MCMCpack_1.6-0

model results become different when save.lvs = TRUE

Hi Ed,

I am wondering why the model result is different when save.lvs = T, even under the same random seed. My main model converges well, but when I added save.lvs = T to save the factor scores, it fails to converge. I am attaching below a minimal example. Although both model converges in the example, they do have different results. All variables are Likert-scale survey items on an ordinal scale. Thanks!

test.csv

test_model <- '
  # measurement model
  PSRA =~ PSRA1_1 + PSRA2_1 + PSRA3_1 + PSRA4_1 + PSRA5_1 + PSRA6_1 + PSRA7_1 + PSRA8_1 + PSRA10_1 + PSRA11_1 + PSRA12_1 + PSRA13_1
'

mydp <- dpriors(tau = "normal(0, 0.5)",
                lambda = "normal(0, 2)")

# model 1
fit_1 <- bcfa(test_model, data = test,
            ordered = c("PSRA1_1", "PSRA2_1", "PSRA3_1",
                        "PSRA4_1", "PSRA5_1", "PSRA6_1",
                        "PSRA7_1", "PSRA8_1", "PSRA10_1",
                        "PSRA11_1", "PSRA12_1", "PSRA13_1"),
           dp = mydp,
           std.lv=TRUE,
           # save.lvs=TRUE,
           mcmcextra = list(data = list(llnsamp = 200)),
           n.chains = 4,
           seed = 1234)

# model 2
fit_2 <- bcfa(test_model, data = test,
            ordered = c("PSRA1_1", "PSRA2_1", "PSRA3_1",
                        "PSRA4_1", "PSRA5_1", "PSRA6_1",
                        "PSRA7_1", "PSRA8_1", "PSRA10_1",
                        "PSRA11_1", "PSRA12_1", "PSRA13_1"),
           dp = mydp,
           std.lv=TRUE,
           save.lvs=TRUE,
           mcmcextra = list(data = list(llnsamp = 200)),
           n.chains = 4,
           seed = 1234)

SemPaths for blavaan

Hey,

First, thanks for your package. Do you have any idea how to convert a blavaan model into something that could be plotted with the SemPlot package. I assume it would just need to somehow convert the blavaan object into a lavaan-like object?

Thanks a lot!!

Add keyword arguments to blavFitIndices

Hi there,

Thank you for this incredible software. When I run blavFitIndices on one of my models I get the warning: Warning: Some Pareto k diagnostic values are too high. See help('pareto-k-diagnostic') for details.

I was able to see that there is a single k value that is 0.7, which I believe is a good candidate for moment matching. Instead of solving this issue post hoc, I would like to add a ... argument to blavFitIndices. The ... argument would pass through to loo, waic, etc. If this works for you, I could start on a PR right away.

Thanks again,

Matt

longitudinal measurement invariance doesnt work with target="stan"

When testing longitudinal measurement invariance, target="stan" doesnt work, while "stanclassic" does work. Does work when I add the factor loadings or intercept equality labels across time

The error is gives is
SAMPLING FOR MODEL 'stanmarg' NOW (CHAIN 1).
Chain 1: Unrecoverable error evaluating the log probability at the initial value.
Chain 1: Exception: normal_lpdf: Random variable has dimension = 3, expecting dimension = 4; a function was called with arguments of different scalar, array, vector, or matrix types, and they were not consistently sized; all arguments must be scalars or multidimensional values of the same shape. (in 'model_stanmarg' at line 689)

[1] "Error in sampler$call_sampler(args_list[[i]]) : "
[2] " Exception: normal_lpdf: Random variable has dimension = 3, expecting dimension = 4; a function was called with arguments of different scalar, array, vector, or matrix types, and they were not consistently sized; all arguments must be scalars or multidimensional values of the same shape. (in 'model_stanmarg' at line 689)"
error occurred during calling the sampler; sampling not done
Stan model 'stanmarg' does not contain samples.
Error in rowvec[tmpw[, 1] == 0] <- grep(stanvec[j], names(b.est)) :
replacement has length zero

Here is an example that reproduce the error with simulated data

library(blavaan)


mod_sim <- '
f1 =~ 0.7?y1_t1 + 0.7?y2_t1 + 0.7?y3_t1
f2 =~ 0.7?y1_t2 + 0.7?y2_t2 + 0.7?y3_t2

f1 ~~ 1?f1
f2 ~~ 1.5?f2

f1 ~0?1
f2 ~0.7?1

f1 ~~ 0.4?f2

y1_t1 ~~ 0.3?y1_t2
y2_t1 ~~ 0.3?y2_t2
y3_t1 ~~ 0.3?y3_t2
'


dat_sim <- simulateData(mod_sim, sample.nobs = 300)
head(dat_sim)

mod_conf <- '
f1 =~ y1_t1 + y2_t1 + y3_t1
f2 =~ y1_t2 + y2_t2 + y3_t2

y1_t1 ~~ y1_t2
y2_t1 ~~ y2_t2
y3_t1 ~~ y3_t2
'


fit_conf <- bcfa(mod_conf, data=dat_sim, std.lv=T)
summary(fit_conf)


mod_weak <- '
f1 =~ l1*y1_t1 + l2*y2_t1 + l3*y3_t1
f2 =~ l1*y1_t2 + l2*y2_t2 + l3*y3_t2

f1 ~~ 1*f1
f2 ~~ NA*f2

y1_t1 ~~ y1_t2
y2_t1 ~~ y2_t2
y3_t1 ~~ y3_t2
'

fit_weak <- bcfa(mod_weak, data=dat_sim, std.lv=T)
summary(fit_weak)


mod_strong <- '
f1 =~ l1*y1_t1 + l2*y2_t1 + l3*y3_t1
f2 =~ l1*y1_t2 + l2*y2_t2 + l3*y3_t2

f1 ~~ 1*f1
f2 ~~ NA*f2

f1 ~0*1
f2 ~NA*1

y1_t1 ~~ y1_t2
y2_t1 ~~ y2_t2
y3_t1 ~~ y3_t2

y1_t1 ~i1*1
y2_t1 ~i2*1
y3_t1 ~i3*1
y1_t2 ~i1*1
y2_t2 ~i2*1
y3_t2 ~i3*1
'

fit_strong <- bcfa(mod_strong, data=dat_sim, std.lv=T)
summary(fit_strong)

modeling with ordered categorical variables in blavaan

Hi, as suggested by Prof. Ben Goodrich, I am reaching out to ask what would be the best way to model ordered categorical variables in blavaan.

For instance, I wish to build a measurement model, e.g.:

y =~ x1 + x2 + x3 where x1, x2, x3 are three items from the same measure measured with a Likert scale from 1-5. And my lavaan syntax is something like:

fit <- cfa(model, data = dat,
            ordered = c("x1", "x2", "x3"),
           estimator = "WLSMV",
           std.lv=TRUE)
  1. Would I still be able to use the WLSMV estimator for ordered categorical variables if I change cfa to bcfa? The model seems to run, but I am not sure if the estimator means the same thing in a Bayesian framework.
  2. Is there a way to estimate monotonic effects with blavaan, such as what is available in brms using the mo() function?
  3. I am also uncertain about the prior syntax I should use for the three items. My belief is that there are more people scoring higher scores on the Likert scale. Is ordered_logistic() supported in blavaan, or is there a better way?

Thank you so much!

Option to install without JAGS (i.e. only stan)

I'm having problems installing runjags at the moment, because of issues with the new GCC. I don't use JAGS though (just Stan) so wonder if it's possible to include an option to install choosing which backend will be used?

blav_object_methods.R summary back compatibility

Hi,

When applying the summary method on a "blavaan" object computed in a previous version (0.3-15), we get this error

Error in if (blavInspect(object, "options")$prisamp) { :
argument is of length zero

Because the object@Options$prisamp option was not part of the blavaan object, the most recent summary function fails.

You could include a check for the prisamp option in the object, and maybe set it to FALSE if not available, so the summary works fine with fit objects computed with previous versions.

Something like (near Line 320)
if(is.null(object@Options$prisamp)) { ... }

Thanks for the great work!

use of parallel computing

For mac & linux users, mclapply() is used once or twice during post-estimation computations (like in postpred()). Give the user an option to turn off parallelization in case there is some problem with parallelization (use lapply() instead).

Problem with translation from lavaan to MCMC syntax

Please tell me how I can overcome this error. I am using R version 4.3.1 in MacOS. Tried reinstalling Matrix package, still I get the same error.

[1] "Error in Matrix::t(A) : invalid 'lazy' to 'R_sparse_transpose'\n"
attr(,"class")
[1] "try-error"
attr(,"condition")
<simpleError in Matrix::t(A): invalid 'lazy' to 'R_sparse_transpose'>
Error in blavaan(HS.model, data = HolzingerSwineford1939, auto.var = TRUE, :
blavaan ERROR: problem with translation from lavaan to MCMC syntax.

'newdata' argument in blavPredict function

I saw that the blavPredict funciton of your great package blavaan has a 'newdata' argument which is currently unused. I think having factor scores for new data based on an existing blavaan model would be a nice feature.

Many thanks!

Dealing with missing data in blavaan

Hi sorry I am not sure if this is a good question to ask on GitHub, but professor Ben Goodrich suggested that I reach out to the authors (see my original StackOverflow post here).

I am trying to run Bayesian SEM on my data using bsem in blavaan. However, it doesn't seem like it has arguments like missing = "FIML" as in the case of the lavaan package. What would be the best practice of dealing with missingness other than list-wise deletion?

This is the paragraph where Garnier-Villarreal and Jorgensen (2020) talked about handling missing data, but I am not sure how exactly I should actualize their recommendations: "While it is often useful and desirable to directly model the missing values with the rest of the model (e.g., Merkle 2011; O’Muircheartaigh and Moustaki 1999), blavaan employs a "missing at random" approach to missing data that differs across JAGS and Stan. In JAGS, one can include NA values in the data, and JAGS will sample these missing values as if they were extra model parameters. In contrast, Stan does not allow NA values in the data, so that one must handle the missing data manually. We utilize a "full information" likelihood (e.g., Wothke 2000) in our Stan models, which is the same likelihood that is used to handle missing data in lavaan and other software that performs maximum likelihood SEM estimation. This requires some additional overhead in preparing the data to be sent to Stan, because each case's observed values must be indexed, and cases are sorted by missing data pattern to speed up computations. Missing values could also be directly sampled ("imputed") in Stan, though this functionality is not currently available."

Thank you!

parameterEstimates() credible interval

Dear creators,

I noticed that there are differences between summary(fit) and parameterEstimates(fit).

The parameterEstimates() function seems to be identical to its lavaan counterpart, and thus assumes Gaussian priors for all parameters. It would be very useful to convert the ci.lower and ci.upper output from this function to HPD.

Thanks,
David

Bug when doing partial measurement invariance

Ed

I was running a couple of examples of longitudinal measurement invariance. The issue only happened with factor loading invariance, didint happened with intercept invariance. And when I did partial measurement invariance I get issues

  • If between time points, 2 of them are equal and only 1 factor loading is different it works fine (s_g3)
    ft3 =~ l1as_g3 + l2r_g3 + l3m_g3
    ft5 =~ l1b
    s_g5 + l2r_g5 + l3m_g5
    ft8 =~ l1bs_g8 + l2r_g8 + l3*m_g8

  • But when for an indicator, all factor loadings are different over time it doesnt work
    ft3 =~ l1as_g3 + l2r_g3 + l3m_g3
    ft5 =~ l1b
    s_g5 + l2r_g5 + l3m_g5
    ft8 =~ l1cs_g8 + l2r_g8 + l3*m_g8

Stan reports this issue over and over and the chains dont sample
Chain 2: Exception: Exception: []: accessing element out of range. index 7 out of range; expecting index to be between 1 and 5; index position = 1load_par2 (in 'model_stanmarg' at line 115)
(in 'model_stanmarg' at line 823)
small_example_data.xlsx

With the data set I am attaching this code produce the error for me

weak_mod <- '

factor loadings

ft3 =~ l1as_g3 + l2r_g3 + l3m_g3
ft5 =~ l1b
s_g5 + l2r_g5 + l3m_g5
ft8 =~ l1cs_g8 + l2r_g8 + l3*m_g8

residual covariances

s_g3 ~~ s_g5 + s_g8
s_g5 ~~ s_g8
r_g3 ~~ r_g5 + r_g8
r_g5 ~~ r_g8
m_g3 ~~ m_g5 + m_g8
m_g5 ~~ m_g8

Factor means

ft3 ~01
ft5 ~0
1
ft8 ~0*1

factor variances

ft3 ~~ 1ft3
ft5 ~~ NA
ft5
ft8 ~~ NA*ft8

item intercepts

s_g3 ~1
r_g3 ~1
m_g3 ~1
s_g5 ~1
r_g5 ~1
m_g5 ~1
s_g8 ~1
r_g8 ~1
m_g8 ~1
'

fit_weak <- bcfa(weak_mod,
std.lv=T,
data = dat2,
n.chains = 3,
burnin = 3000,
sample = 1000,
target="stan",
bcontrol = list(cores=3))

summary(fit_weak,
standardize=T,
rsquare=T,
neff=T)

sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] blavaan_0.3-11.649 Rcpp_1.0.5 lavaan_0.6-7

loaded via a namespace (and not attached):
[1] TH.data_1.0-10 colorspace_1.4-1 ellipsis_0.3.1 ggridges_0.5.2
[5] brms_2.13.5 rsconnect_0.8.16 estimability_1.3 markdown_1.1
[9] base64enc_0.1-3 rstudioapi_0.11 listenv_0.8.0 rstan_2.21.2
[13] MatrixModels_0.4-1 DT_0.15 fansi_0.4.1 mvtnorm_1.1-1
[17] bridgesampling_1.0-0 codetools_0.2-16 splines_4.0.2 mnormt_2.0.2
[21] shinythemes_1.1.2 bayesplot_1.7.2 jsonlite_1.7.1 mcmc_0.9-7
[25] shiny_1.5.0 compiler_4.0.2 backports_1.1.10 emmeans_1.5.1
[29] assertthat_0.2.1 Matrix_1.2-18 fastmap_1.0.1 cli_2.0.2
[33] later_1.1.0.1 htmltools_0.5.0 quantreg_5.67 prettyunits_1.1.1
[37] tools_4.0.2 igraph_1.2.5 coda_0.19-3 gtable_0.3.0
[41] glue_1.4.2 reshape2_1.4.4 dplyr_1.0.2 V8_3.2.0
[45] vctrs_0.3.4 nlme_3.1-149 conquer_1.0.2 crosstalk_1.1.0.1
[49] stringr_1.4.0 globals_0.13.0 ps_1.3.4 mime_0.9
[53] miniUI_0.1.1.1 CompQuadForm_1.4.3 lifecycle_0.2.0 runjags_2.0.4-6
[57] semTools_0.5-3 gtools_3.8.2 future_1.18.0 MASS_7.3-51.6
[61] zoo_1.8-8 scales_1.1.1 colourpicker_1.1.0 promises_1.1.1
[65] Brobdingnag_1.2-6 parallel_4.0.2 sandwich_2.5-1 inline_0.3.16
[69] SparseM_1.78 shinystan_2.5.0 curl_4.3 gridExtra_2.3
[73] ggplot2_3.3.2 loo_2.3.1 StanHeaders_2.21.0-6 stringi_1.5.3
[77] dygraphs_1.1.1.6 pkgbuild_1.1.0 rlang_0.4.7 pkgconfig_2.0.3
[81] matrixStats_0.56.0 lattice_0.20-41 purrr_0.3.4 rstantools_2.1.1
[85] htmlwidgets_1.5.1 processx_3.4.4 tidyselect_1.1.0 plyr_1.8.6
[89] magrittr_1.5 R6_2.4.1 generics_0.0.2 multcomp_1.4-13
[93] pillar_1.4.6 withr_2.2.0 xts_0.12.1 survival_3.1-12
[97] abind_1.4-5 tibble_3.0.3 future.apply_1.6.0 crayon_1.3.4
[101] nonnest2_0.5-5 tmvnsim_1.0-2 grid_4.0.2 pbivnorm_0.6.0
[105] callr_3.4.4 threejs_0.3.3 digest_0.6.25 xtable_1.8-4
[109] httpuv_1.5.4 MCMCpack_1.4-9 RcppParallel_5.0.2 stats4_4.0.2
[113] munsell_0.5.0 shinyjs_2.0.0

lavoptions$categorical

Would it be possible in a future update of blavaan to replace all references to lavoptions$categorical by lavInspect(object, "categorical")?

The reason is that 'categorical' is not really an option, and is only in the lavoptions list because it is used internally in lavaan's lav_options_set(). I recently renamed it to .categorical, and noticed this broke blavaan. (I will revert this change until this is fixed).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.