Comments (12)
Thanks for the report, I will take a look soon. I am also looping in the two people who know this part of the code best:
@maugavilla @TDJorgensen : in recent months, Zezhen has helpfully found a variety of bugs and unclear issues in blavaan. I am wondering whether you have seen the issues that he describes here. The NAs could possibly be related to recent changes that I made to logl computations, but I am not certain about that.
from blavaan.
@littlehifive could you share more information so we can reporduce the error? Something like sample size, variable means, covariance matrix, and full code? With this we could simulate similar data and reproduce it.
I think I know what might be but would like to see more ifnormation before
from blavaan.
I created a simulated dataset based on my data, and I was able to reproduce the problems with BRMSEA and adjusted GammaHat using the following model. Let me know if you need anything else from me, thanks!
library(blavaan)
# load test_dat.csv here
# lavaan
model <- '
# measurement model
EGRA_2m1 =~ VOC_2m1 + LTRC_2m1 + GRPH_2m1 + INV_2m1 + OPR_2m1 + RDCP_2m1
GRPH_2m1 ~~ LTRC_2m1
RDCP_2m1 ~~ VOC_2m1
RDCP_2m1 ~~ OPR_2m1
# measurement model
EGRA_3m1 =~ VOC_3m1 + LTRC_3m1 + GRPH_3m1 + INV_3m1 + OPR_3m1 + RDCP_3m1
GRPH_3m1 ~~ LTRC_3m1
RDCP_3m1 ~~ VOC_3m1
RDCP_3m1 ~~ OPR_3m1
# correlation
EGRA_3m1 ~~ EGRA_2m1
'
# priors
mydp <- dpriors(tau = "normal(0, 0.5)",
lambda = "normal(0, 2)",
beta = "normal(0.5,0.2)",
rho = "beta(1,1)",
nu = "normal(0, 5)")
# fit model
fit <- bcfa(model, data = test_dat,
dp = mydp,
std.lv=TRUE,
seed = 1234)
# fit indices
blavFitIndices(fit)
# Posterior mean (EAP) of devm-based fit indices:
#
# BRMSEA BGammaHat adjBGammaHat BMc
# NaN 0.801 1.680 0.475
# Warning messages:
# 1:
# 41 (20.5%) p_waic estimates greater than 0.4. We recommend trying loo instead.
# 2: Some Pareto k diagnostic values are too high. See help('pareto-k-diagnostic') for details.
#
# 3: In sqrt(nonc/(dif.ppD * N)) : NaNs produced
from blavaan.
Thnsk, I see what the problem is. I think this change is related to the logl change that @ecmerkle mentioned earlier. I think that change affected the pD calculation for loo and waic. As you can see in the fitMeasures() function, the p_dic, p_waic, and p_loo are different calculation, and vary, but are usually close to each other, but now p_dic=36, and p_waic=126, and p_loo=121
So, these differences should not be this large
`
fitMeasures(fit)
npar logl ppp bic dic p_dic waic p_waic se_waic
43.000 -3081.635 0.000 6390.883 6235.755 36.242 6343.645 126.500 271.855
looic p_loo se_loo margloglik
6333.824 121.589 264.850 -3217.457
`
The think is that pD is used in the fit indices calculations, for now the quick fix is to change function to use DIC pD like this
bfits <- blavFitIndices(fit, pD="dic") summary(bfits)
With this change works fine
@ecmerkle could you check why the p_waic and p_loo might be so much larger now? Curiosly, this doesnt happens with the example model, so my guess would be that is related to the residual correlations?
`HS.model <- ' visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9 '
fit1 <- bcfa(HS.model, data = HolzingerSwineford1939,
n.chains = 2, burnin = 1000, sample = 1000)
fitMeasures(fit1)`
> fitMeasures(fit1) npar logl ppp bic dic p_dic waic p_waic se_waic 30.000 -3738.344 0.000 7647.802 7535.950 29.631 7539.915 33.019 86.116 looic p_loo se_loo margloglik 7540.114 33.119 86.137 -3862.744
from blavaan.
Thanks for looking at it. This is an interesting issue that I haven't yet figured out. I have drastically simplified the model, and I still see inflated p_waic
and p_loo
:
model <- ' EGRA_2m1 =~ VOC_2m1 + LTRC_2m1 + GRPH_2m1 '
fit <- bcfa(model, data = test_dat, dp = mydp, std.lv = TRUE)
I have also tried it with older versions of blavaan (0.4-1 and 0.3-17) and with an older version of loo (2.4-1). For all versions, I see similar behavior. And, like Mauricio mentions, the usual test models always seem to work as expected.
One thing I noticed is that, for the simplified model above, the estimated factor variance can be close to 0 if you use std.lv = FALSE
. I will look at it some more tomorrow.
from blavaan.
For BTLI, I understand it can be over 1 from the formula
In frequentist SEM, adjusted GFI (and its small-sample improvement: adjusted-GammaHat) can be negative in very poorly fitting models. But values > 1 only happen in this case of trying to derive a Bayesian analogous (not equivalent) quantity.
adjBGammaHat
can exceed 1 whenever the analogous quantity to "degrees of freedom" (DF
below) is negative:
adjBGammaHat <- 1 - (pStar / DF) * (1 - BGammaHat)
In our paper, we use the canonical number of sufficient statistics minus the (estimated) number of estimated parameters (pD)
# number of sufficient stats
pStar <- ((nvar * (nvar + 3))/2) # multiplied by number of groups when relevant
# estimated number of estimated parameters
pD # borrowed from DIC, WAIC, or LOO-IC
# analogous to degrees of freedom
DF <- pStar - pD
The negative DF
is occurring with your data, and it is difficult to understand why.
- I initially thought it might be the residual correlations, which were handled by adding factors in
blavaan
's JSS paper, but I think that is no longer relevant with the faster Stan model that marginalizes out the latent variables by using summary stats. Also, as @ecmerkle found with a smaller model without residual correlations, the issue persists. - I also noticed the indicators are not all highly correlated, and their pattern is not easy to discern. I tried a different set of indicators that were substantially correlated (the last 3 indicators of the 2m1 factor), but the issue persists.
- I noticed some extreme outliers and set them to
NA
, but there is still a larger pD from WAIC and LOO than pStar (=9). Looking at their histograms, there is heavy inflation among the last 3 indicators(> 50 onOPR_2m1
, > 70% onINV_2m1
andRDCP_2m1
). The first 3 indicators look more normal, but the issue persists in all these cases. - Note from our paper that these indices are not designed to be used in combination with informative priors, and doing so changed their behavior in our applied examples. However, I tried your model (and smaller combinations) without your
mydp
priors, and the issue persisted.
Some syntax I wrote to try the above:
## try getting rid of extreme outliers
dat <- test_dat
vv <- c("VOC_2m1","LTRC_2m1","GRPH_2m1","INV_2m1","OPR_2m1","RDCP_2m1")
for (i in vv) {
sc <- abs(scale(dat[,i]))
dat[,i] <- ifelse(sc > 3, NA, dat[,i])
}
## model with first 3 indicators (more normally distributed, but low correlations)
model <- ' EGRA_2m1 =~ VOC_2m1 + LTRC_2m1 + GRPH_2m1 '
## or try fitting model to most highly correlated indicators (but highly inflated)
model <- ' EGRA_2m1 =~ INV_2m1 + OPR_2m1 + RDCP_2m1 '
#fit <- cfa(model, data = test_dat, std.lv = TRUE) # quick check with ML
bfit <- bcfa(model, data = dat, dp = mydp, std.lv = TRUE)
blavFitIndices(bfit)
So I can't rule out that there is something pathological in these simulated data, but I can't be sure that is what is causing the pD estimates to exceed pStar in all these cases. I hope some of these explorations might help @ecmerkle or @maugavilla by triggering ideas about where to look, but I am not convinced it is caused by the newer logl
calculations. If I interpreted Ed's last message correctly, the issue existed before the logl
change.
Instead of a problem with blavaan
(or loo
) source code, I expect it is an issue that can arise with particular data characteristics (or their interaction with model characteristics) that we haven't discovered yet. It is possible that the normal likelihood function is inappropriate for your data, which causes more variation across the parameter space, leading to greater pD estimates (especially in the WAIC and LOO cases, which more completely utilize the available posterior information).
from blavaan.
Thank you all for your comments! I am looping in Ben Goodrich (@bgoodri) here too.
I am not sure if some background information about my data would be helpful to solving this issue:
- EGRA means early grade reading assessment, and the items are specific domains of reading skills (e.g., VOC is vocabulary, LTRC is letter recognition).
- Because the assessment was done a quite sensitive child population in a crisis-affected country, many assessed kids had zero score on many tested domains (e.g., they were tested on French when they did not know anything about French). The simulated variables here were standardized, but in the original data most variables are very zero-inflated.
- Because of this issue, Ben and I weren't able to build a CFA model that converges for these zero-inflated EGRA measures at each wave of the three-wave study, and he suggested that I try to subtract wave 1 score from wave 2 and wave 3 scores, and model the change scores as if they were continuous.
- But as it turns out, even when we model the change scores, some variables still have quite extreme distributions because many of those who had zero in baseline also had zero in endline (i.e., they didn't learn French at all due to issues like non-schooling or poverty).
from blavaan.
Terrence's interpretation is right: because this also happens in older versions of blavaan, I don't think that recent changes broke something. I will double-check the log-likelihood computations to make sure, but I am leaning towards "particular data characteristics" as the explanation here.
Instead of defining outliers as Terrence did, I used the Pareto k values coming out of loo (removing cases where k is > .5). This leads to about 5 cases being removed (sorry I didn't set the seed). But after removing them, the DIC p_d drops to around -9. The old winbugs user manual says that negative p_d can happen with prior-data conflict, or when posterior means fall in an area of low density (say, when there is a bimodal posterior). I think this is some evidence for the "data characteristics" explanation.
If you compute DIC in the JAGS way (using jags.ic = TRUE), the p_d for DIC seems more reasonable regardless of whether the cases are included.
Code is below:
model_ll <- loo::extract_log_lik(fit@external$mcmcout)
loo_out <- loo(model_ll)
outl <- which(loo_out$diagnostics$pareto_k > .5)
reduced_dat <- test_dat[-outl,]
fitr1 <- bcfa(model, data = reduced_dat, jags.ic = TRUE)
fitMeasures(fitr1)
# npar logl ppp bic dic p_dic dic_jags
# 9.000 -747.657 0.482 1542.678 1477.012 -9.151 1511.448
#p_dic_jags waic p_waic se_waic looic p_loo se_loo
# 8.067 1498.936 12.245 55.656 1499.041 12.298 55.685
#margloglik
# -781.064
from blavaan.
Giving the exaplanation about the original data, I find it interesting that this seems to be a good diagnostic of the data mismatching the models likelihood, and that loo and waic are sensnitive to this but not dic
When I thry ecluding outliers with the larger model, it needs several rounds of excluding them based on the pareto_k, which doesnt seem like a sustainable solution to me
If the data is actually zero-inflated, I think you should try to run the indicators as categorical, with one category for 0, and then ordered the rest of the answers. This would make it similar to a Graded response Model, which is better for this cases than treating as continuous.
In another project we modeled zero-inflation as a mixture IRT model, modeling the zero-inflation as a correlated mixture of the GRM model, and can find the Stan code here https://osf.io/hvkfn/
Magnus, B. E., & Garnier-Villarreal, M. (2022). A multidimensional zero-inflated graded response model for ordinal symptom data. Psychological Methods, 27(2), 261–279. https://doi.org/10.1037/met0000395
from blavaan.
I agree that excluding people on the basis of pareto_k is not a solution... I was just looking at how sensitive the information criteria were to those observations.
It seems like we are agreeing that there is not a bug here, but that the behavior of these fit metrics can be suboptimal for models that do not fit so well.
So it seems a solution could be a warning from fitMeasures
if the effective number of parameters are negative, or if effective number of parameters have large variability across metrics. Also, related to the above post from @TDJorgensen , does it make sense to restrict DF to never go below 0 there?
from blavaan.
does it make sense to restrict DF to never go below 0
I would not advise that. It could end up hiding problems that should be brought to attention.
a warning from
fitMeasures
if the effective number of parameters are negative, or if effective number of parameters have large variability across metrics
That sounds like a good solution. By "different metrics" do you mean DIC vs. WAIC vs. LOO_IC? We don't have a lot to go on here, so I don't think the warning can be very informative about what to do (maybe there is nothing to do) or what to investigate (there might be other issues). Perhaps something like:
warning("The estimated number of model parameters exceeds the number of sample statistics (covariances, etc.), so fit-index calculations may lead to uninterpretable results.")
from blavaan.
Yes, I mean that we look at effective number of parameters under DIC, WAIC, LOO. If DIC is negative, or if they differ by "alot" (maybe some percentage away from a simple parameter count?), then a warning appears.
I am not sure this is exactly the same as the negative DF issue (definitely related, maybe not exactly the same). Maybe we warn about negative DF (in blavFitIndices
) separately from the effective number of parameter warning (from fitMeasures
).
from blavaan.
Related Issues (20)
- modeling with ordered categorical variables in blavaan HOT 3
- Add keyword arguments to blavFitIndices HOT 9
- not saving object because of delta parameterization HOT 2
- Model fit and ppmc() with ordinal data HOT 2
- model results become different when save.lvs = TRUE HOT 5
- problem with translation from lavaan to MCMC syntax HOT 3
- Latent variable names missing from blavPredict(type = "lv") HOT 2
- lavoptions$categorical HOT 2
- Growth Mrxture Modelling in blavaan HOT 1
- avoid nested lists
- blavFitIndices & multiple groups HOT 1
- blavaan failed to compile (Mac M1) HOT 4
- Bug with multiple groups HOT 1
- `blavCompare` errors HOT 2
- Problem with translation from lavaan to MCMC syntax HOT 9
- blavaan ERROR: problem with translation from lavaan to MCMC syntax HOT 11
- Error in lav2standata(lavobject) : object 'ptot' not found HOT 5
- Problem with multilevel model HOT 1
- error in help-page examples for wrappers HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from blavaan.