Comments (11)
Which paper are you refering to?
from loo.
Our paper with subsampling loo with difference estimator http://proceedings.mlr.press/v108/magnusson20a.html (I thought you would still remember that 😆 )
from loo.
Ha ha! Yes. I just need to refresh my mind. It was four years since I wrote that code. =)
from loo.
Additionl notes
In the supplementary material eq (6)
- has minus sign but the eq (9) in main text doesn't
- has a curly bracket denoting a, but that curly bracket doesn't include 1/n. Should it include that?
- below \sum_i^n \pi_i^2 - \tilde{pi}_i^2 seems to be missing parens or another sum
- below \hat{a} includes 1/n, but the curly bracket did not
The supplementary material eq (7) has
- 1/n^2, but that doesn't show in eq (9) of the main text
Currently, it is possible that a
is negative even if it is estimating something positive, which can make \hat{sigma}_{diff,loo}^2 (eq (9) in main text) also negative. Also, I think that \hat{sigma}_{diff,loo}^2 (eq (9)) should never be smaller than eq (8) (in main text), or should include that eq (8) to take into account the uncertainty.
from loo.
Yes. I will need to check this asap after newyears. I think the supplementary material is the best since there is where we prove that the estimator is unbiased.
I think, theoretically, that we can get negative estimates, but need to check the proof.
from loo.
I have now gone through it, and I think the deviation of the unbiased estimator in the supplementary material makes sense. There is one minor error, the a-bracket should cover also 1/n (as you mention). Also, t_e is not formally defined, but between Eq. 10 and Eq. 11 in the supplementary material, it is implicitly used as the total residual error. So, when I went through the supplementary material. It looks correct to me. Except for the bracket.
Next step is going from the supplementary material to see if there is an error between the supplementary material and Eq. 9. The first line of Eq. 9 seem to be \hat{a} in the supplementary material. Hence there should be a minus, not a plus, in Eq 9. Also this error result in that also the second + in the Eq. 9 should be -, as I see it.
Also, there is a slight difference between sigma_loo and sigma_{loo,diff}, in that sigma_{loo,diff} = n sigma_loo (the aggregated variance of the total vs per observation).
I have tried to write this down in this document (mainly p. 4-5). Please look at the derivation and see if you agree with the derivation of the estimator.
Below I try to answer your comments:
In the supplementary material eq (6) has minus sign but the eq (9) in main text doesn't
I think Eq. 6 is correct given that the definition of sigma^2_loo in Eq. 5 is correct, which I think it is.
has a curly bracket denoting a, but that curly bracket doesn't include 1/n. Should it include that?
Yes. I think that is an error. If we look at the estimation of a, that is \hat{a}, it is the mean (ie. 1/n). We also see it in the expectation of \hat{a}, which includes 1/n. Hence, the bracket in Eq. 6 is missing the 1/n.
below \sum_i^n \pi_i^2 - \tilde{pi}_i^2 seems to be missing parens or another sum
Which equation is this?
below \hat{a} includes 1/n, but the curly bracket did not
Yes. I think the curly bracket in Eq. 6 is missing the 1/n.
The supplementary material eq (7) has 1/n^2, but that doesn't show in eq (9) of the main text
The is the difference between the aggregated variance (the total) and the variance per observation.
from loo.
Thanks!
below \sum_i^n \pi_i^2 - \tilde{pi}_i^2 seems to be missing parens or another sum
Which equation is this?
This is the reason you should number all the equations. After eq (6) there is a three line paragraph, and the last line of that paragraph has \sum_i^n \pi_i^2 - \tilde{pi}_i^2. As there are no parens the sum is only over \pi_i^2, but that doesn't make sense as the latter term depends also on i. So it should be either \sum_i^n (\pi_i^2 - \tilde{pi}_i^2) or \sum_i^n \pi_i^2 - \sum_i^n \tilde{pi}_i^2 (which are equivalent).
What do you think of my comment on modifying the estimate to guarantee positivity?
from loo.
Yes, in that paragraph, it is a missing parenthesis. As you say, it is given by the context but is unclear.
Re: ensure positivity.
So, we prove that the estimator is unbiased wrt sigma_loo. Hence, if we ensure positivity, it will no longer be unbiased. That said, it would probably have a smaller MSE. However, in this setting, the easiest solution is to recommend taking a larger sample since potential negative estimates would come from the sampling variance of the estimator.
Have you been able to check my decision to see if you also agree? When you are on board with the derivation there, I can take a pass on the code. I try to avoid looking at the code before we trust the derivation.
from loo.
So, we prove that the estimator is unbiased wrt sigma_loo. Hence, if we ensure positivity, it will no longer be unbiased. That said, it would probably have a smaller MSE. However, in this setting, the easiest solution is to recommend taking a larger sample since potential negative estimates would come from the sampling variance of the estimator.
In projpred rerunning search can be very costly, and thus taking a larger sample is not the easy option. It would be better to include the sampling variance of the estimator, as our uncertainty about the accuracy should be affected also by the sampling variance of the estimator.
The main article Section 2 says: "we propose to use the difference estimator and simple random sampling
without replacement (SRS)", and the code agrees with that. The supplement and the new pdf have "and the probability of subsampling observation i is 1/n, i.e. the subsample is uniform with replacement."
In sigma_loo.pdf eqs (20) and (21) both have \hat{b}, but eq (20) is two first terms of \hat{b} and (21) is two last terms of \hat{b}. Same for (23) and (24)
I did not find other errors
from loo.
In projpred rerunning a search can be very costly, and thus, taking a larger sample is not the easy option. It would be better to include the sampling variance of the estimator, as our uncertainty about the accuracy should be affected also by the sampling variance of the estimator.
Ok. I guess you can do that in the projpred setting? From the function, I think you get both.
The main article Section 2 says: "we propose to use the difference estimator and simple random sampling
without replacement (SRS)", and the code agrees with that. The supplement and the new pdf have "and the probability of subsampling observation i is 1/n, i.e. the subsample is uniform with replacement."
Hmmm. Yes. That is a difference and I think we would need to change this to inclusion probability instead. I need to look in my old sampling theory books.
In sigma_loo.pdf eqs (20) and (21) both have \hat{b}, but eq (20) is two first terms of \hat{b} and (21) is two last terms of \hat{b}. Same for (23) and (24)
Yes. That is because I split up \hat{b} into the two components to reflect the estimator in the main text. I could call it \hat{b}_1 and \hat{b}_2 to make it more clear.
from loo.
Closed by #238
from loo.
Related Issues (20)
- Adding "loo" criterion with add_criterion and moment_match=TRUE is failing even when save_pars(all=TRUE) was set during model fit HOT 12
- (loo_)(s)crps could ask for only one argument with predictions HOT 3
- `.wvar` internal function isn't calculating the right thing HOT 1
- Inconsistent Pareto k-values for SIS? HOT 10
- Update to new Stan array syntax
- Fix unbalanced knitr chunk delimiters
- Don't warn about slightly large khats
- Update E_loo khat diagnostic HOT 3
- possible typo in LFO-CV vignette HOT 1
- Improved mcse_elpd_loo HOT 4
- Use more functions from the posterior package
- E_loo fails with github version of posterior HOT 5
- pareto_khat with tail = "both" returns NA when either tails are constant HOT 1
- loo_moment_match_i computes wrong p_loo
- loo_moment_match fails if new log likelihood is -Inf
- Port C++ psis code to loo?
- Stan log_prob exceptions cause moment_matching to fail
- `loo_R2` can fail on examples where it worked before HOT 4
- Add support for Jacobian adjustment
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from loo.