Giter Site home page Giter Site logo

Comments (8)

osofr avatar osofr commented on August 9, 2024 1

Scratch that.

I believe what you originally did would accomplish exactly the same effect as the weight merging procedure that I have described above. It would be nice to confirm it though, but I don't think its necessary.

Basically I think what you were doing is fine. You initial propensity score (for t=0) would be evaluated as P(A(0)|L(0)*P(C(0)=1|L(0)). Your propensity scores for t>0 would be evaluated as P(C(t)=1|L(t)).

So the cumulative weight at t=1, would be (automatically) evaluated as
[1 / {P(A(0)|L(0)*P(C(0)=1|L(0))}] x [1 / P(C(1)=1|L(1))

I believe this corresponds exactly to the intervention that you are after. Intervene on baseline treatment only and intervene on the censoring process across all time-points. Might also double confirm it with @romainkp

from stremr.

osofr avatar osofr commented on August 9, 2024

Hm, this is a good one. Let me think about it for a second. Definitely possible, just the question of what is the easiest hack for this within existing functionality.

from stremr.

ck37 avatar ck37 commented on August 9, 2024

Thanks, appreciate it. Maybe there is a hack to fit the estimator manually outside of stremr, add the predictions to the data table as a new column with replicated predicted values across all time points, then configure stremr to use that column as the treatment propensity score (manually create or integrate into the fitPropensity object)?

from stremr.

ck37 avatar ck37 commented on August 9, 2024

I guess longer term perhaps there could be a subset_TRT formula for fitPropensity() to identify the subset of observations used for estimation.

Or perhaps if I used an sl3 estimator I could manually subset the data within a wrapper, or include a separate data processing step in the pipeline that subsets to t == 0? Would also need to handle the replication of predicted values, possibly in a custom prediction wrapper.

from stremr.

osofr avatar osofr commented on August 9, 2024

But do you really want to replicate the actual predicted values across all time-points or rather use the model fit from t==0 to obtain predictions of propensity scores across all the other time-points? I was under the impression it was the latter, where as you seem to suggest its the former?

Can you please confirm that this is exactly what you are trying to accomplish: Fit the model A(0) ~ L(0) and use that model fit to obtain the predictions pi_0 = P(A(0)|L(0)) (\hat added to pi_0 to emphasize these are fits, not true probs). Then multiply all time-varying weights (e.g., censoring) across all time-points t=0,..,K by this additional weight 1/pi_0? What is the purpose of that, not sure I follow? So this additional weight only changes by observation, but it is time-invariant, right?

On the other hand you may want to use the above model fit to obtain predictions pi_t = P(A(t)|L(t)) or P(A(0)|L(t)), which would imply that the weights are also time-varying. It seems to me this is not what you are trying to do.

from stremr.

ck37 avatar ck37 commented on August 9, 2024

My interpretation of the article is that for the ITT parameter the treatment propensity score (& associated weights) is time-invariant and based on exposure status at t == 0. So yes that would be what you have in paragraph 2. I could definitely be misinterpreting it though, here are some key excerpts from the article:

We fit pooled logistic marginal structural models (MSMs) to estimate discrete time hazards of CVD comparing ART regimens with and without abacavir using IPW estimation to account for baseline confounding and time-varying selection bias in intention-to-treat and per-protocol analyses, analogous to a randomized trial. The data were structured to allow the exposure, outcome, right-censoring, and time-dependent covariates to change every 30 days after ART initiation. Propensity score logistic models predicted exposure at baseline and censoring over time as a result of death, health plan disenrollment, switching exposure groups, or end of study.

The first approach was intention-to-treat, which focused on the comparison of the effect of ART initiation with or without abacavir, regardless of how long subjects remained on the initial regimen. For these analyses, switching exposure groups at any time was ignored, thus maximizing follow-up and statistical power. This approach estimates the legacy effect of a 1-time decision to initiate regimens with or without abacavir.

What do you think?

from stremr.

osofr avatar osofr commented on August 9, 2024

If above is correct, then absolutely the easiest thing to do is evaluate this subject specific weight 1/pi_0 outside stremr. Then create a dataset with this additional weight column, structured as ID, t, weight, then pass this weight dataset as an additional argument to the relevant estimation routine (I know survNPMSM can handle it, but would have to double check that TMLE can as well.

Here is the code that basically shows how additional weights would be integrated in the weights dataset:
https://github.com/osofr/stremr/blob/master/R/main_estimation.R#L11-L27

Another, much more robust option, is to manually incorporate the additional weight 1/pi_0 into your weights dataset and then continue to use that new, adjusted weights dataset within stremr. This is you could do this manually:

  1. Treat stremr as if it only has time-varying censoring and has absolutely no knowledge about your baseline time-invariant treatment A(0).
  2. Fit the propensity scores and obtain the weights dataset (getIPWeights) for your intervention of interest, based on time-varying censoring alone (the weights dataset for now excludes weights based on baseline propensity treatment score, but we will add them shortly)
  3. Estimate the additional weight 1/pi_0 outside stremr using whatever sl3 learner you want. Alternatively, run stremr again (separately from 1 & 2), with the input data that includes only a single time-point, i.e., (A(0),L(0)). Then fit the propensity scores to that data for A(0)~L(0) and extract the weights dataset that now contains (ID,t=0,1/pi_0).
  4. Create an additional weight dataset, with replicated rows across all-time points. Another way to do that is to exclude t from the weights data.table in 3. and then do a simple merge of the weight data.table from 2 with weights data.table from 3. That way (if the merge is done right) the additional column 1/pi_0 would be simply replicated across all time-points.
  5. Over-write the cumulative weight column it with 1/pi_0, as it is basically done here: https://github.com/osofr/stremr/blob/545642c0326b14f191b034d7b55018c9b80424bc/R/main_estimation.R#L23
  6. Continue using this modifying weights data.table throughout stremr. Make sure to pass this weights dataset to all TMLE routines!!!! Otherwise TMLE will create its own weights dataset, which will obviously ignore 1/pi_0.

Hope this helps, let me know when I can close the issue or if you have any questions.

from stremr.

ck37 avatar ck37 commented on August 9, 2024

Thanks a bunch, ok I will give this a try and report back.

from stremr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.