pedrohcgs / drdid Goto Github PK

View Code? Open in Web Editor NEW

83.0 83.0 30.0 3.29 MB

Doubly Robust Difference-in-Differences Estimators

Home Page: https://psantanna.com/DRDID

R 99.30% C++ 0.70%

drdid's Introduction

Hi there👋

Welcome to my GitHub profile!

I am Pedro H. C. Sant'Anna, an Associate Professor at the Department of Economics at Emory University.

I'm an Applied Econometrician, and I am very curious about how econometric things work (in theory and in practice).

My main areas of research interest are in the intersection of

Causal Inference;
Data-adaptive methods (aka Machine Learning);
Semiparametric methods;
Applied Microeconomics.

In the last few years, I have devoted much of my time to developing, better understanding, and improving Difference-in-Differences (DiD) methods.

I have also spent some time in Tech (Microsoft and Amazon), which I've greatly enjoyed.

In my GitHub, you will find:

📦 Packages that I have developed with co-authors (usually in R);
💾 Replication packages for some of my published papers;
📝 Some lecture notes from courses that I teach.

If you have any questions, please feel free to contact me:

📧 Email: [email protected]
💼 LinkedIn: https://www.linkedin.com/in/pedrohcsantanna/
🐦 Twitter: https://twitter.com/pedrohcgs

drdid's People

Contributors

Stargazers

Watchers

drdid's Issues

Issue with multi level factor covariate in model fit

Hello DrDID team,

I have an issue running a repeated cross-sectional model with a factor covariate with many levels. Some quick background: we're trying to estimate the effect of an intervention that is implemented at an individual organization level. We have individual-level data. These organizations work in multiple local markets and must comply with state/local laws. We include individual and market-level variables about the respondent, but we also think it is necessary to have market fixed-effects -- essentially, a single fixed-effect variable with roughly 80 or so levels.

When I run the "drdid_imp_rc" command without the factor, everything works well. When I include the factor, which we believe to be very important in the DGP of our DV, I get this error:

"Error in [[<-.data.frame(*tmp*, i, value = c(484L, 484L, 484L, 484L, : replacement has 626843747 rows, data has 13337101"

What do you think could be the cause of it? Is there a different specification to use that might avoid this issue? We have a random sample at each year for both treated or control, so this is decidedly not panel data.

Thanks!

All the best,
Josh

Allow for clustered std errors (like in the `did` package)

Non-panel estimation requires ID

Just a quick note that the idname argument is non-optional, even if panel = FALSE. I managed to get panel = FALSE to run anyway by just using the row ID as the identifier (i.e. 1:N) but not sure if this affected the results - I did get different results in Stata vs. R but not certain if this is why.

NA results if matching on categorical variable with no equivalent value in non-treated group

Hi,

I realize that if one of the weighting variables in the formula is a categorical variable (e.g. taking value 1 to 4), and it happens that in the non-treated group some units do not have matching values (e.g. some units in the treated group have value 1 while no unit in the non-treated group has value 1), the ATT and SE are NA, without any warning message.
It was good for me because it forces me to double-check the balance of covariates before performing IPW.
But it may be helpful to have a warning message mentioning: "No match for value 1 in continuous variable X1"

I assume the solution is either to drop the treated units with this specific value unless there is a way to keep these units while ignoring weighting on this specific variable.

Best,

Sebbych

Allow for different covariates in the pscore and OR models

Is IPW denominator correct in ipw_did_panel.R?

Consider the following block:

  #Compute IPW estimator
  # First, the weights
  w.treat <- i.weights * D
  w.cont <- i.weights * ps.fit * (1 - D)/(1 - ps.fit)

  att.treat <- w.treat * deltaY
  att.cont <- w.cont * deltaY

  eta.treat <- mean(att.treat) / mean(i.weights * D)
  eta.cont <- mean(att.cont) / mean(i.weights * D)

For eta.cont we divide by mean(i.weights * D), but shouldn't we divide by w.cont? I ran some simulations and found that the two quantities are the same when i.weights=1 (i.e., no weights), but would they differ if we used non-trivial i.weights?

pedrohcgs / drdid Goto Github PK

drdid's Introduction

Hi there👋

drdid's People

Contributors

Stargazers

Watchers

Forkers

drdid's Issues

Issue with multi level factor covariate in model fit

Allow for clustered std errors (like in the `did` package)

Non-panel estimation requires ID

NA results if matching on categorical variable with no equivalent value in non-treated group

Allow for different covariates in the pscore and OR models

Is IPW denominator correct in ipw_did_panel.R?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent