Hi professor, Looking at your econometrics cheatsheet, I am wonderin

Explaining the same coefficient value, I get : <math-renderer class="js-display-ma

Possible typo on partial effects about econometricslabs HOT 3 CLOSED

tyleransom commented on July 22, 2024

Possible typo on partial effects

from econometricslabs.

Comments (3)

tyleransom commented on July 22, 2024 1

Nice work! I agree with you that the SE difference is likely due to some degrees of freedom discrepancy.

If it's all right with you, I'll close this issue. I believe that the way I present it in the lab is consistent with how it's written in Wooldridge's book. While this implementation is not perfectly correct with respect to the FWL theorem, I believe it's more accessible to introductory students.

Feel free to re-open the issue if you have more questions or proposed corrections. Thanks!

from econometricslabs.

tyleransom commented on July 22, 2024

Hi, thanks for bringing this to my attention. You are correct that the FWL theorem uses residualized y (see the second equation on the Wikipedia page), whereas the example in Lab 4 only residualizes the x.

Let's see how this differs from the example in my lab using R:

# load the data
library(tidyverse)
library(modelsummary)
df <- mtcars %>% as_tibble()

Basic regression using mtcars dataset, where cyl is coefficient of interest:

summary(lm(mpg ~ cyl + disp + hp, data = df))

Call:
lm(formula = mpg ~ cyl + disp + hp, data = df)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.0889 -2.0845 -0.7745  1.3972  6.9183 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 34.18492    2.59078  13.195 1.54e-13 ***
cyl         -1.22742    0.79728  -1.540   0.1349    
disp        -0.01884    0.01040  -1.811   0.0809 .  
hp          -0.01468    0.01465  -1.002   0.3250    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.055 on 28 degrees of freedom
Multiple R-squared:  0.7679,	Adjusted R-squared:  0.743 
F-statistic: 30.88 on 3 and 28 DF,  p-value: 5.054e-09

Now residualize cyl only:

est1 <- lm(cyl ~ disp + hp, data = df)
summary(lm(mpg ~ est1$residuals, data = df))

Call:
lm(formula = mpg ~ est1$residuals, data = df)

Residuals:
     Min       1Q   Median       3Q      Max 
-10.8351  -3.5281  -0.5277   1.8950  13.7914 

Coefficients:
               Estimate Std. Error t value Pr(>|t|)    
(Intercept)      20.091      1.072  18.735   <2e-16 ***
est1$residuals   -1.227      1.583  -0.775    0.444    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 6.066 on 30 degrees of freedom
Multiple R-squared:  0.01965,	Adjusted R-squared:  -0.01303 
F-statistic: 0.6012 on 1 and 30 DF,  p-value: 0.4442

Now residualize both mpg and cyl:

est2 <- lm(mpg ~ disp + hp, data=df)
summary(lm(est2$residuals ~ est1$residuals, data = df))

Call:
lm(formula = est2$residuals ~ est1$residuals, data = df)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.0889 -2.0845 -0.7745  1.3972  6.9183 

Coefficients:
                 Estimate Std. Error t value Pr(>|t|)
(Intercept)     1.695e-16  5.218e-01   0.000    1.000
est1$residuals -1.227e+00  7.702e-01  -1.594    0.122

Residual standard error: 2.952 on 30 degrees of freedom
Multiple R-squared:  0.07804,	Adjusted R-squared:  0.04731 
F-statistic: 2.539 on 1 and 30 DF,  p-value: 0.1215

In the end, they each give the same coefficient estimate, but different standard errors. I'll have to think more about why this is the case, but for now it seems that there is nothing incorrect. But please engage further if you disagree or if something else is unclear! Thanks again for bringing this up.

from econometricslabs.

pollytatouin commented on July 22, 2024

Explaining the same coefficient value, I get :

$$\beta_{2FWL} = (X'_2M'_1M_1X_2)^{-1}X'_2M_1Y = (X'_2M'_1M_1X_2)^{-1}X'_2M'_1M_1Y$$

by idempotency.

The cheatsheet (CS) model implies this regression (Y is not residualized):

$$Y = M_1X_2\beta_2 + U$$

$$ min (Y-M_1X_2\beta_2)'(Y-M_1X_2\beta_2) = min (Y'Y-Y'M_1X_2\beta_2 - \beta'_2X'_2M'_1Y + \beta'_2X'_2M'_1M_1X_2\beta_2$$

$$ FOC: $$

$$ -2X'_2M'_1Y + 2X'_2M'_1M_1X_2\beta_2 = 0 $$

$$ \beta_{2CS} = (X'_2M'_1M_1X_2)^{-1}X'_2M'_1Y = (X'_2M'_1M_1X_2)^{-1}X'_2M'_1M_1Y $$

Again, the idempotency brings back all the $M_1$'s to make the expression equivalent.

For the variance though, I haven't succeeded at explaining it mathematically yet. But from the regression outputs we see that the FWL way gives a SE much closer to the true value than when using CS. I suspect the difference in SE between the true value and FWL comes from incorrect degrees of freedom (the model doesn't know other parameters were used in a first step).

from econometricslabs.

Possible typo on partial effects about econometricslabs HOT 3 CLOSED

Comments (3)

Related Issues (4)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent