sergiocorreia / ppmlhdfe Goto Github PK

View Code? Open in Web Editor NEW

62.0 10.0 13.0 4.62 MB

Poisson pseudo-likelihood regression with multiple levels of fixed effects

Home Page: http://scorreia.com/software/ppmlhdfe/

License: MIT License

HTML 88.48% Stata 11.46% TeX 0.06%

stata poisson-regression fixed-effects separation high-dimensional-data

ppmlhdfe's Introduction

`ppmlhdfe`: Poisson pseudo-likelihood regression with multiple levels of fixed effects

Jump to: citation references install
Also see: ppmlhdfe Paper | Separation Paper | Help File | Separation Primer | Separation Benchmarks | Undocumented

ppmlhdfe is a Stata package that implements Poisson pseudo-maximum likelihood regressions (PPML) with multi-way fixed effects, as described in Correia, Guimarães, Zylkin (2019a). The estimator employed is robust to statistical separation and convergence issues, due to the procedures developed in Correia, Guimarães, Zylkin (2019b).

Recent updates

Version 2.3 27jun2021: minor changes due to reghdfe's v6 update. Currently, ppmlhdfe is still using the code from reghdfe v5, which the new version ships with. A port is planned at some point in the future, but because some Mata functions changed their behavior, this needs to be done carefully.
Version 2.2 02aug2019: major speedups due to improved IRLS acceleration (see page 7 of the paper) and due to faster separation checks.
Version 2.1 04apr2019: added experimental step-halving. Not as useful for Poisson models as with other GLMs, so it's turned off by default. You can enable it by including the option use_step_halving(1). Other options you can set are step_halving_memory(0.9) and max_step_halving(2) (default values in parenthesis).

Citation

(Download BibTex file here)

As text

Sergio Correia, Paulo Guimarães, Thomas Zylkin: “Verifying the existence of maximum likelihood estimates for generalized linear models”, 2019; arXiv:1903.01633.

Sergio Correia, Paulo Guimarães, Thomas Zylkin. Fast Poisson estimation with high-dimensional fixed effects. The Stata Journal. 2020;20(1):95-115. doi:10.1177/1536867X20909691

As BibTex

@Misc{ExistenceGLM,
  Author = {Correia, Sergio and Guimar{\~a}es, Paulo and Zylkin, Thomas},
  Title = {Verifying the existence of maximum likelihood estimates for generalized linear models},
  Year = {2019},
  Eprint = {arXiv:1903.01633},
}

@article{ppmlhdfe,
  Author = {Correia, Sergio and Guimar{\~a}es, Paulo and Zylkin, Thomas},
  Title ={{Fast Poisson estimation with high-dimensional fixed effects}},
  Journal = {The Stata Journal},
  Volume = {20},
  Number = {1},
  Pages = {95-115},
  Year = {2020},
  DOI = {10.1177/1536867X20909691},
  URL = {https://doi.org/10.1177/1536867X20909691},
  eprint = {https://doi.org/10.1177/1536867X20909691}
}

References

Quick information on the command can be glanced from the help file.

For detailed information:

The ppmlhdfe paper explains the command in depth, provides examples, etc.
The paper on statistical separation discusses the crucial step of solving the separation issue, that can otherwise lead to incorrect convergence (or no convergence) in Poisson and other GLM models.

For introductory guides on separation, and on how ppmlhdfe internally address it, see the following documents:

Separation primer: a quick practical introduction to separation in Poisson models.
Separation benchmarks: shows how separation affects all common statistical packages.
Undocumented options: this pages briefly lists otherwise undocumented options of ppmlhdfe, which might be useful for advanced users.

Installation

ppmlhdfe requires the latest versions of ftools and reghdfe.

To install stable versions from SSC:

cap ado uninstall ftools
cap ado uninstall reghdfe
cap ado uninstall ppmlhdfe

ssc install ftools
ssc install reghdfe
ssc install ppmlhdfe

clear all
ftools, compile
reghdfe, compile

* Test program
sysuse auto, clear
reghdfe price weight, a(turn)
ppmlhdfe price weight, a(turn)

To install the latest versions from Github:

* Install ftools
cap ado uninstall ftools
net install ftools, from("https://raw.githubusercontent.com/sergiocorreia/ftools/master/src/")

* Install reghdfe
cap ado uninstall reghdfe
net install reghdfe, from("https://raw.githubusercontent.com/sergiocorreia/reghdfe/master/src/")

* Install ppmlhdfe
cap ado uninstall ppmlhdfe
net install ppmlhdfe, from("https://raw.githubusercontent.com/sergiocorreia/ppmlhdfe/master/src/")

* Create compiled files
ftools, compile
reghdfe, compile

* Check versions
ppmlhdfe, version

* Clear programs already in memory
program drop _all

* Test program
sysuse auto, clear
reghdfe price weight, a(turn)
ppmlhdfe price weight, a(turn)

ppmlhdfe's People

Contributors

Stargazers

Watchers

Forkers

wuwenbinge arlionn deluair omontti aamperalta zichzhou gsipe-workshop yilihong houeix tgerarden kelly-jj-ll jeaninezzz

ppmlhdfe's Issues

Algorithm not working on example separation datasets

Hi to all,

I was implementing the separation algorithm myself and I was testing the example datasets. I just followed the example code in https://github.com/sergiocorreia/ppmlhdfe/blob/master/guides/separation_primer.md and found differences in the results. I checked the example datasets and there were differences between what the example datasets say is separated and the output of the algorithm. Please see these two examples (3 and 4):

import delimited https://raw.githubusercontent.com/sergiocorreia/ppmlhdfe/master/test/separation_datasets/03.csv, clear


* Run IR (iterative rectifier) algorithm
loc tol = 1e-5
gen u =  !y
su u, mean
loc K = ceil(r(sum) / `tol' ^ 2)
gen w = cond(y, `K', 1) 

while 1 {
	qui reghdfe u [fw=w], absorb(id1 id2 id3) resid(e)
	predict double xb, xbd
	qui replace xb = 0 if abs(xb) < `tol'

	* Stop once all predicted values become non-negative
	qui cou if xb < 0
	if !r(N) {
		continue, break
	}

	replace u = max(xb, 0)
	drop xb w
}

rename xb z
gen is_sep = z > 0
list
assert separated == is_sep

(1 contradictions)

import delimited https://raw.githubusercontent.com/sergiocorreia/ppmlhdfe/master/test/separation_datasets/04.csv, clear


* Run IR (iterative rectifier) algorithm
loc tol = 1e-5
gen u =  !y
su u, mean
loc K = ceil(r(sum) / `tol' ^ 2)
gen w = cond(y, `K', 1) 

while 1 {
	qui reghdfe u [fw=w], absorb(id1 id2) resid(e)
	predict double xb, xbd
	qui replace xb = 0 if abs(xb) < `tol'

	* Stop once all predicted values become non-negative
	qui cou if xb < 0
	if !r(N) {
		continue, break
	}

	replace u = max(xb, 0)
	drop xb w
}

rename xb z
gen is_sep = z > 0
list
assert separated == is_sep

(2 contradictions)

Can you please tell me if 1) there is something more to the algorithm not captured in the example code provided, and having that would flag those observations differently; 2) or whether there is something wrong in the example datasets; 3) or those observations are flagged differently by one of the other methods and if so, how to interpret that?

Thanks again for this package. It's great!

Luís

Constant reported if option noconst specified

Code

ppmlhdfe $GLMdep $asymControl, abs (FE3=$fet FE4=$feid, savefe) vce(robust) noconstant keepsingleton

Package not able to be accessed in STATA

When I try to install this package, it tells me the host cannot be found. Has the file moved

net install ppmlhdfe, from("https://raw.githubusercontent.com/sergiocorreia/ppmlhdfe/master/src/")
host not found
https://raw.githubusercontent.com/sergiocorreia/ppmlhdfe/master/src/ either

is not a valid URL, or
could not be contacted, or
is not a Stata download site (has no stata.toc file).

Failure to detect reghdfe, which has already been installed (Stata 17)

In Stata 17, I have installed ftools, reghdfe, and ppmlhdfe. I use "which + command" to check their versions as follows:

. which ftools
*! version 2.49.1 08aug2023

. which reghdfe
*! version 6.12.3 08aug2023


. which ppmlhdfe
*! version 2.3.0 25feb2021
*! Authors: Sergio Correia, Paulo Guimarães, Thomas Zylkin
*! URL: https://github.com/sergiocorreia/ppmlhdfe

However, when I run regression using ppmlhdfe, I got the following message:

ppmlhdfe requires the reghdfe package (version 6 or newer), which is not installed
    - install from SSC
    - install from Github
(error occurred while loading ppmlhdfe.ado)
r(9);

What should I do to solve this problem? Thanks!

Importance weights in ppmlhdfe

Hi @sergiocorreia,

Thanks so much for writing this amazing software! It's made it so much easier and faster to incorporate poisson regression in my work.

I had a question / request: I have been running some linear regressions with propensity score reweighting in reghdfe and I'd like to run equivalent poisson regressions that also incorporate these weights. When I use the poisson or xtpoisson commands, I believe the analog is "importance weights". However, ppmlhdfe does not support iweights. Would it be possible to add support for iweights, or is there some way to replicate iweights in the current program?

This is sort of related to issue #2, but is distinct from it.

Thank you again,
Adam Sacarny

Add support for options: indiv() group() aggregation()?

Thank you for this great tool!

Is that possible to add these new options from reghdfe for ppmlhdfe as well?

Thank you!

Unable to run Wild Bootstrap

Dear Sergio,

Thanks for this package, it's a real time saver.

I'm trying to use ppmlhdfe clustering on month of birth. Because I only have 12 groups, I want to use wild boostrapping for my inference, but upon running
boottest treat, boottype(wild)
I get the message:
boottype(wild) not accepted after GMM or Maximum Likelihood-based estimation.

Do you know if that's an issue with ppmlhdfe per se or perhaps with boottest?

Thank you in advance!

The number of dropped singletons in reghdfe and ppmlhdfe

Hi Sergio,

I notice that the number of dropped singletons can be different in reghdfe and ppmlhdfe when the only thing that differs is the command and everything else in cmdline is the same. I note your technical note on singletons as well as your reply on the singletons in a separate issue (#7 (comment)). I can't quite wrap my mind around why the number of dropped singletons is different. Any thoughts/suggestions?

I am using the latest versions from the github repositories:

. ppmlhdfe, version
c:\ado\plus\p\ppmlhdfe.ado
*! version 2.3.0 25feb2021
*! Authors: Sergio Correia, Paulo Guimarães, Thomas Zylkin
*! URL: https://github.com/sergiocorreia/ppmlhdfe

Required packages installed?

ftools yes; version: 2.48.0 29mar2021
reghdfe yes; version: 6.12.2 02Nov2021

Thank you!!

ppmlhdfe aweights

Hi Sergio,

it would be great if you could program the command "aweights" for ppmlhdfe. The commands "fweight" and "pweight" work well.

Thanks!
Corinna

Interpreting ppmlhdfe (Poisson FE) fixed effect estimates

Dear Sergio, Paulo and Thomas,
I have posted this on Statalist, but GitHub might be the better place to so.

I am using your ppmlhdfe command (thanks for the time spent putting this together!)
The goal is estimating a Poisson model with many levels of fixed effects (i.e. 4 categorical variables some of which are also interacted) that fails to converge using the conventional Poisson command, or even glm .. family(Poisson).

The ppmlhdfe command works well in the sense that i) it converges and ii) it is very fast. It does so by dropping singletons/separated observations.
In the specifications where the Poisson command also converged, the point estimates are identical.

Next, I am trying to do some out of sample prediction, which the command does not allow for, so must be done manually by adding the estimated fixed effects.
Here I am having some trouble understanding the output.

Example with only one binary FE and no other covariate:
Code:

sysuse auto.dta, clear
ppmlhdfe price, absorb(foreign, savefe) d(sumFE)

The options
absorb(..., savefe) save all fixed effect estimates with __hdfe as prefix
d(newvar) save sum of fixed effects as newvar; mandatory if running predict afterwards (except for predict,xb)
Code:

. ppmlhdfe price, absorb(foreign, savefe) d(sumFE)
Iteration 1:   deviance = 8.7262e+04  eps = .         iters = 1    tol = 1.0e-04  min(eta) =   0.75  PS  
Iteration 2:   deviance = 8.6958e+04  eps = 3.50e-03  iters = 1    tol = 1.0e-04  min(eta) =   0.72   S  
Iteration 3:   deviance = 8.6958e+04  eps = 6.06e-07  iters = 1    tol = 1.0e-04  min(eta) =   0.72   S  
Iteration 4:   deviance = 8.6958e+04  eps = 2.13e-14  iters = 1    tol = 1.0e-05  min(eta) =   0.72   S O
------------------------------------------------------------------------------------------------------------
(legend: p: exact partial-out   s: exact solver   h: step-halving   o: epsilon below tolerance)
Converged in 4 iterations and 4 HDFE sub-iterations (tol = 1.0e-08)
 
HDFE PPML regression                              No. of obs      =         74
Absorbing 1 HDFE group                            Residual df     =         72
                                                  Wald chi2(0)    =          .
Deviance             =  86958.07836               Prob > chi2     =          .
Log pseudolikelihood =  -43866.7452               Pseudo R2       =     0.0028
------------------------------------------------------------------------------
             |               Robust
       price |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       _cons |   8.726951   .0555475   157.11   0.000      8.61808    8.835822
------------------------------------------------------------------------------
 
Absorbed degrees of freedom:
-----------------------------------------------------+
Absorbed FE | Categories  - Redundant  = Num. Coefs |
-------------+---------------------------------------|
     foreign |         2           0           2     |
-----------------------------------------------------+
 
. tab __hdfe1__
 
       [FE] |
  1.foreign |      Freq.     Percent        Cum.
------------+-----------------------------------
  -.0154382 |         52       70.27       70.27
   .0347057 |         22       29.73      100.00
------------+-----------------------------------
      Total |         74      100.00
 
 
. tab sumFE
 
     Sum of |
      fixed |
    effects |      Freq.     Percent        Cum.
------------+-----------------------------------
  -.0160663 |          1        1.35        1.35
  -.0159765 |          1        1.35        2.70
  -.0159186 |          1        1.35        4.05
  -.0159104 |          1        1.35        5.41
  -.0157847 |          1        1.35        6.76
  -.0157775 |          1        1.35        8.11
  -.0157128 |          1        1.35        9.46
  -.0157128 |          1        1.35       10.81
  -.0156133 |          1        1.35       12.16
  -.0155503 |          1        1.35       13.51
  -.0154646 |          1        1.35       14.86
  -.0154554 |          1        1.35       16.22
  -.0154529 |          1        1.35       17.57
  -.0154441 |          1        1.35       18.92
  -.0154263 |          1        1.35       20.27
  -.0154207 |          1        1.35       21.62
    -.01542 |          1        1.35       22.97
  -.0154147 |          1        1.35       24.32
  -.0153939 |          1        1.35       25.68
  -.0153839 |          1        1.35       27.03
  -.0153818 |          1        1.35       28.38
  -.0153807 |          1        1.35       29.73
  -.0153764 |          1        1.35       31.08
  -.0153655 |          1        1.35       32.43
  -.0153627 |          1        1.35       33.78
   -.015358 |          1        1.35       35.14
  -.0153537 |          1        1.35       36.49
  -.0153527 |          1        1.35       37.84
   -.015352 |          1        1.35       39.19
  -.0153472 |          1        1.35       40.54
  -.0153388 |          1        1.35       41.89
   -.015338 |          1        1.35       43.24
  -.0153366 |          1        1.35       44.59
  -.0153348 |          1        1.35       45.95
   -.015333 |          1        1.35       47.30
  -.0153329 |          1        1.35       48.65
  -.0153307 |          1        1.35       50.00
  -.0153183 |          1        1.35       51.35
  -.0153178 |          1        1.35       52.70
  -.0153174 |          1        1.35       54.05
  -.0153168 |          1        1.35       55.41
  -.0153122 |          1        1.35       56.76
  -.0153111 |          1        1.35       58.11
  -.0153097 |          1        1.35       59.46
  -.0153065 |          1        1.35       60.81
  -.0153048 |          1        1.35       62.16
   -.015303 |          1        1.35       63.51
  -.0152949 |          1        1.35       64.86
   -.015293 |          1        1.35       66.22
  -.0152846 |          1        1.35       67.57
  -.0152611 |          1        1.35       68.92
  -.0152606 |          1        1.35       70.27
   .0345068 |          1        1.35       71.62
   .0345368 |          1        1.35       72.97
   .0346048 |          1        1.35       74.32
   .0346062 |          1        1.35       75.68
   .0346532 |          1        1.35       77.03
   .0346829 |          1        1.35       78.38
   .0346917 |          1        1.35       79.73
   .0347084 |          1        1.35       81.08
   .0347104 |          1        1.35       82.43
   .0347203 |          1        1.35       83.78
   .0347233 |          1        1.35       85.14
   .0347257 |          1        1.35       86.49
   .0347354 |          1        1.35       87.84
    .034745 |          1        1.35       89.19
   .0347565 |          1        1.35       90.54
   .0347597 |          1        1.35       91.89
   .0347624 |          1        1.35       93.24
   .0347686 |          1        1.35       94.59
   .0347776 |          1        1.35       95.95
   .0347806 |          1        1.35       97.30
   .0347836 |          1        1.35       98.65
   .0347851 |          1        1.35      100.00
------------+-----------------------------------
      Total |         74      100.00

So, I don’t understand why:

Despite foreign being a binary variable, there is a __hdfe1__ estimate for each of its two values as well as an estimate for the constant. How are the values determined?
Despite only one FE being used, __hdfe1__ and sumFE are not the same
sumFE seems to be different for every observation
when comparing against a standard Poisson command, the predicted values are (slightly) different, using either estimate of the FE. Note that in this simple example ppmlhdfe does not drop any observation.

Code:

predict yhat_ppmlhdfe
gen yhat_ppmlhdfe_manual = exp(_b[_cons] + __hdfe1__)
 
poisson price i.foreign
predict yhat_poisson
 
. su yhat*
 
    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
yhat_ppmlh~e |         74    6165.257    143.7014    6068.61   6385.188
yhat_ppmlh~l |         74    6165.257    143.6977   6072.423   6384.682
yhat_poisson |         74    6165.257    143.6979   6072.423   6384.682
 
. 
. compare  yhat_ppmlhdfe yhat_poisson
 
                                        ---------- difference ----------
                            count       minimum      average     maximum
------------------------------------------------------------------------
yhat_pp~e<yhat_po~n            21     -3.812361    -1.270896   -.0355738
yhat_pp~e>yhat_po~n            53      .0171664     .5039798    1.079185
                       ----------
jointly defined                74     -3.812361     .0002987    1.079185
                       ----------
total                          74
 
. compare  yhat_ppmlhdfe_manual yhat_poisson
 
                                        ---------- difference ----------
                            count       minimum      average     maximum
------------------------------------------------------------------------
yhat_pp~l=yhat_po~n            22
yhat_pp~l>yhat_po~n            52      .0004883     .0004883    .0004883
                       ----------
jointly defined                74             0     .0003431    .0004883
                       ----------
total                          74

Any help would be great.

Does not run with Linux OS

Hi Sergio,

Thanks for developing such a great command! It runs perfectly on my Desktop computer, which has a Windows 10 OS. However, when I try to run it on the NBER cluster (Redhat Linux OS), I get the following error message:

accuracy not found in class FixedEffects
(113 lines skipped)
(error occurred while loading ppmlhdfe.ado)
r(3000);

I am happy to provide you with any other info you might want.

Joe Staudt

Out of sample prediction

Reported on Statalist. Note completely sure if it is a bug, but would be grateful if you could take a look
https://www.statalist.org/forums/forum/general-stata-discussion/general/1586918-poisson-out-of-sample-prediction

Error occurred in loading ppmlhdfe.ado

Hi all,

I am seeing the following error message when trying to run ppmlhdfe :

Would you have advice how to proceed? Both ftools and reghdfe are installed; reghdfe has been working fine.

Thank you in advance for your help.

Error in Stata 14

Whether I ssc install or net install, I get the following error when I try to run ppmlhdfe in Stata 14:
(error occurred while loading ppmlhdfe.ado)
r(9611);

The hyperlink on "r(9611);" yields no information. I have installed ftools and reghdfe and have no problem implementing reghdfe.

Thanks.

Not working with updated reghdfe/ppml

I updated to the latest reghdfe/ppmlhdfe on SSC. reghdfe (! version 6.12.3 08aug2023) seems to be working fine. However, ppmlhdfe (! version 2.3.0 25feb2021) issues the following error message:

ppmlhdfe requires the reghdfe package (version 6 or newer), which is not installed
- install from SSC
- install from Github
(error occurred while loading ppmlhdfe.ado)
r(9);

However, as noted, version 6.12.3 of reghdfe is installed. Please let me know if you need any additional information.

boottest error: Could not generate scores from ppmlhdfe with predict

Hi Sergio,

Thanks a lot for your great work!

I am trying to get the boottest command to work but without much luck. Of course, I might be doing something wrong. Illustration using the sample data from the ppmlhdfe help file (I remove the interaction in the fixed effects because my use case has no interactions):

use "http://fmwww.bc.edu/RePEc/bocode/e/EXAMPLE_TRADE_FTA_DATA" if category=="TOTAL", clear
egen imp = group(isoimp)
egen exp = group(isoexp)
ppmlhdfe trade fta, a(year imp exp) cluster(imp exp) d
boottest fta

This is what I get:

. boottest fta
(bootcluster(imp exp) assumed)
Re-running regression with null imposed.
option constraints not allowed
Error imposing null. Perhaps ppmlhdfe does not accept the constraints(), from(), and iterate() options, as needed.
r(198);

So I tried this:

boottest fta, nonull

And I get:

. boottest fta, nonull
(bootcluster(imp exp) assumed)
Could not generate scores from ppmlhdfe with predict.

Someone posted the same question on statalist, but did not get a response yet:
https://www.statalist.org/forums/forum/general-stata-discussion/general/1500935-ppmlhdfe-and-bootstrap

Am I doing something wrong?

Thanks a lot,
Greg

Question about validity of Margins Results post PPMLHDFE estimation

Hello Sergio,

I have a question regarding the validity of margins command after ppmlhdfe estimation results when I have fixed effects.
From my online searches, it seems like when you run margins command after you fit a Poisson estimation with fixed effects, the results can be invalid because margins command doesn't incorporate fixed effects. From one of the comments I found from the Stata Forum, margins after xtpoisson and xtlogit with fixed effects produces results that are meaningless because the marginal effects depend on the fixed effects and these are not estimated when you use these commands.

I followed the instructions to calculate marginal effect after ppmlhdfe from here: https://github.com/sergiocorreia/ppmlhdfe/blob/master/guides/undocumented.md#esttab-and-margins-options

I was wondering if the same problems apply to margins command after ppmlhdfe estimation. Also, if there is a problem, is there anyway I can incorporate fixed effects in the margins command after ppmlhdfe estimation?

Thank you very much in advance!

Cannot Install From Github

I had the old files you e-mailed me, so I thought I'd upgrade to this new version. But attempting to execute the installation instructions from the readme yields:


. clear

. cap ado uninstall ppmlhdfe

. net install ppmlhdfe, from("https://raw.githubusercontent.com/sergiocorreia/ppmlhdfe/master/src/")
https://raw.githubusercontent.com/sergiocorreia/ppmlhdfe/master/src/ either
  1)  is not a valid URL, or
  2)  could not be contacted, or
  3)  is not a Stata download site (has no stata.toc file).

current site is still https://raw.githubusercontent.com/sergiocorreia/reghdfe/master/src/
--Break--
r(1);

.

I was able to execute the same commands to (re-)install ftools and gtools from Github without any problems directly before this. I've also verified that I can access "https://raw.githubusercontent.com/sergiocorreia/ppmlhdfe/master/src/stata.toc" with a browser. Any clue what might be going on? I am using Stata/MP 13.1 for Mac (64-bit Intel).

P.S. Off-topic, but thank you so much for this project. PPMLHDFE has been a big time-saver for me -- and even more important that it runs regressions that previously just would not run! Really a great piece of software.

error 9003 in Stata, mu has infinite values on iteration 10;

Dear Sergio Correia,

I am trying to run the manual RESET Test and I keep getting the following error:

.  qui asdoc ppmlhdfe gvc_total regulatory_distance applied_tariff dist_w rta if exp
> !=imp, a(exp_time imp_time pair_id) vce(cluster pair_id) replace

. predict fit, xb
(222,075 missing values generated)

. generate fit2 = fit^2
(222,075 missing values generated)



.  asdoc ppmlhdfe gvc_total regulatory_distance applied_tariff dist_w rta fit2  
> if exp!=imp, a(exp_time imp_time pair_id) vce(cluster pair_id) replace
warning: dependent variable takes very low values after standardizing (4.3599e-2
> 1)
Iteration 1:   deviance = 1.6027e+11  eps = .         iters = 6    tol = 1.0e-04
>   min(eta) =  -6.95  P   
Iteration 2:   deviance = 5.8990e+10  eps = 1.72e+00  iters = 9    tol = 1.0e-04
>   min(eta) =  -9.40      
Iteration 3:   deviance = 2.1738e+10  eps = 1.71e+00  iters = 8    tol = 1.0e-04
>   min(eta) = -12.59      
Iteration 4:   deviance = 8.0361e+09  eps = 1.71e+00  iters = 8    tol = 1.0e-04
>   min(eta) = -14.38      
Iteration 5:   deviance = 4.2509e+43  eps = 5.29e+33  iters = 8    tol = 1.0e-04
>   min(eta) = -274.18      
Iteration 6:   deviance = 1.5638e+43  eps = 1.72e+00  iters = 43   tol = 1.0e-04
>   min(eta) = -275.69      
Iteration 7:   deviance = 1.2081e+44  eps = 6.73e+00  iters = 32   tol = 1.0e-04
>   min(eta) = -46.21   S  
Iteration 8:   deviance = 4.4444e+43  eps = 1.72e+00  iters = 47   tol = 1.0e-04
>   min(eta) = -47.68      
Iteration 9:   deviance = 0.0000e+00  eps = 0.00e+00  iters = 162  tol = 1.0e-04
>   min(eta) =      .      
mu has infinite values on iteration 10; aborting
r(9003);

Would you know the reason why this error occurs?
Thank you in advance for your time

Does pplphdfe use the same sample size as poisson?

Dear Sergio,

I checked the results of command pplphdfe and poisson in Stata, and ran codes as below:

sysuse auto, clear
ppmlhdfe price weight, a(turn)
poisson price weight i.turn,vce(robust)

The observations reported are different with each other.

. ppmlhdfe price weight, a(turn)
(dropped 4 observations that are either singletons or separated by a fixed effect)
Iteration 1: deviance = 2.7681e+04 eps = . iters = 1 tol = 1.0e-04 min(eta) =

0.12 P
Iteration 2: deviance = 2.7317e+04 eps = 1.33e-02 iters = 1 tol = 1.0e-04 min(eta) =
0.09
Iteration 3: deviance = 2.7317e+04 eps = 7.90e-06 iters = 1 tol = 1.0e-04 min(eta) =
0.09
Iteration 4: deviance = 2.7317e+04 eps = 8.09e-12 iters = 1 tol = 1.0e-05 min(eta) =
0.09
Iteration 5: deviance = 2.7317e+04 eps = 1.91e-16 iters = 1 tol = 1.0e-07 min(eta) =
0.09 S O

(legend: p: exact partial-out s: exact solver h: step-halving o: epsilon below tolerance)
Converged in 5 iterations and 5 HDFE sub-iterations (tol = 1.0e-08)

HDFE PPML regression No. of obs = 70
Absorbing 1 HDFE group Residual df = 55
Wald chi2(1) = 71.89
Deviance = 27316.76429 Prob > chi2 = 0.0000
Log pseudolikelihood = -14025.19017 Pseudo R2 = 0.6592

         |               Robust
   price |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]

-------------+----------------------------------------------------------------
weight | .0007682 .0000906 8.48 0.000 .0005906 .0009457
_cons | 6.33763 .2750445 23.04 0.000 5.798553 6.876707

. poisson price weight i.turn,vce(robust)

Iteration 0: log pseudolikelihood = -14047.47
Iteration 1: log pseudolikelihood = -14046.088
Iteration 2: log pseudolikelihood = -14046.088

Poisson regression Number of obs = 74
Wald chi2(14) = .
Prob > chi2 = .
Log pseudolikelihood = -14046.088 Pseudo R2 = 0.6807

         |               Robust
   price |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]

-------------+----------------------------------------------------------------
weight | .0007682 .0000906 8.48 0.000 .0005907 .0009457

As u can see, the observations reported are different with each other, although the "weight" coefficient and standard error are the same. Does this mean pplphdfe use the same sample size as poisson? i.e. Although the observations in Poisson are 74, it actually use 70 of 74 observations to estimate coefficient and standard errors?

What I want to say is that if two estimations use different sample, it is unlikely that they could get the same result of coefficient and standard errors, right? Is it true that, in both regressions, there are 4 observations being omitted as a result of i.turn(fixed effect) included in the regression? Thanks.

sergiocorreia / ppmlhdfe Goto Github PK

ppmlhdfe's Introduction

ppmlhdfe: Poisson pseudo-likelihood regression with multiple levels of fixed effects

Recent updates

Citation

As text

As BibTex

References

Installation

ppmlhdfe's People

Contributors

Stargazers

Watchers

Forkers

ppmlhdfe's Issues

HDFE PPML regression No. of obs = 70 Absorbing 1 HDFE group Residual df = 55 Wald chi2(1) = 71.89 Deviance = 27316.76429 Prob > chi2 = 0.0000 Log pseudolikelihood = -14025.19017 Pseudo R2 = 0.6592

-------------+---------------------------------------------------------------- weight | .0007682 .0000906 8.48 0.000 .0005906 .0009457 _cons | 6.33763 .2750445 23.04 0.000 5.798553 6.876707

Recommend Projects

Recommend Topics

Recommend Org

`ppmlhdfe`: Poisson pseudo-likelihood regression with multiple levels of fixed effects

HDFE PPML regression No. of obs = 70
Absorbing 1 HDFE group Residual df = 55
Wald chi2(1) = 71.89
Deviance = 27316.76429 Prob > chi2 = 0.0000
Log pseudolikelihood = -14025.19017 Pseudo R2 = 0.6592

-------------+----------------------------------------------------------------
weight | .0007682 .0000906 8.48 0.000 .0005906 .0009457
_cons | 6.33763 .2750445 23.04 0.000 5.798553 6.876707