help for hdfe.ado
Title
hdfe -- Partial-out variables with respect to a series of
fixed-effects
Syntax
hdfe varlist [weight] , absorb(absvars) [generate(stubname) | clear]
[clustervars(varlist) partial(varlist) dropsingletons
sample(newvarname) cores(#) verbose(#) tolerance(#)
maxiterations(#)] maximize_options]
Notes:
- this is a programmers' command, like -avar-. For a detailed explanation
and comments, see the help and website for the reghdfe package.
- does not accept time series or factor variables
- varlist and clustervars MUST BE FULLY spelled out (i.e. you need to use
unab beforehand!), but that is not needed at all for the absvars.
Absvars:
absvar Description
--------------------------------------------------------------------------
i.varname indicators for each level of varname (the i.
prefix is tacit and can be omitted).
var1#var2 indicators for each combination of levels of var1
and var2 (same as i.var1#i.var2).
var1#c.var2 indicators for each level of var1, multiplied by
var2
var1##c.var2 equivalent to "i.var1 i.var1#c.var2", but much
faster (the two sets of fixed effects are
absorbed jointly at each iteration)
--------------------------------------------------------------------------
Notes:
- Each absvar in the absvars list represents a fixed effect that you wish
to absorb (like individual, firm or time).
- It is good practice to put the absvars with more dimensions first.
- Interactions (e.g. x#z) are supported. Using categorical interactions
is faster than running egen group(...) beforehand.
- To partial-out fixed slopes (and not just fixed intercepts), use
continuous interactions (e.g. x#c.z).
- Each absvar can contain any number of categorical interactions (e.g.
i.var1#i.var2#i.var3) but at most one continuous interaction (thus,
i.var1#c.var2#c.var3 is not allowed).
- The first absvar cannot contain a continuous variable (i.var1#c.var2 is
not allowed, although i.var1##c.var2 is ok).
- When saving fixed effects and using ## interactions, remember that
newvar=varname1##c.varname2 will be expanded to "newvar=varname1
newvar_slope=varname1#c.varname2"
Summary of Options:
options Description
--------------------------------------------------------------------------
Model
* absorb(absvars) identifiers of the fixed effects that will be
absorbed
absorb(..., savefe) save fixed effects with autogenerated names
__hdfe*__; useful when running predict
afterwards
generate(stubname) will not overwrite the variables; instead creating
new demeaned variables with the stubname prefix
clear will overwrite the dataset; leaving the
transformed variables, as well as some ancillary
ones (such as the fixed effects, weights,
cluster variables, etc.). Use char list to see
details of those ancillary variables.
clustervars(varlist) list of variables containing cluster categories.
This is used to give more accurate number of
degrees of freedom lost due to the fixed
effects, as reported on r(df_a).
partial(varlist) will partial-out the variables in the given
varlist, in addition of the partialled-out fixed
effects indicated in absorb(). Also returns
r(df_partial) with the number of partialled out
variables (excluding collinear).
dropsingletons remove singleton groups from the sample; once per
absvar.
sample(newvarname) will save the equivalent of e(sample) in this
variable; useful when dropping singletons. Used
with the generate option.
cores(#) will run the demeaning algorithm in # parallel
instances.
verbose(#) amount of debugging information to show (0=None,
1=Some, 2=More, 3=Parsing/convergence details,
4=Every iteration)
maxiterations(#) specify maximum number of iterations; default is
maxiterations(1000); 0 means run forever until
convergence
maximize_options there are several advanced maximization options,
useful for tweaking the iteration. See the help
for reghdfe for details.
version reports the version number and date of hdfe, and
saves it in e(version). standalone option
Recovering Fixed Effects
You can use hdfe again to recover the fixed effects. For instance, in the
least-squares case:
. sysuse auto, clear
. * Demean variables
. hdfe price weight length, a(turn trunk) gen(RESID_)
. * Run regression
. reg RESID_*
. * Predict using original variables
. drop RESID_*
. rename (price weight length) RESID_=
. predict double resid, resid
. rename RESID_* *
. * Obtain fixed effects
. hdfe resid, a(FE1=turn FE2=trunk) gen(temp_)
. * Benchmark and verification
. reghdfe price weight length, a(BENCH1=turn BENCH2=trunk)
. gen double delta = abs(BENCH1-FE1) + abs(BENCH2-FE2)
. su delta
Stored results
hdfe stores the following in r():
Scalars
r(df_a) degrees of freedom lost due to the fixed effects
(taking into account the cluster structure and
whether the FEs are nested within the
clusters)
r(N_hdfe) number of sets of fixed effects
r(df_a#) degrees of freedom lost due to the #th fixed
effect (excluding those collinear with the
#th-1 first FEs)
Macros
r(hdfe#) canonical expansion of the fixed effects (e.g.
for#turn is expanded into i.foreign#i.turn)
Author
Sergio Correia
Fuqua School of Business, Duke University
Email: sergio.correia@duke.edu
Latest Updates
reghdfe and hdfe are updated frequently, and upgrades or minor bug fixes
may not be immediately available in SSC. To check or contribute to the
latest version of reghdfe, explore the Github repository.