Panel data
For panel data, the usual set up is:
Fixed effect
A fixed effect model can be done with OLS on
This is basically OLS on de-meaned and .
Random effect
The random effect looks at this at different angle: it treated as the error term. There are two components of the error term. Suppose they are estimated as (idiosyncratic component), and (individual component). Then we can do GLS transformation:
and
where is the number of observations for individual .
Given estimates of and , we can run OLS on transformed variables (including and all ’s). We can iterate the process.
Example
Stata 19 implemented CRE in the xtreg
command, with “cre” option. The following example is from Stata’s website, using the nlswork
dataset.
webuse nlswork
xtreg ln_wage tenure age i.race, cre vce(cluster idcode)
(National Longitudinal Survey of Young Women, 14-24 years old in 1968)
note: 2.race omitted from xt_means because of collinearity.
note: 3.race omitted from xt_means because of collinearity.
Correlated random-effects regression Number of obs = 28,101
Group variable: idcode Number of groups = 4,699
R-squared: Obs per group:
Within = 0.1296 min = 1
Between = 0.2346 avg = 6.0
Overall = 0.1890 max = 15
Wald chi2(4) = 1685.18
corr(xit_vars*b, xt_means*γ) = 0.5474 Prob > chi2 = 0.0000
(Std. err. adjusted for 4,699 clusters in idcode)
------------------------------------------------------------------------------
| Robust
ln_wage | Coefficient std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
xit_vars |
tenure | .0211313 .0012113 17.44 0.000 .0187572 .0235055
age | .0121949 .0007414 16.45 0.000 .0107417 .013648
|
race |
Black | -.1312068 .0117856 -11.13 0.000 -.1543061 -.1081075
Other | .1059379 .0593177 1.79 0.074 -.0103225 .2221984
|
_cons | 1.2159 .0306965 39.61 0.000 1.155736 1.276064
-------------+----------------------------------------------------------------
xt_means |
tenure | .0376991 .002281 16.53 0.000 .0332283 .0421698
age | -.0011984 .0013313 -0.90 0.368 -.0038077 .0014109
|
race |
Black | 0 (omitted)
Other | 0 (omitted)
-------------+----------------------------------------------------------------
sigma_u | .33334407
sigma_e | .29808194
rho | .55567161 (fraction of variance due to u_i)
------------------------------------------------------------------------------
Mundlak test (xt_means = 0): chi2(2) = 331.5144 Prob > chi2 = 0.0000
To compare with a fixed effect model:
webuse nlswork
xtreg ln_wage tenure age i.race, fe vce(cluster idcode)
(National Longitudinal Survey of Young Women, 14-24 years old in 1968)
note: 2.race omitted because of collinearity.
note: 3.race omitted because of collinearity.
Fixed-effects (within) regression Number of obs = 28,101
Group variable: idcode Number of groups = 4,699
R-squared: Obs per group:
Within = 0.1296 min = 1
Between = 0.1916 avg = 6.0
Overall = 0.1456 max = 15
F(2, 4698) = 766.79
corr(u_i, Xb) = 0.1302 Prob > F = 0.0000
(Std. err. adjusted for 4,699 clusters in idcode)
------------------------------------------------------------------------------
| Robust
ln_wage | Coefficient std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
tenure | .0211313 .0012112 17.45 0.000 .0187568 .0235059
age | .0121949 .0007414 16.45 0.000 .0107414 .0136483
|
race |
Black | 0 (omitted)
Other | 0 (omitted)
|
_cons | 1.256467 .0194187 64.70 0.000 1.218397 1.294537
-------------+----------------------------------------------------------------
sigma_u | .39034493
sigma_e | .29808194
rho | .63165531 (fraction of variance due to u_i)
------------------------------------------------------------------------------
We see the coefficient estimates are the same for “tenure” and “age”, but CRE model allows you to estimate the effect of “race”.
We can also manually do it by using a RE model on , and :
webuse nlswork
egen age_mean = mean(age), by(idcode)
egen tenure_mean = mean(tenure), by(idcode)
xtreg ln_wage tenure tenure_mean age age_mean i.race, vce(cluster idcode)
(National Longitudinal Survey of Young Women, 14-24 years old in 1968)
(1 missing value generated)
(12 missing values generated)
Random-effects GLS regression Number of obs = 28,101
Group variable: idcode Number of groups = 4,699
R-squared: Obs per group:
Within = 0.1296 min = 1
Between = 0.2346 avg = 6.0
Overall = 0.1890 max = 15
Wald chi2(6) = 2688.49
corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000
(Std. err. adjusted for 4,699 clusters in idcode)
------------------------------------------------------------------------------
| Robust
ln_wage | Coefficient std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
tenure | .0211328 .001211 17.45 0.000 .0187592 .0235064
tenure_mean | .0376962 .0022835 16.51 0.000 .0332207 .0421718
age | .0121935 .0007412 16.45 0.000 .0107409 .0136462
age_mean | -.0011922 .0013353 -0.89 0.372 -.0038094 .001425
|
race |
Black | -.1312062 .011785 -11.13 0.000 -.1543044 -.108108
Other | .1059734 .0593175 1.79 0.074 -.0102868 .2222337
|
_cons | 1.215738 .0307699 39.51 0.000 1.15543 1.276046
-------------+----------------------------------------------------------------
sigma_u | .33338665
sigma_e | .29808194
rho | .55573468 (fraction of variance due to u_i)
------------------------------------------------------------------------------
This is what stata’s “cre” option is doing behind the scene.