This post is a demo for doing the same multi-level models with R and stata. R’s multilevel models are mostly done with lme4 and nlme. Stata’s multilevel models are mostly done with mixed and menl. The two packages are not exactly the same. For example, mixed does not allow complicated structure for the residual. That’s why we need menl. Likewise, in R lme4’s “lmer” does not allow this specification. We have to use nlme. But it’s good to know how to do the same thing in both packages.
For example, we want to do a consensus emergence model (CEM) as described in Lang et al. 2012(https://core.ac.uk/download/pdf/158346339.pdf). The CEM model is basically a growth model with the residual being a exponential function of the time variable. We want to do the same thing in both R and stata.
We start with simpler models.
Stata and R syntax comparison for the same three level model
We run a three level model specified in stata with mixed and menl. In R with nlme and lme4. This is a random intercept model. Only intercept is varying for each unit at each level. In this data set supposedly state is nested within region. There are situations that levels are not nested, then we need to specify cross effect models. We do not discuss that here. What bothers me is that nlme’s lme function takes the first level as higher level. So I have to put region before state. Otherwise, it seems like it treated regions nested within states?
There are different ways to specify this model in nlme. I show two ways here (including lme2 which is commented out).
Stata
* let's see how to use menl
webuse productivity
* to replicate mixed with menl
mixed gsp || region: || state:
menl gsp = {b1:}, define(b1: U1[region] UU1[region>state])
(Public capital productivity)
Performing EM optimization Performing gradient-based optimization:
Iteration 0: Log likelihood = 200.86162
Iteration 1: Log likelihood = 200.86162
Computing standard errors ...
Mixed-effects ML regression Number of obs = 816
Grouping information
-------------------------------------------------------------
| No. of Observations per group
Group variable | groups Minimum Average Maximum
----------------+--------------------------------------------
region | 9 51 90.7 136
state | 48 17 17.0 17
-------------------------------------------------------------
Wald chi2(0) = .
Log likelihood = 200.86162 Prob > chi2 = .
------------------------------------------------------------------------------
gsp | Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
_cons | 10.65961 .2503806 42.57 0.000 10.16887 11.15035
------------------------------------------------------------------------------
------------------------------------------------------------------------------
Random-effects parameters | Estimate Std. err. [95% conf. interval]
-----------------------------+------------------------------------------------
region: Identity |
var(_cons) | .4376123 .2697616 .13073 1.464887
-----------------------------+------------------------------------------------
state: Identity |
var(_cons) | .6080626 .1382733 .3893904 .9495357
-----------------------------+------------------------------------------------
var(Residual) | .0246633 .0012586 .0223159 .0272577
------------------------------------------------------------------------------
LR test vs. linear model: chi2(2) = 2750.56 Prob > chi2 = 0.0000
Note: LR test is conservative and provided only for reference.
Obtaining starting values by EM:
Alternating PNLS/LME algorithm:
Iteration 1: Linearization log likelihood = 200.86162
Iteration 2: Linearization log likelihood = 200.86162
Computing standard errors:
Mixed-effects ML nonlinear regression Number of obs = 816
Grouping information
------------------------------------------------------------------
| No. of Observations per group
Path | groups Minimum Average Maximum
---------------------+--------------------------------------------
region | 9 51 90.7 136
region>state | 48 17 17.0 17
------------------------------------------------------------------
Linearization log likelihood = 200.86162
b1: U1[region] UU1[region>state]
------------------------------------------------------------------------------
gsp | Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
b1 |
_cons | 10.65961 .2505341 42.55 0.000 10.16857 11.15065
------------------------------------------------------------------------------
------------------------------------------------------------------------------
Random-effects parameters | Estimate Std. err. [95% conf. interval]
-----------------------------+------------------------------------------------
region: Identity |
var(U1) | .4376119 .26865 .1313821 1.457612
-----------------------------+------------------------------------------------
region>state: Identity |
var(UU1) | .6080625 .1381509 .389544 .9491612
-----------------------------+------------------------------------------------
var(Residual) | .0246633 .0012586 .0223159 .0272577
------------------------------------------------------------------------------
R
These two R packages return same results. Note that lmer reports variance, while lme reports standard deviation.
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.3 ✔ readr 2.1.4
✔ forcats 1.0.0 ✔ stringr 1.5.0
✔ ggplot2 3.4.4 ✔ tibble 3.2.1
✔ lubridate 1.9.3 ✔ tidyr 1.3.0
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(haven)
library(dplyr)
library(readxl)
library(nlme)
Attaching package: 'nlme'
The following object is masked from 'package:dplyr':
collapse
library(lme4)
Loading required package: Matrix
Attaching package: 'Matrix'
The following objects are masked from 'package:tidyr':
expand, pack, unpack
Attaching package: 'lme4'
The following object is masked from 'package:nlme':
lmList
data <- read_dta('https://www.stata-press.com/data/r18/productivity.dta')
# lme4 for three level model
lmer1 <- lmer(gsp ~ 1 + (1|region) + (1|state), data=data)
summary(lmer1)
Linear mixed model fit by REML ['lmerMod']
Formula: gsp ~ 1 + (1 | region) + (1 | state)
Data: data
REML criterion at convergence: -400.9
Scaled residuals:
Min 1Q Median 3Q Max
-2.8328 -0.6757 0.0474 0.6382 3.3928
Random effects:
Groups Name Variance Std.Dev.
state (Intercept) 0.60775 0.7796
region (Intercept) 0.50977 0.7140
Residual 0.02466 0.1570
Number of obs: 816, groups: state, 48; region, 9
Fixed effects:
Estimate Std. Error t value
(Intercept) 10.665 0.266 40.1
lme1 <- lme(gsp ~ 1, random= list( region=~1, state=~1), data=data)
#or
#lme1 <- lme(gsp ~ 1, random= list(region = pdSymm(~1), state=pdSymm(~1)), data=data)
#but
#lme1 <- lme(gsp ~ 1, random= list( state=~1, region=~1), data=data)
#would be a different result!
#The other way to specify:
#lme2 <- lme(gsp ~ 1, random= ~ 1 | region/state, data=data)
summary(lme1)
Linear mixed-effects model fit by REML
Data: data
AIC BIC logLik
-392.8509 -374.0381 200.4254
Random effects:
Formula: ~1 | region
(Intercept)
StdDev: 0.7139792
Formula: ~1 | state %in% region
(Intercept) Residual
StdDev: 0.7795814 0.1570457
Fixed effects: gsp ~ 1
Value Std.Error DF t-value p-value
(Intercept) 10.66483 0.2659842 768 40.09574 0
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-2.83282326 -0.67572390 0.04744387 0.63817953 3.39281358
Number of Observations: 816
Number of Groups:
region state %in% region
9 48
A growth model in R and stata
We implement a growth model in R and stata. The results are somewhat different. The results may not be that reliable given that the year slope random component is tiny.
Stata
* same growth model with mixed and menl
webuse productivity
mixed gsp year || region: year|| state:
menl gsp = {b2:} + {U1[region]}*year, define(b2: year U0[region] U11[region>state])
(Public capital productivity)
Performing EM optimization ...
Performing gradient-based optimization:
Iteration 0: Log likelihood = 784.59336
Iteration 1: Log likelihood = 784.64562
Iteration 2: Log likelihood = 784.65341
Iteration 3: Log likelihood = 784.65343
Iteration 4: Log likelihood = 784.65344
Computing standard errors ...
Mixed-effects ML regression Number of obs = 816
Grouping information
-------------------------------------------------------------
| No. of Observations per group
Group variable | groups Minimum Average Maximum
----------------+--------------------------------------------
region | 9 51 90.7 136
state | 48 17 17.0 17
-------------------------------------------------------------
Wald chi2(1) = 2744.51
Log likelihood = 784.65344 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
gsp | Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
year | .0274903 .0005247 52.39 0.000 .0264618 .0285188
_cons | -43.71613 1.067739 -40.94 0.000 -45.80886 -41.6234
------------------------------------------------------------------------------
------------------------------------------------------------------------------
Random-effects parameters | Estimate Std. err. [95% conf. interval]
-----------------------------+------------------------------------------------
region: Independent |
var(year) | 5.44e-16 5.06e-13 0 .
var(_cons) | .4380961 .2700722 .1308672 1.466587
-----------------------------+------------------------------------------------
state: Identity |
var(_cons) | .6091766 .1382625 .3904355 .9504672
-----------------------------+------------------------------------------------
var(Residual) | .0053926 .0002752 .0048793 .0059598
------------------------------------------------------------------------------
LR test vs. linear model: chi2(3) = 3903.81 Prob > chi2 = 0.0000
Note: LR test is conservative and provided only for reference.
Obtaining starting values by EM:
Alternating PNLS/LME algorithm:
Iteration 1: Linearization log likelihood = 784.65134
Iteration 2: Linearization log likelihood = 784.65343
Iteration 3: Linearization log likelihood = 784.65343
Computing standard errors:
Mixed-effects ML nonlinear regression Number of obs = 816
Grouping information
------------------------------------------------------------------
| No. of Observations per group
Path | groups Minimum Average Maximum
---------------------+--------------------------------------------
region | 9 51 90.7 136
region>state | 48 17 17.0 17
------------------------------------------------------------------
Wald chi2(1) = 2737.78
Linearization log likelihood = 784.65343 Prob > chi2 = 0.0000
b2: year U0[region] U11[region>state]
------------------------------------------------------------------------------
gsp | Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
b2 |
year | .0274903 .0005254 52.32 0.000 .0264605 .02852
_cons | -43.71615 1.069039 -40.89 0.000 -45.81143 -41.62087
------------------------------------------------------------------------------
------------------------------------------------------------------------------
Random-effects parameters | Estimate Std. err. [95% conf. interval]
-----------------------------+------------------------------------------------
region: Independent |
var(U0) | .4378219 .2687845 .131442 1.458347
var(U1) | 5.42e-13 2.67e-10 0 .
-----------------------------+------------------------------------------------
region>state: Identity |
var(U11) | .6091976 .1381491 .3905974 .9501388
-----------------------------+------------------------------------------------
var(Residual) | .0053926 .0002752 .0048793 .0059598
------------------------------------------------------------------------------
R
# lme4 for three level model, growth model
lmer_growth1 <- lmer(gsp ~ 1 + year + (1|region: year) + (1|region) + (1|state), data=data, REML=FALSE)
summary(lmer_growth1)
Linear mixed model fit by maximum likelihood ['lmerMod']
Formula: gsp ~ 1 + year + (1 | region:year) + (1 | region) + (1 | state)
Data: data
AIC BIC logLik deviance df.resid
-1772.4 -1744.1 892.2 -1784.4 810
Scaled residuals:
Min 1Q Median 3Q Max
-3.8497 -0.4675 -0.0397 0.5098 4.0558
Random effects:
Groups Name Variance Std.Dev.
region:year (Intercept) 0.002360 0.04858
state (Intercept) 0.609352 0.78061
region (Intercept) 0.437469 0.66141
Residual 0.003019 0.05494
Number of obs: 816, groups: region:year, 153; state, 48; region, 9
Fixed effects:
Estimate Std. Error t value
(Intercept) -4.190e+01 1.802e+00 -23.25
year 2.657e-02 9.021e-04 29.46
Correlation of Fixed Effects:
(Intr)
year -0.990
# nlme for three level model, growth model
#lme_growth1<-lme(gsp ~ year, random = list(region=pdSymm(~year), state=pdIdent(~1)),data=data, method="ML")
lme_growth1<-lme(gsp ~ year, random = list(region=~year, state=~1),data=data, method="ML")
summary(lme_growth1)
Linear mixed-effects model fit by maximum likelihood
Data: data
AIC BIC logLik
-1555.307 -1522.376 784.6534
Random effects:
Formula: ~year | region
Structure: General positive-definite, Log-Cholesky parametrization
StdDev Corr
(Intercept) 6.615230e-01 (Intr)
year 8.029895e-09 0.001
Formula: ~1 | state %in% region
(Intercept) Residual
StdDev: 0.7805102 0.0734343
Fixed effects: gsp ~ year
Value Std.Error DF t-value p-value
(Intercept) -43.71617 1.0690287 767 -40.89336 0
year 0.02749 0.0005254 767 52.32366 0
Correlation:
(Intr)
year -0.972
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-3.24606518 -0.59087173 0.03087456 0.62062441 4.27790886
Number of Observations: 816
Number of Groups:
region state %in% region
9 48
cem in R and stata
“CEM” is a growth model with the residual being a exponential function of the time variable.
stata
This mode has some convergence problem. I still use it for demonstration purpose.
webuse productivity, clear
* this replicates the results from R
menl gsp = {b2:} + {U1[region]}*year, define(b2: year U0[region] U11[region>state]) resvariance(exponential year) nolog
(Public capital productivity)
Mixed-effects ML nonlinear regression Number of obs = 816
Grouping information
------------------------------------------------------------------
| No. of Observations per group
Path | groups Minimum Average Maximum
---------------------+--------------------------------------------
region | 9 51 90.7 136
region>state | 48 17 17.0 17
------------------------------------------------------------------
Wald chi2(1) = 102.57
Linearization log likelihood = 877.9289 Prob > chi2 = 0.0000
b2: year U0[region] U11[region>state]
------------------------------------------------------------------------------
gsp | Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
b2 |
year | .0274697 .0027124 10.13 0.000 .0221535 .0327858
_cons | -43.62751 5.530577 -7.89 0.000 -54.46724 -32.78778
------------------------------------------------------------------------------
------------------------------------------------------------------------------
Random-effects parameters | Estimate Std. err. [95% conf. interval]
-----------------------------+------------------------------------------------
region: Independent |
var(U0) | 265.6364 129.373 102.2649 689.9991
var(U1) | .0000638 .0000311 .0000245 .000166
-----------------------------+------------------------------------------------
region>state: Identity |
var(U11) | .6046005 .1369615 .3878323 .9425253
-----------------------------+------------------------------------------------
Residual variance: |
Exponential year |
sigma2 | 2.4e-102 . . .
gamma | .0577332 .000013 .0577078 .0577587
------------------------------------------------------------------------------
Warning: Convergence not achieved.
R
As in stata, it has some convergence problem. So the results can differ somewhat between R and stata.
lme_growth1<-lme(gsp ~ year, random = list(region=pdSymm(~year), state=pdIdent(~1)),data=data, method="ML")
lme_growth2<-update(lme_growth1,weights=varExp( form = ~ year), method="ML", control = lmeControl(opt = "optim"))
summary(lme_growth2)
Linear mixed-effects model fit by maximum likelihood
Data: data
AIC BIC logLik
-1548.244 -1510.608 782.1218
Random effects:
Formula: ~year | region
Structure: General positive-definite
StdDev Corr
(Intercept) 0.2937615870 (Intr)
year 0.0001436049 0.995
Formula: ~1 | state %in% region
(Intercept) Residual
StdDev: 0.8046618 0.0736431
Variance function:
Structure: Exponential of variance covariate
Formula: ~year
Parameter estimates:
expon
-2.165027e-16
Fixed effects: gsp ~ year
Value Std.Error DF t-value p-value
(Intercept) -43.80206 1.0533833 767 -41.58226 0
year 0.02752 0.0005291 767 52.01290 0
Correlation:
(Intr)
year -0.976
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-3.24679735 -0.58919213 0.03104216 0.61843888 4.27977261
Number of Observations: 816
Number of Groups:
region state %in% region
9 48
delta1<-lme_growth2$modelStruct$varStruct
delta1
Variance function structure of class varExp representing
expon
-2.165027e-16