This post is a demo for doing the same multi-level models with R and stata. R’s multilevel models are mostly done with lme4 and nlme. Stata’s multilevel models are mostly done with mixed and menl. The two packages are not exactly the same. For example, mixed does not allow complicated structure for the residual. That’s why we need menl. Likewise, in R lme4’s “lmer” does not allow this specification. We have to use nlme. But it’s good to know how to do the same thing in both packages.

For example, we want to do a consensus emergence model (CEM) as described in Lang et al. 2012(https://core.ac.uk/download/pdf/158346339.pdf). The CEM model is basically a growth model with the residual being a exponential function of the time variable. We want to do the same thing in both R and stata.

We start with simpler models.

Stata and R syntax comparison for the same three level model

We run a three level model specified in stata with mixed and menl. In R with nlme and lme4. This is a random intercept model. Only intercept is varying for each unit at each level. In this data set supposedly state is nested within region. There are situations that levels are not nested, then we need to specify cross effect models. We do not discuss that here. What bothers me is that nlme’s lme function takes the first level as higher level. So I have to put region before state. Otherwise, it seems like it treated regions nested within states?

There are different ways to specify this model in nlme. I show two ways here (including lme2 which is commented out).

Stata


* let's see how to use menl
webuse productivity
* to replicate mixed with menl
mixed gsp || region: || state:
menl gsp = {b1:},  define(b1: U1[region] UU1[region>state])

(Public capital productivity)


Performing EM optimization Performing gradient-based optimization: 
Iteration 0:  Log likelihood =  200.86162  
Iteration 1:  Log likelihood =  200.86162  

Computing standard errors ...

Mixed-effects ML regression                                Number of obs = 816

        Grouping information
        -------------------------------------------------------------
                        |     No. of       Observations per group
         Group variable |     groups    Minimum    Average    Maximum
        ----------------+--------------------------------------------
                 region |          9         51       90.7        136
                  state |         48         17       17.0         17
        -------------------------------------------------------------

                                                           Wald chi2(0)  =   .
Log likelihood =  200.86162                                Prob > chi2   =   .

------------------------------------------------------------------------------
         gsp | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
       _cons |   10.65961   .2503806    42.57   0.000     10.16887    11.15035
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects parameters  |   Estimate   Std. err.     [95% conf. interval]
-----------------------------+------------------------------------------------
region: Identity             |
                  var(_cons) |   .4376123   .2697616        .13073    1.464887
-----------------------------+------------------------------------------------
state: Identity              |
                  var(_cons) |   .6080626   .1382733      .3893904    .9495357
-----------------------------+------------------------------------------------
               var(Residual) |   .0246633   .0012586      .0223159    .0272577
------------------------------------------------------------------------------
LR test vs. linear model: chi2(2) = 2750.56               Prob > chi2 = 0.0000

Note: LR test is conservative and provided only for reference.


Obtaining starting values by EM:

Alternating PNLS/LME algorithm:

Iteration 1:  Linearization log likelihood =  200.86162
Iteration 2:  Linearization log likelihood =  200.86162

Computing standard errors:

Mixed-effects ML nonlinear regression           Number of obs     =        816

      Grouping information
      ------------------------------------------------------------------
                           |     No. of        Observations per group
                      Path |     groups    Minimum    Average    Maximum
      ---------------------+--------------------------------------------
                    region |          9         51       90.7        136
              region>state |         48         17       17.0         17
      ------------------------------------------------------------------


Linearization log likelihood =  200.86162

          b1: U1[region] UU1[region>state]

------------------------------------------------------------------------------
         gsp | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
b1           |
       _cons |   10.65961   .2505341    42.55   0.000     10.16857    11.15065
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects parameters  |   Estimate   Std. err.     [95% conf. interval]
-----------------------------+------------------------------------------------
region: Identity             |
                     var(U1) |   .4376119     .26865      .1313821    1.457612
-----------------------------+------------------------------------------------
region>state: Identity       |
                    var(UU1) |   .6080625   .1381509       .389544    .9491612
-----------------------------+------------------------------------------------
               var(Residual) |   .0246633   .0012586      .0223159    .0272577
------------------------------------------------------------------------------

R

These two R packages return same results. Note that lmer reports variance, while lme reports standard deviation.

library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.3     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.0
✔ ggplot2   3.4.4     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(haven)
library(dplyr)
library(readxl)
library(nlme)


Attaching package: 'nlme'

The following object is masked from 'package:dplyr':

    collapse

library(lme4)

Loading required package: Matrix

Attaching package: 'Matrix'

The following objects are masked from 'package:tidyr':

    expand, pack, unpack


Attaching package: 'lme4'

The following object is masked from 'package:nlme':

    lmList

data <- read_dta('https://www.stata-press.com/data/r18/productivity.dta')
# lme4 for three level model
lmer1 <- lmer(gsp ~ 1 + (1|region) + (1|state), data=data)
summary(lmer1)

Linear mixed model fit by REML ['lmerMod']
Formula: gsp ~ 1 + (1 | region) + (1 | state)
   Data: data

REML criterion at convergence: -400.9

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.8328 -0.6757  0.0474  0.6382  3.3928 

Random effects:
 Groups   Name        Variance Std.Dev.
 state    (Intercept) 0.60775  0.7796  
 region   (Intercept) 0.50977  0.7140  
 Residual             0.02466  0.1570  
Number of obs: 816, groups:  state, 48; region, 9

Fixed effects:
            Estimate Std. Error t value
(Intercept)   10.665      0.266    40.1

lme1 <- lme(gsp ~ 1, random= list( region=~1, state=~1), data=data)
#or
#lme1 <- lme(gsp ~ 1, random= list(region = pdSymm(~1), state=pdSymm(~1)), data=data)
#but
#lme1 <- lme(gsp ~ 1, random= list( state=~1, region=~1), data=data)
#would be a different result!
#The other way to specify:
#lme2 <- lme(gsp ~ 1, random= ~ 1 | region/state, data=data)
summary(lme1)

Linear mixed-effects model fit by REML
  Data: data 
        AIC       BIC   logLik
  -392.8509 -374.0381 200.4254

Random effects:
 Formula: ~1 | region
        (Intercept)
StdDev:   0.7139792

 Formula: ~1 | state %in% region
        (Intercept)  Residual
StdDev:   0.7795814 0.1570457

Fixed effects:  gsp ~ 1 
               Value Std.Error  DF  t-value p-value
(Intercept) 10.66483 0.2659842 768 40.09574       0

Standardized Within-Group Residuals:
        Min          Q1         Med          Q3         Max 
-2.83282326 -0.67572390  0.04744387  0.63817953  3.39281358 

Number of Observations: 816
Number of Groups: 
           region state %in% region 
                9                48

A growth model in R and stata

We implement a growth model in R and stata. The results are somewhat different. The results may not be that reliable given that the year slope random component is tiny.

Stata


* same growth model with mixed and menl
webuse productivity
mixed gsp year || region: year|| state:
menl gsp =  {b2:} + {U1[region]}*year, define(b2: year U0[region]  U11[region>state])

(Public capital productivity)


Performing EM optimization ...

Performing gradient-based optimization: 
Iteration 0:  Log likelihood =  784.59336  
Iteration 1:  Log likelihood =  784.64562  
Iteration 2:  Log likelihood =  784.65341  
Iteration 3:  Log likelihood =  784.65343  
Iteration 4:  Log likelihood =  784.65344  

Computing standard errors ...

Mixed-effects ML regression                            Number of obs =     816

        Grouping information
        -------------------------------------------------------------
                        |     No. of       Observations per group
         Group variable |     groups    Minimum    Average    Maximum
        ----------------+--------------------------------------------
                 region |          9         51       90.7        136
                  state |         48         17       17.0         17
        -------------------------------------------------------------

                                                       Wald chi2(1)  = 2744.51
Log likelihood =  784.65344                            Prob > chi2   =  0.0000

------------------------------------------------------------------------------
         gsp | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
        year |   .0274903   .0005247    52.39   0.000     .0264618    .0285188
       _cons |  -43.71613   1.067739   -40.94   0.000    -45.80886    -41.6234
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects parameters  |   Estimate   Std. err.     [95% conf. interval]
-----------------------------+------------------------------------------------
region: Independent          |
                   var(year) |   5.44e-16   5.06e-13             0           .
                  var(_cons) |   .4380961   .2700722      .1308672    1.466587
-----------------------------+------------------------------------------------
state: Identity              |
                  var(_cons) |   .6091766   .1382625      .3904355    .9504672
-----------------------------+------------------------------------------------
               var(Residual) |   .0053926   .0002752      .0048793    .0059598
------------------------------------------------------------------------------
LR test vs. linear model: chi2(3) = 3903.81               Prob > chi2 = 0.0000

Note: LR test is conservative and provided only for reference.


Obtaining starting values by EM:

Alternating PNLS/LME algorithm:

Iteration 1:  Linearization log likelihood =  784.65134
Iteration 2:  Linearization log likelihood =  784.65343
Iteration 3:  Linearization log likelihood =  784.65343

Computing standard errors:

Mixed-effects ML nonlinear regression           Number of obs     =        816

      Grouping information
      ------------------------------------------------------------------
                           |     No. of        Observations per group
                      Path |     groups    Minimum    Average    Maximum
      ---------------------+--------------------------------------------
                    region |          9         51       90.7        136
              region>state |         48         17       17.0         17
      ------------------------------------------------------------------

                                                Wald chi2(1)      =    2737.78
Linearization log likelihood =  784.65343       Prob > chi2       =     0.0000

          b2: year U0[region] U11[region>state]

------------------------------------------------------------------------------
         gsp | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
b2           |
        year |   .0274903   .0005254    52.32   0.000     .0264605      .02852
       _cons |  -43.71615   1.069039   -40.89   0.000    -45.81143   -41.62087
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects parameters  |   Estimate   Std. err.     [95% conf. interval]
-----------------------------+------------------------------------------------
region: Independent          |
                     var(U0) |   .4378219   .2687845       .131442    1.458347
                     var(U1) |   5.42e-13   2.67e-10             0           .
-----------------------------+------------------------------------------------
region>state: Identity       |
                    var(U11) |   .6091976   .1381491      .3905974    .9501388
-----------------------------+------------------------------------------------
               var(Residual) |   .0053926   .0002752      .0048793    .0059598
------------------------------------------------------------------------------

R

# lme4 for three level model, growth model
lmer_growth1 <- lmer(gsp ~ 1 + year + (1|region: year) + (1|region) +  (1|state), data=data,  REML=FALSE)
summary(lmer_growth1)

Linear mixed model fit by maximum likelihood  ['lmerMod']
Formula: gsp ~ 1 + year + (1 | region:year) + (1 | region) + (1 | state)
   Data: data

     AIC      BIC   logLik deviance df.resid 
 -1772.4  -1744.1    892.2  -1784.4      810 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.8497 -0.4675 -0.0397  0.5098  4.0558 

Random effects:
 Groups      Name        Variance Std.Dev.
 region:year (Intercept) 0.002360 0.04858 
 state       (Intercept) 0.609352 0.78061 
 region      (Intercept) 0.437469 0.66141 
 Residual                0.003019 0.05494 
Number of obs: 816, groups:  region:year, 153; state, 48; region, 9

Fixed effects:
              Estimate Std. Error t value
(Intercept) -4.190e+01  1.802e+00  -23.25
year         2.657e-02  9.021e-04   29.46

Correlation of Fixed Effects:
     (Intr)
year -0.990

# nlme for three level model, growth model
#lme_growth1<-lme(gsp ~ year, random = list(region=pdSymm(~year), state=pdIdent(~1)),data=data, method="ML")
lme_growth1<-lme(gsp ~ year, random = list(region=~year, state=~1),data=data, method="ML")
summary(lme_growth1)

Linear mixed-effects model fit by maximum likelihood
  Data: data 
        AIC       BIC   logLik
  -1555.307 -1522.376 784.6534

Random effects:
 Formula: ~year | region
 Structure: General positive-definite, Log-Cholesky parametrization
            StdDev       Corr  
(Intercept) 6.615230e-01 (Intr)
year        8.029895e-09 0.001 

 Formula: ~1 | state %in% region
        (Intercept)  Residual
StdDev:   0.7805102 0.0734343

Fixed effects:  gsp ~ year 
                Value Std.Error  DF   t-value p-value
(Intercept) -43.71617 1.0690287 767 -40.89336       0
year          0.02749 0.0005254 767  52.32366       0
 Correlation: 
     (Intr)
year -0.972

Standardized Within-Group Residuals:
        Min          Q1         Med          Q3         Max 
-3.24606518 -0.59087173  0.03087456  0.62062441  4.27790886 

Number of Observations: 816
Number of Groups: 
           region state %in% region 
                9                48

cem in R and stata

“CEM” is a growth model with the residual being a exponential function of the time variable.

stata

This mode has some convergence problem. I still use it for demonstration purpose.


webuse productivity, clear

* this replicates the results from R
menl gsp =  {b2:} + {U1[region]}*year, define(b2: year U0[region]  U11[region>state]) resvariance(exponential year) nolog

(Public capital productivity)


Mixed-effects ML nonlinear regression           Number of obs     =        816

      Grouping information
      ------------------------------------------------------------------
                           |     No. of        Observations per group
                      Path |     groups    Minimum    Average    Maximum
      ---------------------+--------------------------------------------
                    region |          9         51       90.7        136
              region>state |         48         17       17.0         17
      ------------------------------------------------------------------

                                                Wald chi2(1)      =     102.57
Linearization log likelihood =   877.9289       Prob > chi2       =     0.0000

          b2: year U0[region] U11[region>state]

------------------------------------------------------------------------------
         gsp | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
b2           |
        year |   .0274697   .0027124    10.13   0.000     .0221535    .0327858
       _cons |  -43.62751   5.530577    -7.89   0.000    -54.46724   -32.78778
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects parameters  |   Estimate   Std. err.     [95% conf. interval]
-----------------------------+------------------------------------------------
region: Independent          |
                     var(U0) |   265.6364    129.373      102.2649    689.9991
                     var(U1) |   .0000638   .0000311      .0000245     .000166
-----------------------------+------------------------------------------------
region>state: Identity       |
                    var(U11) |   .6046005   .1369615      .3878323    .9425253
-----------------------------+------------------------------------------------
Residual variance:           |
  Exponential year           |
                      sigma2 |   2.4e-102          .             .           .
                       gamma |   .0577332    .000013      .0577078    .0577587
------------------------------------------------------------------------------
Warning: Convergence not achieved.

R

As in stata, it has some convergence problem. So the results can differ somewhat between R and stata.

lme_growth1<-lme(gsp ~ year, random = list(region=pdSymm(~year), state=pdIdent(~1)),data=data, method="ML")
lme_growth2<-update(lme_growth1,weights=varExp( form = ~ year), method="ML", control = lmeControl(opt = "optim"))
summary(lme_growth2)

Linear mixed-effects model fit by maximum likelihood
  Data: data 
        AIC       BIC   logLik
  -1548.244 -1510.608 782.1218

Random effects:
 Formula: ~year | region
 Structure: General positive-definite
            StdDev       Corr  
(Intercept) 0.2937615870 (Intr)
year        0.0001436049 0.995 

 Formula: ~1 | state %in% region
        (Intercept)  Residual
StdDev:   0.8046618 0.0736431

Variance function:
 Structure: Exponential of variance covariate
 Formula: ~year 
 Parameter estimates:
        expon 
-2.165027e-16 
Fixed effects:  gsp ~ year 
                Value Std.Error  DF   t-value p-value
(Intercept) -43.80206 1.0533833 767 -41.58226       0
year          0.02752 0.0005291 767  52.01290       0
 Correlation: 
     (Intr)
year -0.976

Standardized Within-Group Residuals:
        Min          Q1         Med          Q3         Max 
-3.24679735 -0.58919213  0.03104216  0.61843888  4.27977261 

Number of Observations: 816
Number of Groups: 
           region state %in% region 
                9                48

delta1<-lme_growth2$modelStruct$varStruct
delta1

Variance function structure of class varExp representing
        expon 
-2.165027e-16

Doing the same multi-level model with R and stata

Stata and R syntax comparison for the same three level model

Stata

R

A growth model in R and stata

Stata

R

cem in R and stata

stata

R