R-Sessions 17: Generalized Multilevel {lme4}



Although all introductions on regression seem to be based on the assumption of data that is distributed normally, in practice this is not the case. Many other types of distributions exist. To name a few: normal distribution, binomial distribution, poisson, gaussian and so on. The lmer()-function in the lme4-package can easily estimate models based on these distributions. This is done by adding the ‘family’-argument to the command syntax, thereby specifying that not a linear multilevel model needs to be estimated, but a generalized linear model.

Logistic Multilevel Regression

Let us say, we want to estimate the chance for success on a test a student in a specific school has. Therefor, we can use the Exam data-set in the mlmRev-package. This contains the standardized scores on a test. Here, we’ll define success on the test as having a standardized score of 0 or larger. This is recoded to a 0-1 variable below, using the ifelse() function. Using summary() the process of recoding is checked. The needed packages are loaded as well, using the library() function.

library(lme4)
library(mlmRev)
names(Exam)

Exam$success <- ifelse(Exam$normexam >= 0,1,0)
summary(Exam$normexam)
summary(Exam$success)

> library(lme4)
Loading required package: Matrix
Loading required package: lattice
> library(mlmRev)
> names(Exam)
 [1] "school"   "normexam" "schgend"  "schavg"   "vr"       "intake"  
 [7] "standLRT" "sex"      "type"     "student" 
> 
> Exam$success <- ifelse(Exam$normexam >= 0,1,0)
> summary(Exam$normexam)
      Min.    1st Qu.     Median       Mean    3rd Qu.       Max. 
-3.6660000 -0.6995000  0.0043220 -0.0001138  0.6788000  3.6660000 
> summary(Exam$success)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 0.0000  0.0000  1.0000  0.5122  1.0000  1.0000 

In order to be able to properly use the so created binary ‘success’ variable, a logistic regression model needs to be estimated. This is done by specifying binomial family, using the logit as a link-function, using “family = binomial(link = “logit”)”. The rest of the specification is exactly the same as a normal linear multilevel regression model using the lmer() function.

lmer(success~ schavg + (1|school), data=Exam, family=binomial(link = “logit”))

> lmer(success~ schavg + (1|school), 
+ 	data=Exam, 
+ 	family=binomial(link = "logit"))
Generalized linear mixed model fit using Laplace 
Formula: success ~ schavg + (1 | school) 
   Data: Exam 
 Family: binomial(logit link)
  AIC  BIC logLik deviance
 5323 5342  -2658     5317
Random effects:
 Groups Name        Variance Std.Dev.
 school (Intercept) 0.23113  0.48076 
number of obs: 4059, groups: school, 65

Estimated scale (compare to  1 )  0.9909287 

Fixed effects:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  0.08605    0.07009   1.228    0.220    
schavg       1.60548    0.21374   7.511 5.86e-14 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Correlation of Fixed Effects:
       (Intr)
schavg 0.072 

– – — — —– ——–

– – — — —– ——–
R-Sessions is a collection of manual chapters for R-Project, which are maintained on Curving Normality. All posts are linked to the chapters from the R-Project manual on this site. The manual is free to use, for it is paid by the advertisements, but please refer to it in your work inspired by it. Feedback and topic requests are highly appreciated.
——– —– — — – –

3 comment on “R-Sessions 17: Generalized Multilevel {lme4}

Leave a Reply to Anne Cancel reply