R-Sessions 32: Forward.lmer: Basic stepwise function for mixed effects in R

Intended to be a customized solution, it may have grown to be a little more. forward.lmer is an early installment of a full stepwise function for mixed effects regression models in R-Project. I may put in some work to extend it, or I may not. Nevertheless, in a ‘forward sense of stepwise’, I think it can be pretty useful as it is. Also, it has an interesting take on the stepwise concept, I think.

Most stepwise functions (as far as I know) take a base model and a bunch of variables, and then iteratively adds and/or subtracts some variables, according to various criteria, to come to the best fitting regression model. All very interesting, but how to deal with interaction variables? And moreover: most existing functions do not work with mixed effects models ((I use the term ‘mixed effects model’ to describe this stepwise function to refer to what is often referred to as hierarchical or multilevel regression models, as well)).

Built around the lme4 package in R, forward.lmer provides a forward stepwise procedure to mixed effects models. Also, it allows the user not only to enter single variables to models, but also to do the same with blocks of variables. This opens up many options: users can add the complete interactions at once (i.e. both the original and the multiplicative terms), or add these consequetively. Future development will focus on additional selection criteria for interactions, such as the criterium that at least the multiplicative term needs to be statistically significant.

The user provides a starting model and a set of variables to evaluate. The procedure then updates the starting model with the addition of every single variable (or block of variables). The models are ordered based on their LogLikelihood (other criteria, i.e. BIC and AIC following soon), after which the best fitting model is evaluated against one of two criteria. The first criterium is that at least one of the added parameters is statistically significant. The other criterium is that the addition of the parameters together is statistically significant.

There are several parameters to be specified:

  • start.model: The starting model the procedure starts with. This can be a null-model, or a model already containing several variables. All lmer-models (i.e. logistic, poisson, linear) are supported.
  • blocks: a vector of variable names (as character strings) to be added to a model. Several variables can a concatenated within the same character string, so that these are added as a block of variables, instead of a single variables at once.
  • max.iter: The maximum number of variables that are evaluated. If max.iter is reached, the procedure stops without adding more variables.
  • sig.level: This is the p-value against which it is tested whether the new model fits better than a base model. Either sig.level or zt needs to be specified, but not both at once.
  • zt: This is either the T or Z value that is used to test whether (at least) one of the added variables is statistically significant. T values are used for linear regression, Z values for binary response models.
  • print.log: Should a log be printed? The log contains information on which variables (and on which criteria) were added in each step.

The forward.lmer function returns the best fitting model (according to the given criteria). Of course, one can use this resulting model as a starting model for a new stepwise procedure.

forward.lmer <- function(
start.model, blocks,
max.iter=1, sig.level=FALSE,
zt=FALSE, print.log=TRUE)

# forward.lmer: a function for stepwise regression using lmer mixed effects models
# Author: Rense Nieuwenhuis

# Initialysing internal variables
log.step <- 0
log.LL <- log.p <- log.block <- zt.temp <- log.zt <- NA
model.basis <- start.model

# Maximum number of iterations cannot exceed number of blocks
if (max.iter > length(blocks)) max.iter <- length(blocks)

# Setting up the outer loop
for(i in 1:max.iter)

models <- list()

# Iteratively updating the model with addition of one block of variable(s)
# Also: extracting the loglikelihood of each estimated model
for(j in 1:length(blocks))
models[[j]] <- update(model.basis, as.formula(paste(". ~ . + ", blocks[j])))

LL <- unlist(lapply(models, logLik))

# Ordering the models based on their loglikelihood.
# Additional selection criteria apply
for (j in order(LL, decreasing=TRUE))

############## Selection based on ANOVA-test

if(sig.level != FALSE)
if(anova(model.basis, models[[j]])[2,7] < sig.level)

model.basis <- models[[j]]

# Writing the logs
log.step <- log.step + 1
log.block[log.step] <- blocks[j]
log.LL[log.step] <- as.numeric(logLik(model.basis))
log.p[log.step] <- anova(model.basis, models[[j]])[2,7]

blocks <- blocks[-j]


############## Selection based significance of added variable-block

if(zt != FALSE)
b.model <- summary(models[[j]])@coefs
diff.par <- setdiff(rownames(b.model), rownames(summary(model.basis)@coefs))
if (length(diff.par)==0) break
sig.par <- FALSE

for (k in 1:length(diff.par))
if(abs(b.model[which(rownames(b.model)==diff.par[k]),3]) > zt)
sig.par <- TRUE
zt.temp <- b.model[which(rownames(b.model)==diff.par[k]),3]

model.basis <- models[[j]]

# Writing the logs
log.step <- log.step + 1
log.block[log.step] <- blocks[j]
log.LL[log.step] <- as.numeric(logLik(model.basis))
log.zt[log.step] <- zt.temp
blocks <- blocks[-j]


## Create and print log
log.df <- data.frame(log.step=1:log.step, log.block, log.LL, log.p, log.zt)
if(print.log == TRUE) print(log.df, digits=4)

## Return the 'best' fitting model

As always, you're invited to use this function, or to adapt it and use that. However, it is required to make mention of this function and its author. Additionally, since I intend to continue working on this function (perhaps even evolve it to a 'package' on CRAN), I would love to hear about any experiences in using it.

14 comment on “R-Sessions 32: Forward.lmer: Basic stepwise function for mixed effects in R

  • Hi! I was quite happy to find your forward function. Unfortunately, I can’t get it to work!

    This is the list of variables and interactions I would like to test:


    This is the starting model:

    bib.lmer.min<-lmer(RespBin ~ verger+ (1 | Bloc), family=quasibinomial)

    This is the “call”

    Result =

    Error in mer_finalize(ans) : Downdated X’X is not positive definite, 1.

    Can you help out?

    Best regards,


    • Hi there, Vincent,

      thanks for trying to use my function. Strictly speaking, the problem you’re running into is a ‘problem’ with the lmer estimation procedure, meaning that apparently lmer attempts to fit a model that does not converge well.

      This, however, raises an issue about doing stepwise procedures on mixed effects models. Since not all mixed effects models can converge, how to deal with those? I’m thinking about extending this function to allow it to continue after a model did not converge. It may take a while before I find the time to do that, though.

      In the meanwhile, the only ‘solution’ to your problem might be to start with different sets of variables and to go from there. For instance, first add all the ‘main’ effects, and in a second run the ‘interaction’ effects. This may however diverge from your original plan.

      • He Rense! Thanks for sharing this wonderful function. I’ve found it to be quite useful. Are you aware of any papers that have used and cited it?

      • Hi Frank,

        thanks for using this function. I am aware of one journal article that used this function (and later this article was incorporated in a dissertation). The paper is by Jochem Tolsma, Tom van der Meer and Maurice Gesthuizen, and was published in Acta Politica. It is titled “The impact of neighbourhood and municipality characteristics on social cohesion in the Netherlands”.

        You can find the article here: http://www.jtolsma.nl/uploads/Tolsma%202009%20AP.pdf

  • Hoi Rense,

    how would you like me to cite that function, if I use it for a paper?

    Dank je wel,


    (I left my RL email address in the comment)

    • Hello, perceval,

      I am considering to build a small R package based on this function, but this may take some while. R packages are easily put in the reference list of a paper/chapter. Till that time, you can just refer to this function by mentioning my name, the name of the function, and perhaps the URL of the website.


  • Dear Rense,

    Thanks for putting this out – I’ve been thinking for a while how I’d code this. I should’ve realised someone would have done it already.

    I’m intending to use the simpler ‘lowest AIC’ criterion to choose the best model. I will start modifying the code next week. Do let me know if you’ve done this already, but if not I will let you have the result.

    Best wishes, Nick

  • Hi Rense,

    very helpful indeed.

    I did not go through the entire code, but I think I have noticed a bug so I wanted to let you know

    when writing the logs for if(sig.level != FALSE), your anova calculation for log.p[log.step] comes after you already replaces model.basis by model[[j]], so you do not get the appropriate p.value.
    I suggest you store anova(model.basis, models[[j]]) before you apply model.basis length(blocks)) max.iter = length(blocks)

    for(i in 1:max.iter){ # 2a

    models = list()
    aic = numeric(length(blocks))

    for(j in 1 : length(blocks)){ # 3a

    models[[j]] = update(model.basis, as.formula(paste(“. ~ . + “, blocks[j])))

    LL = logLik(models[[j]]) ; df = as.numeric(attr(LL, “df”))
    aic[j] = as.numeric(-2* LL) + 2*df

    } # 3b

    LL_mb = logLik(model.basis) ; df_mb = as.numeric(attr(LL_mb, “df”))
    aic_mb = as.numeric(-2* LL_mb) + 2*df_mb

    j = order(aic, decreasing=FALSE)[1]

    if(aic[j] 0) log.df = data.frame(log.step = 1:log.step, log.block = log.block, log.AIC.bef = log.AIC.bef, log.AIC.aft = log.AIC.aft) else log.df = data.frame(log.step = NA, log.block = log.block, log.AIC.bef = log.AIC.bef, log.AIC.aft = log.AIC.aft)
    return(log.df)} # 1b

  • would your function work for cox proportional hazard
    models? I am fitting one of those with 3 sets of possible
    dependent variables and some 100 possible regressors.
    stepwise seems to be needed.

  • Hi Rense,

    Thanks for sharing this function! I was wondering if you have any example data/code available using this function? I’m trying to run it on my own and it would be really helpful to work through an example as I run into errors.


  • Fantastic!!

Leave a Reply to Alban Guillaumet Cancel reply