Book Cover
Peter Dalgaard is associate professor at the Department of Biostatistics at the University of Copenhagen in Denmark, and a member of the R-Project Core Development team. Also, he is an active participating and respected member of the R-help mailing-list. Based on these experiences, he set to write an introductory book on statistics and R.



Multilevel models, or mixed effect models, can easily be estimated in R. Several packages are available. Here, the lme() function from the nlme-package is described. The specification of several types of models will be shown, using a fictive example. A detailed description of the specification rules is given. Output of the specified models is given, but not described or interpreted.
Please note that this description is very closely related to the description of the specification of the lmer() function of the lme4-package. The results are similar and here exactly the same possibilities are offered.

In this example, the dependent variable is the standardized result of a student on a specific exam. This variable is called “normexam”. In estimating the score on the exam, two levels will be discerned: student and school. On each level, one explanatory variable is present. On individual level, we are taking into account the standardized score of the student on a LR-test (“standLRT”). On the school-level, we take into account the average intake-score (“schavg”).

Plotting the results of a multilevel analysis, without use of the extension package ‘Lattice’ can be quite complicated while using R. Using only the basic packages, as well as the multilevel packages (nlme and lme4) there are no functions readily available for this task. So, this is a good point in this manual to put some of our programming skills to use. This makes exactly clear how the results of a multilevel analysis are stored in R as well.

Unlike most statistical software packages, R often stores the results of an analysis in an object. The advantage of this is that while not all output is shown in the screen ad once, it is neither necessary to estimate the statistical model again if different output is required.

This paragraph will show the kind of data that is stored in a multilevel model estimated by R-Project and introduce some functions that make use of this data.

Several functions already present in R-Project are very useful when analyzing multilevel models or when preparing data to do so. Three of these helper functions will be described: aggregating data, the behavior of the plot() function when applied to a multilevel model and finally setting contrasts for categorical functions. Note that none of these functions are related to multilevel analysis only.

Although all introductions on regression seem to be based on the assumption of data that is distributed normally, in practice this is not the case. Many other types of distributions exist. To name a few: normal distribution, binomial distribution, poisson, gaussian and so on. The lmer()-function in the lme4-package can easily estimate models based on these distributions. This is done by adding the ‘family’-argument to the command syntax, thereby specifying that not a linear multilevel model needs to be estimated, but a generalized linear model.


Multilevel models, or mixed effects models, can easily be estimated in R. Several packages are available. Here, the lmer() function from the lme4-package is described. The specification of several types of models will be shown, using a fictive example. A detailed description of the specification rules is given. Output of the specified models is given, but not described or interpreted.
Please note that this description is very closely related to the description of the specification of the lme() function of the nlme-package. The results are similar and here exactly the same possibilities are offered.

Based on the basic graphics that were created in the previous paragraph of this manual, we will elaborate some to create more advanced graphics. What we are going to do is to add two other sets of data, one represented by an additional line, one as four large green varying symbols. Then, in order to keep oversight over the graph, a basic legend is added to the plot. Finally, we let R draw a curved line based on a quadratic function.

Sometimes it takes more than just one graph to illustrate your arguments. In those cases, you’ll often want these graphs to be held closely together in the output. R has several ways of doing so. In the results that can be achieved with them, they differ a great deal, but their syntax differs only slightly. These different functions cannot be used together on the same graph-window though.

In many cases, multiple points on a scatterplot have exactly the same coordinates. When these are simply plotted, the visual representation of the data may be unsatisfactory. Today’s R-Session is on how to present this type of data in neatly arranged plots in R-Project.

Curving Normality

Curving Normality is an academic website and blog maintained by Rense Nieuwenhuis.

Rense is a Ph.D. Candidate at the Institue for Innovation and Governance Studies (IGS) of the University of Twente.

His work is forthcoming in the Journal of Marriage and Family and the European Sociological Review.

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Recent Activities

Conference: Day of Sociology