Mixed Effects Tutorial 2: Fun with merMod Objects

Update: Since this post was released I have co-authored an R package to make some of the items in this post easier to do. This package is called merTools and is available on CRAN and on GitHub. To read more about it, read my new post here and check out the package on GitHub.

Fun with merMod Objects

Introduction

First of all, be warned, the terminology surrounding multilevel models is vastly inconsistent. For example, multilevel models themselves may be referred to as hierarchical linear models, random effects models, multilevel models, random intercept models, random slope models, or pooling models. Depending on the discipline, software used, and the academic literature many of these terms may be referring to the same general modeling strategy. In this tutorial I will attempt to provide a user guide to multilevel modeling by demonstrating how to fit multilevel models in R and by attempting to connect the model fitting procedure to commonly used terminology used regarding these models.

We will cover the following topics:

  • The structure and methods of merMod objects
  • Extracting random effects of merMod objects
  • Plotting and interpreting merMod objects

If you haven’t already, make sure you head over to the Getting Started With Multilevel Models tutorial in order to ensure you have set up your environment correctly and installed all the necessary packages. The tl;dr is that you will need:

  • A current version of R (2.15 or greater)
  • The lme4 package (install.packages("lme4"))

Read in the data

Multilevel models are appropriate for a particular kind of data structure where units are nested within groups (generally 5+ groups) and where we want to model the group structure of the data. We will use data from Jon Starkweather at the University of North Texas on student social attributes within schools and classes. Visit the excellent tutorial available here for more.

library(lme4) # load library
library(arm) # convenience functions for regression in R
lmm.data <- read.table("http://www.unt.edu/rss/class/Jon/R_SC/Module9/lmm.data.txt",
   header=TRUE, sep=",", na.strings="NA", dec=".", strip.white=TRUE)
#summary(lmm.data)
head(lmm.data)
##   id extro  open agree social class school
## 1  1 63.69 43.43 38.03  75.06     d     IV
## 2  2 69.48 46.87 31.49  98.13     a     VI
## 3  3 79.74 32.27 40.21 116.34     d     VI
## 4  4 62.97 44.41 30.51  90.47     c     IV
## 5  5 64.25 36.86 37.44  98.52     d     IV
## 6  6 50.97 46.26 38.83  75.22     d      I

Here we have data on the extroversion of subjects nested within classes and within schools.

Let’s understand the structure of the data a bit before we begin:

str(lmm.data)
## 'data.frame':    1200 obs. of  7 variables:
##  $ id    : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ extro : num  63.7 69.5 79.7 63 64.2 ...
##  $ open  : num  43.4 46.9 32.3 44.4 36.9 ...
##  $ agree : num  38 31.5 40.2 30.5 37.4 ...
##  $ social: num  75.1 98.1 116.3 90.5 98.5 ...
##  $ class : Factor w/ 4 levels "a","b","c","d": 4 1 4 3 4 4 4 4 1 2 ...
##  $ school: Factor w/ 6 levels "I","II","III",..: 4 6 6 4 4 1 3 4 3 1 ...

Here we see we have two possible grouping variables – class and school. Let’s explore them a bit further:

table(lmm.data$class)
## 
##   a   b   c   d 
## 300 300 300 300
table(lmm.data$school)
## 
##   I  II III  IV   V  VI 
## 200 200 200 200 200 200
table(lmm.data$class, lmm.data$school)
##    
##      I II III IV  V VI
##   a 50 50  50 50 50 50
##   b 50 50  50 50 50 50
##   c 50 50  50 50 50 50
##   d 50 50  50 50 50 50

This is a perfectly balanced dataset. In all likelihood you aren’t working with a perfectly balanced dataset, but we’ll explore the implications for that in the future. For now, let’s plot the data a bit. Using the excellent xyplot function in the lattice package, we can explore the relationship between schools and classes across our variables.

require(lattice)
xyplot(extro ~ open + social + agree | class, data = lmm.data, 
                 auto.key = list(x = .85, y = .035, corner = c(0, 0)), 
       layout = c(4,1), main = "Extroversion by Class")