0.1 Overview

A few years ago, I decided that it might be useful to introduce the models for repeated measures data (e.g., RM ANOVA, RM MANOVA, Growth models) along a “continuum”. The idea is that this presentation, wherein the models are all linked together, simplifies the whole world - and makes the similarities and differences among these models easier to identify and understand. I do not have a totally streamlined implementation of my vision for presenting all of this yet, but it is getting more refined with successive attempts. In Part 1, we worked through “traditional” RM ANOVA set-ups and briefly touched on the multilevel implementation. In Part 2, we work in the multilevel framework, starting with RM ANOVA, through RM MANOVA, and into Growth Models.

0.2 Outline for Part 2

  1. Data Preparation and Description
  2. Intro to Multilevel Model
  3. Repeated Measures ANOVA
  4. Repeated Measures MANOVA
  5. Growth Model

0.2.0.1 Prelim - Loading libraries used in this script.

library(psych)
library(ggplot2)
library(nlme) #for mixed effects models

0.3 1. Data Preparation and Description

For our examples, we use 3-occasion WISC data that are equally spaced.

Load the repeated measures data

############################
####### Reading in the Data
############################
#set filepath for data file
filepath <- "https://quantdev.ssri.psu.edu/sites/qdev/files/wisc3raw.csv"
#read in the .csv file using the url() function
wisc3raw <- read.csv(file=url(filepath),header=TRUE)

Subsetting to the variables of interest. 3-occasion equally spaced repeated measures + a person-level “grouping” variable (+ an id variable). Specifically, we include the id, verb2, verb4, verb6, and grad variables.

#Creating a list of variables of interest
varnames <- c("id","verb2","verb4","verb6","grad")
#sub-setting the columns of interest
wiscsub <- wisc3raw[ ,varnames]

#Looking at the sample-level descriptives for the chosen variables
describe(wiscsub)
vars n mean sd median trimmed mad min max range skew kurtosis se
id 1 204 102.5000000 59.0338886 102.500 102.5000000 75.612600 1.00 204.00 203.00 0.0000000 -1.2176609 4.1331989
verb2 2 204 25.4153431 6.1063772 25.980 25.4026220 6.567918 5.95 39.85 33.90 -0.0639307 -0.3445494 0.4275319
verb4 3 204 32.6077451 7.3198841 32.820 32.4240244 7.183197 12.60 52.84 40.24 0.2317998 -0.0776524 0.5124944
verb6 4 204 43.7499020 10.6650511 42.545 43.4560976 11.297412 17.35 72.59 55.24 0.2356459 -0.3605210 0.7467029
grad 5 204 0.2254902 0.4189328 0.000 0.1585366 0.000000 0.00 1.00 1.00 1.3040954 -0.3007374 0.0293312

Multilevel modeling analyses typically require a long data set. So, we also reshape from wide to long in order to have a long data set.

#reshaping wide to long
verblong <- reshape(data=wiscsub, 
                     varying=c("verb2","verb4","verb6"), 
                     timevar="grade", idvar="id", 
                     direction="long", sep="")
#reordering for convenience
verblong <- verblong[order(verblong$id,verblong$grade),c("id","grade","verb","grad")]
#looking at dataset
head(verblong,12)
id grade verb grad
1.2 1 2 26.98 0
1.4 1 4 39.61 0
1.6 1 6 55.64 0
2.2 2 2 14.38 0
2.4 2 4 21.92 0
2.6 2 6 37.81 0
3.2 3 2 33.51 1
3.4 3 4 34.30 1
3.6 3 6 50.18 1
4.2 4 2 28.39 1
4.4 4 4 42.16 1
4.6 4 6 44.72 1

For clarity, lets consider the basic information representation of the 3-occasion repeated measures data. In particular, data (even non-repeated measures data) are summarized (at the sample-level) as …
Means Vector,
Variances-Covariance Matrix

#mean vector (from wide data)
meanvector <- sapply(wiscsub[ ,c("verb2","verb4","verb6")], mean, na.rm=TRUE)
meanvector
##    verb2    verb4    verb6 
## 25.41534 32.60775 43.74990
#variance-covariance matrix (from wide data)
varcovmatrix <- cov(wiscsub[ ,c("verb2","verb4","verb6")], use="pairwise.complete.obs")
varcovmatrix
##          verb2    verb4     verb6
## verb2 37.28784 33.81957  47.40488
## verb4 33.81957 53.58070  62.25489
## verb6 47.40488 62.25489 113.74332

Lets make a note of this vector and matrix (i.e., on the board) as we will come back to them. Making visual counterparts can also be extremely useful - especially for faciliating higher-level conversations in a research group.
Basic Sample-level descriptions in visual form.

#boxplot by grade (from long data)
#note that the time variable has been converted to a factor = categorical
ggplot(data=verblong, aes(x=factor(grade), y=verb)) + 
  geom_boxplot(notch = TRUE) +
  stat_summary(fun.y="mean", geom="point", shape=23, size=3, fill="white") +
  labs(x = "Grade", y = "Verbal Ability")

#correlations across grade
#in the psych library (from wide data)
pairs.panels(wiscsub[,c("verb2","verb4","verb6")])

Always be careful about the scaling of the x- and y-axes in these plots.

OK - all of that is the same as we did in Part 1.

One additional recoding for convenience … (we will treat this in more detail later) is to center and scale our time variable. This gives us a specific 0 point and an intuitive 0,1,2 scale that is useful for our didactic purposes.

unique(verblong$grade)
## [1] 2 4 6
#Making a new time variable that recenters and rescales grade 
# from 2,4,6 to 0,1,2
verblong$time0 <- (verblong$grade-2)/2
unique(verblong$time0)
## [1] 0 1 2
#looking at dataset
head(verblong,12)
id grade verb grad time0
1.2 1 2 26.98 0 0
1.4 1 4 39.61 0 1
1.6 1 6 55.64 0 2
2.2 2 2 14.38 0 0
2.4 2 4 21.92 0 1
2.6 2 6 37.81 0 2
3.2 3 2 33.51 1 0
3.4 3 4 34.30 1 1
3.6 3 6 50.18 1 2
4.2 4 2 28.39 1 0
4.4 4 4 42.16 1 1
4.6 4 6 44.72 1 2

Plotting the raw data along this new time variable.

#plotting intraindividual change RAW DATA
ggplot(data = verblong, aes(x = time0, y = verb, group = id)) +
  ggtitle("Raw Data") +
  geom_point() + 
  geom_line() +
  xlab("Time") + 
  ylab("WISC Verbal Score") + 
  ylim(0,100) + xlim(0,2)