A few years ago, I decided that it might be useful to introduce the models for repeated measures data (e.g., RM ANOVA, RM MANOVA, Growth models) along a “continuum”. The idea is that this presentation, wherein the models are all linked together, simplifies the whole world - and makes the similarities and differences among these models easier to identify and understand. I do not have a totally streamlined implementation of my vision for presenting all of this yet, but am giving it a shot. In Part 1, we work through “traditional” RM ANOVA set-ups and briefly touch on the multilevel implementation. In Part 2, we work in the multilevel framework, starting with RM ANOVA, through RM MANOVA, and into Growth Models.

- Data Preparation and Description
- Analysis of Variance (ANOVA)
- Repeated Measures Analysis of Variance (RM ANOVA)
- RM ANOVA in the Multilevel Framework

```
library(psych)
library(ggplot2)
library(ez)
library(car)
library(nlme) #for mixed effects models
```

For our examples, we use 3-occasion WISC data that are *equally spaced*.

Load the repeated measures data

```
############################
####### Reading in the Data
############################
#set filepath for data file
filepath <- "https://quantdev.ssri.psu.edu/sites/qdev/files/wisc3raw.csv"
#read in the .csv file using the url() function
wisc3raw <- read.csv(file=url(filepath),header=TRUE)
```

Subsetting to the variables of interest. *3-occasion equally spaced repeated measures + a person-level “grouping” variable (+ an id variable).* Specifically, we include the `id`

, `verb2`

, `verb4`

, `verb6`

, and `grad`

variables.

```
#Creating a list of variables of interest
varnames <- c("id","verb2","verb4","verb6","grad")
#sub-setting the columns of interest
wiscsub <- wisc3raw[ ,varnames]
#Looking at the sample-level descriptives for the chosen variables
describe(wiscsub)
```

vars | n | mean | sd | median | trimmed | mad | min | max | range | skew | kurtosis | se | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

id | 1 | 204 | 102.5000000 | 59.0338886 | 102.500 | 102.5000000 | 75.612600 | 1.00 | 204.00 | 203.00 | 0.0000000 | -1.2176609 | 4.1331989 |

verb2 | 2 | 204 | 25.4153431 | 6.1063772 | 25.980 | 25.4026220 | 6.567918 | 5.95 | 39.85 | 33.90 | -0.0639307 | -0.3445494 | 0.4275319 |

verb4 | 3 | 204 | 32.6077451 | 7.3198841 | 32.820 | 32.4240244 | 7.183197 | 12.60 | 52.84 | 40.24 | 0.2317998 | -0.0776524 | 0.5124944 |

verb6 | 4 | 204 | 43.7499020 | 10.6650511 | 42.545 | 43.4560976 | 11.297412 | 17.35 | 72.59 | 55.24 | 0.2356459 | -0.3605210 | 0.7467029 |

grad | 5 | 204 | 0.2254902 | 0.4189328 | 0.000 | 0.1585366 | 0.000000 | 0.00 | 1.00 | 1.00 | 1.3040954 | -0.3007374 | 0.0293312 |

Multilevel modeling analyses typically require a long data set. So, we also reshape from wide to long in order to have a long data set.

```
#reshaping wide to long
verblong <- reshape(data = wiscsub,
varying = c("verb2","verb4","verb6"),
timevar = "grade", idvar = "id",
direction = "long", sep = "")
#reordering for convenience
verblong <- verblong[order(verblong$id, verblong$grade), c("id", "grade", "verb", "grad")]
#looking at dataset
head(verblong,12)
```

id | grade | verb | grad | |
---|---|---|---|---|

1.2 | 1 | 2 | 26.98 | 0 |

1.4 | 1 | 4 | 39.61 | 0 |

1.6 | 1 | 6 | 55.64 | 0 |

2.2 | 2 | 2 | 14.38 | 0 |

2.4 | 2 | 4 | 21.92 | 0 |

2.6 | 2 | 6 | 37.81 | 0 |

3.2 | 3 | 2 | 33.51 | 1 |

3.4 | 3 | 4 | 34.30 | 1 |

3.6 | 3 | 6 | 50.18 | 1 |

4.2 | 4 | 2 | 28.39 | 1 |

4.4 | 4 | 4 | 42.16 | 1 |

4.6 | 4 | 6 | 44.72 | 1 |

For clarity, lets consider the basic information representation of the 3-occasion repeated measures data. In particular, data (even non-repeated measures data) are summarized (at the sample-level) as …

*Means Vector*,

*Variances-Covariance Matrix* …

```
#mean vector (from wide data)
meanvector <- sapply(wiscsub[ , c("verb2", "verb4", "verb6")], mean, na.rm=TRUE)
meanvector
```

```
## verb2 verb4 verb6
## 25.41534 32.60775 43.74990
```

```
#variance-covariance matrix (from wide data)
varcovmatrix <- cov(wiscsub[ , c("verb2", "verb4", "verb6")], use = "pairwise.complete.obs")
varcovmatrix
```

```
## verb2 verb4 verb6
## verb2 37.28784 33.81957 47.40488
## verb4 33.81957 53.58070 62.25489
## verb6 47.40488 62.25489 113.74332
```

Making visual counterparts can also be extremely useful - especially for faciliating higher-level conversations in a research group.

Basic Sample-level descriptions in visual form.

```
#boxplot by grade (from long data)
#note that the time variable has been converted to a factor = categorical
ggplot(data = verblong, aes(x = factor(grade), y = verb)) +
geom_boxplot(notch = TRUE) +
stat_summary(fun.y = "mean", geom = "point", shape = 23, size = 3, fill = "white") +
labs(x = "Grade", y = "Verbal Ability")
```

```
#correlations across grade
#in the psych library (from wide data)
pairs.panels(wiscsub[, c("verb2", "verb4", "verb6")])
```

```
#require(GGally)
#ggpairs(wiscsub[ ,c("verb2","verb4","verb6")])
```

*Always be careful about the scaling of the x- and y-axes in these plots.* Side note: What would these correlations be if we had sampled *different* children at each grade (i.e., in a full between-person design)?

**The objective of all of our analyses is to ‘deconstruct’ this information into meaningful/interpretable pieces. Each ‘model’ does it in a different way, with different assumptions.**

Key practical differences among models, and within the R analysis framework, have to do with how variables are conceived and treated. For example, the *time* variable can be considered as having been measured on either a categorical scale or a continuous scale. Different conceptions of *time* suggest use of different models, and different interpretations and inferences.

We can check how specific variables are ‘coded’ (*‘structured’*) within R. Note that *you control R*. It is possible, and often necessary, to change the structure of an object. You can tell R to do what you want!

Let’s check how R is holding our variables - the ‘structure’ of the objects.

`str(verblong)`

```
## 'data.frame': 612 obs. of 4 variables:
## $ id : int 1 1 1 2 2 2 3 3 3 4 ...
## $ grade: num 2 4 6 2 4 6 2 4 6 2 ...
## $ verb : num 27 39.6 55.6 14.4 21.9 ...
## $ grad : int 0 0 0 0 0 0 1 1 1 1 ...
```

And let’s look at some specific variables.

The “id” variable …

`str(verblong$id)`

`## int [1:612] 1 1 1 2 2 2 3 3 3 4 ...`

We see that “verblong$id” is a vector of integers, and has mean …

`mean(verblong$id)`

`## [1] 102.5`

But, we actually treat “id” as a *factor*, a categorical variable with *N* = 204 levels. Let’s make this explicit.

```
verblong$id <- factor(verblong$id)
str(verblong$id)
```

`## Factor w/ 204 levels "1","2","3","4",..: 1 1 1 2 2 2 3 3 3 4 ...`

Now, R will treat “id” as a categorical variable. This has consequences. For example, it is no longer appropriate to calculate an arithmetic mean.

`mean(verblong$id)`

```
## Warning in mean.default(verblong$id): argument is not numeric or logical:
## returning NA
```

`## [1] NA`

How about the structure of our time variable, “grade”?

`str(verblong$grade)`

`## num [1:612] 2 4 6 2 4 6 2 4 6 2 ...`

*Numeric*. Interesting … (and good to know!).

Now lets get into the models. We start with a traditional ANOVA framework …

In the ANOVA modeling world, we are interested in *mean differences* across levels of one or more *factors*.

Research questions are phrased as, for example, …

Does the mean verbal ability differ between children with mothers who did or did not graduate from high school?

Does the mean verbal ability differ between children across grade levels?

For review, lets do a one-way ANOVA with a between-person factor. This is not our real situation, so *we make a FAKE example* …

Preliminaries: In the traditional ANOVA world we generally treat *time as a categorical variable* (think experimental design, with random assigment to groups - the full between-person design we considered earlier). We must tell R that “grade” is a categorical variable = factor()

`str(verblong$grade)`

`## num [1:612] 2 4 6 2 4 6 2 4 6 2 ...`

```
verblong$grade <- factor(verblong$grade)
str(verblong$grade)
```

`## Factor w/ 3 levels "2","4","6": 1 2 3 1 2 3 1 2 3 1 ...`

We must also tell R that “id” is a categorical variable = factor()

`verblong$id <- factor(verblong$id)`

Let’s attach the data for convenience. (Be careful doing this! For good programming practice, whenever you make an attach(), ALSO make a detach() and carry it with you)

`attach(verblong)`

Let’s first pretend that these are not repeated measures and do a **One-Way ANOVA**

```
model_foneway <- aov(verb ~ grade, data = verblong)
summary(model_foneway)
```

```
## Df Sum Sq Mean Sq F value Pr(>F)
## grade 2 34818 17409 255.3 <2e-16 ***
## Residuals 609 41536 68
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

Check that the degrees of freedom make sense! In our case we have 612 complete records, which we are pretending are from 612 *different* children.

Signficance of the overall test pushes us to do follow-up tests …

`TukeyHSD(model_foneway, "grade")`

```
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = verb ~ grade, data = verblong)
##
## $grade
## diff lwr upr p adj
## 4-2 7.192402 5.271206 9.113598 0
## 6-2 18.334559 16.413362 20.255755 0
## 6-4 11.142157 9.220960 13.063353 0
```

Or as a set of pairwise t.tests, *with adjustment* of p-values …

```
with(verblong, pairwise.t.test(verb, grade,
p.adjust.method = "holm", paired = TRUE))
```

```
##
## Pairwise comparisons using paired t tests
##
## data: verb and grade
##
## 2 4
## 4 <2e-16 -
## 6 <2e-16 <2e-16
##
## P value adjustment method: holm
```

Okay - that was for review about one-way ANOVA (with a between-person factor) …

Really, we should better accommodate the **repeated measures** design. The “grade” variable is a *within-person factor*.

Let’s run the one-way RM ANOVA

```
model_aov1 <- aov(verb ~ grade + Error(id), data = verblong)
summary(model_aov1)
```

```
##
## Error: id
## Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 203 33263 163.9
##
## Error: Within
## Df Sum Sq Mean Sq F value Pr(>F)
## grade 2 34818 17409 854.3 <2e-16 ***
## Residuals 406 8273 20
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

```
# Error: id
# Df Sum Sq Mean Sq F value Pr(>F)
# Residuals 203 33263 163.9
#
# Error: Within
# Df Sum Sq Mean Sq F value Pr(>F)
# grade 2 34818 17409 854.3 <2e-16 ***
# Residuals 406 8273 20
# ---
```

ANOVA models are typically reported along with the means. Get a table of all means (e.g., grand mean, level means, cell means)

`model.tables(model_aov1, type = "means", se = TRUE)`

```
## Warning in model.tables.aovlist(model_aov1, type = "means", se = TRUE): SEs
## for type 'means' are not yet implemented
```

```
## Tables of means
## Grand mean
##
## 33.92433
##
## grade
## grade
## 2 4 6
## 25.42 32.61 43.75
```

`model.tables(model_aov1, type = "effects", se = TRUE)`

```
## Tables of effects
##
## grade
## grade
## 2 4 6
## -8.509 -1.317 9.826
##
## Standard errors of effects
## grade
## 0.3161
## replic. 204
```

Note that formally, here, the standard error being reported is for the *mean difference* effect, not for the means themselves.

After doing the RMANOVA, the significance of the “grade”" effect suggests that we should locate where the difference is …

How should we do the contrasts between levels of the within-person factor?

**BASED ON THEORY!!**

(if we are running all possible contrasts, we are in an exploratory world)

Studies are always designed with a purpose (i.e., based on some research questions), so there is usually some theory available. Hopefully, going back to the research question, there were some hypotheses about how mean levels of verbal ability would differ across levels.

But, … we may also take a more data-driven approach. This puts us in an “exploratory” analysis world. Let’s see …

Well, we know that these are ordered, equal-interval repeated measures …

We also know that mathematical functions for *change* are often based on polynomials (linear, quadratic, cubic, …) …

Let’s go back and tell R that “grade” is an *ordered factor*. We need this, or the polynomial contrasts won’t work.

R examines the values and uses them to determine the ordering of the levels. The values for “grade” are 2 4 6, so R gets them in the correct order. However, if the values had been words, R would have ordered them alphabetically - BE CAREFUL!

(to reorder use the levels option of factor(), see e.g., http://stackoverflow.com/questions/18413756/re-ordering-factor-levels-in-data-frame)

```
verblong$grade <- factor(verblong$grade, ordered = TRUE)
str(verblong$grade)
```

`## Ord.factor w/ 3 levels "2"<"4"<"6": 1 2 3 1 2 3 1 2 3 1 ...`

It seems that it is a good idea to set the default contrasts as below, these are orthogonal contrasts and will give a better sphericity test.

`options(contrasts = c("contr.helmert", "contr.poly"))`

Now let’s run the RM Anova model again …

RM Anova with 3 equidistant occasions

```
model_aov2 <- aov(verb ~ grade + Error(id), data = verblong,
contrasts = c("contr.helmert", "contr.poly"))
summary(model_aov2)
```

```
##
## Error: id
## Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 203 33263 163.9
##
## Error: Within
## Df Sum Sq Mean Sq F value Pr(>F)
## grade 2 34818 17409 854.3 <2e-16 ***
## Residuals 406 8273 20
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

This is all the same as above. But the ordering of grade facilitates construction of polynomial contrasts.

To get the polynomial contrasts, we use the summary function and the split= option to “split out” the degrees of the polynomial.

Note: in the code, the words “Linear” and “Quadratic” are just convenient labels

`summary(model_aov2, split = list(grade = list("Linear"=1, "Quadratic"=2)))`

```
##
## Error: id
## Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 203 33263 163.9
##
## Error: Within
## Df Sum Sq Mean Sq F value Pr(>F)
## grade 2 34818 17409 854.33 < 2e-16 ***
## grade: Linear 1 34288 34288 1682.64 < 2e-16 ***
## grade: Quadratic 1 530 530 26.03 5.17e-07 ***
## Residuals 406 8273 20
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

Side note: For the contrasts, the df SS and MS values agree with SAS, but the F values do not. SAS decomposes the Residual SS to get a separate error term for each degree of freedom. R uses the single Residual SS and Residual MS in each F test. Not sure whether the SAS or R approach is “best”

What did the analysis do? What contrasts did it use? …

In the background, R assigned the polynomial contrasts of degree 3 to the grade factor (which is an Ord.factor w/ 3 levels “2”<“4”<“6”)

```
contrasts(grade) <- contr.poly(3)
contrasts(grade)
```

```
## .L .Q
## 2 -7.071068e-01 0.4082483
## 4 -7.850462e-17 -0.8164966
## 6 7.071068e-01 0.4082483
```

```
# .L .Q
# 2 -7.071068e-01 0.4082483
# 4 -7.850462e-17 -0.8164966
# 6 7.071068e-01 0.4082483
```

These are the orthonormal polynomial contrasts. I assume that contrast codes are covered better in other courses. When I need them for specific applications, I look them up.

Note: In making these contrasts, R does not know the true “distances” between the levels - it only knows the levels are ordered. We are making an *interpretation* that the 2, 4, 6 indicate equivalent distances. In all cases we must consider what assumptions are embedded in our interpretation of time variables - and try to make them explicit.

**The Sphericity Assumption**

RMANOVA models make a strong assumption about the variance-covariance structure (we will get to the specifics further down).

**We need to test for Sphericity = Mauchly’s criterion**

(Yes, someone built an R function that does the test, and the traditional corrections - in the `ez`

package.

```
#from the ez pacakge
ezANOVA(data = verblong, dv = .(verb), within = .(grade), wid = .(id), type = 3, detailed = TRUE)
```

```
## $ANOVA
## Effect DFn DFd SSn SSd F p p<.05
## 1 (Intercept) 1 203 704326.42 33262.940 4298.4253 1.423624e-138 *
## 2 grade 2 406 34818.34 8273.268 854.3326 3.211724e-146 *
## ges
## 1 0.9443112
## 2 0.4560087
##
## $`Mauchly's Test for Sphericity`
## Effect W p p<.05
## 2 grade 0.7790063 1.110806e-11 *
##
## $`Sphericity Corrections`
## Effect GGe p[GG] p[GG]<.05 HFe p[HF] p[HF]<.05
## 2 grade 0.819005 2.49192e-120 * 0.8247685 3.732699e-121 *
```

Annotated output below …

```
# $ANOVA
# Effect DFn DFd SSn SSd F p p<.05 ges = generalized eta square, effect size measure
# 1 (Intercept) 1 203 704326.42 33262.940 4298.4253 1.423624e-138 * 0.9443112
# 2 grade 2 406 34818.34 8273.268 854.3326 3.211724e-146 * 0.4560087
### The above test assumes sphericity
# $`Mauchly's Test for Sphericity`
# Effect W p p<.05
# 2 grade 0.7790063 1.110806e-11 *
### This indicates that sphericity was violated
#
# $`Sphericity Corrections`
# Effect GGe p[GG] p[GG]<.05 HFe p[HF] p[HF]<.05
# 2 grade 0.819005 2.49192e-120 * 0.8247685 3.732699e-121 *
### Correction for sphericity
### GGe = Greenhouse-Geisser epsilon (in this case still significant)
### HFe = Huynh-Feldt epsilon (in this case still significant)
### generally if epsilon > .8 then violation not too severe
```

This next block of code goes into some RM ANOVA procedures that I am now skipping.

```
#Lets do this same set of contrasts a different way;
#Starting with the wide data
#Make some new variables -
#Trend scores, calcualted using orthonormal effect codes (for 3 equally spaced occasions)
#This is basically transforming our data
wiscsub$verbave <- ( 1*wiscsub$verb2 + 1*wiscsub$verb4 + 1*wiscsub$verb6)/sqrt(3) #flat line
wiscsub$verblin <- (-1*wiscsub$verb2 + 0*wiscsub$verb4 + 1*wiscsub$verb6)/sqrt(2) #straight line
wiscsub$verbqua <- ( 1*wiscsub$verb2 - 2*wiscsub$verb4 + 1*wiscsub$verb6)/sqrt(6) #curved line
describe(wiscsub[ ,c("verbave","verblin","verbqua")])
# vars n mean sd median trimmed mad min max range skew kurtosis se
# verbave 1 204 58.76 12.80 57.88 58.52 14.58 20.73 91.80 71.07 0.10 -0.44 0.90
# verblin 2 204 12.96 5.30 12.71 12.76 4.85 1.61 28.14 26.54 0.38 -0.06 0.37
# verbqua 3 204 1.61 3.56 1.76 1.59 3.15 -6.57 11.79 18.36 0.14 -0.08 0.25
#Setting up the models
#Using the car package to get Type III sum of squares
# For average
model_ave <- lm(verbave ~ 1, data = wiscsub)
summary(model_ave) #this gives us regression results
Anova(model_ave, type="III") #this gives us anova results
# For linear trend scores
model_lin <- lm(verblin ~ 1, data = wiscsub)
summary(model_lin) #this gives us regression results
Anova(model_lin, type="III") #this gives us anova results
# For quadratic trend scores
model_qua <- lm(verbqua ~ 1, data = wiscsub)
summary(model_qua) #this gives us regression results
Anova(model_qua, type="III") #this gives us anova results
#Notes on above - the Sum of Squares is the same as above analysis
#The F-tests are different - becasue the residual is treated differently (separate) as noted above
```

There is also a shorter way using Anova() from `car`

package …

Prelim: check what is form of the time variable - needs to be an ordered factor.

`str(verblong$grade)`

`## Ord.factor w/ 3 levels "2"<"4"<"6": 1 2 3 1 2 3 1 2 3 1 ...`

Anova() analysis:

```
#from car package
model_quick <- lm(verb ~ C(grade, contr.poly), data = verblong)
summary(model_quick) #but note that these are t-tests and not F-tests
```

```
##
## Call:
## lm(formula = verb ~ C(grade, contr.poly), data = verblong)
##
## Residuals:
## Min 1Q Median 3Q Max
## -26.3999 -5.8524 -0.1076 4.9842 28.8401
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 33.9243 0.3338 101.621 < 2e-16 ***
## C(grade, contr.poly).L 12.9645 0.5782 22.422 < 2e-16 ***
## C(grade, contr.poly).Q 1.6125 0.5782 2.789 0.00546 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.259 on 609 degrees of freedom
## Multiple R-squared: 0.456, Adjusted R-squared: 0.4542
## F-statistic: 255.3 on 2 and 609 DF, p-value: < 2.2e-16
```

`Anova(model_quick, type="III") #This lumps them all together into 1 test`

Sum Sq | Df | F value | Pr(>F) | |
---|---|---|---|---|

(Intercept) | 704326.42 | 1 | 10326.7683 | 0 |

C(grade, contr.poly) | 34818.34 | 2 | 255.2516 | 0 |

Residuals | 41536.21 | 609 | NA | NA |

Ok - that is all good - although does not give us any information about (or formal tests of) the sphericity assumption.

Let’s move towards a “modern” multilevel framework. I write “modern”" here because all of my colleagues who are doing RMANOVAs these days, mostly with data from laboratory experiments, are placing them within the multilevel framework. Some good resources … http://www.ats.ucla.edu/stat/r/seminars/Repeated_Measures/repeated_measures.htm http://www.statpower.net/Content/GCM/Lectures/SW07.pdf https://www.zoology.ubc.ca/~schluter/R/fit-model/

Here, we make use of the `nlme`

(linear and nonlinear mixed effects) package. We run the typical model with the `gls()`

function.

We set the model up with the usual Compound Symmetry error structure

Preliminary: First, lets remove the ordedness of grade. Note however, that R will place the intercept at the first value.

```
verblong$grade <- factor(verblong$grade, ordered = FALSE)
str(verblong$grade)
```

`## Factor w/ 3 levels "2","4","6": 1 2 3 1 2 3 1 2 3 1 ...`

Ok - and now run the RMANOVA model.

```
gls_fit1 <- gls(verb ~ grade,
correlation = corCompSymm(form = ~1|id),
data = verblong)
summary(gls_fit1)
```

```
## Generalized least squares fit by REML
## Model: verb ~ grade
## Data: verblong
## AIC BIC logLik
## 4016.76 4038.819 -2003.38
##
## Correlation Structure: Compound symmetry
## Formula: ~1 | id
## Parameter estimate(s):
## Rho
## 0.7012269
##
## Coefficients:
## Value Std.Error t-value p-value
## (Intercept) 33.92433 0.5174359 65.56238 0
## grade1 3.59620 0.2234835 16.09157 0
## grade2 4.91279 0.1290283 38.07527 0
##
## Correlation:
## (Intr) grade1
## grade1 0
## grade2 0 0
##
## Standardized residuals:
## Min Q1 Med Q3 Max
## -3.19666804 -0.70864605 -0.01303162 0.60352054 3.49214250
##
## Residual standard error: 8.258568
## Degrees of freedom: 612 total; 609 residual
```

`anova(gls_fit1)`

numDF | F-value | p-value | |
---|---|---|---|

(Intercept) | 1 | 4298.4253 | 0 |

grade | 2 | 854.3326 | 0 |

Let’s look at the model implied **means vector**.

```
#Make a fake dataset
evaldata <- data.frame(grade = c("2","4","6")) #remember grade is a factor
#evaluate the model at those values
predict(gls_fit1, newdata = evaldata)
```

```
## [1] 25.41534 32.60775 43.74990
## attr(,"label")
## [1] "Predicted values"
```

Let’s look at the model implied **variance-covariance matrix** (note this is a difficult to find function)

`getVarCov(gls_fit1)`

```
## Marginal variance covariance matrix
## [,1] [,2] [,3]
## [1,] 68.204 47.826 47.826
## [2,] 47.826 68.204 47.826
## [3,] 47.826 47.826 68.204
## Standard Deviations: 8.2586 8.2586 8.2586
```

The “spherical” assumption - *Compound Symmetry* structure embedded in this model is evident.

How do these compare with the observed means and variance-covariances.

```
#mean vector of observed data
meanvector
```

```
## verb2 verb4 verb6
## 25.41534 32.60775 43.74990
```

```
#variance-covariance matrix of observed data
varcovmatrix
```

```
## verb2 verb4 verb6
## verb2 37.28784 33.81957 47.40488
## verb4 33.81957 53.58070 62.25489
## verb6 47.40488 62.25489 113.74332
```

We see that the means vector is recovered precisely, but that the variance-covariances are not.

With this basic backround/review in place, we move into Part 2: A multilevel approach from the ground up - embedded with some Matrix Algebra.

Don’t forget the `detach()`

!

`detach(verblong)`