This session we work through the probing of interactions (moderation) in a growth modeling framework. In particular we provide exposure to the Johnson-Neyman method for identifying zones of significance. We used existing packages and plotting function.
library(psych)
library(ggplot2)
library(nlme) #for mixed effects models
library(reghelper) #for probing interactions
For our example, we use 4-occasion WISC data.
Load the repeated measures data
############################
####### Reading in the Data
############################
#set filepath for data file
filepath <- "https://quantdev.ssri.psu.edu/sites/qdev/files/wisc3raw.csv"
#read in the .csv file using the url() function
wisc3raw <- read.csv(file=url(filepath),header=TRUE)
Subsetting to the variables of interest. Specifically, we include the id
variable; the repeated measures outcome variables verb1
, verb2
, verb4
, verb6
; and the predictors grad
and momed
variables.
#Creating a list of variables of interest
varnames <- c("id","verb1","verb2","verb4","verb6","grad","momed")
#sub-setting the columns of interest
wiscsub <- wisc3raw[ ,varnames]
#Looking at the sample-level descriptives for the chosen variables
describe(wiscsub)
## vars n mean sd median trimmed mad min max range skew
## id 1 204 102.50 59.03 102.50 102.50 75.61 1.00 204.00 203.00 0.00
## verb1 2 204 19.59 5.81 19.34 19.50 5.41 3.33 35.15 31.82 0.13
## verb2 3 204 25.42 6.11 25.98 25.40 6.57 5.95 39.85 33.90 -0.06
## verb4 4 204 32.61 7.32 32.82 32.42 7.18 12.60 52.84 40.24 0.23
## verb6 5 204 43.75 10.67 42.55 43.46 11.30 17.35 72.59 55.24 0.24
## grad 6 204 0.23 0.42 0.00 0.16 0.00 0.00 1.00 1.00 1.30
## momed 7 204 10.81 2.70 11.50 11.00 2.97 5.50 18.00 12.50 -0.36
## kurtosis se
## id -1.22 4.13
## verb1 -0.05 0.41
## verb2 -0.34 0.43
## verb4 -0.08 0.51
## verb6 -0.36 0.75
## grad -0.30 0.03
## momed 0.01 0.19
We note the characteristics of our person-level predictors, grad
and momed
. We will want to retain information about the standard deviation, range, and plausible values of these variables (most easily done in a wide data set).
#descriptives of person-level predictors
table(wiscsub$grad)
##
## 0 1
## 158 46
describe(wiscsub$momed)
## vars n mean sd median trimmed mad min max range skew kurtosis
## X1 1 204 10.81 2.7 11.5 11 2.97 5.5 18 12.5 -0.36 0.01
## se
## X1 0.19
Growth models (in the multilevel framework) require a long data set. So, we also reshape from wide to long in order to have a long data set.
#reshaping wide to long
verblong <- reshape(data=wiscsub,
varying=c("verb1","verb2","verb4","verb6"),
timevar="grade", idvar="id",
direction="long", sep="")
#reordering for convenience
verblong <- verblong[order(verblong$id,verblong$grade),c("id","grade","verb","grad","momed")]
#looking at dataset
head(verblong,12)
## id grade verb grad momed
## 1.1 1 1 24.42 0 9.5
## 1.2 1 2 26.98 0 9.5
## 1.4 1 4 39.61 0 9.5
## 1.6 1 6 55.64 0 9.5
## 2.1 2 1 12.44 0 5.5
## 2.2 2 2 14.38 0 5.5
## 2.4 2 4 21.92 0 5.5
## 2.6 2 6 37.81 0 5.5
## 3.1 3 1 32.43 1 14.0
## 3.2 3 2 33.51 1 14.0
## 3.4 3 4 34.30 1 14.0
## 3.6 3 6 50.18 1 14.0
#plotting the individual trajectories
#plotting intraindividual change
ggplot(data = verblong, aes(x = grade, y = verb, group = id)) +
geom_point() +
geom_line() +
xlab("Grade") +
ylab("WISC Verbal Score") + ylim(0,100) +
scale_x_continuous(breaks=seq(1,6,by=1))
We look at the within-person linear relation beween the time variable (grade) and the outcome variable (verb).
Linear growth model (MLM): random intercepts and random slopes.
model0 <- lme(fixed = verb ~ 1 + grade,
random = ~ 1 + grade|id,
data=verblong,
na.action = na.exclude)
summary(model0)
## Linear mixed-effects model fit by REML
## Data: verblong
## AIC BIC logLik
## 5053.632 5081.844 -2520.816
##
## Random effects:
## Formula: ~1 + grade | id
## Structure: General positive-definite, Log-Cholesky parametrization
## StdDev Corr
## (Intercept) 3.915536 (Intr)
## grade 1.241299 0.321
## Residual 3.581590
##
## Fixed effects: verb ~ 1 + grade
## Value Std.Error DF t-value p-value
## (Intercept) 15.15099 0.3686513 611 41.09844 0
## grade 4.67339 0.1087023 611 42.99255 0
## Correlation:
## (Intr)
## grade -0.155
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.61615621 -0.54270691 -0.02898115 0.52402090 3.16468334
##
## Number of Observations: 816
## Number of Groups: 204
VarCorr(model0)
## id = pdLogChol(1 + grade)
## Variance StdDev Corr
## (Intercept) 15.331421 3.915536 (Intr)
## grade 1.540823 1.241299 0.321
## Residual 12.827783 3.581590
intervals(model0)
## Approximate 95% confidence intervals
##
## Fixed effects:
## lower est. upper
## (Intercept) 14.427016 15.15099 15.874970
## grade 4.459914 4.67339 4.886865
## attr(,"label")
## [1] "Fixed effects:"
##
## Random Effects:
## Level: id
## lower est. upper
## sd((Intercept)) 3.255506646 3.9155359 4.7093810
## sd(grade) 1.057964280 1.2412989 1.4564035
## cor((Intercept),grade) 0.006320221 0.3205334 0.5771394
##
## Within-group standard error:
## lower est. upper
## 3.343438 3.581590 3.836705
We can obtain predicted scores and plot them
#obtaining predicted scores for individuals
verblong$pred.model0 <- predict(model0)
#obtaining predicted scores for prototype
verblong$proto.model0 <- predict(model0, level=0)
#plotting predicted trajectories
ggplot(data = verblong, aes(x = grade, y = pred.model0, group = id)) +
#geom_point() +
geom_line(color="gray", alpha=.8) +
geom_line(aes(x = grade, y = proto.model0), color="red",size=2) +
xlab("Grade") +
ylab("WISC Verbal Score") + ylim(0,100) +
scale_x_continuous(breaks=seq(1,6,by=1))
We can add momed
as a predictor of intercept and slope
Momed as a (continuous) predictor
Need to center momed variable at sample-level mean. Note this is done using the wide data set
describe(wiscsub$momed)
## vars n mean sd median trimmed mad min max range skew kurtosis
## X1 1 204 10.81 2.7 11.5 11 2.97 5.5 18 12.5 -0.36 0.01
## se
## X1 0.19
#Calculating the mean
momed.mean <- mean(wiscsub$momed)
momed.mean
## [1] 10.81127
#Calculating the sd for later use in plots
momed.sd <- sd(wiscsub$momed)
momed.sd
## [1] 2.698279
#Computing centered variable in long data
verblong$momed.c <- (verblong$momed-momed.mean)
describe(verblong$momed.c)
## vars n mean sd median trimmed mad min max range skew kurtosis
## X1 1 816 0 2.69 0.69 0.19 2.97 -5.31 7.19 12.5 -0.36 0.03
## se
## X1 0.09
Fitting conditional growth model with momed (centered) as predictor
model1 <- lme(fixed= verb ~ 1 + grade + momed.c + grade:momed.c,
random= ~ 1 + grade|id,
data=verblong,
na.action = na.exclude)
summary(model1)
## Linear mixed-effects model fit by REML
## Data: verblong
## AIC BIC logLik
## 5000.914 5038.51 -2492.457
##
## Random effects:
## Formula: ~1 + grade | id
## Structure: General positive-definite, Log-Cholesky parametrization
## StdDev Corr
## (Intercept) 3.394993 (Intr)
## grade 1.158314 0.163
## Residual 3.581594
##
## Fixed effects: verb ~ 1 + grade + momed.c + grade:momed.c
## Value Std.Error DF t-value p-value
## (Intercept) 15.150993 0.3424175 610 44.24713 0
## grade 4.673390 0.1041157 610 44.88652 0
## momed.c 0.734066 0.1272144 202 5.77030 0
## grade:momed.c 0.169838 0.0386809 610 4.39075 0
## Correlation:
## (Intr) grade momd.c
## grade -0.301
## momed.c 0.000 0.000
## grade:momed.c 0.000 0.000 -0.301
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.56783078 -0.54388373 -0.01862405 0.51810895 3.19394034
##
## Number of Observations: 816
## Number of Groups: 204
We see that there is a significant “interaction”. grad is a predictor of the linear slope, or said differently, momed moderates the within-person relation between grade and verb.
We can obtain predicted scores and plot them
#obtaining predicted scores for individuals
verblong$pred.model1 <- predict(model1)
#obtaining predicted scores for prototype
verblong$proto.model1 <- predict(model1, level=0)
Here, this has calculated the population-level model for every level of momed that is in the data (not just the -1SD and +1SD prototypes)
#plotting predicted trajectories
ggplot(data = verblong, aes(x = grade, y = pred.model1, group = id, color=momed.c)) +
#geom_point() +
geom_line(alpha=.2) +
geom_line(aes(x = grade, y = proto.model1),size=2) +
xlab("Grade") +
ylab("WISC Verbal Score") + ylim(0,100) +
scale_x_continuous(breaks=seq(1,6,by=1))
#or without the distraction of the individual lines
#plotting predicted trajectories
ggplot(data = verblong, aes(x = grade, y = pred.model1, group = id, color=momed.c)) +
#geom_point() +
#geom_line(alpha=.2) +
geom_line(aes(x = grade, y = proto.model1),size=2) +
xlab("Grade") +
ylab("WISC Verbal Score") + ylim(0,100) +
scale_x_continuous(breaks=seq(1,6,by=1))
We can see how the moderation works, but would want to do create specific set of lines for parsimonious communication (the moderation effect is a linear effect, so we really only need 2 lines to show how it works).
We can use the ‘reghelper’ package to calculate simple slopes for interactions from MLMs (nlme and lme4 output) as well as from lm, aov and glm output. Package specifics are here https://cran.r-project.org/web/packages/reghelper/reghelper.pdf
For the broader set of procedures an informative set of code and explanation is here, as presented for identical functions in the jtools package https://cran.r-project.org/web/packages/jtools/vignettes/interactions.html
Neither of these packages has yet implemented the Johnson-Neyman technique for the multilevel model situation (only for lm and glm models so far, see https://cran.r-project.org/web/packages/reghelper/README.html)
Johnson-Neyman implementation for multilevel models can be done through the quantpsy.org calculator at http://www.quantpsy.org/interact/index.htm .
Evaluating simple slopes.
# simple slopes evaluated at mean, +/- 1SD using library(reghelper)
#checking the standard deviation of momed variable
momed.sd
## [1] 2.698279
#calculating the simple slopes using the helper function
simple_slopes(model1) #only need to give model name; the package will extract simple slopes for the first interaction in the model if there are multiple
## grade momed.c Test Estimate Std. Error t value df Pr(>|t|) Sig.
## 1 1.328536 sstest 0.9597 0.1220 7.8641 202 2.194e-13 ***
## 2 3.25 sstest 1.2860 0.1496 8.5987 202 2.227e-15 ***
## 3 5.171464 sstest 1.6124 0.2022 7.9733 202 1.123e-13 ***
## 4 sstest -2.693308 4.2160 0.1473 28.6241 610 < 2.2e-16 ***
## 5 sstest 0 4.6734 0.1041 44.8865 610 < 2.2e-16 ***
## 6 sstest 2.693308 5.1308 0.1473 34.8355 610 < 2.2e-16 ***
These are simple slope tests (sstest), meaning that the Test Estimate is the estimate of the “slope”, or better, the relation between the target variable and outcome at each level of the moderator. Here, at -1SD of grade (grade = 1.33), the relation between momed.c and verb is 0.96, which is significnatly different than zero. For us, this evaluation is not informative, as we do not think of grade as the moderator. Rather, we consider momed as the moderator. So, at -1SD of momed.c (momed.c = -2.64), the relation between grade and verb is 4.22, which is significnatly different than zero. At +1SD of of momed.c (momed.c = +2.64), the relation between grade and verb is 5.13, which is significnatly different than zero. This aligns with what we saw in the plot, where the lighter blue lines (higher values of momed.c) had steeper slopes.
There is also a graphmodel()
helper function for graphing
# graph_model(model, y, x, lines = NULL, split = NULL,
# errorbars = c("CI", "SE", "none"), ymin = NULL, ymax = NULL,
# labels = NULL, bargraph = FALSE, draw.legend = TRUE, dodge = 0,
# exp = FALSE, ...)
#graphing interaction
graph_model(model1, y=verb, x=grade, lines=momed.c)
#in other situations it may be useful to use a bargraph option
graph_model(model1, y=verb, x=grade, lines=momed.c, bargraph = TRUE)
This plot structure may be more useful in non-growth model situations. However, since per place such a priority on the full trajectory (i.e., on the time variable), we may want to make versions that retain the full range on the x-axis.
We can change the levels of the predictor and moderator variables at which we test the significance of the simple slopes.
#setting values at which to evaluate
simple_slopes(model1, levels=list(momed.c=c(-5, 5, 'sstest'), grade=c(1, 2, 4, 6, 'sstest')))
## grade momed.c Test Estimate Std. Error t value df Pr(>|t|) Sig.
## 1 sstest -5 3.8242 0.2196 17.4106 610 < 2.2e-16 ***
## 2 sstest 5 5.5226 0.2196 25.1428 610 < 2.2e-16 ***
## 3 1 sstest 0.9039 0.1213 7.4501 202 2.665e-12 ***
## 4 2 sstest 1.0737 0.1275 8.4233 202 6.780e-15 ***
## 5 4 sstest 1.4134 0.1682 8.4041 202 7.652e-15 ***
## 6 6 sstest 1.7531 0.2287 7.6665 202 7.293e-13 ***
We can attempt to do the same thing with grad
as a predictor of intercept and slope
Grad is a (categorial) predictor (which is coded 0,1)
model2 <- lme(fixed= verb ~ 1 + grade + grad + grade*grad,
random= ~ 1 + grade|id,
data=verblong,
na.action = na.exclude)
summary(model2)
## Linear mixed-effects model fit by REML
## Data: verblong
## AIC BIC logLik
## 5031.753 5069.349 -2507.876
##
## Random effects:
## Formula: ~1 + grade | id
## Structure: General positive-definite, Log-Cholesky parametrization
## StdDev Corr
## (Intercept) 3.761291 (Intr)
## grade 1.194941 0.255
## Residual 3.581590
##
## Fixed effects: verb ~ 1 + grade + grad + grade * grad
## Value Std.Error DF t-value p-value
## (Intercept) 14.533795 0.4098491 610 35.46133 0.0000
## grade 4.483657 0.1205883 610 37.18152 0.0000
## grad 2.737137 0.8630981 202 3.17129 0.0018
## grade:grad 0.841424 0.2539460 610 3.31340 0.0010
## Correlation:
## (Intr) grade grad
## grade -0.215
## grad -0.475 0.102
## grade:grad 0.102 -0.475 -0.215
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.57955639 -0.53942830 -0.01662113 0.51067259 3.19891926
##
## Number of Observations: 816
## Number of Groups: 204
We see that there is a significant “interaction”. grad is a predictor of the linear slope, or said differently, grad moderates the within-person relation between grade and verb.
We can obtain predicted scores and plot them
#obtaining predicted scores for individuals
verblong$pred.model2 <- predict(model2)
#obtaining predicted scores for prototype
verblong$proto.model2 <- predict(model2, level=0)
#plotting predicted trajectories
ggplot(data = verblong, aes(x = grade, y = pred.model2, group = id, color=factor(grad))) +
#geom_point() +
geom_line(alpha=.2) +
geom_line(aes(x = grade, y = proto.model2),size=2) +
xlab("Grade") +
ylab("WISC Verbal Score") + ylim(0,100) +
scale_x_continuous(breaks=seq(1,6,by=1))
Evaluating simple slopes. The simple_slopes() and graph_model() functions do not appear to work in the binary case.
# #setting values at which to evaluate interaction
# simple_slopes(model2, levels=list(grad=c(0, 1, 'sstest')))
#graphing interaction
graph_model(model2, y=verb, x=grade, lines=grad)
We will need to make our own plots and evaluations.