1 Overview

This tutorial reviews the Actor-Partner Interdependence Model (APIM; Kashy & Kenny, 2000; Kenny, Kashy, & Cook, 2006), which is often used to examine the association (1) between two constructs for two people using cross-sectional data, or (2) between the same construct from two people across two time points.

2 Outline

In this tutorial, we’ll cover…

Useful descriptive statistics for dyadic/bivariate data
Dyadic/bivariate data preparation
APIM model using nlme package
Other resources

Before we begin, let’s load the libraries we need.

library(ggplot2)   #for plotting
library(nlme)      #for model fitting
library(psych)     #for general functions
library(reshape)   #for data management

We’ll be using our trusty WISC data, so we’ll read that data set in.

#set filepath for data file
filepath <- "https://quantdev.ssri.psu.edu/sites/qdev/files/wisc3raw_gender.csv"

#read in the .csv file using the url() function
wisc3raw <- read.csv(file=url(filepath),header=TRUE)

#subset to variables of interest
data <- wisc3raw[, c("id", "verb1", "verb6", "perfo1", "perfo6")]
head(data)

id	verb1	verb6	perfo1	perfo6
1	24.42	55.64	19.84	44.19
2	12.44	37.81	5.90	40.38
3	32.43	50.18	27.64	77.72
4	22.69	44.72	33.16	61.66
5	28.23	70.95	27.64	64.22
6	16.06	39.94	8.45	39.08

We’ll also make a long version of the data set for later use.

data_long <- reshape(data = data,
                    varying = c("verb1", "verb6",
                                "perfo1", "perfo6"),
                    timevar = c("grade"), 
                    idvar = c("id"),
                    direction = "long", sep="")

#sort for easy viewing - reorder by id and grade
data_long <- data_long[order(data_long$id, data_long$grade), ]

3 Problem Definition

In this tutorial, we are going to examine the association between verbal and performance ability using measures from first grade and sixth grade. We are interested in simultaneously examining whether (1) verbal ability in the first grade is predictive of verbal ability in the sixth grade, (2) performance ability in the first grade is predictive of performance ability in the sixth grade, (3) verbal ability in the first grade is predictive of performance ability in the sixth grade, and (4) performance ability in the first grade is predictive of verbal ability in the sixth grade.

When working with dyads, the above points 1 and 2 are often referred to as “actor effects” and points 3 and 4 are often referred to as “partner effects.”

While this example is not a “traditional dyad” - i.e., two distinguishable people - the analytic processes demonstrated here for bivariate data are applicable to the examination of dyadic data.

4 Descriptive Statistics for Dyadic Data

Before we run our models, it is often useful to get more familiar with the data via plotting and descriptives statistics.

Let’s begin with descriptive statistics of our four variables of interest: first grade verbal and performance ability, and sixth grade verbal and performance ability.

describe(data$verb1)

	vars	n	mean	sd	median	trimmed	mad	min	max	range	skew	kurtosis	se
X1	1	204	19.58505	5.807695	19.335	19.49768	5.41149	3.33	35.15	31.82	0.1301071	-0.0528376	0.40662

describe(data$verb6)

	vars	n	mean	sd	median	trimmed	mad	min	max	range	skew	kurtosis	se
X1	1	204	43.7499	10.66505	42.545	43.4561	11.29741	17.35	72.59	55.24	0.2356459	-0.360521	0.7467029

describe(data$perfo1)

	vars	n	mean	sd	median	trimmed	mad	min	max	range	skew	kurtosis	se
X1	1	204	17.97706	8.349782	17.66	17.69116	8.295147	0	46.58	46.58	0.3510232	-0.1089726	0.5846017

describe(data$perfo6)

	vars	n	mean	sd	median	trimmed	mad	min	max	range	skew	kurtosis	se
X1	1	204	50.93162	12.47987	51.765	51.07073	13.26927	10.26	89.01	78.75	-0.0637532	0.1836101	0.873766

We can see that both the mean and standard deviation of verbal and performance ability increase from first to sixth grade. While this is worth noting, the APIM will not be examining changes in mean levels of verbal and performance ability.

Let’s also examine the correlations (i.e., associations and rank order stability) among the four variables.

#correlations
cor(data[, 2:5])

##            verb1     verb6    perfo1    perfo6
## verb1  1.0000000 0.6541040 0.6101379 0.4779672
## verb6  0.6541040 1.0000000 0.6183155 0.6106694
## perfo1 0.6101379 0.6183155 1.0000000 0.6958321
## perfo6 0.4779672 0.6106694 0.6958321 1.0000000

#plot
pairs.panels(data[, c("verb1", "verb6", "perfo1", "perfo6")])

We can see that all four varibles are relatively nomrally distirbuted.

We can see there are strong, positive correlations both across time within-construct and between-constructs.

Let’s also look at a few individuals to examine the within-person association between verbal and performance ability from the first to sixth grade.

ggplot(data = subset(data_long, id <= 9), aes(x = grade, group = id), legend = FALSE) +
  geom_point(aes(x = grade, y = verb), shape = 17, size = 3, color = "purple") + 
  geom_point(aes(x = grade, y = perfo), shape = 19, size = 3, color = "pink") + 
  geom_line(aes(x = grade, y = verb), lty = 1, size=1, color = "purple") + 
  geom_line(aes(x = grade, y = perfo), lty = 1, size=1, color = "pink") + 
  xlab("Grade") + 
  ylab("Verbal and Performance Ability") + ylim(0, 80) +
  scale_x_continuous(breaks=seq(0, 7, by = 1)) + 
  theme_classic() +
  facet_wrap( ~ id)

Within these nine participants, there is some inter-individual variability. For example, some participants’ verbal and performance ability increase more than others, some participants have higher levels of verbal ability than performance ability (or vice versa) across time. The APIM model is all about rank-order interindividual differences, so these intra-individual change plots are not quite matched to the APIM-type research questions. (see the tutorial on two types of 2-occasion change models).

5 Dyadic data preparation

We have learned to manipulate data from “wide” to “long,” and vice versa. Dyadic/bivariate analyses require further manipulation in order to get the data in the correct format for our analyses. We will walk through the data prep in two steps.

First, we need to create one column that has the information for both outcome variables - i.e., for each person, the verb6 and perfo6 values will alternate. This is almost like repeated measures data, but instead of having multiple time points nested within person, we have multiple (two) variables to nest within person.

data_melt <- reshape::melt(data = data,
                      id.vars = c("id", "verb1", "perfo1"),
                      na.rm=FALSE)
head(data_melt)

id	verb1	perfo1	variable	value
1	24.42	19.84	verb6	55.64
2	12.44	5.90	verb6	37.81
3	32.43	27.64	verb6	50.18
4	22.69	33.16	verb6	44.72
5	28.23	27.64	verb6	70.95
6	16.06	8.45	verb6	39.94

#rename "variable" and "value" variables to "grade6_variable" and "grade6_outcome"
colnames(data_melt)[4:5] <- c("grade6_variable", "grade6_outcome")

#re-order for convenience
data_melt <- data_melt[order(data_melt$id, data_melt$grade6_variable), ]

#look at updated data set
head(data_melt)

	id	verb1	perfo1	grade6_variable	grade6_outcome
1	1	24.42	19.84	verb6	55.64
205	1	24.42	19.84	perfo6	44.19
2	2	12.44	5.90	verb6	37.81
206	2	12.44	5.90	perfo6	40.38
3	3	32.43	27.64	verb6	50.18
207	3	32.43	27.64	perfo6	77.72

Second, we need to create two dummy variables (each 0/1) that will be useful in our analyses to “turn on/off” a row (more on this later). We will create a new variable, “verb_on” that indicates which rows are for the “verb6” outcome, We will create a second variable, “perform_on” that indicates which rows are for the “perfo6” oucome.

data_melt$verb_on <- ifelse(data_melt$grade6_variable == "verb6", 1, 0)
data_melt$perform_on <- ifelse(data_melt$grade6_variable == "perfo6", 1, 0)

head(data_melt)

	id	verb1	perfo1	grade6_variable	grade6_outcome	verb_on	perform_on
1	1	24.42	19.84	verb6	55.64	1	0
205	1	24.42	19.84	perfo6	44.19	0	1
2	2	12.44	5.90	verb6	37.81	1	0
206	2	12.44	5.90	perfo6	40.38	0	1
3	3	32.43	27.64	verb6	50.18	1	0
207	3	32.43	27.64	perfo6	77.72	0	1

There are many alternative ways to prepare dyadic data. For example, see https://github.com/RandiLGarcia/2day-dyad-workshop/blob/master/Day%201/R%20Code/Day%201-Data%20Restructuring.Rmd). It will depend on how you choose to run your analysis (described further later).

6 APIM using `nlme` package

Now that we know a bit more about the data we are working with and have the data prepared in a single-outcome (double-entry) format, we can set up our APIM model. We’ll run this model in the nlme package.

Specifically, we’ll examine whether:

verbal ability in the first grade is predictive of verbal ability in the sixth grade (verbal “actor” effect),
performance ability in the first grade is predictive of performance ability in the sixth grade (performance “actor” effect),
verbal ability in the first grade is predictive of performance ability in the sixth grade (verbal “partner” effect), and
performance ability in the first grade is predictive of verbal ability in the sixth grade (performance “partner” effect).

Before running this full model, we will examine the empty model to determine how much variability there is within- and between-persons. Specifically,

\[Grade6Outcome_{i} = \beta_{0V}VerbOn_{it} + \beta_{0P}PerformOn_{it} + e_{Vi} +e_{Pi}\]

Empty model.

apim_empty <- gls(grade6_outcome ~  -1 +               #turnd OFF the intercept 
                                 verb_on +             #intercept for verbal scores
                                 perform_on,           #intercept for performance scores
               correlation = corSymm(form= ~1|id),
               weights = varIdent(form= ~1|verb_on),    #allows for different error terms for the two variables
               data = data_melt,
               na.action = na.exclude)

summary(apim_empty)

## Generalized least squares fit by REML
##   Model: grade6_outcome ~ -1 + verb_on + perform_on 
##   Data: data_melt 
##        AIC      BIC    logLik
##   3063.861 3083.893 -1526.931
## 
## Correlation Structure: General
##  Formula: ~1 | id 
##  Parameter estimate(s):
##  Correlation: 
##   1    
## 2 0.611
## Variance function:
##  Structure: Different standard deviations per stratum
##  Formula: ~1 | verb_on 
##  Parameter estimates:
##        1        0 
## 1.000000 1.170165 
## 
## Coefficients:
##               Value Std.Error  t-value p-value
## verb_on    43.74990 0.7467029 58.59077       0
## perform_on 50.93162 0.8737660 58.28977       0
## 
##  Correlation: 
##            verb_n
## perform_on 0.611 
## 
## Standardized residuals:
##         Min          Q1         Med          Q3         Max 
## -3.25897660 -0.72994511 -0.02108775  0.72002989  3.05118321 
## 
## Residual standard error: 10.66505 
## Degrees of freedom: 408 total; 406 residual

We examine the correlation of the verbal and performance error terms to determine the degree of non-independence in the data. We can see that Rho = 0.61, indicating the correlation across dyad members, children who have higher verbal ability tend to also have higher performance ability.

Other things to note in this output…

The results for “verb_on” indicate the average or expected verbal score at grade 6 is 43.75.
The results for “perform_on” indicate the average or expected performance score at grade 6 is 50.93.
Both of these expected values correspond to their respective means in the raw data (see descriptives above).
The grade 6 verbal and performance scores also each have estimated “standard error” values. The estimated standard error for Grade 6 verbal scores is 10.67 and the estimated standard error for Grade 6 performance scores is 12.48 = (1.17*10.67). Note that these estiamtes exactly match the respective standard devaitions in the raw data (see descriptives above).

For completeness of reppresenting the data we can get the means and variances of, and correlation between, the two predictors from the raw data.

Next, we are going to run our full APIM model using the two-intercept approach. Specifically,

\[Grade6Outcome_{i} = \beta_{0V}VerbOn_{it} + \beta_{1V}VerbOn_{it}*Verb1_{it} + \beta_{2V}VerbOn_{it}*Perform1_{it} + \beta_{0P}PerformOn_{it} + \beta_{1P}PerformOn_{it}*Perform1_{it} + \beta_{2P}PerformOn_{it}*Verb1_{it} + e_{Vi} +e_{Pi}\]

So when “verb_on” is equal to 0:

\[Grade6Outcome_{i} = \beta_{0P}PerformOn_{it} + \beta_{1P}PerformOn_{it}*Perform1_{it} + \beta_{2P}PerformOn_{it}*Verb1_{it} + e_{Vi} +e_{Pi}\]

and when “perform_on” is equal to 0:

\[Grade6Outcome_{it} = \beta_{0V}VerbOn_{it} + \beta_{1V}VerbOn_{it}*Verb1_{it} + \beta_{2V}VerbOn_{it}*Perform1_{it} + e_{Vi} +e_{Pi}\]

Full model.

apim_full <- gls(grade6_outcome ~  -1 +              #turns OFF intercept 
                                 verb_on +           #verbal intercept
                                 verb_on:verb1 +     #verbal "actor" effect
                                 verb_on:perfo1 +    #performance "partner" effect
                   
                                 perform_on +        #performance intercept
                                 perform_on:perfo1 + #performance "actor" effect
                                 perform_on:verb1,  #verbal "partner" effect
               correlation = corSymm(form=~1|id),
               weights = varIdent(form=~1|verb_on), #allows for different error terms for the two variables
               data = data_melt,
               na.action = na.exclude)

summary(apim_full)

## Generalized least squares fit by REML
##   Model: grade6_outcome ~ -1 + verb_on + verb_on:verb1 + verb_on:perfo1 +      perform_on + perform_on:perfo1 + perform_on:verb1 
##   Data: data_melt 
##        AIC      BIC    logLik
##   2879.029 2914.997 -1430.514
## 
## Correlation Structure: General
##  Formula: ~1 | id 
##  Parameter estimate(s):
##  Correlation: 
##   1    
## 2 0.312
## Variance function:
##  Structure: Different standard deviations per stratum
##  Formula: ~1 | verb_on 
##  Parameter estimates:
##        1        0 
## 1.000000 1.188537 
## 
## Coefficients:
##                       Value Std.Error   t-value p-value
## verb_on           19.869325 1.8634459 10.662679  0.0000
## perform_on        30.049124 2.2147736 13.567582  0.0000
## verb_on:verb1      0.809886 0.1150893  7.037017  0.0000
## verb_on:perfo1     0.446064 0.0800504  5.572292  0.0000
## perfo1:perform_on  0.962419 0.0951429 10.115507  0.0000
## verb1:perform_on   0.182846 0.1367879  1.336709  0.1821
## 
##  Correlation: 
##                   verb_n prfrm_ vrb_n:v1 vrb_n:p1 prf1:_
## perform_on         0.312                                
## verb_on:verb1     -0.738 -0.230                         
## verb_on:perfo1    -0.034 -0.011 -0.610                  
## perfo1:perform_on -0.011 -0.034 -0.190    0.312         
## verb1:perform_on  -0.230 -0.738  0.312   -0.190   -0.610
## 
## Standardized residuals:
##         Min          Q1         Med          Q3         Max 
## -2.55877996 -0.66034371 -0.02441614  0.62804539  3.69258907 
## 
## Residual standard error: 7.545255 
## Degrees of freedom: 408 total; 402 residual

Let’s interpret the results!

The expected verbal score at grade 6 = 19.87 and the expected performance score at grade 6 = 30.05 when verbal and performance scores at grade 1 are equal to zero.
Actor effects:
- The “actor effect” of verbal ability is 0.81, indicating that every 1.o point difference in children’s verbal ability at Grade 1 is associated with a 0.81 point difference in Grade 6. Stated differently, earlier differences in verbal ability “influence” later differences in verbal ability.
- The “actor effect” of performance ability is 0.96, indicating that every 1.0 point difference in children’s performance ability at Grade 1 is associated with a 0.96 point difference in Grade 6. Stated differently, earlier differences in performance ability “influence” later differences in performance ability.
Partner effects:
- The “partner effect” of performance ability on verbal ability is 0.45, indicating that every 1.0 point difference in children’s performance ability at Grade 1 is associated with a 0.45 point difference in verbal ability at Grade 6. Stated differently, earlier differences in performance ability “influence” later differences in verbal ability.
- The “partner effect” of verbal ability on performance ability is not significantly different than zero, indicating that earlier differences in performance ability at Grade 1 do not “influence” later differences in verbal ability at Grade 6.
Other things to note:
- Rho = 0.31, indicating that the correlation across “dyad members” or variables. Here, higher verbal ability scores are tend to be partnered with higher performance ability scores.
- The verbal and performance scores each have their own estimated error values. The estimated standard error for verbal scores is 7.55 and the estimated standard error for performance scores is 8.98 = (1.19*7.55).

7 Other resources

We’ve walked through one way of running an APIM model in R. There are, of course, lots of alternatives. Here are a few resources if you’d like to learn more about running an APIM model or dyadic data analyses in general.

David Kenny’s website, where he and his colleagues have created some useful shiny apps for running dyadic analyses in R: http://davidakenny.net/DyadR/DyadRweb.htm
Randi Garcia’s github page: https://github.com/RandiLGarcia
Another QuantDev tutorial for repeated measures dyadic data: https://quantdev.ssri.psu.edu/sites/qdev/files/ILD_Ch05_2017_DyadicMLM.html

Thanks for playing!

Introduction to Actor-Partner Interdependence Model (APIM)