Overview

Repeated measures data, often obtained from experience sampling or daily diary studies, require analytic methods that accommodate the inherent nesting in the data. One special case of repeated measures data are those from dyad (e.g., couples, parent/child). Dyadic data is structured such that repeated measures are nested within a person, and each person is nested within a dyad. The dyads are assumed to be independently sampled. Non-independece of the members of the dyad and across the repeated occasions are modeled explicitly. Multivariate multilevel modeling is one technique for effectively handling this type of data. Formal names for the dayad-specific model include Actor Partner Interdependence Model (APIM) (see Kenny, Kashy, & Cook, 2006).

In this tutorial, we will follow the example from Bolger and Laurenceau (2013) Chapter 8: Design and Analysis of Intensive Longitudinal Longitudinal Study of Distingiushable Dyads. Specifically, we will use the simulated dyadic process data set (p. 150). The data were simulated to represent 100 dual-career heterosexual couples where each partner provided diary reports twice daily over the course of 21 consecutive days. The first report is an end-of-workday report that includes (a) the number of stressors that occurred at work, and (b) work dissatisfaction. The data are complete and clean - with Bolger and Laurenceau’s goal being to “create a pedagologically uncomplicated data set.”

Note that we are using distinguishable dyads in this example. Distinguishability is typically determined conceptually, based upon a stable, differentiating characteristic (e.g., gender, age, role). Note that notions of “dyadic” can be generalized from persons to any reasonable combination of two variables (e.g., emotion and behavior). The next tutorial will illustrate application of the multivariate multilevel model for examination of intraindividual coupling.

Outline

  1. Introduction to the research questions, model, and data.
  2. Plotting the data.
  3. The multilevel model.
    • Males only.
    • Females only.
    • The dyad.
  4. Cautions.
  5. Conclusion.

Prelim - Loading libraries used in this script.

library(ggplot2)
library(nlme)
library(psych)

1. Introduction to the Research Questions, Model, and Data.

The Research Questions.

We are going to address:

  • Is the number of daily work stressors associated with end-of-day relationship satisfaction?
    • What is the extent of association for the typical male partner and for the typical female partner (fixed effects)?
    • Is there heterogeneity in the strength of the association across male partners in couples and across female partners in couples (random effects)?

Notice that, so far, the questions are stated separately for the two types of dyad members (males and females). The dyadic longitudinal model provides for another type of question - questions related to non-independence.

  • Are dyad members’ relationship satisfaction on any given day related, after accounting for other explanatory variables?

Here, these relations manifest as correlations/covariances between intercepts, slopes, and residuals.

The Modeling Enterprise.

The basic multilevel model is designed as a model with a univariate outcome. The ability to model multiple outcomes simultaneoulsy used to be a distinguishing feature of structural equation models (SEM). However, researches discovered that the multilevel model can be adapted for examination of multivariate outcomes quite easily. One simply has to “trick” the program into thinking that two (or more) variables are one variable.

The Data.

The data are organized as follows:

  • There should be N (number of individuals) x measurement occasions rows per dyad in the data set. In this case, we should have a data set with 4,200 rows (100 dyads x 2 persons x 21 occasions).

  • Columns:
    • Couple ID
    • Person ID (e.g., 1 = partner 1, 2 = partner 2)
    • Time (e.g., day within the daily diary study)
    • Centered version of time (optional)
    • Gender (or whatever feature is distinguishing the dyad; in this case, 0 = male and 1 = female)
    • An indicator variable for each partner (e.g., husband, wife) - dichotomous (0/1)
    • Outcome variable (in this case, “reldis”)
    • Predictor variable (in this case, “wrkstrs”)
    • Centered version of the predictor variable (“wrkstrsc”)
    • Trait component of the predictor variable (“wrkstrsb”)
    • State component of the predictor variable (“wrkstrsw”)

    • for a total of 12 columns in this data set

The “trick” is to stack the data so that there are two records for each repeated observation. The data file is twice as long, and we structure the model to “turn on” and “turn off” the double records to invoke parameter estimation for each variable. Bolger & Laurenceau’s example data are already prepared data. Let’s take a look.

Load data and needed libraries.

#set filepath for data file
filepath <- "https://quantdev.ssri.psu.edu/sites/qdev/files/B%26Ldyads.csv"


#read in the .csv file using the url() function
dyads <- read.csv(file=url(filepath),header=TRUE)

# Look at the variable names
names(dyads)
##  [1] "coupleid"  "personid"  "time"      "time7c"    "gender"   
##  [6] "female"    "male"      "reldis"    "wrkstrs"   "wrkstrsc" 
## [11] "wrkstrscb" "wrkstrscw"
# Re-ordering for easy viewing
dyads <- dyads[order(dyads$coupleid, dyads$time, dyads$personid),]

# Examine first few rows of the data set
head(dyads)
##    coupleid personid time    time7c gender female male reldis wrkstrs
## 1         1        1    0 -1.500000      1      1    0   3.03       3
## 22        1        2    0 -1.500000      0      0    1   4.46       3
## 2         1        1    1 -1.357143      1      1    0   4.62       3
## 23        1        2    1 -1.357143      0      0    1   4.88       3
## 3         1        1    2 -1.214286      1      1    0   2.85       3
## 24        1        2    2 -1.214286      0      0    1   4.58       3
##     wrkstrsc  wrkstrscb wrkstrscw
## 1  0.0095238 -0.3238095 0.3333333
## 22 0.0095238 -0.1333333 0.1428571
## 2  0.0095238 -0.3238095 0.3333333
## 23 0.0095238 -0.1333333 0.1428571
## 3  0.0095238 -0.3238095 0.3333333
## 24 0.0095238 -0.1333333 0.1428571

2. Plotting the Data.

Before we begin running our models, it is always a good idea to look at our data.

We start with examining the distribution of our outcome variable end-of-day relationship dissatisfaction, reldis. Let’s look at the histogram by gender.

ggplot(data = dyads, aes(x = reldis)) +
  geom_histogram(fill = "white", color = "black") + 
  labs(x = "Relationship Dissatisfaction") +
  facet_grid(. ~ gender) # creating a separate plot for each gender

The outcome variable for each gender looks approximately normally distributed, which is good news for when we run our models.

Next, let’s plot a few dyads’ reports of relationship dissatisfaction through the course of the study. Since the the predictor variable has already been split into time-varying (state) and time-invariant (trait) components, we use the time-varying predictor wrkstrscw as the “background” context variable.

ggplot(data = subset(dyads, coupleid <= 9), aes(x = time, group = personid), legend = FALSE) +
  geom_rect(mapping = aes(xmin = time-.5, xmax = time+.5, ymin = 0, ymax = 10, fill = wrkstrscw), alpha = 0.6) + # creating rectangles in the background of the plot colored by work stressors
  geom_point(aes(x = time, y = reldis, color = factor(gender)), shape = 17, size = 3) + # creating a different colored point for each gender
  geom_line(aes(x = time, y = reldis, color = factor(gender)), lty = 1, size=1) + # creating a different colored line for each gender
  xlab("Time") + 
  ylab("Relationship Dissatisfaction") + ylim(0, 10) +
  scale_x_continuous(breaks=seq(0, 20, by = 5)) + 
  facet_wrap( ~ coupleid) # creating a separate plot for each dyad