1 Overview

This is an illustrative example of using dynr.mi to implement multiple imputation with a vector autoregressive model.

References:

[1] Schermerhorn, A. C., Chow, S.-M., & Cummings, E. M. (2010). Developmental family processes and interparental conflict: Patterns of microlevel influences. Developmental Psychology, 46(4), 869-885.

[2] Ji, L., Chow, S.-M., Schermerhorn, A. C., Jacobson, N. C., & Cummings, E.M. (2018). Handling missing data in the modeling of intensive longitudinal data. Structural Equation Modeling: A Multidisciplinary Journal, 1-22.

[3] Li, Y., Ji, L., Oravecz, Z., Brick, T.R., Hunter, M.D., & Chow, S-M. (2019). dynr.mi: An R program for Multiple Imputation in Dynamic Modeling. International Journal of Computer, Electrical, Automation, Control and Information Engineering, 13(5), 302 - 311.

2 Preliminary

Loading the libraries used in this script.

library(dynr)
## Loading required package: ggplot2

Getting the simulated data.

data(VARsim)
VARsim$ca <- as.factor(VARsim$ca)

3 Data description

This example is inspired by a published study aimed at exploring the dynamics of inter-parental emotion states and behaviors at the end of conflicts, and associations with child emotions and behaviors during conflicts (Schermerhorn, Chow, & Cummings, 2010). To analyze over-time and lagged dependencies in the couples’ dynamics and possible associations with child-related variable, we considered the following model:

\[ \begin{bmatrix} wp_{i,t} \\ hp_{i,t} \end{bmatrix} = \begin{bmatrix} a & b\\ b_1 & a_1 \end{bmatrix} \begin{bmatrix} wp_{i,t-1}\\ hp_{i,t-1} \end{bmatrix}+ \begin{bmatrix} c & d\\ c_1 & d_1 \end{bmatrix} \begin{bmatrix} ca_{i,t}\\ cn_{i,t} \end{bmatrix}+ \begin{bmatrix} we_{i,t}\\ he_{i,t} \end{bmatrix}\]

\[\begin{bmatrix} we_{i,t}\\ he_{i,t} \end{bmatrix} \sim N( \begin{bmatrix} 0\\ 0 \end{bmatrix}, \begin{bmatrix} v_{wp} & c_{hw}\\ c_{hw} & v_{hp} \end{bmatrix} )\]

where \(wp\)and \(hp\) are two dependent variables representing the emotional ratings of wife and husband respectively; \(ca\) is a covariate representing an aggregate measure of the child’s agentic behavior (e.g. helping out, taking sides, and comforting the parents); \(cn\) is a covariate representing an aggregate measure of the child’s negative emotions and dysregulated behaviors (e.g., anger, sadness, and fear).

The simulated data were generated from the above model. x1 and x2 are two auxiliary varialbes used in the generation of missing data. The data contain 100 subjects (N = 100) and 100 time points (T = 100) for each subject. Missing data were under missing at random (MAR) condtion with around 30% missing rate.

The trajectories of emotional ratings of wife and husband for two families are shown below.

ggplot(data = subset(VARsim,ID>=5 & ID <=6)) +
  geom_line(aes(x = Time, y = wp, col = "wp")) + 
  geom_line(aes(x = Time, y = hp, col = "hp")) + 
  labs(col="Variable", 
       x="Time",
       y="wp and hp")+ 
  theme_classic() +
  #scale_colour_manual(values=c("pink", "purple")) +
  facet_wrap( ~ paste0("ID:",ID),dir = "v")

4 An example of usage of dynr.mi()

4.1 Declare the data with the dynr.data() function

rawdata <- dynr.data(VARsim, id="ID", time="Time", 
                     observed=c("wp","hp"),covariates=c("ca","cn"))

4.2 Define elements of the measurement model

meas <- prep.measurement(
  values.load=matrix(c(1,0,
                       0,1),ncol=2,byrow=T), 
  params.load=matrix(rep("fixed",4),ncol=2),
  state.names=c("wp","hp"),
  obs.names=c("wp","hp")
)

4.3 Define elements of the dynamic model

formula =list(wp ~ a*wp + b*hp + c*ca + d*cn,
       hp ~ a1*hp + b1*wp +c1*ca + d1*cn)

dynm  <- prep.formulaDynamics(formula=formula,
                              startval=c(a = .4, b = -.3, b1=-.2, a1=.3, 
                                         c = .3, c1=.3, d=-.5, d1=-.4
                              ), isContinuousTime=FALSE)

4.4 Define the initial conditions of the model

initial <- prep.initial(
  values.inistate=c(.15,.15),
  params.inistate=c('mu_wp', 'mu_hp'),
  values.inicov=matrix(c(1,0.1,
                         0.1,1),byrow=T,ncol=2),
  params.inicov=matrix(c("v_11","c_21",
                         "c_21","v_22"),byrow=T,ncol=2))

4.5 Define the covariance structures of the measurement noise covariance matrix and the dynamic noise covariance matrix

mdcov <- prep.noise(
  values.latent=matrix(c(1,0.3,
                         0.3,1),byrow=T,ncol=2), 
  params.latent=matrix(c("v_wp","c_hw",
                         "c_hw","v_hp"),byrow=T,ncol=2), 
  values.observed=diag(rep(0,2)), 
  params.observed=diag(c('fixed','fixed'),2))

4.6 Pass data and submodels to dynrModel object

model <- dynr.model(dynamics=dynm, measurement=meas,
                    noise=mdcov, initial=initial, data=rawdata,
                    outfile=paste("trial.c",sep=""))

4.7 Plot the Formula

plotFormula(dynrModel = model, ParameterAs = model$param.names, printDyn = TRUE, printMeas = TRUE)

4.8 Apply dynr.mi()

dynr.mi() function is used to implement multiple imputation and parameter estimation procedures. In this particular trial, imp.obs was set as FALSE to request a partial MI approach.

Two auxiliary variables (i.e., x1 and x2) were included in the imputation model given their associations with missing data. We also included lagged information for dependent variables up to the previous time point (i.e., lag of 1) in the imputation model.

The number of imputations was set as 5, which is the default number of imputations in the mice() function in MICE package. The number of MCMC iterations in each imputation was set as 10 for the diagnostic purpose.

result <- dynr.mi(model, which.aux=c("x1","x2"), 
  which.lag=c("wp","hp"), lag=1,
  which.lead=NULL, lead=0,
  m=5, iter=10, 
  imp.obs=FALSE, imp.exo=TRUE,
  diag = TRUE, Rhat=1.1,
  conf.level=0.95,
  verbose=FALSE, seed=12345)

5 Compare true parameters to estimated ones

truep <- c(a=0.4, b=-0.3, b1=-0.2, a1=0.3,
           c=0.3, c1=0.3, d=-0.5, d1=-0.4,
           v_wp = 1, c_hw = 0.3, v_hp = 1)
estp <- result$estimation.result[1:11, ]
data.frame(truep, estp)
truep Estimate Std..Error t.value ci.lower ci.upper Pr…t..
a 0.4 0.3860217 0.0105734 36.50869 0.3652956 0.4067477 0
b -0.3 -0.2967944 0.0118591 -25.02674 -0.3200406 -0.2735482 0
b1 -0.2 -0.2060782 0.0106906 -19.27651 -0.2270340 -0.1851224 0
a1 0.3 0.2872139 0.0118920 24.15176 0.2639030 0.3105247 0
c 0.3 0.2972987 0.0186882 15.90838 0.2606661 0.3339312 0
c1 0.3 0.2818793 0.0184995 15.23710 0.2456165 0.3181421 0
d -0.5 -0.4974293 0.0110962 -44.82897 -0.5191800 -0.4756786 0
d1 -0.4 -0.3989903 0.0110657 -36.05641 -0.4206813 -0.3772992 0
v_wp 1.0 1.0463369 0.0188458 55.52100 1.0093953 1.0832784 0
c_hw 0.3 0.3156262 0.0149351 21.13313 0.2863503 0.3449020 0
v_hp 1.0 1.0225720 0.0182833 55.92941 0.9867331 1.0584109 0

With dynr.mi(), we obtained good recovery of parameters of interest.

6 Convergence diagnostic check

#Rhat plot
result$Rhat.plot

From the Rhat plot, we can see that 6 iterations in each imputation would be sufficient to obtain stable imputation results.