Overview

This tutorial introduces a hybrid method that combines multivariate time-series methods and network analysis methods in order to model individuals as complex dynamic systems. This hybrid method is designed and tested to quantify the extent of interaction in a multivariate system, and applicable on experience sampling data.

This turorial corresponds to the following paper (under review): Yang, X., Ram, N., Gest, S. D., Lydon, D. M., Conroy, D. E., Pincus, A. L., Molenaar, P. M. C.. Socioemotional Dynamics of Emotion Regulation and Depressive Symptoms: A Person-Specific Network Approach.

Outline

  1. Functions to fit uSEM model
  2. Prepare simulation of time-series data
  3. Apply multivariate time-series methods (unified Structural Equation Modeling, uSEM) on the simulated data, to construct person-specific networks
  4. Apply network analysis (also called impulse reponse analysis) on the person-specific network
  5. Calculate an innovative network metric (recovery time)

Introduction

Lifespan developmental theories view persons as complex dynamic systems, with feelings, thoughts and actions that are interconnected and change over time. In order to understand this aspect of individual’s psychological functioning, we need repeatedly measured multivariate psychological data, as well as methods to quantify the interrelations among variables and quantify the property of the system. Based upon the property of the system, we can examine the long-term change of the interrelations by applying network analysis methods.

In the paper aforementioned, we forward an approach that merges multivariate time-series methods , network analysis methods, and measurement-burst designs in order to describe the interplay among many aspects of functioning and the change in this interplay over time.

In the paper, we used experience sampling data collected from 150 indviduals over the span of a year, average measurement was 145.7 per burst.

Here, I used a simulated dataset to demonstrate the method in this tutorial, keeping the dimension of the system as the same (p = 13) and number of measurement similar (T = 150).

Step 0. Functions to fit uSEM model

First, import the libraries required for this tutorial.

library(lavaan)
library(ggplot2)
library(reshape2)
library(qgraph)
library(MASS)
library(matrixcalc)

Here are the functions to fit a uSEM model.

Function1: find.path.to.free.up

This is a recursive function that find the path that satisfies two conditions:

  1. either direction was not included in the model before, to avoid bidirectional paths

  2. has the largest modification indices of all who satisfy 1).

return value: the path to free up in the format of one row of modification indices, which is an output of function “modindices” in lavaan

find.path.to.free.up <- function(beta.mi, model.syntax, var.number){
  # this is the path with largest mi
  largest.path <- beta.mi[which(beta.mi$mi == max(beta.mi$mi)), ] 
  
  if (length(largest.path[,1]) > 1)
  {
    print("two paths with equal improvement to model!")
    #If there are two largest and equal improvement of modification indices, we always pick the first one in order, so that we produce stable model estimates every time. 
    largest.path <- largest.path[1,]
  }
  
  path.to.add <- paste(largest.path$lhs, "~", largest.path$rhs, "\n ",sep = "")
  
  # to test if either direction is already included in the following if statement
  path.to.add.reverse <- paste(largest.path$rhs, "~", largest.path$lhs, "\n ",sep = "")
  
  lhs.number <- as.numeric(substr(largest.path$lhs,4,nchar(largest.path$lhs)))
  rhs.number <- as.numeric(substr(largest.path$rhs,4,nchar(largest.path$rhs)))
  
  # this path was already in the model, either direction; then the mi of that path is fixed to 0
  if (!identical(grep(path.to.add.reverse, model.syntax), integer(0)) | lhs.number <= var.number && rhs.number > var.number)
  {
    # make the largest value in the modification indices equal to zero, so that it can search other values
    beta.mi[which(beta.mi$mi == max(beta.mi$mi)), ]$mi <- 0 
    
    # return (find.path.to.free.up(beta.mi, model.syntax, var.number, lag.order))
    return (find.path.to.free.up(beta.mi, model.syntax, var.number))
  }
  else
  {
    return(largest.path)
  }
}

Function2: uSEM

This is how the uSEM model is specified and implemented using package “lavaan” (https://cran.r-project.org/web/packages/lavaan/index.html) and we also adapted code from package “GIMME” (https://cran.r-project.org/web/packages/gimme/index.html), keeping the iterative procedure of model fitting and adding features to ensure the model result is interpretable in the context of our study.

Here are some important parameters in this piece of code:

var.number: number of variables you have per measurement. In our case, it was 13.

lag.order: number of lag in your model. We only tested lag-1, so lag.order = 1. Normally, lag-1 is the most common fitted order, and one can try to increase to higher orders. I have not personally tried second order in this model, and it may or may not yield a good fit.

return value: model fit object output from function “lavaan”

uSEM <- function(var.number, data, lag.order, verbose = FALSE) {
  
  regression.syntax <- ""
  measurement.syntax <- ""
  variance.syntax <- ""
  
  chisq.list <- NULL
  path.list <- NULL
  
  
  # By specification of the uSEM model, the upper half of beta matrix is fixed to 0, because eta(t-1) is not the focus of esitmation here.
  for (i in 1:(lag.order*var.number))
  {
    for (j in 1:((lag.order+1)*var.number))
    {
      regression.syntax <- paste(regression.syntax, 
                                 "eta", i, "~0*","eta", j, "\n ",sep = "") 
    }
  }
  # lower half of beta matrix. for phi matrix (lag-1 temporal relations), open up auto-regression (diagnal line of the phi matrix) in the initial iteration
  for (i in (1+(lag.order*var.number)):((lag.order+1)*var.number))
  {
    regression.syntax <- paste(regression.syntax,
                               "eta", i, "~eta",i-var.number, "\n ",sep = "")
  }
  
  # summary for beta matrix set up:
  # so in uSEM, the beta matrix is fixed at 0 at the upper half (for lag-1),
  # and the diagonal line of southwest block (phi matrix) is freed up
  # everything else is left to be fitted in the automatic search process later
  
  # to write syntax for measurement syntax, fix factor loading to 1 and measurement error to 0
  for (i in 1:(var.number*(lag.order+1)))
  {
    measurement.syntax <- paste(measurement.syntax, 
                                "eta", i, "=~1 * y", i, "\n ",sep = "")
  }
  
  # to write syntax for variance.syntax, part 1, upper left (northwest block) being symmetric
  for (i in 1:(lag.order*var.number)){
    for (j in i:(lag.order*var.number))
    {
      variance.syntax <- paste(variance.syntax, "eta", i, "~~", "eta", j,  "\n ",sep = "")
    }
  }
  # to write syntax for variance.syntax, part 2, lower right (southeast block) being diagonal
  for (i in (lag.order * var.number + 1) : ((lag.order+1) * var.number))
  {
    variance.syntax <- paste(variance.syntax, "eta", i, "~~", "eta", i,  "\n ",sep = "")
  }
  
  # assemble regression syntex, measurement syntax and variance syntax
  # in LISREL, these 3 parts are all you need to specify as well
  model.syntax <- paste(regression.syntax, measurement.syntax, variance.syntax, sep = "")
  
  # iteration of searching modification indices
  fit.criteria <- 1 # initial value
  iteration <- 1
  
  # model syntax explanation see 
  # https://www.rdocumentation.org/packages/lavaan/versions/0.5-14/topics/lavaan
  fit <- tryCatch(lavaan(model.syntax,
                         data=data,
                         model.type      = "sem",
                         # missing         = "fiml",
                         estimator       = "ml"
  ),
  error=function(e) e)
  
  error.message <- tryCatch(inspect(fit, "fit"),error=function(e) e)
  
  if (any(grepl("error", class(error.message)) == TRUE) )
  {
    return (fit)
  }
  else
  {
    mi <- modindices(fit)
    fit.meausres <- fitMeasures(fit)
    fit.criteria <- fit.meausres["rmsea"] 
    model.fit <- fit # global variable of model fit 
    
    
    # update chisq list and path list
    chisq.list <- append(chisq.list, fit.meausres["chisq"])
    chisq.delta <- fit.meausres["chisq"]
    
    while(chisq.delta > 3.841  ) # 1 degree freedom chisq p < 0.05
    {
      # Step 1. print some description of the iteration, this is for debugging purpose
      if (verbose == TRUE)        {
        print (paste("iteration", iteration))
      }
      
      # Step 2. get the largest value modification indices of covariance
      beta.mi <- mi[which(mi$op == "~"),]
      beta.mi$lhs.number <-  as.numeric(gsub("eta","", beta.mi$lhs))
      beta.mi <- beta.mi[beta.mi$lhs.number > lag.order * var.number,]
      beta.mi.ordered <-  beta.mi[order(-beta.mi$mi),]
      # print(beta.mi.ordered[1:5,])
      if (verbose == TRUE)        {
        print(beta.mi.ordered[1:5,])
      }
      
      
      largest.path <- find.path.to.free.up(beta.mi = beta.mi, 
                                           model.syntax = model.syntax, 
                                           var.number = var.number)
      path.to.add<- paste(largest.path$lhs, "~", largest.path$rhs, "\n ",sep = "")
      # print(path.to.add)
      
      if (verbose == TRUE)        {
        print(path.to.add)
      }
      
      lhs.number <- as.numeric(substr(largest.path$lhs,4,nchar(largest.path$lhs)))
      rhs.number <- as.numeric(substr(largest.path$rhs,4,nchar(largest.path$rhs)))
      
      model.syntax <- paste(model.syntax, path.to.add) 
      
      fit <- tryCatch(lavaan::lavaan(model.syntax,
                                     data=data,
                                     model.type      = "sem",
                                     # missing         = "fiml",
                                     estimator       = "ml"
      ),
      error=function(e) e)
      
      
      check.npd            <- any(grepl("error", class(fit)) == TRUE) #  non positive definite
      check.not.identified <- sum(lavInspect(fit,"se")$beta, na.rm = TRUE) == 0 
      singular            <- tryCatch(modindices(fit), error = function(e) e)
      check.singular      <- any(grepl("singular", singular) == TRUE)
      check.error         <- any(grepl("error", class(singular)) == TRUE)
      converge            <- lavInspect(fit, "converged")
      
      if (check.npd | check.not.identified | check.singular | check.error | !converge)
      {
        return (fit)
      }    
      else
      {
        mi <- modindices(fit)
        chisq.model <- fitMeasures(fit)["chisq"] 
        
        # update chisq list and path list
        chisq.list <- append(chisq.list, fitMeasures(fit)["chisq"])
        path.list <- append(path.list, path.to.add)
        chisq.delta <- chisq.list[iteration] - chisq.list[iteration+1]
        
        # update fit criteria, model fit object, model syntax, modification indices
        fit.criteria <- fitMeasures(fit)["rmsea"] 
        # model.fit <- fit
        
        # print(fitMeasures(fit)["chisq"])
        iteration <- iteration + 1
        
        if (verbose == TRUE)
        {
          print(summary(fit))
          print(fitMeasures(fit)["chisq"])
        }
      }
    }
    return (fit)
  }
}

Function3: parse.beta.est

This function is to parse the lavaan output to a list format for beta estimates.

parse.beta.est <- function(var.number, model.fit, lag.order)
{
  all.estimates <- parameterEstimates(model.fit)
  beta.est <- all.estimates[which(all.estimates$op == "~" ),]
  beta.est <- beta.est[which(!is.na(beta.est$pvalue)),]
  return(beta.est)
}

Function4: parse.beta

This function is to parse the lavaan output to a matrix format for beta estimates.

parse.beta <- function(var.number, model.fit, lag.order) # pass the modelfit (lavaan)
{
  all.estimates <- parameterEstimates(model.fit)
  beta.est <- all.estimates[which(all.estimates$op == "~" ),]
  beta.est <- beta.est[which(!is.na(beta.est$pvalue)),]

  # se, beta.est$se to get distribution of beta, can write another function to parse the beta estimates

  beta.matrix <- matrix(rep(0, (var.number*(lag.order + 1))^2), nrow = (var.number*(lag.order+1)), ncol = (var.number*(lag.order+1)), byrow = TRUE)
  lhs.number <- as.numeric(gsub("eta","", beta.est$lhs))
  rhs.number <- as.numeric(gsub("eta","", beta.est$rhs))

  for(i in 1:length(beta.est[,1]))
  {
    # if (beta.est$pvalue[i] <= 0.05)
    # {
    beta.matrix[lhs.number[i], rhs.number[i]] <- beta.est$est[i]

    # }
  }
  return (beta.matrix)
}

Function5: parse.psi

This function is to parse the lavaan output to a matrix format for Psi matrix

parse.psi <- function(var.number, model.fit, lag.order) # pass the modelfit (lavaan)
{
  all.estimates <- parameterEstimates(model.fit)
  psi.est <- all.estimates[which(all.estimates$op == "~~" ),]
  psi.est <- psi.est[which(!is.na(psi.est$pvalue)),]
  
  psi.matrix <- matrix(rep(0, (var.number*(lag.order + 1))^2), nrow = (var.number*(lag.order+1)), ncol = (var.number*(lag.order+1)), byrow = TRUE)
  
  lhs.number <- as.numeric(gsub("eta","", psi.est$lhs))
  rhs.number <- as.numeric(gsub("eta","", psi.est$rhs))
  
  for(i in 1:length(psi.est[,1]))
  {
    psi.matrix[lhs.number[i], rhs.number[i]] <- psi.est$est[i]
    psi.matrix[ rhs.number[i], lhs.number[i]] <- psi.est$est[i]
  }
  return (psi.matrix)
}

Function 6: model.summary

To check model fit and return a list as follows:

beta = beta.matrix,

psi = psi.matrix,

fit.stat = fit.stat,

beta.edge.version = beta.estimates,

psi.edge.version = psi.estimates

model.summary <- function(model.fit) # pass the modelfit (lavaan)
{
  
  # check if model fit successfully
  check.npd            <- any(grepl("error", class(model.fit)) == TRUE) # non positive definite
  check.not.identified <- sum(lavInspect(model.fit,"se")$beta, na.rm = TRUE) == 0 
  singular            <- tryCatch(modindices(model.fit), error = function(e) e)
  check.singular      <- any(grepl("singular", singular) == TRUE)
  check.error         <- any(grepl("error", class(singular)) == TRUE)
  converge            <- lavInspect(model.fit, "converged")
  
  if (check.npd | check.not.identified | check.singular | check.error | !converge)
  {
    if (!converge){
      print(paste(id, burst, "not converged!"))
    } else{
      print(paste(id, burst, "model did not fit!"))
    }
  }else{
    # no trimming, because some cases trimming didn't work
    mi <- modindices(model.fit)
    # mi[mi$op == "~",] # why does MI include the upper right block of beta????
    fit.meausres <- fitMeasures(model.fit)
    
    # summary(fit)
    fit.stat <- fitMeasures(model.fit)
    estimates <- parameterEstimates(model.fit)
    beta.estimates <- estimates[estimates$op == "~", ]
    psi.estimates <- estimates[estimates$op == "~~", ]
    
    beta.matrix <- parse.beta(var.number,model.fit, lag.order)
    # beta.matrix <- beta.matrix[(var.number+1):(var.number*2), 1:(var.number*2)]
    
    psi.matrix <- parse.psi(var.number,model.fit, lag.order)
    result <- list(beta = beta.matrix, 
                   psi = psi.matrix, 
                   fit.stat = fit.stat,
                   beta.edge.version = beta.estimates,
                   psi.edge.version = psi.estimates)
    return(result)
    }
}

Function 7: W2E

To convert adjacency matrix representation to edge representation

W2E <-function(x) cbind(which(x!=0,arr.ind=TRUE),x[x!=0]) 

Step 1: Prepare simulation of time-series data

Since uSEM estimated contemporaneous relations as well as lag-1 relations, we transformed the contemporaneous relations back to a lag-1 format, to facilitate simulation of the time-series. Details of mathematical deduction is shown in Gates et al. (2010).

We simulate a time-series data based on the covariance matrix among variables. we use mvnorm fuction (in MASS package) to simulate multivariate time-series. The covariance matrix is as follows:

\(\Sigma = \Lambda (I-\ B \ )^{-1}\Psi (I- \ B \ )^{-T} \Lambda ^{T} + \Theta\)

where the \(\Lambda\) matrix is identity matrix, and \(\Theta\) matrix is zero matrix. Therefore, we only need to specify the \(\ B\) matrix and the \(\Psi\) matrix.

Number of variables, number of measurments (time) are variables names are as follows:

Note here since the beta matrix is a 26 by 26 matrix. The upper half of beta matrix is always zero, since lag-1 variables are not predicting themselves in this model; and lower half is a combination of phi matrix on the left and A matrix on the right.

n.obs <- 150
p <- 13

# upper block of beta are zeros
true.beta <- c(rep(0,p * (2*p)), 
0.1652, 0,  0,  0,  0,  0,  -0.1447,    0,  0,  0,  0,  0,  0,  0,  0.2083, 0.6099, -0.1015,    0,  0.0898, 0.1359, 0,  0,  0,  0,  0,  0,
0,  0.026,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  -0.3328,    0,  0,  0,  0,  0,  0.1434, 0,  0,  0,
0,  0,  0.3095, 0,  0,  0,  0,  0,  0,  0.159,  0,  0,  0,  0,  0,  0,  0,  0.3552, 0,  0,  0,  0,  0,  0,  0,  0,
0,  0,  0.21,   0.0326, 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0.1373, 0.2069, 0,  0,  0,  0,  0,  0,
0,  0,  0,  0,  0.2063, 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0.311,  0,  0,  0,  0,  0,  0,
0,  0,  0,  0,  0,  0.2261, -0.1514,    0,  0,  0,  0,  0,  0,  0,  0.231,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0.3722,
0,  0,  0.1235, 0,  0,  0,  0.0808, 0,  0,  0,  0,  0,  0,  0,  -0.17,  0,  0,  0,  0,  0,  0.3498, 0,  0,  0,  0,  0.3002,
0,  0,  0,  0,  0,  0,  0,  0.2278, 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  -0.1216,    0.1862, 0.4109,
0,  0,  0,  0,  0,  0,  0,  0,  -0.0593,    0,  0,  0,  0,  0,  0,  0,  -0.139, 0.1734, 0,  0,  0,  0,  -0.2961,    0.6339, 0,  -0.2353,
0,  0,  0,  0,  0,  0,  0,  -0.0858,    0,  -0.0638,    0,  0,  0,  0,  0,  -0.1801,    0.131,  0,  0,  0,  0,  0,  0,  0.7578, 0,  0,
0,  0,  0,  0,  0,  0,  0,  0,  0.1207, 0,  0.153,  0,  0,  0,  0,  0,  0.1371, 0,  0,  0,  0,  0,  0,  0,  -0.1901,    0,
0,  0,  0,  0,  0,  0,  0,  0.1438, 0,  0.1501, 0,  0.3037, 0,  0.1576, 0,  0,  0,  0.2412, 0,  0,  0,  0,  -0.2774,    0,  0,  0,
0,  0,  0,  0,  0,  0,  0,  0,  0,  0.1305, 0,  -0.1441,    0.3868, 0.1098, 0,  0,  0,  0,  0,  0,  0,  0,  -0.3275,    0,  0.3989, 0
)

true.beta <- matrix(true.beta, nrow = p * 2, ncol = p * 2, byrow = TRUE)
  
# process noise vector
psi <- c( 0.46, 0.88, 0.71,
          0.89, 0.76, 0.77,
          0.47, 0.47, 0.72,
          0.33, 0.83, 0.47,
          0.42)
psi <- matrix(psi, nrow = p , ncol = 1, byrow = TRUE, dimnames = NULL)

#simulated process noise for n observations
psi.sim <- matrix(rep(0,p * n.obs), nrow = n.obs , ncol = p, byrow = TRUE, dimnames = NULL)
for (dim in 1:p)
{
  psi.sim[, dim] <- rnorm(n.obs,0, sqrt(psi[dim]))
}


# phi is the lag-1 matrix, alpha is the contemporaneous matrix
phi <- matrix(true.beta[(p+1):(2*p),1:p], nrow = p, ncol = p, byrow = F)
alpha <- matrix(true.beta[(p+1):(2*p),(p+1):(2*p)], nrow = p, ncol = p, byrow = F)

i <- diag(1,p * 2,p * 2)

# simulate time-series based on beta and psi
time.series <- matrix(rep(0, p * n.obs), nrow = n.obs, ncol = p)
time.series[1,] <-solve(diag(1,p, p) - alpha) %*% psi.sim[1, ]

row <- 2
for (row in 2:n.obs)
{
   time.series[row,] <- solve(diag(1,p, p) - alpha) %*% phi %*% time.series[row-1,] +
    solve(diag(1,p, p) - alpha) %*% psi.sim[row, ]
}

Note here, in the simulation, an important step here is to convert the contemporaneous form into the traditional VAR expression (only lag-1 form), so that the simulation can be computed step by step.

\(\eta(t) = (I-\ A \ )^{-1}\Phi\eta(t-1) + (I-\ A \ )^{-1}\zeta(t)\)

This conversion is similarly done in the later section, where impulse response analysis was computed.

When experience sampling data is fed to the model, the lag-1 version is to be added to the time series as follows.

The “time.series” dataset is in long format, where each row is one measurement of p variables.

The lag-1 version is “time.series[1:(nrow(time.series) -1), ]”.

The lag-0 version is “time.series[2:nrow(time.series), ])”.

time.series <- cbind (time.series[1:(nrow(time.series) -1), ], 
                      time.series[2:nrow(time.series), ])
time.series <- data.frame(time.series)
names(time.series) <- c(
  "other.communion.lag",
  "other.agency.lag",
  "self.communion.lag",
  "self.agency.lag",
  "benefit.self.lag",
  "benefit.other.lag",
  "control.lag",
  "esteem.lag",
  "ashamed.lag",
  "anger.lag",
  "sad.lag",
  "happy.lag",
  "proud.lag",
  "other.communion",
  "other.agency",
  "self.communion",
  "self.agency",
  "benefit.self",
  "benefit.other",
  "control",
  "esteem",
  "ashamed",
  "anger",
  "sad",
  "happy",
  "proud")

Then we use “melt” function to change the time-series to “long” format. Then we can plot the simulated data by variable name, using “facet_wrap” function.

time.series.lag0 <- time.series[, (p+1):(2*p)]
time.series.lag0$interaction <- seq(1,length(time.series.lag0[,1]),1)

time.series.long <- melt(time.series.lag0, id="interaction")  

ggplot(data=time.series.long,
       aes(x=interaction, y=value, colour=variable)) +
  geom_line() + 
  facet_wrap( ~ variable  , nrow = p * 2) +
  scale_y_continuous(breaks = seq(0, 100, by = 50)) + 
  theme(
    strip.background = element_blank(),
    strip.text.x = element_blank(),
    strip.text.y = element_blank(),
    axis.text.y = element_text(), 
    axis.title.y = element_blank()
    )

We can also plot the true.beta matrix in the network form, which is also called a graph (Epskamp et al., 2012). Because the beta matrix is equivalent to the transpose of the adjacency matrix that represent network direction, we took the transpose of true.beta matrix here.

econtemporaneous <- W2E(t(true.beta[14:26,14:26]))
elagged <- W2E(t(true.beta[14:26,1:13]))

plot.names <- c(  "other.communion",
                      "other.agency",
                      "self.communion",
                      "self.agency",
                      "benefit.self",
                      "benefit.other",
                      "control",
                      "esteem",
                      "ashamed",
                      "anger",
                      "sad",
                      "happy",
                      "proud")

# indicate which edges are edges that represent lagged relationships, so that in the graph it will be labeled as dotted lines
isLagged               <- c(rep(TRUE, nrow(elagged)), rep(FALSE, nrow(econtemporaneous))) 
curve                  <- rep(1, length(isLagged)) # all edges will be curve

qgraph(rbind(elagged,econtemporaneous),
        layout              = "circle",
                      lty                 = ifelse(isLagged,2, 1),
                      edge.labels         = F,
                      curve               = curve,
                      fade                = FALSE,
                      posCol              = "green",
                      negCol              = "red",
                      labels              = plot.names,
                      label.cex           = 1,
                      label.norm          = "O",
                      label.scale         = FALSE,
                      edge.label.cex      = 1.5,
                      edge.label.position = .3)

Step 2. Apply multivariate time-series method (unified Structural Equation Modeling, uSEM) on the simulated data.

We use the functions defined previously in step 0 to fit the model and parse parameters. The function “uSEM” is the function where the interative procedure of fitting a uSEM model.

The model fit summary will give beta and psi estimates, which are essential information to conduct network analysis later. The model fit summary also gives the fit statistics.

To reiterate the parameters of the “uSEM” function:

var.number: number of variables, 13 in this case.

scaled.data: multivariate time-series data in the long format (every row is a measurement, and every column is a variable). Note: scaling is not absolutely necessary, but it can be helpful when some variables have small variances.

lag.order: number of lags in the uSEM model. Default is 1.

# set up parameters for uSEM
var.number <- 13
lag.order <- 1
data <- time.series # use the generated time-series directly

names.data <- lapply(c(1:((lag.order + 1)*var.number)), function(x)(paste("y",x, sep = ""))) #
names(data) <- names.data
scaled.data <- data # no need to scale the simulated data, but actual data was scaled in manuscript

# call the uSEM function in "usem_tutorial.R"
model.fit <- uSEM(var.number, scaled.data, lag.order)

Now the uSEM model result is in the object “model.fit”, including beta matrix, psi matrix, and fit statistics. Next, we need to parse these information out.

“model.summary” function will return an object that contains beta, psi and fit statistics, shown in the following code chunks.

We can also parse the temporal relations stored in the beta matrix and innovation of the system stored in the psi matrix.

mdl <- model.summary(model.fit)

beta.matrix <- mdl$beta
psi.matrix <- mdl$psi

Model fit should obey a 3 out of 4 rule, which means at least 3 out of the 4 rules should satisfy, and the four rules are CFI and NNFI should be at least .95, and RMSEA and SRMR should be no greater than .08.

mdl$fit.stat["cfi"]
## cfi 
##   1
mdl$fit.stat["nnfi"]
##     nnfi 
## 1.015295
mdl$fit.stat["rmsea"]
## rmsea 
##     0
mdl$fit.stat["srmr"]
##       srmr 
## 0.04413429

Step 3. Apply impulse reponse analysis on person-specific network

Conceptually, impulse response analysis is to perturb the system (the whole psychological network we estimated) by one node (or multiple nodes, but since this model is linear, the multiple nodes scenario is additive of the single node impulse. Thus one can conduct impulse response analysis in either way and get equivalent result). After the system receives such perturbation, or impulse, the impulse will reverbate through the network based on the statistical relationship.

For example, one estimated sadness can be predicted by -0.35 happiness + 0.2 anger - 0.6 self-esteem, if happiness or anger/self-esteem received a perturbation, then one can deduct what is the value of sadness for the next step, and we can also deduct value of other nodes in the same way. So one can have a time profile per node after the perturbation. Time profile is the trajectory of a variable after the system received the perturbation.

This method will give an intuitive view of the dynamic behavior of a network, in complimentary of the static newtork configuration (e.g. density, centrality, clustering, etc).

3.1 Estimated beta

n.obs <- 150 # impuulse response analysis length/time steps
p <- 13 # dimension of the system

beta <- beta.matrix# use estimated matrix to see recovery time
# convert to A matrix and Phi matrix
alpha <- as.matrix(beta[(p+1):(2*p), (p+1):(2*p)], nrow = p, bncol = p, byrow = TRUE)
phi <- as.matrix(beta[(p+1):(2*p), 1:p], nrow = p, bncol = p, byrow = TRUE)
psi <- matrix(psi, nrow = p , ncol = 1, byrow = TRUE)
identity.matrix <- diag(1, p, p)
# by algebra, coefficient matrix is the inverse of (I -A) * Phi
coeff.matrix.eta.lag <- solve(identity.matrix-alpha) %*% phi

# to record the half time and tenth time, simulated by impulse response analysis
recovery.time <- data.frame(matrix(rep(0, 3), nrow = 1, ncol = 3))
names(recovery.time) <- c("id", "burst", "recovery.time")
# to store the impulse response by time steps   
sadness.time.profile <- data.frame(matrix(rep(0, 1 * n.obs), nrow = 1, ncol = n.obs))

3.2 Give impulse to designated node (sadness in this tutorial), anc compute time-series based on network configuration

We used the wording “shock”, “impulse” and “perturbation” interchangably here.

 # specify a pertrubation
node.to.shock <- 11 # here you change it once where to shock, the order is follows, so sad is 11

shock <- rep(0,p)
shock[node.to.shock] <- 1
shock <- as.matrix(shock, nrow = p, ncol = 1)

time.series.shock <- matrix(rep(0, p * n.obs), nrow = p, ncol = n.obs)

start.point <- 1 # at least 2, to give timepoint 1 as initial value
for(col in 1:start.point)
{
   time.series.shock[,col] <- solve(identity.matrix-alpha) %*% shock 
}

# noise come as psi matrix directs
for (time in (start.point+1):n.obs)
{
   # ONLY shock at T =1 
    time.series.shock[,time] <- coeff.matrix.eta.lag %*%
       time.series.shock[,time-1]   
    
}

column.names <- c(
                     "other.communion",
                      "other.agency",
                      "self.communion",
                      "self.agency",
                      "benefit.self",
                      "benefit.other",
                      "control",
                      "esteem",
                      "ashamed",
                      "anger",
                      "sad",
                      "happy",
                      "proud")


time.series.shock <- data.frame(t(time.series.shock))
names(time.series.shock) <- column.names
time.series.shock$id <- 1:n.obs

3.3 Plot the time-series generated by impulse reponse analysis

time.series.shock.melt <- melt(time.series.shock, id.vars ="id")

ggplot(data = time.series.shock.melt,
       aes( x =id, y = value, group = variable, color = variable))+
  geom_line()+
  facet_wrap(~variable) +
  theme(legend.position = "none") +
  xlim(c(0,20))
## Warning: Removed 1690 rows containing missing values (geom_path).

we could see the impulse (sometimes called perturbation or distrubance) given to sadness is eliciting change in emotions (e.g. anger, pride, happiness), as well as aspects from social interaction (e.g. self-agency, other-communion). These curves, also called time profiles (Lutkepohl, 2005), demonstrate perturbation to sadness has various dynamic impact on other nodes in the network, and each node has its own shape of profile. In terms of substantive meaning, this also illustrats at least some of emotion regulation mechasnisms were embedded in this socioemotional network. For example, whenever this person gets sad (a perturbation of higher value in sadness node), he will also becomes more friendly when talking to others (self-communion peaked after the initial shock).

Step 4. Calculate network metric (recovery time)

Recovery time described the duration for the socioemotional network returns back to normal (equilibrium) after perturbation. It is sometimes complicated to derive an analytical form of recovery time, so we used a numerical method to search backward in the simulated impulse response analysis. Specifically, we searched backward in the time profile, and looked for the first time when the absoluate value of sadness exceeded .01, which is the threshold we considered close enough to the asymptote. So the recovery time is the duration from the first timestep to the timestep when time profile returns within a range of +/-0.01.

In the paper, we also included analysis of this recovery time of sadness and depressive symptoms, attempting to link the micro-time dynamic (sadness regulation) to macro-time development (of depressive symptoms). We found recovery time is a significant predictor of levels of depressive symptoms, implying longer recovery time is associated with higher depressive symptom level.

for (row in n.obs:1)
{
  if ((time.series.shock[row, node.to.shock] >= 0.01 ||time.series.shock[row, node.to.shock] <= -0.01)
      &&  recovery.time$recovery.time==0)
  {
    recovery.time$recovery.time <- row + 1
    print("recovery time is:")
    print(recovery.time$recovery.time)
  }
}
## [1] "recovery time is:"
## [1] 5

References

  1. Gates, K. M., Molenaar, P. C. M., Hillary, F. G., Ram, N., & Rovine, M. J. (2010). Automatic search for fMRI connectivity mapping: An alternative to Granger causality testing using formal equivalences among SEM path modeling, VAR, and unified SEM. NeuroImage, 50(3), 1118-1125.

  2. Lütkepohl, H. (2005). New introduction to multiple time-series analysis. Berlin: Springer.

  3. Epskamp, S., Cramer, A. O. J., Waldorp, L.J., Schmittmann, V.D., & Borsboom, D. (2012). qgraph: Network Visualizations of Relationships in Psychometric Data. Journal of Statistical Software, 48(4), 1-18. URL http://www.jstatsoft.org/v48/i04/. R package version 1.3.3. http://sachaepskamp.com/qgraph. doi: 10.18637/jss.v048.i04.