1 Overview
2 Specify Growth Model
- 2.1 Write out Mathematical model
3 Generate Intraindividual Change (prototype)
4 Generate Interindividual Differences in Intraindividual Change
- 4.1 Set parameters for simulation
- 4.2 Generate individual-level coefficients
5 Generate Intraindividual Change Trajectories for Multiple Person Sample

1 Overview

We have forwarded a proposal for Computational Developmental Science (Ram et al., 2015). The basic idea is that theory can be used to generate some formal propositions (equations). Those propositions can then be used in to simualate/generate data. The implications of the theory can then be explored in the simulated world through in silica experiments (e.g., changing parameter values). Theory can be rejected and revised appropriately.

This script is a very basic set-up for simualting growth model data. We specifically generate formal propositions for (1) intraindividual change and (2) interindividual differences in specific features of intraindividual change.

We set up our example using a (1) linear model for intraindividual change and (2) interindividual differences in two features of intraindividual change, intercept and linear slope. The script is meant to be adjusted for examination of other theories (i.e., nonlinear intraindividual change).

1.0.0.1 Preliminary

Libraries used in this script

library(ggplot2)  #for plotting
library(MASS)     #for random generator function
library(psych)    #for general functions

2 Specify Growth Model

Our first task is to write out the theory of interest - a theory of interindividual differences in intraindividual change.

2.1 Write out Mathematical model

The equation for intraindividual change is … (will need to adjust this section with new equation)

\[y_{it} = (b_{0i}) + (b_{1i})x_{it} + e_{it}\] where \[x_{it} = time_{it}\] and \[b_{0i} = g_{00} + u_{0i}\] \[b_{1i} = g_{10} + u_{1i}\] with \[e_{it} \tilde{} N(0,\sigma^{2}_{e})\] and \[[u_{0i},u_{1i}, ...] \tilde{} N(0,\Psi)\]

After algebraic substitution we get … \[y_{it} = (g_{00} + u_{0i}) + (g_{10} + u_{1i})x_{it} + e_{it}\]

This equation can be rearranged to more closely match R code layout of whatever functions will use to estimate. For our purposes here, this form translates in a sufficiently straightforward manner with our coding below.

3 Generate Intraindividual Change (prototype)

3.1 Set up a computation function

Setting up function in R to hold the mathemtical growth model … (will need to adjust this section to match new equation) (For convenience we have created “holders” for 5 features of change.)

growth.fun <- function(x, u_0=0,u_1=0,u_2=0,u_3=0,u_4=0, error=0) {
  y = (g_00 + u_0) + (g_10 + u_1)*x + error   #adjust this line to match equation
  return(y)
}

3.2 Set parameter values for simulation

Set prototypical values for g00, g10, g20, g30, ….(will need to adjust these values with theoretically informed values)

#setting fixed effects parameters
g_00 <- 0
g_10 <- 0.2
g_20 <- NA
g_30 <- NA
g_40 <- NA
#...add more as needed ...

Set number of persons and occasions (here it is for 1 person, 10 occasions).

num_persons <- 1
num_occasions <- 10

3.3 Generate data for prototype trajectory

Generate empty data:

proto_data <- data.frame(
                  id = sort(rep(c(1:num_persons),num_occasions)), 
                  time = rep(c(0:(num_occasions-1)),num_persons), 
                  outcome = NA)
head(proto_data)

id	time	outcome
1	0	NA
1	1	NA
1	2	NA
1	3	NA
1	4	NA
1	5	NA

Fill in ‘outcome’ with prototype trajectory from function (plugging in x = time variable)

proto_data$outcome <- growth.fun(x=proto_data$time)
proto_data

id	time	outcome
1	0	0.0
1	1	0.2
1	2	0.4
1	3	0.6
1	4	0.8
1	5	1.0
1	6	1.2
1	7	1.4
1	8	1.6
1	9	1.8

3.4 Plot prototype trajectory

Plot the prototype …(may need to adjust for aesthetics)

#plot prototype trajectory
ggplot(data = proto_data, aes(x = time, y = outcome, group = id)) +
  ggtitle("Simulated Linear Growth") +
  #geom_point() + 
  geom_line(color="red", size = 2) +
  xlab("Time") + 
  ylab("Outcome") + #ylim(0,100) +
  scale_x_continuous(breaks=seq(0,9,by=1))

We now have the intraindividual change part in place.

4 Generate Interindividual Differences in Intraindividual Change

4.1 Set parameters for simulation

This section is setting the information for the between-person random effects part of the model. \[[u_{0i},u_{1i}, ...] \tilde{} N(0,\Psi)\].

Set the number of persons to simulate: (will need to adjust this to size of desired sample)

num_persons <- 25

Create names for the growth coefficients (gcoeffs):

names_gcoeffs <- c("u_0","u_1","u_2","u_3","u_4")

Note - am creating 5 individual difference coefficients even though our example only uses 2.

Create a vector of means for the individual differences, u variables (typically these will be zero):

#setting means of random effects
u_0 <- 0
u_1 <- 0
u_2 <- 0
u_3 <- 0
u_4 <- 0

Set values for standard deviations for growth coefficients (will need to adjust this with chosen values)

sig_u_0 <- 1.0
sig_u_1 <- 0.3
sig_u_2 <- 1.0
sig_u_3 <- 1.0
sig_u_4 <- 1.0
#...add more as needed ...

Create a symmetric matrix of correlations among the growth coefficients (will need to adjust this with chosen values):

corr_gcoeffs <- matrix(c(1.0,0.2,0.0,0.0,0.0,
                         0.2,1.0,0.0,0.0,0.0,
                         0.0,0.0,1.0,0.0,0.0,
                         0.0,0.0,0.0,1.0,0.0,
                         0.0,0.0,0.0,0.0,1.0),nrow=5,ncol=5,byrow=TRUE)
corr_gcoeffs

##      [,1] [,2] [,3] [,4] [,5]
## [1,]  1.0  0.2    0    0    0
## [2,]  0.2  1.0    0    0    0
## [3,]  0.0  0.0    1    0    0
## [4,]  0.0  0.0    0    1    0
## [5,]  0.0  0.0    0    0    1

Create a vector of means. Create a vector of standard deviations and make it into a diagonal matrix:

#making mean vector
means_gcoeffs <- c(u_0,u_1,u_2,u_3,u_4)
#making sds vector and diagonal matrix
sds_gcoeffs_vector <- c(sig_u_0,sig_u_1,sig_u_2,sig_u_3,sig_u_4)
sds_gcoeffs <- diag(sds_gcoeffs_vector)
sds_gcoeffs

##      [,1] [,2] [,3] [,4] [,5]
## [1,]    1  0.0    0    0    0
## [2,]    0  0.3    0    0    0
## [3,]    0  0.0    1    0    0
## [4,]    0  0.0    0    1    0
## [5,]    0  0.0    0    0    1

Do the matrix multiplication to get a covariance matrix:

cov_gcoeffs <- sds_gcoeffs %*% corr_gcoeffs %*% t(sds_gcoeffs)
cov_gcoeffs

##      [,1] [,2] [,3] [,4] [,5]
## [1,] 1.00 0.06    0    0    0
## [2,] 0.06 0.09    0    0    0
## [3,] 0.00 0.00    1    0    0
## [4,] 0.00 0.00    0    1    0
## [5,] 0.00 0.00    0    0    1

4.2 Generate individual-level coefficients

We use the mvrnorm() function in the MASS package) …

Use the mean vector and variance-covariance matrix to create some multivariate normal data:

#setting the random seed (in order to be able to repeat with same generation)
set.seed(1234) #can change this value if want different data run
#simulate
coeffs_simulate <- mvrnorm(n = num_persons, mu = means_gcoeffs, Sigma = cov_gcoeffs, empirical = TRUE)

Note that makes the data exact (if N is large enough)

Some cleanup into a data-like object.

#making and binding an id variable
id <- c(1:nrow(coeffs_simulate))
indiv_data <- cbind(id,coeffs_simulate)
#assigning the variable names
colnames(indiv_data) <- c("id",names_gcoeffs)
head(indiv_data)

##      id        u_0         u_1         u_2         u_3        u_4
## [1,]  1  1.0883565  0.35493517  0.06978623  1.03101628  1.5681671
## [2,]  2  1.1514923  0.02592345  1.35546492 -0.10400543 -1.1427319
## [3,]  3  1.2717192  0.08035000 -1.10056618  0.60137539 -0.6211519
## [4,]  4  0.4712630 -0.19889661  2.31188787  0.07321659  1.6843328
## [5,]  5  0.6817137 -0.20026783 -0.59006198 -0.70953285 -0.2979141
## [6,]  6 -0.1140907 -0.37390083  1.24890947  0.42056273 -1.5163611

#descriptives
describe(indiv_data)

	vars	n	mean	sd	median	trimmed	mad	min	max	range	skew	kurtosis	se
id	1	25	13	7.359801	13.0000000	13.0000000	8.8956000	1.0000000	25.0000000	24.000000	0.0000000	-1.3446646	1.47196
u_0	2	25	0	1.000000	0.4533346	0.0649800	0.9414834	-2.3494809	1.2717192	3.621200	-0.5562613	-0.8521228	0.20000
u_1	3	25	0	0.300000	-0.0559674	-0.0046921	0.2577785	-0.5169619	0.6649941	1.181956	0.3168838	-0.8436492	0.06000
u_2	4	25	0	1.000000	-0.0992872	-0.0440353	0.8244843	-2.2144909	2.3118879	4.526379	0.3678988	0.0888570	0.20000
u_3	5	25	0	1.000000	-0.0062015	0.0257368	0.8268599	-2.3353412	2.2650956	4.600437	-0.1814367	0.1138464	0.20000
u_4	6	25	0	1.000000	0.0631530	0.0133786	0.8978654	-1.9403057	1.6843328	3.624639	-0.1622044	-0.8209951	0.20000

pairs.panels(indiv_data[,-1])

We now have the individual differences part in place.

5 Generate Intraindividual Change Trajectories for Multiple Person Sample

Now lets generate the simulated intraindivdual change trajectories for all those individuals.

5.1 Set additional simulation parameters

Set number of occasions: (will need to adjust this)
Typically is the same as was used earlier for intraindividual change.

num_occasions <- 10

Setting the (time-varying) residual error standard deviation: (will need to adjust this)

\[e_{it} \tilde{} N(0,\sigma^{2}_{e})\].

#set the sd of the error
error_sd <- 0.1

Calculating total number of observations

NobsTotal <- num_persons*num_occasions
NobsTotal

## [1] 250

Generating an empty data frame of appropraite size:

empty_df <- data.frame(
                  id = sort(rep(c(1:num_persons),num_occasions)), 
                  time = rep(c(0:(num_occasions-1)),num_persons), 
                  outcome = NA)
head(empty_df)

id	time	outcome
1	0	NA
1	1	NA
1	2	NA
1	3	NA
1	4	NA
1	5	NA

Merging in the individual differences coefficients:

sim_data <- merge(empty_df,indiv_data,by="id")

Creating a (time-varying) residual error score for each observation:

sim_data$error <- rnorm(nrow(sim_data),mean=0,sd=error_sd)

head(sim_data)

id	time	outcome	u_0	u_1	u_2	u_3	u_4	error
1	0	NA	1.088356	0.3549352	0.0697862	1.031016	1.568167	0.0080060
1	1	NA	1.088356	0.3549352	0.0697862	1.031016	1.568167	-0.0631409
1	2	NA	1.088356	0.3549352	0.0697862	1.031016	1.568167	-0.1513288
1	3	NA	1.088356	0.3549352	0.0697862	1.031016	1.568167	-0.0636100
1	4	NA	1.088356	0.3549352	0.0697862	1.031016	1.568167	0.0226302
1	5	NA	1.088356	0.3549352	0.0697862	1.031016	1.568167	0.1013690

Notice which variables are time-invariant, and which are time-varying.

5.2 Generate longitudinal outcome data

Use the originally specified growth function to generate outcome scores in the long data set:

sim_data$outcome <- growth.fun(x=sim_data$time,
                               u_0=sim_data$u_0,
                               u_1=sim_data$u_1,
                               u_2=sim_data$u_2,
                               u_3=sim_data$u_3,
                               u_4=sim_data$u_4,
                               error=sim_data$error)
head(sim_data)

id	time	outcome	u_0	u_1	u_2	u_3	u_4	error
1	0	1.096362	1.088356	0.3549352	0.0697862	1.031016	1.568167	0.0080060
1	1	1.580151	1.088356	0.3549352	0.0697862	1.031016	1.568167	-0.0631409
1	2	2.046898	1.088356	0.3549352	0.0697862	1.031016	1.568167	-0.1513288
1	3	2.689552	1.088356	0.3549352	0.0697862	1.031016	1.568167	-0.0636100
1	4	3.330727	1.088356	0.3549352	0.0697862	1.031016	1.568167	0.0226302
1	5	3.964401	1.088356	0.3549352	0.0697862	1.031016	1.568167	0.1013690

5.3 Plot trajectory data

Plotting the simulated data (with the prototype as an additional layer) (may need to adjust title and for aesthetics)

ggplot(data = sim_data, aes(x = time, y = outcome, group = id)) +
  ggtitle("Simulated Linear Growth") +
  geom_point() + 
  geom_line() +
  xlab("Time") + 
  ylab("Outcome") + #ylim(0,100) +
  scale_x_continuous(breaks=seq(0,9,by=1)) +
  stat_function(fun=growth.fun, color="red", size = 2)

That’s cool!

A simulation engine that can be used to play with theory and generate hypotheses!

Growth Model Trajectory Generator