Generating and Plotting Data from ODE Models: Part 2

Overview

In Part 1, we generated data from a variety of ODE models. Here in Part 2, we outline the entire fitting process. The code here is, in part, a “brain dump” - so we will look to flesh the details out a bit more later on.

Outline

In this session we cover …

A. Generating Data from a Differential Equation Model
B. Model Fitting: Integrating The Model Approach
C. Model Fitting: Differentiating the Data Approach
D. Coupled Damped Linear Oscillator Example

Prelim - Loading libraries used in this script.

library(deSolve)  #for the differential equation solvers
library(ggplot2)   #for plotting
library(nlme)     #for fitting mixed effets models
library(zoo)       #for making time-series embedded data
library(reshape2)  #for reshaping data
library(tseriesChaos) #for embedd function

Setting a seed becasue we are generating data.

set.seed(1234)

A. Generating Data from a Differential Equation Model

Let’s start with a very simple model - the constant (linear) growth model. In differential form, the model is …
\[\frac{dx}{dt} = a \]

We can generate some data from this model

1. Write the differential equation as a function.

#diffeq model as function
model.constant = function(times, y, parms) {
  ## times = current times
  ## y = current state vector: in this case the 0th derivative, c(x)
  ## parms = vector of model parameters.
  with(as.list(c(y,parms)),{
  derivs = c( 
    dx = a
    )
  ## Return
  list(derivs)
})
}

2. Specify vectors with the initial values, model parameters, and times over which to integrate.

#Initial Values
initvalues.1 <- c(x = 0.1)
#Model Parameters
parms.1 <- c(a = 0.2)
#Times over which to integrate
times <- seq(0, 100, 1)

3. Solve the differential equation using numerical methods.

This is done using the ode() function, which is itself a wrapper of the lsoda() function (and a collection of similar functions for different types of integrations (e.g., Runge-Katta), some of which work better with particular types of models (e.g., stiff and non-stiff systems)).

out <- ode(y = initvalues.1, times = times, func = model.constant, parms = parms.1)

This produces an object of class deSolve that can then be examined, summarized, used for diagnosis, and plotted.

head(out)

##      time   x
## [1,]    0 0.1
## [2,]    1 0.3
## [3,]    2 0.5
## [4,]    3 0.7
## [5,]    4 0.9
## [6,]    5 1.1

summary(out)

##                  x
## Min.      0.100000
## 1st Qu.   5.100000
## Median   10.100000
## Mean     10.100000
## 3rd Qu.  15.100000
## Max.     20.100000
## N       101.000000
## sd        5.860034

diagnostics(out)

## 
## --------------------
## lsoda return code
## --------------------
## 
##   return code (idid) =  2 
##   Integration was successful.
## 
## --------------------
## INTEGER values
## --------------------
## 
##   1 The return code : 2 
##   2 The number of steps taken for the problem so far: 102 
##   3 The number of function evaluations for the problem so far: 103 
##   5 The method order last used (successfully): 2 
##   6 The order of the method to be attempted on the next step: 2 
##   7 If return flag =-4,-5: the largest component in error vector 0 
##   8 The length of the real work array actually required: 36 
##   9 The length of the integer work array actually required: 21 
##  14 The number of Jacobian evaluations and LU decompositions so far: 0 
##  15 The method indicator for the last succesful step,
##            1=adams (nonstiff), 2= bdf (stiff): 1 
##  16 The current method indicator to be attempted on the next step,
##            1=adams (nonstiff), 2= bdf (stiff): 1 
##  
## --------------------
## RSTATE values
## --------------------
## 
##   1 The step size in t last used (successfully): 1 
##   2 The step size to be attempted on the next step: 1 
##   3 The current value of the independent variable which the solver has reached: 100.002 
##   4 Tolerance scale factor > 1.0 computed when requesting too much accuracy: 0 
##   5 The value of t at the time of the last method switch, if any: 0 
##

These diagnostics output may be useful for checking the kind of integration method used (ode() chooses on its own), and if there were problems. (I have not run enough of these yet to know how, specifically, to use the diagnostics)

4. Plot the output trajectory (solution).

plot(out, main="Constant Growth", lwd=2, xlab="Time", ylab="x")

Given our proclivity for plotting using ggplot2 function, we can also make the plot there, after conversion of the deSolve object into a data frame.

outplot=as.data.frame(out)

ggplot(data=outplot, aes(x=time, y=x)) +
  geom_point() + 
  geom_line() +
  xlab("Time") +
  ylab("x") +
  ggtitle("Constant Growth")

Expansion to Multiple Scenarios (Persons)

We can expand by plotting multiple scenarios (e.g., simualted trajectories for a set of selected or randomly chosen parameters (or starting values)).

First, we generate parameter values for a hypothetical sample of 30 persons.

#Creating a list of ids and parameter values 
outlist <- list()
plist <- cbind(id = 1:30,
               a = runif(30,min=0.1,max=.4))

Then, we use the ode() function to generate longitudinal data for those 30 persons

#Loop ode() through the matrix of parameter values and create a data set
#create individaul ids and parameters
plist <- cbind(id = 1:30,
               a = rnorm(30,mean=.25,sd=0.1))
#create a list to hold ode() generated values
outlist <- list()
#Loop through the matrix of parameter values
for(i in 1:nrow(plist)){
  outlist[[i]] <- ode(y = initvalues.1, times = times, func = model.constant, parms = plist[i,"a"])
}

#combne into data frame with original data
#creating data.frame holder 
outdata <- data.frame()
#converting plist and outlist into a combined data frame
for(i in 1:nrow(plist)){
  idata <- (cbind(id = plist[i,"id"], a = plist[i,"a"], outlist[[i]]))
  outdata <- rbind(outdata,idata)
}

And then plot the resulting longitudinal data.

#plotting
ggplot(data = outdata, aes(x = time, y = x, group = id, color=factor(id))) +
  #geom_point() + 
  geom_line() +
  xlab("Time") + 
  ylab("x") +
  scale_x_continuous(breaks=seq(0,100,by=10))

Ok - we have used the differential equation model and solver to generate multi-person, multi-occasion data of the observed output of the dynamic process.

Generally, there are two ways to fit differential equation models.
1. Integrate the model, and fit the integrated model to the observed data.
2. Differentiate the data, and fit the differential model to the estimated derivatives.

We illustrate each of these approaches.

B. Model Fitting: Integrating The Model Approach

Given the differential equation model \[\frac{dx}{dt} = a \] we can integrate to obtain \[\int_{t = 0}^{T} \frac{dx}{dt}(dt) = \int_{t = 0}^{T} a (dt)\] \[x_t = x_0 + at\] where \(x_0\) is a constant of integration (an intercept).

Placing the integrated model into a multilevel modeling framework for fitting to multi-person, multi-occasion data, we obtain \[x_{it} = x_{0i} + a_{i}t_{it} + e_{it}\] where \[x_{0i} = \gamma_{00} + u_{0i}\] \[a_{i} = \gamma_{10} + u_{1i}\]

This is then a typical linear growth model and can be fit using usual procedures (see https://quantdev.ssri.psu.edu/resources/growth-modeling-structural-equation-and-multilevel-modeling-approaches)

As a practical matter in our simulated example here, we first add a bit of noise to the observed scores. This is simply so that we do not have perfect fit (and we can actually estimate the model with an error term).

#copy of data
observedata <- outdata
#creating an error (noise) variable 
observedata$error <- rnorm(length(observedata$x), mean=0,sd=.01)
#creating an observed variable with noise
observedata$x_observed <- observedata$x + observedata$error

#plotting
ggplot(data = observedata, aes(x = time, y = x_observed, group = id, color=factor(id))) +
  #geom_point() + 
  geom_line() +
  xlab("Time") + 
  ylab("x") +
  scale_x_continuous(breaks=seq(0,100,by=10))

We then fit the growth model to thes data in the usual way using lme().

lineargrowth.lme <- lme(fixed= x_observed ~ 1 + time, 
                        random= ~ 1 + time|id, 
                        data=observedata,
                        na.action=na.exclude,
                        control = lmeControl(opt = "optim"))
summary(lineargrowth.lme)

## Linear mixed-effects model fit by REML
##  Data: observedata 
##         AIC       BIC   logLik
##   -18777.41 -18741.31 9394.704
## 
## Random effects:
##  Formula: ~1 + time | id
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev       Corr  
## (Intercept) 0.0002349036 (Intr)
## time        0.0934399467 0.471 
## Residual    0.0099741511       
## 
## Fixed effects: x_observed ~ 1 + time 
##                  Value  Std.Error   DF   t-value p-value
## (Intercept) 0.09966287 0.00036227 2999 275.10659       0
## time        0.20586048 0.01705972 2999  12.06705       0
##  Correlation: 
##      (Intr)
## time 0.055 
## 
## Standardized Within-Group Residuals:
##          Min           Q1          Med           Q3          Max 
## -3.440996242 -0.672290488  0.008798448  0.659228099  3.178890925 
## 
## Number of Observations: 3030
## Number of Groups: 30

We recover the parameters that were used to generate the data pretty well. The model integration approach to fitting works well. However, there are sitations where it is not practical or possible to obtain an integrated model.

C. Model Fitting: Differentiating the Data Approach

An alternative approach is to fit the differential equation model directly after differentiating the data.

We have observed \(x_it\) (as shown in plots above). We need to differentiate the observed time series. That is, to obtain estimates of \(dx{_it}\) for each observation.

One of the most straightforward ways to do this is using a Generalized Local Linear Approximation (GLLA) approach. In a future tutorial we will go into the details. For the moment we refer to materials and code from Steve Boker’s site (http://people.virginia.edu/~smb3u/GLLAfunctions.R)

Here, simply for illustration, we use slightly adapted (short) versions of Boker’s GLLA functions. The more general functions can be adapted for other settings (along with other functions for doing the time-delay embedding, e.g., embedd() in the tseriesChaos package)

The GLLA function specific to this example is.

GLLA <- function(x, embed, tau, deltaT, order=2) {
        L <- rep(1,embed)
        for(i in 1:order) {
            L <- cbind(L,(((c(1:embed)-mean(1:embed))*tau*deltaT)^i)/factorial(i))
            }
        
        W <- solve(t(L)%*%L)%*%t(L)
        #EMat <- Embed(x,embed,1)
        #replacement for Embed for embed = 5
          xzoo <- zoo(x)
      xzooembed <- cbind(lag(xzoo,k=-2), lag(xzoo,k=-1), xzoo, lag(xzoo,k=1), lag(xzoo,k=2))
      EMat <- as.matrix(xzooembed)
    #Continuation
        Estimates <- EMat%*%t(W)

        return(Estimates)
        }

We then loop our “observed” data through the GLLA function to obtain estimates of the “observed” derviatives, that is, to obtain the “differentiated data”.

# This is the for-loop that will go through the data and calculate derivatives.
# it goes through all of the elements of 'curr_data', uses GLLA
data_list <- plist[,"id"]
#create open object
newdata <- data.frame()
    for(i in 1:length(data_list)){
      #i <- 1 #for testing
        curr_data = observedata[which(observedata$id == data_list[i]),]
        derivs_x = GLLA(x = curr_data[,'x_observed'], embed = 5, tau = 1, deltaT = 1, order = 2)


        colnames(derivs_x) = c('x_est', 'dx_est', 'd2x_est')

        derivs_plus = cbind(curr_data, derivs_x)

    newdata <- rbind(newdata,derivs_plus)
    }

#look at data file
head(newdata)

##   id         a time         x        error x_observed     x_est    dx_est
## 1  1 0.2389715    0 0.1000000 -0.009685143 0.09031486        NA        NA
## 2  1 0.2389715    1 0.3389715 -0.011073182 0.32789827        NA        NA
## 3  1 0.2389715    2 0.5779429 -0.012519859 0.56542304 0.5675254 0.2404983
## 4  1 0.2389715    3 0.8169144 -0.005238281 0.81167607 0.8108712 0.2383292
## 5  1 0.2389715    4 1.0558858 -0.004968500 1.05091730 1.0470565 0.2390291
## 6  1 0.2389715    5 1.2948573 -0.018060313 1.27679694 1.2837854 0.2377161
##         d2x_est
## 1            NA
## 2            NA
## 3  0.0017205565
## 4 -0.0043288669
## 5 -0.0004922348
## 6  0.0020365042

Lets plot the derivatives data. These our our “observed” estimates of \(dx_{it}\).

#plotting
ggplot(data = newdata, aes(x = time, y = dx_est, group = id, color=factor(id))) +
  #geom_point() + 
  geom_line() +
  xlab("Time") + 
  ylab("dx") +
  scale_x_continuous(breaks=seq(0,100,by=10))

## Warning: Removed 120 rows containing missing values (geom_path).

Now we have the estimated derivatives in hand, and we can fit the derivative model, \(\frac{dx}{dt} = a\) directly. Again we place the model of interest into a multilevel framework that allows for examination of between-person differences. Specifically, our model becomes \[\left(\frac{dx}{dt} \right)_{it} = a_{i} + e_{it}\] where \[a_{i} = \gamma_{10} + u_{1i}\]

We can then fit this model to the differentiated data using lme().

diffgrowth.lme <- lme(fixed = dx_est ~ 1 , 
                        random= ~ 1|id, 
                        data=newdata,
                        na.action=na.exclude)
summary(diffgrowth.lme)

## Linear mixed-effects model fit by REML
##  Data: newdata 
##         AIC       BIC   logLik
##   -24905.36 -24887.43 12455.68
## 
## Random effects:
##  Formula: ~1 | id
##         (Intercept)    Residual
## StdDev:  0.09356322 0.003155285
## 
## Fixed effects: dx_est ~ 1 
##                 Value  Std.Error   DF  t-value p-value
## (Intercept) 0.2058621 0.01708233 2880 12.05118       0
## 
## Standardized Within-Group Residuals:
##          Min           Q1          Med           Q3          Max 
## -3.697895286 -0.700117165  0.004873065  0.650118543  3.694738110 
## 
## Number of Observations: 2910
## Number of Groups: 30

And, better for generalization and where we are headed, using nlme().

#fitting differential model
diffgrowth.nlme <- nlme(dx_est ~ a ,
                      fixed =  a ~ 1,
                      random = a ~ 1|id,
                      start = c(a = .2),
                        data=newdata,
                        na.action=na.exclude,
                        verbose=TRUE)

## 
## **Iteration 1
## LME step: Loglik: 12458.84, nlminb iterations: 1
## reStruct  parameters:
##        id 
## -3.372607 
## PNLS step: RSS =  0.02897145 
##  fixed effects: 0.2058621  
##  iterations: 2 
## Convergence crit. (must all become <= tolerance = 1e-05):
##        fixed     reStruct 
## 2.847596e-02 1.104576e-09 
## 
## **Iteration 2
## LME step: Loglik: 12458.84, nlminb iterations: 1
## reStruct  parameters:
##        id 
## -3.372607 
## PNLS step: RSS =  0.02897145 
##  fixed effects: 0.2058621  
##  iterations: 1 
## Convergence crit. (must all become <= tolerance = 1e-05):
##    fixed reStruct 
##        0        0

summary(diffgrowth.nlme)

## Nonlinear mixed-effects model fit by maximum likelihood
##   Model: dx_est ~ a 
##  Data: newdata 
##         AIC       BIC   logLik
##   -24911.68 -24893.75 12458.84
## 
## Random effects:
##  Formula: a ~ 1 | id
##                  a    Residual
## StdDev: 0.09199059 0.003155285
## 
## Fixed effects: a ~ 1 
##       Value Std.Error   DF  t-value p-value
## a 0.2058621 0.0167981 2880 12.25509       0
## 
## Standardized Within-Group Residuals:
##          Min           Q1          Med           Q3          Max 
## -3.697902321 -0.700119258  0.004866461  0.650136546  3.694732854 
## 
## Number of Observations: 2910
## Number of Groups: 30

Once we have obtained the model parameters, we can use those to obtain predicted trajectories.

We do the four steps as usual.

1. Write the differential equation as a function.

As a check, let us look at the equation used in the fitting process.

#to see equation to put into ODE function
diffgrowth.nlme$call$model

## dx_est ~ a

We then code this equation into a function (by replacing the ~ with an =).

#rewriting diffeq model as function
model.constant = function(times, y, parms) {
  ## times = current times
  ## y = current state vector: in this case the 0th derivative, c(x)
  ## parms = vector of model parameters.
  with(as.list(c(y,parms)),{
  derivs = c( 
    dx = a
    )
  ## Return
  list(derivs)
})
}

2. Specify vectors with the initial values, model parameters, and times over which to integrate.

#Initial Values
initvalues.1 <- c(x = 0.1)
#Model Parameters
#parms.1 <- c(a = 0.2)
#Times over which to integrate
times <- seq(0, 100, 1)

Pulling the estimated Model Parameters directly from the fitted model nlme object

#extracting values of the parameters 
parms.est <- summary(diffgrowth.nlme)$tTable[,"Value"]
#attaching names
names(parms.est) <- rownames(summary(diffgrowth.nlme)$tTable)
parms.est

##         a 
## 0.2058621

3. Solve the differential equation using numerical methods.

In this case we use the parmeters that were extracted from the nlme fitted object.

out <- ode(y = initvalues.1, times = times, func = model.constant, parms = parms.est)

This produces an object of class deSolve that can then be examined, summarized, used for diagnosis, and plotted.

4. Plot the estimated trajectory (given the initial values we supplied).

#convert deSolve object to data.frame
outplot=as.data.frame(out)

ggplot(data=outplot, aes(x=time, y=x)) +
  geom_point() + 
  geom_line() +
  xlab("Time") +
  ylab("Estimated x") +
  ggtitle("Estimated Constant Growth")

Yay!. We have differentiated the data (estiamtes obtained using GLLA), fit the differntial equation model, used the parameter estimates to generate the prototypical trajectory.

D. Coupled Damped Linear Oscillator Example

Now, lets try to do it for a more complicated coupled damped linear oscillator model - a 2nd order differential equation model.

Our coupled equations are …

\[\frac{d^2x}{dt^2} = \eta_x x + \zeta_x\frac{dx}{dt} + \gamma_1z\] \[\frac{d^2z}{dt^2} = \eta_z z + \zeta_z \frac{dz}{dt} + \gamma_2 x\] We do the four steps as usual.

1. specifying the model as a function

#Writing out the model as a function
model.coupleosc <- function(times, y , parms) {
  with(as.list(c(y, parms)), {
    
    derivs = c(
    dx = dx, 
    d2x = eta_x*x + zeta_x*dx + gamma1*z,
    dz = dz,
    d2z = eta_z*z + zeta_z*dz + gamma2*x
    )
    return(list(derivs))
  })
}

2. setting model parameters, initial values, times

parms.coupleosc  <- c(eta_x  = -0.2,    # frequency of x
                      zeta_x  = -.1,    # damping of x
                      eta_z  = -0.05,   # frequency of z
                      zeta_z = -0.3,    # damping of z
                      gamma1 = -.1,   # coupling parameter 1 (z on d2x)
                      gamma2 = +.1)   # coupling parameter 2 (x on d2z)
initvalues.coupleosc <- c(x = 3, dx = 0, z = 2, dz = 0)
times <- seq(0, 100, by = 1)

Actually, instead of generating just one trajectory, we generate trajectories for 30 individuals. Making multi-person data.

#Use a list if many unknown number of outputs
plist <- cbind(id = 1:30,
               eta_x  = rnorm(30,mean=-0.2,sd=.01),    # frequency of x
               zeta_x  = rnorm(30,mean=-0.1,sd=.01),    # damping of x
               eta_z  = rnorm(30,mean=-0.05,sd=.01),   # frequency of z
               zeta_z = rnorm(30,mean=-0.3,sd=.01),    # damping of z
               gamma1 = rnorm(30,mean=-0.0,sd=.00),   # coupling parameter 1 (z on d2x)
               gamma2 = rnorm(30,mean=+0.0,sd=.00))   # coupling parameter 2 (x on d2z))

3. Solving to get trajectories

#creatiing holder object
outlist <- list()
#Loop through the matrix of parameter values
for(i in 1:nrow(plist)){
  outlist[[i]] <- ode(y = initvalues.coupleosc, times = times, func = model.coupleosc, parms = plist[i,-1])
}

4. Plotting the implied trajectory (solution)

#creating data.frame holder 
outdata <- data.frame()
#converting plist and outlist into a combined data frame
for(i in 1:nrow(plist)){
  idata <- (cbind(id = plist[i,"id"], outlist[[i]]))
  outdata <- rbind(outdata,idata)
}
#plotting
ggplot(data = outdata, aes(x = time, group = id)) +
  #geom_point() + 
  geom_line(aes(y=x), color="red") +
  geom_line(aes(y=z), color="black") +
  xlab("Time") + 
  ylab("x") +
  scale_x_continuous(breaks=seq(0,100,by=10)) +
  facet_wrap(~ id)

Yay!

Using these data, we then go ahead and fit the model using the Differentiate The Data approach.

Lets estimate the derivatives.

#copy of data
observedata <- outdata

# This is the for-loop that will go through the data and calculate derivatives.
# it goes through all of the elements of 'curr_data', uses GLLA
data_list <- plist[,"id"]
#create open object
newdata <- data.frame()
    for(i in 1:length(data_list)){
      #i <- 1 #for testing
        curr_data = observedata[which(observedata$id == data_list[i]),]
        #estimate derivs for x variable
        derivs_x = GLLA(x = curr_data[,'x'], embed = 5, tau = 1, deltaT = 1, order = 2)
        colnames(derivs_x) = c('x_est', 'dx_est', 'd2x_est')
        #estimate derivs for z variable
        derivs_z = GLLA(x = curr_data[,'z'], embed = 5, tau = 1, deltaT = 1, order = 2)
        colnames(derivs_z) = c('z_est', 'dz_est', 'd2z_est')

        derivs_plus = cbind(curr_data, derivs_x, derivs_z)

    newdata <- rbind(newdata,derivs_plus)
    }

Ok - now we have recreated the “observed” (estimated) derivatives

head(newdata)

##   id time          x         dx        z          dz      x_est     dx_est
## 1  1    0  3.0000000  0.0000000 2.000000  0.00000000         NA         NA
## 2  1    1  2.7203321 -0.5405122 1.954484 -0.08649864         NA         NA
## 3  1    2  1.9705842 -0.9274688 1.835472 -0.14759779  1.9677048 -0.8117632
## 4  1    3  0.9355407 -1.1054833 1.666521 -0.18699907  0.9363783 -0.9831054
## 5  1    4 -0.1664204 -1.0630371 1.467465 -0.20842021 -0.1623913 -0.9570753
## 6  1    5 -1.1266927 -0.8301521 1.254482 -0.21542970 -1.1205331 -0.7583599
##       d2x_est    z_est     dz_est      d2z_est
## 1          NA       NA         NA           NA
## 2          NA       NA         NA           NA
## 3 -0.27569742 1.835791 -0.1353033 -0.051002542
## 4 -0.06970947 1.666834 -0.1768011 -0.031149706
## 5  0.12825848 1.467760 -0.2002372 -0.014911318
## 6  0.28275729 1.254750 -0.2091129 -0.002088306

Check that these correlate well with the actual derivatives.

cor(newdata$dz,newdata$dz_est, use = "pairwise.complete.obs")

## [1] 0.9996172

Indeed.

To fit the bivariate model using nlme(), we need to stack the data into double entry and include dummy variables.

names(newdata)

##  [1] "id"      "time"    "x"       "dx"      "z"       "dz"      "x_est"  
##  [8] "dx_est"  "d2x_est" "z_est"   "dz_est"  "d2z_est"

newdata2 <- melt(data=newdata,
               id.vars=c("id","time","x","dx","z","dz",
                         "x_est","dx_est",#"d2x_est",
                         "z_est","dz_est"#"d2z_est"
                         ),
               na.rm=FALSE, value.name="d2_var")


#reordering for convenience 
newdata2 <- newdata2[order(newdata2$id,newdata2$time,newdata2$variable),]

#look at updated data set
head(newdata2, 12)

##      id time          x         dx        z          dz      x_est
## 1     1    0  3.0000000  0.0000000 2.000000  0.00000000         NA
## 3031  1    0  3.0000000  0.0000000 2.000000  0.00000000         NA
## 2     1    1  2.7203321 -0.5405122 1.954484 -0.08649864         NA
## 3032  1    1  2.7203321 -0.5405122 1.954484 -0.08649864         NA
## 3     1    2  1.9705842 -0.9274688 1.835472 -0.14759779  1.9677048
## 3033  1    2  1.9705842 -0.9274688 1.835472 -0.14759779  1.9677048
## 4     1    3  0.9355407 -1.1054833 1.666521 -0.18699907  0.9363783
## 3034  1    3  0.9355407 -1.1054833 1.666521 -0.18699907  0.9363783
## 5     1    4 -0.1664204 -1.0630371 1.467465 -0.20842021 -0.1623913
## 3035  1    4 -0.1664204 -1.0630371 1.467465 -0.20842021 -0.1623913
## 6     1    5 -1.1266927 -0.8301521 1.254482 -0.21542970 -1.1205331
## 3036  1    5 -1.1266927 -0.8301521 1.254482 -0.21542970 -1.1205331
##          dx_est    z_est     dz_est variable       d2_var
## 1            NA       NA         NA  d2x_est           NA
## 3031         NA       NA         NA  d2z_est           NA
## 2            NA       NA         NA  d2x_est           NA
## 3032         NA       NA         NA  d2z_est           NA
## 3    -0.8117632 1.835791 -0.1353033  d2x_est -0.275697423
## 3033 -0.8117632 1.835791 -0.1353033  d2z_est -0.051002542
## 4    -0.9831054 1.666834 -0.1768011  d2x_est -0.069709471
## 3034 -0.9831054 1.666834 -0.1768011  d2z_est -0.031149706
## 5    -0.9570753 1.467760 -0.2002372  d2x_est  0.128258481
## 3035 -0.9570753 1.467760 -0.2002372  d2z_est -0.014911318
## 6    -0.7583599 1.254750 -0.2091129  d2x_est  0.282757285
## 3036 -0.7583599 1.254750 -0.2091129  d2z_est -0.002088306

#adding the two dummy indicators 
newdata2$var_x <- ifelse(newdata2$variable=="d2x_est", 1, 0)
newdata2$var_z <- ifelse(newdata2$variable=="d2z_est", 1, 0)
#adding another indicator
newdata2$varnum <- as.numeric(factor(newdata2$variable))

#look at updated data set
head(newdata2,12)

##      id time          x         dx        z          dz      x_est
## 1     1    0  3.0000000  0.0000000 2.000000  0.00000000         NA
## 3031  1    0  3.0000000  0.0000000 2.000000  0.00000000         NA
## 2     1    1  2.7203321 -0.5405122 1.954484 -0.08649864         NA
## 3032  1    1  2.7203321 -0.5405122 1.954484 -0.08649864         NA
## 3     1    2  1.9705842 -0.9274688 1.835472 -0.14759779  1.9677048
## 3033  1    2  1.9705842 -0.9274688 1.835472 -0.14759779  1.9677048
## 4     1    3  0.9355407 -1.1054833 1.666521 -0.18699907  0.9363783
## 3034  1    3  0.9355407 -1.1054833 1.666521 -0.18699907  0.9363783
## 5     1    4 -0.1664204 -1.0630371 1.467465 -0.20842021 -0.1623913
## 3035  1    4 -0.1664204 -1.0630371 1.467465 -0.20842021 -0.1623913
## 6     1    5 -1.1266927 -0.8301521 1.254482 -0.21542970 -1.1205331
## 3036  1    5 -1.1266927 -0.8301521 1.254482 -0.21542970 -1.1205331
##          dx_est    z_est     dz_est variable       d2_var var_x var_z
## 1            NA       NA         NA  d2x_est           NA     1     0
## 3031         NA       NA         NA  d2z_est           NA     0     1
## 2            NA       NA         NA  d2x_est           NA     1     0
## 3032         NA       NA         NA  d2z_est           NA     0     1
## 3    -0.8117632 1.835791 -0.1353033  d2x_est -0.275697423     1     0
## 3033 -0.8117632 1.835791 -0.1353033  d2z_est -0.051002542     0     1
## 4    -0.9831054 1.666834 -0.1768011  d2x_est -0.069709471     1     0
## 3034 -0.9831054 1.666834 -0.1768011  d2z_est -0.031149706     0     1
## 5    -0.9570753 1.467760 -0.2002372  d2x_est  0.128258481     1     0
## 3035 -0.9570753 1.467760 -0.2002372  d2z_est -0.014911318     0     1
## 6    -0.7583599 1.254750 -0.2091129  d2x_est  0.282757285     1     0
## 3036 -0.7583599 1.254750 -0.2091129  d2z_est -0.002088306     0     1
##      varnum
## 1         1
## 3031      2
## 2         1
## 3032      2
## 3         1
## 3033      2
## 4         1
## 3034      2
## 5         1
## 3035      2
## 6         1
## 3036      2

Now we are ready to fit the coupled model using nlme().

#the coupled model 
model.coupleosc_2 <- nlme(d2_var ~ eta_x*var_x*x_est + zeta_x*var_x*dx_est + gamma1*var_x*z_est +
                                   eta_z*var_z*z_est + zeta_z*var_z*dz_est + gamma2*var_z*x_est,
                        fixed = eta_x + zeta_x + gamma1 +
                                eta_z + zeta_z + gamma2 ~ 1,
                        random = eta_x + 
                                 eta_z ~ 1|id,
                        weights = varIdent(form = ~ 1|varnum),
                        corr = corSymm(form = ~ varnum|id/time),
                        start = c(eta_x  = -0.2,    # frequency of x
                                  zeta_x  = -.1,    # damping of x
                                  gamma1 = -.1,
                                  eta_z  = -0.05,    # frequency of x
                                  zeta_z  = -.3,    # damping of x
                                  gamma2 = +.1),
                        data=newdata2,
                        na.action = na.exclude,
                        verbose=TRUE)

## 
## **Iteration 1
## LME step: Loglik: 32682.03, nlminb iterations: 32
## reStruct  parameters:
##          id1          id2          id3 
## -1.231235640 -1.334228264 -0.008572995 
## corStruct  parameters:
## [1] 0.339888
## varStruct  parameters:
## [1] -2.224973
## PNLS step: RSS =  0.03990564 
##  fixed effects: -0.1851459  -0.0947971  0.0005576474  -0.04807466  -0.2903971  -1.5371e-05  
##  iterations: 2 
## Convergence crit. (must all become <= tolerance = 1e-05):
##        fixed     reStruct    corStruct    varStruct 
## 6506.7558730    0.2810968    1.0000000    1.0000000 
## 
## **Iteration 2
## LME step: Loglik: 32682.03, nlminb iterations: 1
## reStruct  parameters:
##       id1       id2       id3 
## -1.231236 -1.334228 -0.008573 
## corStruct  parameters:
## [1] 0.339888
## varStruct  parameters:
## [1] -2.224973
## PNLS step: RSS =  0.03990564 
##  fixed effects: -0.1851459  -0.0947971  0.0005576474  -0.04807466  -0.2903971  -1.5371e-05  
##  iterations: 1 
## Convergence crit. (must all become <= tolerance = 1e-05):
##     fixed  reStruct corStruct varStruct 
##         0         0         0         0

summary(model.coupleosc_2)

## Nonlinear mixed-effects model fit by maximum likelihood
##   Model: d2_var ~ eta_x * var_x * x_est + zeta_x * var_x * dx_est + gamma1 *      var_x * z_est + eta_z * var_z * z_est + zeta_z * var_z *      dz_est + gamma2 * var_z * x_est 
##  Data: newdata2 
##         AIC       BIC   logLik
##   -65340.05 -65260.02 32682.03
## 
## Random effects:
##  Formula: list(eta_x ~ 1, eta_z ~ 1)
##  Level: id
##  Structure: General positive-definite, Log-Cholesky parametrization
##          StdDev      Corr 
## eta_x    0.008974380 eta_x
## eta_z    0.009942684 0.033
## Residual 0.002618519      
## 
## Correlation Structure: General
##  Formula: ~varnum | id/time 
##  Parameter estimate(s):
##  Correlation: 
##   1     
## 2 -0.261
## Variance function:
##  Structure: Different standard deviations per stratum
##  Formula: ~1 | varnum 
##  Parameter estimates:
##         1         2 
## 1.0000000 0.1080703 
## Fixed effects: eta_x + zeta_x + gamma1 + eta_z + zeta_z + gamma2 ~ 1 
##              Value    Std.Error   DF    t-value p-value
## eta_x  -0.18514592 0.0016414336 5785  -112.7953  0.0000
## zeta_x -0.09479710 0.0002093646 5785  -452.7848  0.0000
## gamma1  0.00055765 0.0001537969 5785     3.6259  0.0003
## eta_z  -0.04807466 0.0018164575 5785   -26.4662  0.0000
## zeta_z -0.29039713 0.0001971931 5785 -1472.6533  0.0000
## gamma2 -0.00001537 0.0000093420 5785    -1.6454  0.0999
##  Correlation: 
##        eta_x  zeta_x gamma1 eta_z  zeta_z
## zeta_x  0.011                            
## gamma1  0.010  0.522                     
## eta_z   0.032  0.002 -0.001              
## zeta_z  0.001  0.108  0.058  0.014       
## gamma2 -0.012 -0.040 -0.044 -0.004 -0.369
## 
## Standardized Within-Group Residuals:
##           Min            Q1           Med            Q3           Max 
## -9.621405e+00 -4.371798e-02  1.735332e-05  4.725185e-02  7.948201e+00 
## 
## Number of Observations: 5820
## Number of Groups: 30

The parameter estiamtes are quite close to the parameters used to generate the data.

Lets now extract the parameters and plot the protypical trajectory.

1. specifying the model as a function

Getting equations from the nlme model.

model.coupleosc_2$call$model

## d2_var ~ eta_x * var_x * x_est + zeta_x * var_x * dx_est + gamma1 * 
##     var_x * z_est + eta_z * var_z * z_est + zeta_z * var_z * 
##     dz_est + gamma2 * var_z * x_est

This is not in the most useful form, but can be adapted for making the differential equation function. We follow the original equations more directly, keeping in mind that the function must be written as a set of first-order equations.

#Writing out the model as a function
model.coupleosc <- function(times, y , parms) {
  with(as.list(c(y, parms)), {
    
    derivs = c(
    dx = dx, 
    d2x = eta_x*x + zeta_x*dx + gamma1*z,
    dz = dz,
    d2z = eta_z*z + zeta_z*dz + gamma2*x
    )
    return(list(derivs))
  })
}

2. setting model parameters, initial values, times

#Pulling the estimated Model Parameters from the nlme object
#extracting values of the parameters 
parms.est <- summary(model.coupleosc_2)$tTable[,"Value"]
#attaching names
names(parms.est) <- rownames(summary(model.coupleosc_2)$tTable)
parms.est

##         eta_x        zeta_x        gamma1         eta_z        zeta_z 
## -0.1851459197 -0.0947971042  0.0005576474 -0.0480746556 -0.2903971324 
##        gamma2 
## -0.0000153710

#Setting initial values
initvalues.coupleosc <- c(x = 3, dx = 0, z = 2, dz = 0)
#setting times to integrate over.
times <- seq(0, 100, by = 1)

3. Solving to get trajectory

out <- ode(y = initvalues.coupleosc, times = times, func = model.coupleosc, parms = parms.est)
#plot(out,outlist, lwd=2)

4. Plotting the implied trajectory (solution for fitted model)

#converting to data.frame 
outdata <- as.data.frame(out)

#plotting
ggplot(data = outdata[which(outdata$time <=100),], aes(x = time)) +
  #geom_point() + 
  geom_line(aes(y=x), color="red") +
  geom_line(aes(y=z), color="black") +
  xlab("Time") + 
  ylab("x=red, z=black") +
  scale_x_continuous(breaks=seq(0,100,by=10))

Cool! - It is possible to set it all up in a way that can do the fitting and plotting in relatively straightforward way. We are making progress, and finding a relatively straightforward path through all the steps in the model estimation and results communication.

Thank you for playing!