In Part 1, we generated data from a variety of ODE models. Here in Part 2, we outline the entire fitting process. The code here is, in part, a “brain dump” - so we will look to flesh the details out a bit more later on.
In this session we cover …
A. Generating Data from a Differential Equation Model
B. Model Fitting: Integrating The Model Approach
C. Model Fitting: Differentiating the Data Approach
D. Coupled Damped Linear Oscillator Example
library(deSolve) #for the differential equation solvers
library(ggplot2) #for plotting
library(nlme) #for fitting mixed effets models
library(zoo) #for making time-series embedded data
library(reshape2) #for reshaping data
library(tseriesChaos) #for embedd function
Setting a seed becasue we are generating data.
set.seed(1234)
Let’s start with a very simple model - the constant (linear) growth model. In differential form, the model is …
\[\frac{dx}{dt} = a \]
We can generate some data from this model
#diffeq model as function
model.constant = function(times, y, parms) {
## times = current times
## y = current state vector: in this case the 0th derivative, c(x)
## parms = vector of model parameters.
with(as.list(c(y,parms)),{
derivs = c(
dx = a
)
## Return
list(derivs)
})
}
#Initial Values
initvalues.1 <- c(x = 0.1)
#Model Parameters
parms.1 <- c(a = 0.2)
#Times over which to integrate
times <- seq(0, 100, 1)
This is done using the ode()
function, which is itself a wrapper of the lsoda()
function (and a collection of similar functions for different types of integrations (e.g., Runge-Katta), some of which work better with particular types of models (e.g., stiff and non-stiff systems)).
out <- ode(y = initvalues.1, times = times, func = model.constant, parms = parms.1)
This produces an object of class deSolve that can then be examined, summarized, used for diagnosis, and plotted.
head(out)
## time x
## [1,] 0 0.1
## [2,] 1 0.3
## [3,] 2 0.5
## [4,] 3 0.7
## [5,] 4 0.9
## [6,] 5 1.1
summary(out)
## x
## Min. 0.100000
## 1st Qu. 5.100000
## Median 10.100000
## Mean 10.100000
## 3rd Qu. 15.100000
## Max. 20.100000
## N 101.000000
## sd 5.860034
diagnostics(out)
##
## --------------------
## lsoda return code
## --------------------
##
## return code (idid) = 2
## Integration was successful.
##
## --------------------
## INTEGER values
## --------------------
##
## 1 The return code : 2
## 2 The number of steps taken for the problem so far: 102
## 3 The number of function evaluations for the problem so far: 103
## 5 The method order last used (successfully): 2
## 6 The order of the method to be attempted on the next step: 2
## 7 If return flag =-4,-5: the largest component in error vector 0
## 8 The length of the real work array actually required: 36
## 9 The length of the integer work array actually required: 21
## 14 The number of Jacobian evaluations and LU decompositions so far: 0
## 15 The method indicator for the last succesful step,
## 1=adams (nonstiff), 2= bdf (stiff): 1
## 16 The current method indicator to be attempted on the next step,
## 1=adams (nonstiff), 2= bdf (stiff): 1
##
## --------------------
## RSTATE values
## --------------------
##
## 1 The step size in t last used (successfully): 1
## 2 The step size to be attempted on the next step: 1
## 3 The current value of the independent variable which the solver has reached: 100.002
## 4 Tolerance scale factor > 1.0 computed when requesting too much accuracy: 0
## 5 The value of t at the time of the last method switch, if any: 0
##
These diagnostics output may be useful for checking the kind of integration method used (ode()
chooses on its own), and if there were problems. (I have not run enough of these yet to know how, specifically, to use the diagnostics)
plot(out, main="Constant Growth", lwd=2, xlab="Time", ylab="x")
Given our proclivity for plotting using ggplot2
function, we can also make the plot there, after conversion of the deSolve object into a data frame.
outplot=as.data.frame(out)
ggplot(data=outplot, aes(x=time, y=x)) +
geom_point() +
geom_line() +
xlab("Time") +
ylab("x") +
ggtitle("Constant Growth")
We can expand by plotting multiple scenarios (e.g., simualted trajectories for a set of selected or randomly chosen parameters (or starting values)).
First, we generate parameter values for a hypothetical sample of 30 persons.
#Creating a list of ids and parameter values
outlist <- list()
plist <- cbind(id = 1:30,
a = runif(30,min=0.1,max=.4))
Then, we use the ode()
function to generate longitudinal data for those 30 persons
#Loop ode() through the matrix of parameter values and create a data set
#create individaul ids and parameters
plist <- cbind(id = 1:30,
a = rnorm(30,mean=.25,sd=0.1))
#create a list to hold ode() generated values
outlist <- list()
#Loop through the matrix of parameter values
for(i in 1:nrow(plist)){
outlist[[i]] <- ode(y = initvalues.1, times = times, func = model.constant, parms = plist[i,"a"])
}
#combne into data frame with original data
#creating data.frame holder
outdata <- data.frame()
#converting plist and outlist into a combined data frame
for(i in 1:nrow(plist)){
idata <- (cbind(id = plist[i,"id"], a = plist[i,"a"], outlist[[i]]))
outdata <- rbind(outdata,idata)
}
And then plot the resulting longitudinal data.
#plotting
ggplot(data = outdata, aes(x = time, y = x, group = id, color=factor(id))) +
#geom_point() +
geom_line() +
xlab("Time") +
ylab("x") +
scale_x_continuous(breaks=seq(0,100,by=10))
Ok - we have used the differential equation model and solver to generate multi-person, multi-occasion data of the observed output of the dynamic process.
Generally, there are two ways to fit differential equation models.
1. Integrate the model, and fit the integrated model to the observed data.
2. Differentiate the data, and fit the differential model to the estimated derivatives.
We illustrate each of these approaches.
Given the differential equation model \[\frac{dx}{dt} = a \] we can integrate to obtain \[\int_{t = 0}^{T} \frac{dx}{dt}(dt) = \int_{t = 0}^{T} a (dt)\] \[x_t = x_0 + at\] where \(x_0\) is a constant of integration (an intercept).
Placing the integrated model into a multilevel modeling framework for fitting to multi-person, multi-occasion data, we obtain \[x_{it} = x_{0i} + a_{i}t_{it} + e_{it}\] where \[x_{0i} = \gamma_{00} + u_{0i}\] \[a_{i} = \gamma_{10} + u_{1i}\]
This is then a typical linear growth model and can be fit using usual procedures (see https://quantdev.ssri.psu.edu/resources/growth-modeling-structural-equation-and-multilevel-modeling-approaches)
As a practical matter in our simulated example here, we first add a bit of noise to the observed scores. This is simply so that we do not have perfect fit (and we can actually estimate the model with an error term).
#copy of data
observedata <- outdata
#creating an error (noise) variable
observedata$error <- rnorm(length(observedata$x), mean=0,sd=.01)
#creating an observed variable with noise
observedata$x_observed <- observedata$x + observedata$error
#plotting
ggplot(data = observedata, aes(x = time, y = x_observed, group = id, color=factor(id))) +
#geom_point() +
geom_line() +
xlab("Time") +
ylab("x") +
scale_x_continuous(breaks=seq(0,100,by=10))
We then fit the growth model to thes data in the usual way using lme()
.
lineargrowth.lme <- lme(fixed= x_observed ~ 1 + time,
random= ~ 1 + time|id,
data=observedata,
na.action=na.exclude,
control = lmeControl(opt = "optim"))
summary(lineargrowth.lme)
## Linear mixed-effects model fit by REML
## Data: observedata
## AIC BIC logLik
## -18777.41 -18741.31 9394.704
##
## Random effects:
## Formula: ~1 + time | id
## Structure: General positive-definite, Log-Cholesky parametrization
## StdDev Corr
## (Intercept) 0.0002349036 (Intr)
## time 0.0934399467 0.471
## Residual 0.0099741511
##
## Fixed effects: x_observed ~ 1 + time
## Value Std.Error DF t-value p-value
## (Intercept) 0.09966287 0.00036227 2999 275.10659 0
## time 0.20586048 0.01705972 2999 12.06705 0
## Correlation:
## (Intr)
## time 0.055
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -3.440996242 -0.672290488 0.008798448 0.659228099 3.178890925
##
## Number of Observations: 3030
## Number of Groups: 30
We recover the parameters that were used to generate the data pretty well. The model integration approach to fitting works well. However, there are sitations where it is not practical or possible to obtain an integrated model.
An alternative approach is to fit the differential equation model directly after differentiating the data.
We have observed \(x_it\) (as shown in plots above). We need to differentiate the observed time series. That is, to obtain estimates of \(dx{_it}\) for each observation.
One of the most straightforward ways to do this is using a Generalized Local Linear Approximation (GLLA) approach. In a future tutorial we will go into the details. For the moment we refer to materials and code from Steve Boker’s site (http://people.virginia.edu/~smb3u/GLLAfunctions.R)
Here, simply for illustration, we use slightly adapted (short) versions of Boker’s GLLA functions. The more general functions can be adapted for other settings (along with other functions for doing the time-delay embedding, e.g., embedd()
in the tseriesChaos
package)
The GLLA function specific to this example is.
GLLA <- function(x, embed, tau, deltaT, order=2) {
L <- rep(1,embed)
for(i in 1:order) {
L <- cbind(L,(((c(1:embed)-mean(1:embed))*tau*deltaT)^i)/factorial(i))
}
W <- solve(t(L)%*%L)%*%t(L)
#EMat <- Embed(x,embed,1)
#replacement for Embed for embed = 5
xzoo <- zoo(x)
xzooembed <- cbind(lag(xzoo,k=-2), lag(xzoo,k=-1), xzoo, lag(xzoo,k=1), lag(xzoo,k=2))
EMat <- as.matrix(xzooembed)
#Continuation
Estimates <- EMat%*%t(W)
return(Estimates)
}
We then loop our “observed” data through the GLLA function to obtain estimates of the “observed” derviatives, that is, to obtain the “differentiated data”.
# This is the for-loop that will go through the data and calculate derivatives.
# it goes through all of the elements of 'curr_data', uses GLLA
data_list <- plist[,"id"]
#create open object
newdata <- data.frame()
for(i in 1:length(data_list)){
#i <- 1 #for testing
curr_data = observedata[which(observedata$id == data_list[i]),]
derivs_x = GLLA(x = curr_data[,'x_observed'], embed = 5, tau = 1, deltaT = 1, order = 2)
colnames(derivs_x) = c('x_est', 'dx_est', 'd2x_est')
derivs_plus = cbind(curr_data, derivs_x)
newdata <- rbind(newdata,derivs_plus)
}
#look at data file
head(newdata)
## id a time x error x_observed x_est dx_est
## 1 1 0.2389715 0 0.1000000 -0.009685143 0.09031486 NA NA
## 2 1 0.2389715 1 0.3389715 -0.011073182 0.32789827 NA NA
## 3 1 0.2389715 2 0.5779429 -0.012519859 0.56542304 0.5675254 0.2404983
## 4 1 0.2389715 3 0.8169144 -0.005238281 0.81167607 0.8108712 0.2383292
## 5 1 0.2389715 4 1.0558858 -0.004968500 1.05091730 1.0470565 0.2390291
## 6 1 0.2389715 5 1.2948573 -0.018060313 1.27679694 1.2837854 0.2377161
## d2x_est
## 1 NA
## 2 NA
## 3 0.0017205565
## 4 -0.0043288669
## 5 -0.0004922348
## 6 0.0020365042
Lets plot the derivatives data. These our our “observed” estimates of \(dx_{it}\).
#plotting
ggplot(data = newdata, aes(x = time, y = dx_est, group = id, color=factor(id))) +
#geom_point() +
geom_line() +
xlab("Time") +
ylab("dx") +
scale_x_continuous(breaks=seq(0,100,by=10))
## Warning: Removed 120 rows containing missing values (geom_path).
Now we have the estimated derivatives in hand, and we can fit the derivative model, \(\frac{dx}{dt} = a\) directly. Again we place the model of interest into a multilevel framework that allows for examination of between-person differences. Specifically, our model becomes \[\left(\frac{dx}{dt} \right)_{it} = a_{i} + e_{it}\] where \[a_{i} = \gamma_{10} + u_{1i}\]
We can then fit this model to the differentiated data using lme()
.
diffgrowth.lme <- lme(fixed = dx_est ~ 1 ,
random= ~ 1|id,
data=newdata,
na.action=na.exclude)
summary(diffgrowth.lme)
## Linear mixed-effects model fit by REML
## Data: newdata
## AIC BIC logLik
## -24905.36 -24887.43 12455.68
##
## Random effects:
## Formula: ~1 | id
## (Intercept) Residual
## StdDev: 0.09356322 0.003155285
##
## Fixed effects: dx_est ~ 1
## Value Std.Error DF t-value p-value
## (Intercept) 0.2058621 0.01708233 2880 12.05118 0
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -3.697895286 -0.700117165 0.004873065 0.650118543 3.694738110
##
## Number of Observations: 2910
## Number of Groups: 30
And, better for generalization and where we are headed, using nlme()
.
#fitting differential model
diffgrowth.nlme <- nlme(dx_est ~ a ,
fixed = a ~ 1,
random = a ~ 1|id,
start = c(a = .2),
data=newdata,
na.action=na.exclude,
verbose=TRUE)
##
## **Iteration 1
## LME step: Loglik: 12458.84, nlminb iterations: 1
## reStruct parameters:
## id
## -3.372607
## PNLS step: RSS = 0.02897145
## fixed effects: 0.2058621
## iterations: 2
## Convergence crit. (must all become <= tolerance = 1e-05):
## fixed reStruct
## 2.847596e-02 1.104576e-09
##
## **Iteration 2
## LME step: Loglik: 12458.84, nlminb iterations: 1
## reStruct parameters:
## id
## -3.372607
## PNLS step: RSS = 0.02897145
## fixed effects: 0.2058621
## iterations: 1
## Convergence crit. (must all become <= tolerance = 1e-05):
## fixed reStruct
## 0 0
summary(diffgrowth.nlme)
## Nonlinear mixed-effects model fit by maximum likelihood
## Model: dx_est ~ a
## Data: newdata
## AIC BIC logLik
## -24911.68 -24893.75 12458.84
##
## Random effects:
## Formula: a ~ 1 | id
## a Residual
## StdDev: 0.09199059 0.003155285
##
## Fixed effects: a ~ 1
## Value Std.Error DF t-value p-value
## a 0.2058621 0.0167981 2880 12.25509 0
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -3.697902321 -0.700119258 0.004866461 0.650136546 3.694732854
##
## Number of Observations: 2910
## Number of Groups: 30
Once we have obtained the model parameters, we can use those to obtain predicted trajectories.
We do the four steps as usual.
As a check, let us look at the equation used in the fitting process.
#to see equation to put into ODE function
diffgrowth.nlme$call$model
## dx_est ~ a
We then code this equation into a function (by replacing the ~
with an =
).
#rewriting diffeq model as function
model.constant = function(times, y, parms) {
## times = current times
## y = current state vector: in this case the 0th derivative, c(x)
## parms = vector of model parameters.
with(as.list(c(y,parms)),{
derivs = c(
dx = a
)
## Return
list(derivs)
})
}
#Initial Values
initvalues.1 <- c(x = 0.1)
#Model Parameters
#parms.1 <- c(a = 0.2)
#Times over which to integrate
times <- seq(0, 100, 1)
Pulling the estimated Model Parameters directly from the fitted model nlme object
#extracting values of the parameters
parms.est <- summary(diffgrowth.nlme)$tTable[,"Value"]
#attaching names
names(parms.est) <- rownames(summary(diffgrowth.nlme)$tTable)
parms.est
## a
## 0.2058621
In this case we use the parmeters that were extracted from the nlme fitted object.
out <- ode(y = initvalues.1, times = times, func = model.constant, parms = parms.est)
This produces an object of class deSolve that can then be examined, summarized, used for diagnosis, and plotted.
#convert deSolve object to data.frame
outplot=as.data.frame(out)
ggplot(data=outplot, aes(x=time, y=x)) +
geom_point() +
geom_line() +
xlab("Time") +
ylab("Estimated x") +
ggtitle("Estimated Constant Growth")
Yay!. We have differentiated the data (estiamtes obtained using GLLA), fit the differntial equation model, used the parameter estimates to generate the prototypical trajectory.
Now, lets try to do it for a more complicated coupled damped linear oscillator model - a 2nd order differential equation model.
Our coupled equations are …
\[\frac{d^2x}{dt^2} = \eta_x x + \zeta_x\frac{dx}{dt} + \gamma_1z\] \[\frac{d^2z}{dt^2} = \eta_z z + \zeta_z \frac{dz}{dt} + \gamma_2 x\] We do the four steps as usual.
#Writing out the model as a function
model.coupleosc <- function(times, y , parms) {
with(as.list(c(y, parms)), {
derivs = c(
dx = dx,
d2x = eta_x*x + zeta_x*dx + gamma1*z,
dz = dz,
d2z = eta_z*z + zeta_z*dz + gamma2*x
)
return(list(derivs))
})
}
parms.coupleosc <- c(eta_x = -0.2, # frequency of x
zeta_x = -.1, # damping of x
eta_z = -0.05, # frequency of z
zeta_z = -0.3, # damping of z
gamma1 = -.1, # coupling parameter 1 (z on d2x)
gamma2 = +.1) # coupling parameter 2 (x on d2z)
initvalues.coupleosc <- c(x = 3, dx = 0, z = 2, dz = 0)
times <- seq(0, 100, by = 1)
Actually, instead of generating just one trajectory, we generate trajectories for 30 individuals. Making multi-person data.
#Use a list if many unknown number of outputs
plist <- cbind(id = 1:30,
eta_x = rnorm(30,mean=-0.2,sd=.01), # frequency of x
zeta_x = rnorm(30,mean=-0.1,sd=.01), # damping of x
eta_z = rnorm(30,mean=-0.05,sd=.01), # frequency of z
zeta_z = rnorm(30,mean=-0.3,sd=.01), # damping of z
gamma1 = rnorm(30,mean=-0.0,sd=.00), # coupling parameter 1 (z on d2x)
gamma2 = rnorm(30,mean=+0.0,sd=.00)) # coupling parameter 2 (x on d2z))
#creatiing holder object
outlist <- list()
#Loop through the matrix of parameter values
for(i in 1:nrow(plist)){
outlist[[i]] <- ode(y = initvalues.coupleosc, times = times, func = model.coupleosc, parms = plist[i,-1])
}
#creating data.frame holder
outdata <- data.frame()
#converting plist and outlist into a combined data frame
for(i in 1:nrow(plist)){
idata <- (cbind(id = plist[i,"id"], outlist[[i]]))
outdata <- rbind(outdata,idata)
}
#plotting
ggplot(data = outdata, aes(x = time, group = id)) +
#geom_point() +
geom_line(aes(y=x), color="red") +
geom_line(aes(y=z), color="black") +
xlab("Time") +
ylab("x") +
scale_x_continuous(breaks=seq(0,100,by=10)) +
facet_wrap(~ id)
Yay!
Using these data, we then go ahead and fit the model using the Differentiate The Data approach.
Lets estimate the derivatives.
#copy of data
observedata <- outdata
# This is the for-loop that will go through the data and calculate derivatives.
# it goes through all of the elements of 'curr_data', uses GLLA
data_list <- plist[,"id"]
#create open object
newdata <- data.frame()
for(i in 1:length(data_list)){
#i <- 1 #for testing
curr_data = observedata[which(observedata$id == data_list[i]),]
#estimate derivs for x variable
derivs_x = GLLA(x = curr_data[,'x'], embed = 5, tau = 1, deltaT = 1, order = 2)
colnames(derivs_x) = c('x_est', 'dx_est', 'd2x_est')
#estimate derivs for z variable
derivs_z = GLLA(x = curr_data[,'z'], embed = 5, tau = 1, deltaT = 1, order = 2)
colnames(derivs_z) = c('z_est', 'dz_est', 'd2z_est')
derivs_plus = cbind(curr_data, derivs_x, derivs_z)
newdata <- rbind(newdata,derivs_plus)
}
Ok - now we have recreated the “observed” (estimated) derivatives
head(newdata)
## id time x dx z dz x_est dx_est
## 1 1 0 3.0000000 0.0000000 2.000000 0.00000000 NA NA
## 2 1 1 2.7203321 -0.5405122 1.954484 -0.08649864 NA NA
## 3 1 2 1.9705842 -0.9274688 1.835472 -0.14759779 1.9677048 -0.8117632
## 4 1 3 0.9355407 -1.1054833 1.666521 -0.18699907 0.9363783 -0.9831054
## 5 1 4 -0.1664204 -1.0630371 1.467465 -0.20842021 -0.1623913 -0.9570753
## 6 1 5 -1.1266927 -0.8301521 1.254482 -0.21542970 -1.1205331 -0.7583599
## d2x_est z_est dz_est d2z_est
## 1 NA NA NA NA
## 2 NA NA NA NA
## 3 -0.27569742 1.835791 -0.1353033 -0.051002542
## 4 -0.06970947 1.666834 -0.1768011 -0.031149706
## 5 0.12825848 1.467760 -0.2002372 -0.014911318
## 6 0.28275729 1.254750 -0.2091129 -0.002088306
Check that these correlate well with the actual derivatives.
cor(newdata$dz,newdata$dz_est, use = "pairwise.complete.obs")
## [1] 0.9996172
Indeed.
To fit the bivariate model using nlme(), we need to stack the data into double entry and include dummy variables.
names(newdata)
## [1] "id" "time" "x" "dx" "z" "dz" "x_est"
## [8] "dx_est" "d2x_est" "z_est" "dz_est" "d2z_est"
newdata2 <- melt(data=newdata,
id.vars=c("id","time","x","dx","z","dz",
"x_est","dx_est",#"d2x_est",
"z_est","dz_est"#"d2z_est"
),
na.rm=FALSE, value.name="d2_var")
#reordering for convenience
newdata2 <- newdata2[order(newdata2$id,newdata2$time,newdata2$variable),]
#look at updated data set
head(newdata2, 12)
## id time x dx z dz x_est
## 1 1 0 3.0000000 0.0000000 2.000000 0.00000000 NA
## 3031 1 0 3.0000000 0.0000000 2.000000 0.00000000 NA
## 2 1 1 2.7203321 -0.5405122 1.954484 -0.08649864 NA
## 3032 1 1 2.7203321 -0.5405122 1.954484 -0.08649864 NA
## 3 1 2 1.9705842 -0.9274688 1.835472 -0.14759779 1.9677048
## 3033 1 2 1.9705842 -0.9274688 1.835472 -0.14759779 1.9677048
## 4 1 3 0.9355407 -1.1054833 1.666521 -0.18699907 0.9363783
## 3034 1 3 0.9355407 -1.1054833 1.666521 -0.18699907 0.9363783
## 5 1 4 -0.1664204 -1.0630371 1.467465 -0.20842021 -0.1623913
## 3035 1 4 -0.1664204 -1.0630371 1.467465 -0.20842021 -0.1623913
## 6 1 5 -1.1266927 -0.8301521 1.254482 -0.21542970 -1.1205331
## 3036 1 5 -1.1266927 -0.8301521 1.254482 -0.21542970 -1.1205331
## dx_est z_est dz_est variable d2_var
## 1 NA NA NA d2x_est NA
## 3031 NA NA NA d2z_est NA
## 2 NA NA NA d2x_est NA
## 3032 NA NA NA d2z_est NA
## 3 -0.8117632 1.835791 -0.1353033 d2x_est -0.275697423
## 3033 -0.8117632 1.835791 -0.1353033 d2z_est -0.051002542
## 4 -0.9831054 1.666834 -0.1768011 d2x_est -0.069709471
## 3034 -0.9831054 1.666834 -0.1768011 d2z_est -0.031149706
## 5 -0.9570753 1.467760 -0.2002372 d2x_est 0.128258481
## 3035 -0.9570753 1.467760 -0.2002372 d2z_est -0.014911318
## 6 -0.7583599 1.254750 -0.2091129 d2x_est 0.282757285
## 3036 -0.7583599 1.254750 -0.2091129 d2z_est -0.002088306
#adding the two dummy indicators
newdata2$var_x <- ifelse(newdata2$variable=="d2x_est", 1, 0)
newdata2$var_z <- ifelse(newdata2$variable=="d2z_est", 1, 0)
#adding another indicator
newdata2$varnum <- as.numeric(factor(newdata2$variable))
#look at updated data set
head(newdata2,12)
## id time x dx z dz x_est
## 1 1 0 3.0000000 0.0000000 2.000000 0.00000000 NA
## 3031 1 0 3.0000000 0.0000000 2.000000 0.00000000 NA
## 2 1 1 2.7203321 -0.5405122 1.954484 -0.08649864 NA
## 3032 1 1 2.7203321 -0.5405122 1.954484 -0.08649864 NA
## 3 1 2 1.9705842 -0.9274688 1.835472 -0.14759779 1.9677048
## 3033 1 2 1.9705842 -0.9274688 1.835472 -0.14759779 1.9677048
## 4 1 3 0.9355407 -1.1054833 1.666521 -0.18699907 0.9363783
## 3034 1 3 0.9355407 -1.1054833 1.666521 -0.18699907 0.9363783
## 5 1 4 -0.1664204 -1.0630371 1.467465 -0.20842021 -0.1623913
## 3035 1 4 -0.1664204 -1.0630371 1.467465 -0.20842021 -0.1623913
## 6 1 5 -1.1266927 -0.8301521 1.254482 -0.21542970 -1.1205331
## 3036 1 5 -1.1266927 -0.8301521 1.254482 -0.21542970 -1.1205331
## dx_est z_est dz_est variable d2_var var_x var_z
## 1 NA NA NA d2x_est NA 1 0
## 3031 NA NA NA d2z_est NA 0 1
## 2 NA NA NA d2x_est NA 1 0
## 3032 NA NA NA d2z_est NA 0 1
## 3 -0.8117632 1.835791 -0.1353033 d2x_est -0.275697423 1 0
## 3033 -0.8117632 1.835791 -0.1353033 d2z_est -0.051002542 0 1
## 4 -0.9831054 1.666834 -0.1768011 d2x_est -0.069709471 1 0
## 3034 -0.9831054 1.666834 -0.1768011 d2z_est -0.031149706 0 1
## 5 -0.9570753 1.467760 -0.2002372 d2x_est 0.128258481 1 0
## 3035 -0.9570753 1.467760 -0.2002372 d2z_est -0.014911318 0 1
## 6 -0.7583599 1.254750 -0.2091129 d2x_est 0.282757285 1 0
## 3036 -0.7583599 1.254750 -0.2091129 d2z_est -0.002088306 0 1
## varnum
## 1 1
## 3031 2
## 2 1
## 3032 2
## 3 1
## 3033 2
## 4 1
## 3034 2
## 5 1
## 3035 2
## 6 1
## 3036 2
Now we are ready to fit the coupled model using nlme()
.
#the coupled model
model.coupleosc_2 <- nlme(d2_var ~ eta_x*var_x*x_est + zeta_x*var_x*dx_est + gamma1*var_x*z_est +
eta_z*var_z*z_est + zeta_z*var_z*dz_est + gamma2*var_z*x_est,
fixed = eta_x + zeta_x + gamma1 +
eta_z + zeta_z + gamma2 ~ 1,
random = eta_x +
eta_z ~ 1|id,
weights = varIdent(form = ~ 1|varnum),
corr = corSymm(form = ~ varnum|id/time),
start = c(eta_x = -0.2, # frequency of x
zeta_x = -.1, # damping of x
gamma1 = -.1,
eta_z = -0.05, # frequency of x
zeta_z = -.3, # damping of x
gamma2 = +.1),
data=newdata2,
na.action = na.exclude,
verbose=TRUE)
##
## **Iteration 1
## LME step: Loglik: 32682.03, nlminb iterations: 32
## reStruct parameters:
## id1 id2 id3
## -1.231235640 -1.334228264 -0.008572995
## corStruct parameters:
## [1] 0.339888
## varStruct parameters:
## [1] -2.224973
## PNLS step: RSS = 0.03990564
## fixed effects: -0.1851459 -0.0947971 0.0005576474 -0.04807466 -0.2903971 -1.5371e-05
## iterations: 2
## Convergence crit. (must all become <= tolerance = 1e-05):
## fixed reStruct corStruct varStruct
## 6506.7558730 0.2810968 1.0000000 1.0000000
##
## **Iteration 2
## LME step: Loglik: 32682.03, nlminb iterations: 1
## reStruct parameters:
## id1 id2 id3
## -1.231236 -1.334228 -0.008573
## corStruct parameters:
## [1] 0.339888
## varStruct parameters:
## [1] -2.224973
## PNLS step: RSS = 0.03990564
## fixed effects: -0.1851459 -0.0947971 0.0005576474 -0.04807466 -0.2903971 -1.5371e-05
## iterations: 1
## Convergence crit. (must all become <= tolerance = 1e-05):
## fixed reStruct corStruct varStruct
## 0 0 0 0
summary(model.coupleosc_2)
## Nonlinear mixed-effects model fit by maximum likelihood
## Model: d2_var ~ eta_x * var_x * x_est + zeta_x * var_x * dx_est + gamma1 * var_x * z_est + eta_z * var_z * z_est + zeta_z * var_z * dz_est + gamma2 * var_z * x_est
## Data: newdata2
## AIC BIC logLik
## -65340.05 -65260.02 32682.03
##
## Random effects:
## Formula: list(eta_x ~ 1, eta_z ~ 1)
## Level: id
## Structure: General positive-definite, Log-Cholesky parametrization
## StdDev Corr
## eta_x 0.008974380 eta_x
## eta_z 0.009942684 0.033
## Residual 0.002618519
##
## Correlation Structure: General
## Formula: ~varnum | id/time
## Parameter estimate(s):
## Correlation:
## 1
## 2 -0.261
## Variance function:
## Structure: Different standard deviations per stratum
## Formula: ~1 | varnum
## Parameter estimates:
## 1 2
## 1.0000000 0.1080703
## Fixed effects: eta_x + zeta_x + gamma1 + eta_z + zeta_z + gamma2 ~ 1
## Value Std.Error DF t-value p-value
## eta_x -0.18514592 0.0016414336 5785 -112.7953 0.0000
## zeta_x -0.09479710 0.0002093646 5785 -452.7848 0.0000
## gamma1 0.00055765 0.0001537969 5785 3.6259 0.0003
## eta_z -0.04807466 0.0018164575 5785 -26.4662 0.0000
## zeta_z -0.29039713 0.0001971931 5785 -1472.6533 0.0000
## gamma2 -0.00001537 0.0000093420 5785 -1.6454 0.0999
## Correlation:
## eta_x zeta_x gamma1 eta_z zeta_z
## zeta_x 0.011
## gamma1 0.010 0.522
## eta_z 0.032 0.002 -0.001
## zeta_z 0.001 0.108 0.058 0.014
## gamma2 -0.012 -0.040 -0.044 -0.004 -0.369
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -9.621405e+00 -4.371798e-02 1.735332e-05 4.725185e-02 7.948201e+00
##
## Number of Observations: 5820
## Number of Groups: 30
The parameter estiamtes are quite close to the parameters used to generate the data.
Lets now extract the parameters and plot the protypical trajectory.
Getting equations from the nlme model.
model.coupleosc_2$call$model
## d2_var ~ eta_x * var_x * x_est + zeta_x * var_x * dx_est + gamma1 *
## var_x * z_est + eta_z * var_z * z_est + zeta_z * var_z *
## dz_est + gamma2 * var_z * x_est
This is not in the most useful form, but can be adapted for making the differential equation function. We follow the original equations more directly, keeping in mind that the function must be written as a set of first-order equations.
#Writing out the model as a function
model.coupleosc <- function(times, y , parms) {
with(as.list(c(y, parms)), {
derivs = c(
dx = dx,
d2x = eta_x*x + zeta_x*dx + gamma1*z,
dz = dz,
d2z = eta_z*z + zeta_z*dz + gamma2*x
)
return(list(derivs))
})
}
#Pulling the estimated Model Parameters from the nlme object
#extracting values of the parameters
parms.est <- summary(model.coupleosc_2)$tTable[,"Value"]
#attaching names
names(parms.est) <- rownames(summary(model.coupleosc_2)$tTable)
parms.est
## eta_x zeta_x gamma1 eta_z zeta_z
## -0.1851459197 -0.0947971042 0.0005576474 -0.0480746556 -0.2903971324
## gamma2
## -0.0000153710
#Setting initial values
initvalues.coupleosc <- c(x = 3, dx = 0, z = 2, dz = 0)
#setting times to integrate over.
times <- seq(0, 100, by = 1)
out <- ode(y = initvalues.coupleosc, times = times, func = model.coupleosc, parms = parms.est)
#plot(out,outlist, lwd=2)
#converting to data.frame
outdata <- as.data.frame(out)
#plotting
ggplot(data = outdata[which(outdata$time <=100),], aes(x = time)) +
#geom_point() +
geom_line(aes(y=x), color="red") +
geom_line(aes(y=z), color="black") +
xlab("Time") +
ylab("x=red, z=black") +
scale_x_continuous(breaks=seq(0,100,by=10))
Cool! - It is possible to set it all up in a way that can do the fitting and plotting in relatively straightforward way. We are making progress, and finding a relatively straightforward path through all the steps in the model estimation and results communication.
Thank you for playing!