This tutorial walks through the fitting of linear growth modeling in several different frameworks (e.g., multilevel modeling framework, structural equation modeling framework), and demonstrates these models using different R packages (knowing how to fit the models in different packages can be helpful when trying to fit more complex models as each packages as its own advantages and limitations). In this tutorial, we will be using a sample data set that includes repeated measures of individuals’ mathematics achievement through elementary and middle school.
The code and example provided in this tutorial are from Chapter 3 of Grimm, Ram, and Estabrook (2016), with a few additions in code and commentary. Please refer to the chapter for further interpretations and insights about the analyses.
This tutorial provides line-by-line code to
#set filepath
filepath<-"https://quantdev.ssri.psu.edu/sites/qdev/files/nlsy_math_long_R.csv"
#read in the .csv file using the url() function
nlsy_math_long <- read.csv(file=url(filepath),header=TRUE)
#add names the columns of the data set
names(nlsy_math_long) <- c('id', 'female', 'low_birth_weight', 'anti_k1', 'math', 'grade', 'occ', 'age', 'men', 'spring', 'anti')
#view the first few observations in the data set
head(nlsy_math_long)
## id female low_birth_weight anti_k1 math grade occ age men spring anti
## 1 201 1 0 0 38 3 2 111 0 1 0
## 2 201 1 0 0 55 5 3 135 1 1 0
## 3 303 1 0 1 26 2 2 121 0 1 2
## 4 303 1 0 1 33 5 3 145 0 1 2
## 5 2702 0 0 0 56 2 2 100 NA 1 0
## 6 2702 0 0 0 58 4 3 125 NA 1 2
#allows for objects in the data set to be accessed be name
attach(nlsy_math_long)
#calling the libraries we will need throughout this tutorial
library(ggplot2)
library(lavaan)
library(lme4)
library(nlme)
As covered in the Chapter 2 tutorial, it is important to plot the data to obtain a better understanding of the structure and form of the observed phenomenon. Here, we want to examine the data to make sure a growth model would be an appropriate analysis for the data (i.e., we need to check that there is in fact growth to model).
#creating a plot and assigning it to an object
plot_obs <- ggplot(data=nlsy_math_long, #data set
aes(x = grade, y = math, group = id)) + #calling variables
geom_line() + #adding lines to plot
theme_bw() + #changing style/background
scale_x_continuous(breaks = 2:8, name = "Grade") + #creating breaks in the x-axis and labeling the x-axis
scale_y_continuous(name = "PIAT Mathematics") #creating breaks in the y-axis and labeling the y-axis
#print the object (plot)
print(plot_obs)
There appears to be growth in mathematics scores over time, as can be seen in our plot above. To begin the analysis we will fit a no growth model. This no growth model will act as our “null” model to which to compare later models.
#fitting no growth model and assigning it to an object
ng.math.lme <- lme(math ~ 1, random= ~1 |id, data = nlsy_math_long, method="ML")
#obtaining summary of the model using the object we just created
summary(ng.math.lme)
## Linear mixed-effects model fit by maximum likelihood
## Data: nlsy_math_long
## AIC BIC logLik
## 17497.9 17515.02 -8745.952
##
## Random effects:
## Formula: ~1 | id
## (Intercept) Residual
## StdDev: 6.849591 10.80195
##
## Fixed effects: math ~ 1
## Value Std.Error DF t-value p-value
## (Intercept) 45.91468 0.3237961 1289 141.8012 0
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.79834566 -0.57828004 0.05407981 0.57885522 2.52690025
##
## Number of Observations: 2221
## Number of Groups: 932
#obtaining confidence intervals of model parameters
intervals(ng.math.lme)
## Approximate 95% confidence intervals
##
## Fixed effects:
## lower est. upper
## (Intercept) 45.2796 45.91468 46.54977
## attr(,"label")
## [1] "Fixed effects:"
##
## Random Effects:
## Level: id
## lower est. upper
## sd((Intercept)) 6.19195 6.849591 7.577078
##
## Within-group standard error:
## lower est. upper
## 10.39711 10.80195 11.22255
#fitting no growth model and assigning it to an object
ng.math.nlme <- nlme(math ~ beta_1 + d_1i,
data=nlsy_math_long,
fixed=beta_1~1,
random=d_1i~1,
group=~id,
start=c(beta_1=40),
na.action = "na.exclude")
#obtaining summary of the model using the object we just created
summary(ng.math.nlme)
## Nonlinear mixed-effects model fit by maximum likelihood
## Model: math ~ beta_1 + d_1i
## Data: nlsy_math_long
## AIC BIC logLik
## 17497.9 17515.02 -8745.952
##
## Random effects:
## Formula: d_1i ~ 1 | id
## d_1i Residual
## StdDev: 6.849581 10.80196
##
## Fixed effects: beta_1 ~ 1
## Value Std.Error DF t-value p-value
## beta_1 45.91468 0.323796 1289 141.8013 0
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.79834502 -0.57827917 0.05408079 0.57885591 2.52690020
##
## Number of Observations: 2221
## Number of Groups: 932
#fitting no growth model and assigning it to an object
ng.math.nlme <- nlme(math~b_1i,
data=nlsy_math_long,
fixed=b_1i~1,
random=b_1i~1|id,
start=c(b_1i=40),
na.action = "na.exclude")
#obtaining summary of the model using the object we just created
summary(ng.math.nlme)
## Nonlinear mixed-effects model fit by maximum likelihood
## Model: math ~ b_1i
## Data: nlsy_math_long
## AIC BIC logLik
## 17497.9 17515.02 -8745.952
##
## Random effects:
## Formula: b_1i ~ 1 | id
## b_1i Residual
## StdDev: 6.849582 10.80196
##
## Fixed effects: b_1i ~ 1
## Value Std.Error DF t-value p-value
## b_1i 45.91468 0.323796 1289 141.8013 0
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.79834512 -0.57827919 0.05408079 0.57885592 2.52690029
##
## Number of Observations: 2221
## Number of Groups: 932
#fitting no growth model and assigning it to an object
ng.math.lmer <- lmer(math ~ 1 + (1 | id),
data = nlsy_math_long,
REML = FALSE,
na.action = "na.exclude")
#obtaining summary of the model using the object we just created
summary(ng.math.lmer)
## Linear mixed model fit by maximum likelihood ['lmerMod']
## Formula: math ~ 1 + (1 | id)
## Data: nlsy_math_long
##
## AIC BIC logLik deviance df.resid
## 17497.9 17515.0 -8746.0 17491.9 2218
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.79835 -0.57828 0.05408 0.57886 2.52690
##
## Random effects:
## Groups Name Variance Std.Dev.
## id (Intercept) 46.92 6.85
## Residual 116.68 10.80
## Number of obs: 2221, groups: id, 932
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 45.9147 0.3237 141.8
First we need to reshape the data from long to wide.
names(nlsy_math_long)
## [1] "id" "female" "low_birth_weight"
## [4] "anti_k1" "math" "grade"
## [7] "occ" "age" "men"
## [10] "spring" "anti"
# names of variables in data
#c("id","female","low_birth_weight","anti_k1","math","grade","occ","age","men","spring","anti")
nlsy_math_wide <- reshape(data=nlsy_math_long,
timevar=c("grade"),
idvar=c("id"),
v.names=c("math","occ","age","men","spring","anti"),
direction="wide", sep="_")
#view the first few observations in the data set
head(nlsy_math_wide)
## id female low_birth_weight anti_k1 math_3 occ_3 age_3 men_3 spring_3
## 1 201 1 0 0 38 2 111 0 1
## 3 303 1 0 1 NA NA NA NA NA
## 5 2702 0 0 0 NA NA NA NA NA
## 8 4303 1 0 0 41 2 115 0 0
## 10 5002 0 0 4 NA NA NA NA NA
## 13 5005 1 0 0 NA NA NA NA NA
## anti_3 math_5 occ_5 age_5 men_5 spring_5 anti_5 math_2 occ_2 age_2
## 1 0 55 3 135 1 1 0 NA NA NA
## 3 NA 33 3 145 0 1 2 26 2 121
## 5 NA NA NA NA NA NA NA 56 2 100
## 8 1 NA NA NA NA NA NA NA NA NA
## 10 NA NA NA NA NA NA NA NA NA NA
## 13 NA NA NA NA NA NA NA 35 2 93
## men_2 spring_2 anti_2 math_4 occ_4 age_4 men_4 spring_4 anti_4 math_8
## 1 NA NA NA NA NA NA NA NA NA NA
## 3 0 1 2 NA NA NA NA NA NA NA
## 5 NA 1 0 58 3 125 NA 1 2 80
## 8 NA NA NA 58 3 135 0 1 2 NA
## 10 NA NA NA 46 2 117 NA 1 4 66
## 13 0 0 2 50 3 115 0 1 2 59
## occ_8 age_8 men_8 spring_8 anti_8 math_6 occ_6 age_6 men_6 spring_6
## 1 NA NA NA NA NA NA NA NA NA NA
## 3 NA NA NA NA NA NA NA NA NA NA
## 5 4 173 NA 1 2 NA NA NA NA NA
## 8 NA NA NA NA NA NA NA NA NA NA
## 10 4 167 NA 1 5 54 3 145 NA 0
## 13 5 163 1 1 3 60 4 139 0 1
## anti_6 math_7 occ_7 age_7 men_7 spring_7 anti_7
## 1 NA NA NA NA NA NA NA
## 3 NA NA NA NA NA NA NA
## 5 NA NA NA NA NA NA NA
## 8 NA NA NA NA NA NA NA
## 10 1 NA NA NA NA NA NA
## 13 1 NA NA NA NA NA NA
#might engage in a bit of reordering for easy viewing
Now, we set up the no growth model as a lavaan model
#writing out no growth model in full SEM way
ng.math.lavaan_model <- '
# latent variable definitions
#intercept
eta_1 =~ 1*math_2
eta_1 =~ 1*math_3
eta_1 =~ 1*math_4
eta_1 =~ 1*math_5
eta_1 =~ 1*math_6
eta_1 =~ 1*math_7
eta_1 =~ 1*math_8
# factor variances
eta_1 ~~ eta_1
# covariances among factors
#none (only 1 factor)
# factor means
eta_1 ~ start(30)*1
# manifest variances (made equivalent by naming theta)
math_2 ~~ theta*math_2
math_3 ~~ theta*math_3
math_4 ~~ theta*math_4
math_5 ~~ theta*math_5
math_6 ~~ theta*math_6
math_7 ~~ theta*math_7
math_8 ~~ theta*math_8
# manifest means (fixed at zero)
math_2 ~ 0*1
math_3 ~ 0*1
math_4 ~ 0*1
math_5 ~ 0*1
math_6 ~ 0*1
math_7 ~ 0*1
math_8 ~ 0*1
' #end of model definition
And fit the model to the data …
#estimating the model using sem() function
ng.math.lavaan_fit <- sem(ng.math.lavaan_model,
data = nlsy_math_wide,
meanstructure = TRUE,
estimator = "ML",
missing = "fiml")
## Warning in lav_data_full(data = data, group = group, group.label =
## group.label, : lavaan WARNING: due to missing values, some pairwise
## combinations have less than 10% coverage
summary(ng.math.lavaan_fit)
## lavaan (0.5-22) converged normally after 18 iterations
##
## Number of observations 932
##
## Number of missing patterns 60
##
## Estimator ML
## Minimum Function Test Statistic 1759.002
## Degrees of freedom 32
## P-value (Chi-square) 0.000
##
## Parameter Estimates:
##
## Information Observed
## Standard Errors Standard
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|)
## eta_1 =~
## math_2 1.000
## math_3 1.000
## math_4 1.000
## math_5 1.000
## math_6 1.000
## math_7 1.000
## math_8 1.000
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|)
## eta_1 45.915 0.324 141.721 0.000
## .math_2 0.000
## .math_3 0.000
## .math_4 0.000
## .math_5 0.000
## .math_6 0.000
## .math_7 0.000
## .math_8 0.000
##
## Variances:
## Estimate Std.Err z-value P(>|z|)
## eta_1 46.917 4.832 9.709 0.000
## .math_2 (thet) 116.682 4.548 25.656 0.000
## .math_3 (thet) 116.682 4.548 25.656 0.000
## .math_4 (thet) 116.682 4.548 25.656 0.000
## .math_5 (thet) 116.682 4.548 25.656 0.000
## .math_6 (thet) 116.682 4.548 25.656 0.000
## .math_7 (thet) 116.682 4.548 25.656 0.000
## .math_8 (thet) 116.682 4.548 25.656 0.000
fitMeasures(ng.math.lavaan_fit)
## npar fmin chisq
## 3.000 0.944 1759.002
## df pvalue baseline.chisq
## 32.000 0.000 862.333
## baseline.df baseline.pvalue cfi
## 21.000 0.000 0.000
## tli nnfi rfi
## -0.347 -0.347 1.000
## nfi pnfi ifi
## -1.040 -1.584 -1.080
## rni logl unrestricted.logl
## -1.053 -8745.952 -7866.451
## aic bic ntotal
## 17497.903 17512.415 932.000
## bic2 rmsea rmsea.ci.lower
## 17502.888 0.241 0.231
## rmsea.ci.upper rmsea.pvalue rmr
## 0.250 0.000 38.332
## rmr_nomean srmr srmr_bentler
## 38.332 0.480 0.480
## srmr_bentler_nomean srmr_bollen srmr_bollen_nomean
## 0.362 0.624 0.339
## srmr_mplus srmr_mplus_nomean cn_05
## 0.675 0.444 25.476
## cn_01 gfi agfi
## 29.339 0.937 0.931
## pgfi mfi ecvi
## 0.856 0.396 NA
#Other summaries
#parameterEstimates(ng.math.lavaan_fit)
#inspect(ng.math.lavaan_fit, what="est")
Next, we fit a linear growth model using and comparing several different functions and packages.
#creating object with what was originally grade 2 as now the zero (starting) point
grade_c2 <- grade - 2
#fitting linear growth model and assigning it to an object
lg.math.lme <- lme(math ~ grade_c2, random= ~ grade_c2 |id, data = nlsy_math_long, method="ML")
#obtaining summary of the model using the object we just created
summary(lg.math.lme)
## Linear mixed-effects model fit by maximum likelihood
## Data: nlsy_math_long
## AIC BIC logLik
## 15949.39 15983.62 -7968.693
##
## Random effects:
## Formula: ~grade_c2 | id
## Structure: General positive-definite, Log-Cholesky parametrization
## StdDev Corr
## (Intercept) 8.035040 (Intr)
## grade_c2 0.855900 -0.026
## Residual 6.019114
##
## Fixed effects: math ~ grade_c2
## Value Std.Error DF t-value p-value
## (Intercept) 35.26736 0.3551568 1288 99.30081 0
## grade_c2 4.33933 0.0873767 1288 49.66231 0
## Correlation:
## (Intr)
## grade_c2 -0.532
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -3.030277540 -0.525876497 0.001633029 0.540751608 2.542250503
##
## Number of Observations: 2221
## Number of Groups: 932
#fitting linear growth model and assigning it to an object
lg.math.nlme <- nlme(math~(beta_1+d_1i)+(beta_2+d_2i)*(grade-2),
data=nlsy_math_long,
fixed=beta_1+beta_2~1,
random=d_1i+d_2i~1,
group=~id,
start=c(beta_1=35,beta_2=4),
na.action = "na.exclude")
#obtaining summary of the model using the object we just created
summary (lg.math.nlme)
## Nonlinear mixed-effects model fit by maximum likelihood
## Model: math ~ (beta_1 + d_1i) + (beta_2 + d_2i) * (grade - 2)
## Data: nlsy_math_long
## AIC BIC logLik
## 15949.39 15983.62 -7968.693
##
## Random effects:
## Formula: list(d_1i ~ 1, d_2i ~ 1)
## Level: id
## Structure: General positive-definite, Log-Cholesky parametrization
## StdDev Corr
## d_1i 8.0350382 d_1i
## d_2i 0.8558994 -0.026
## Residual 6.0191140
##
## Fixed effects: beta_1 + beta_2 ~ 1
## Value Std.Error DF t-value p-value
## beta_1 35.26736 0.3551568 1288 99.30082 0
## beta_2 4.33933 0.0873767 1288 49.66231 0
## Correlation:
## beta_1
## beta_2 -0.532
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -3.030277731 -0.525876489 0.001632883 0.540751587 2.542250204
##
## Number of Observations: 2221
## Number of Groups: 932
#fitting linear growth model and assigning it to an object
lg.math.nlme <- nlme(math~b_1i+b_2i*(grade-2),
data=nlsy_math_long,
fixed=b_1i+b_2i~1,
random=b_1i+b_2i~1|id,
start=c(b_1i=35, b_2i=4))
#obtaining summary of the model using the object we just created
summary(lg.math.nlme)
## Nonlinear mixed-effects model fit by maximum likelihood
## Model: math ~ b_1i + b_2i * (grade - 2)
## Data: nlsy_math_long
## AIC BIC logLik
## 15949.39 15983.62 -7968.693
##
## Random effects:
## Formula: list(b_1i ~ 1, b_2i ~ 1)
## Level: id
## Structure: General positive-definite, Log-Cholesky parametrization
## StdDev Corr
## b_1i 8.0350377 b_1i
## b_2i 0.8558995 -0.026
## Residual 6.0191139
##
## Fixed effects: b_1i + b_2i ~ 1
## Value Std.Error DF t-value p-value
## b_1i 35.26736 0.3551567 1288 99.30083 0
## b_2i 4.33933 0.0873767 1288 49.66231 0
## Correlation:
## b_1i
## b_2i -0.532
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -3.030277750 -0.525876492 0.001632883 0.540751591 2.542250220
##
## Number of Observations: 2221
## Number of Groups: 932
#fitting linear growth model and assigning it to an object
lg.math.lmer <- lmer(math ~ 1 + grade_c2 + (1 + grade_c2 | id), data = nlsy_math_long, REML = FALSE)
#obtaining summary of the model using the object we just created
summary(lg.math.lmer)
## Linear mixed model fit by maximum likelihood ['lmerMod']
## Formula: math ~ 1 + grade_c2 + (1 + grade_c2 | id)
## Data: nlsy_math_long
##
## AIC BIC logLik deviance df.resid
## 15949.4 15983.6 -7968.7 15937.4 2215
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.03028 -0.52588 0.00163 0.54075 2.54225
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 64.5618 8.0350
## grade_c2 0.7326 0.8559 -0.03
## Residual 36.2297 6.0191
## Number of obs: 2221, groups: id, 932
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 35.26736 0.35500 99.35
## grade_c2 4.33933 0.08734 49.68
##
## Correlation of Fixed Effects:
## (Intr)
## grade_c2 -0.532
#writing out linear growth model in full SEM way
lg.math.lavaan_model <- '
# latent variable definitions
#intercept (note intercept is a reserved term)
eta_1 =~ 1*math_2
eta_1 =~ 1*math_3
eta_1 =~ 1*math_4
eta_1 =~ 1*math_5
eta_1 =~ 1*math_6
eta_1 =~ 1*math_7
eta_1 =~ 1*math_8
#linear slope (note intercept is a reserved term)
eta_2 =~ 0*math_2
eta_2 =~ 1*math_3
eta_2 =~ 2*math_4
eta_2 =~ 3*math_5
eta_2 =~ 4*math_6
eta_2 =~ 5*math_7
eta_2 =~ 6*math_8
# factor variances
eta_1 ~~ eta_1
eta_2 ~~ eta_2
# covariances among factors
eta_1 ~~ eta_2
# factor means
eta_1 ~ start(35)*1
eta_2 ~ start(4)*1
# manifest variances (made equivalent by naming theta)
math_2 ~~ theta*math_2
math_3 ~~ theta*math_3
math_4 ~~ theta*math_4
math_5 ~~ theta*math_5
math_6 ~~ theta*math_6
math_7 ~~ theta*math_7
math_8 ~~ theta*math_8
# manifest means (fixed at zero)
math_2 ~ 0*1
math_3 ~ 0*1
math_4 ~ 0*1
math_5 ~ 0*1
math_6 ~ 0*1
math_7 ~ 0*1
math_8 ~ 0*1
' #end of model definition
And fit the model to the data …
#estimating the model using sem() function
lg.math.lavaan_fit <- sem(lg.math.lavaan_model,
data = nlsy_math_wide,
meanstructure = TRUE,
estimator = "ML",
missing = "fiml")
## Warning in lav_data_full(data = data, group = group, group.label =
## group.label, : lavaan WARNING: due to missing values, some pairwise
## combinations have less than 10% coverage
summary(lg.math.lavaan_fit)
## lavaan (0.5-22) converged normally after 27 iterations
##
## Number of observations 932
##
## Number of missing patterns 60
##
## Estimator ML
## Minimum Function Test Statistic 204.484
## Degrees of freedom 29
## P-value (Chi-square) 0.000
##
## Parameter Estimates:
##
## Information Observed
## Standard Errors Standard
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|)
## eta_1 =~
## math_2 1.000
## math_3 1.000
## math_4 1.000
## math_5 1.000
## math_6 1.000
## math_7 1.000
## math_8 1.000
## eta_2 =~
## math_2 0.000
## math_3 1.000
## math_4 2.000
## math_5 3.000
## math_6 4.000
## math_7 5.000
## math_8 6.000
##
## Covariances:
## Estimate Std.Err z-value P(>|z|)
## eta_1 ~~
## eta_2 -0.181 1.150 -0.158 0.875
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|)
## eta_1 35.267 0.355 99.229 0.000
## eta_2 4.339 0.088 49.136 0.000
## .math_2 0.000
## .math_3 0.000
## .math_4 0.000
## .math_5 0.000
## .math_6 0.000
## .math_7 0.000
## .math_8 0.000
##
## Variances:
## Estimate Std.Err z-value P(>|z|)
## eta_1 64.562 5.659 11.408 0.000
## eta_2 0.733 0.327 2.238 0.025
## .math_2 (thet) 36.230 1.867 19.410 0.000
## .math_3 (thet) 36.230 1.867 19.410 0.000
## .math_4 (thet) 36.230 1.867 19.410 0.000
## .math_5 (thet) 36.230 1.867 19.410 0.000
## .math_6 (thet) 36.230 1.867 19.410 0.000
## .math_7 (thet) 36.230 1.867 19.410 0.000
## .math_8 (thet) 36.230 1.867 19.410 0.000
fitMeasures(lg.math.lavaan_fit)
## npar fmin chisq
## 6.000 0.110 204.484
## df pvalue baseline.chisq
## 29.000 0.000 862.333
## baseline.df baseline.pvalue cfi
## 21.000 0.000 0.791
## tli nnfi rfi
## 0.849 0.849 0.828
## nfi pnfi ifi
## 0.763 1.053 0.789
## rni logl unrestricted.logl
## 0.791 -7968.693 -7866.451
## aic bic ntotal
## 15949.386 15978.410 932.000
## bic2 rmsea rmsea.ci.lower
## 15959.354 0.081 0.070
## rmsea.ci.upper rmsea.pvalue rmr
## 0.091 0.000 10.840
## rmr_nomean srmr srmr_bentler
## 10.840 0.121 0.121
## srmr_bentler_nomean srmr_bollen srmr_bollen_nomean
## 0.097 0.174 0.076
## srmr_mplus srmr_mplus_nomean cn_05
## 0.179 0.088 194.967
## cn_01 gfi agfi
## 227.012 0.990 0.988
## pgfi mfi ecvi
## 0.820 0.910 NA
#Other summaries
#parameterEstimates(ng.math.lavaan_fit)
#inspect(ng.math.lavaan_fit, what="est")
We continue using results from our linear growth model using the “nlme” function above to obtain and plot predicted trajectories and estimates of mathematics achievement.
#obtaining individual estimates of the intercept and slope using fixed and random effects estimates and storing this information into objects
b_1i_hat <- ranef(lg.math.nlme)[,1] + fixef(lg.math.nlme)[1]
b_2i_hat <- ranef(lg.math.nlme)[,2] + fixef(lg.math.nlme)[2]
#creating list of ids from row names of the nlme linear model object
child_id <- as.numeric(rownames(ranef(lg.math.nlme)))
#creating new data set that contains information on each child's id, intercept estimate, and slope estimate
estimates <- data.frame(child_id, b_1i_hat, b_2i_hat)
#creating new data set by merging original data set with the estimates data set we just created
estimates1 = merge(x = nlsy_math_long, y = estimates,
by.x = c('id'), by.y = c('child_id'),
all = TRUE)
#creating new variables that calculate the predicted and residual mathematics achievement scores for each child
estimates1$pred = estimates1$b_1i_hat + estimates1$b_2i_hat * (estimates1$grade - 2)
estimates1$resid = estimates1$math - estimates1$pred
#creating a plot of predicted values and assigning it to an object
plot_pred <- ggplot(data=estimates1,
aes(x = grade, y = pred, group = id)) +
geom_line() +
theme_bw() +
scale_x_continuous(breaks = 2:8, name = "Grade") +
scale_y_continuous(name = "PIAT Mathematics - Predictions")
#printing the object (plot)
print(plot_pred)
#creating a plot of residual values and assigning it to an object
plot_resid <- ggplot(data=estimates1,
aes(x = grade, y = resid, group = id)) +
geom_line() +
theme_bw() +
scale_x_continuous(breaks = 2:8, name = "Grade") +
scale_y_continuous(name = "PIAT Mathematics - Predictions")
#printing the object (plot)
print(plot_resid)
detach(nlsy_math_long)
This tutorial has presented several ways to set up and analyze no growth and linear models for longitudinal data, as well as plot the predicted trajectories and residuals of these models.