Longitudinal Factor Analysis - Measurement Invariance

Overview

In this tutorial we walk through the very basics of testing measurement invariance in the context if a longitudinal factor model. This is not a comprehensive coverage, just something to get started. This is a one-off done as part of a guest lecture.

Outline

In this session we cover …

A. Introduction to Factorial Invariance
B. Configural Invariance Model C. Weak Invariance Model D. Strong Invariance Model E. Comparisons

Prelim - Loading libraries used in this script.

library(psych)
library(ggplot2)
library(corrplot) #plotting correlation matrices
library(lavaan)  #for fitting structural equation models
library(semPlot)  #for automatically making diagrams

A. Introduction to the Common Factor Model

Recall from last time that the basic factor analysis model can be written as a matrix equation …

\[ \boldsymbol{Y_{i}} = \boldsymbol{\Lambda}\boldsymbol{F_{i}} + \boldsymbol{U_{i}} \]

where $\boldsymbol{Y_{i}}$ is a $p$ x 1 vector of observed variable scores, $\boldsymbol{\Lambda}$ is a p x q matrix of factor loadings, $\boldsymbol{F_{i}}$ is a $q$ x 1 vector of common factor scores, and $\boldsymbol{U_{i}}$ is a p x 1 vector of unique factor scores.

We can rewrite the model in terms of variance-covariance and mean expectations. For example, the expected covariance matrix, $\boldsymbol{\Sigma} = \boldsymbol{Y}'\boldsymbol{Y}$, becomes

\[ \boldsymbol{\Sigma} = \boldsymbol{\Lambda}\boldsymbol{\Psi}\boldsymbol{\Lambda}' + \boldsymbol{\Theta} \] where $\boldsymbol{\Sigma}$ is a p x p covariance (or correlation) matrix of the observed variables, $\boldsymbol{\Lambda}$ is a p x q matrix of factor loadings, $\boldsymbol{\Psi}$ is a q x q covariance matrix of the latent factor variables, and $\boldsymbol{\Theta}$ is a diagonal matrix of unique factor variances.

and the expected p x 1 mean vector, $ $ becomes \[ \boldsymbol{\mu} = \boldsymbol{\tau} + \boldsymbol{\Lambda}\boldsymbol{\alpha} \]

where $\boldsymbol{\tau}$ is a p x 1 vector of manifest variable means, $\boldsymbol{\Lambda}$ is a p x q matrix of factor loadings, and $\boldsymbol{\alpha}$ is a q x 1 vector of latent variable means.

We can then extend the model to multipe-group or multiple-occasion settings with a group or occasion-specific subscript, so that (e.g., for the two-occasion case)

\[ \boldsymbol{\Sigma_g} = \boldsymbol{\Lambda_g}\boldsymbol{\Psi_g}\boldsymbol{\Lambda_g}' + \boldsymbol{\Theta_g} \] and \[ \boldsymbol{\mu_g} = \boldsymbol{\tau_g} + \boldsymbol{\Lambda_g}\boldsymbol{\alpha_g} \]

Different levels of measurement invariance are established by testing (or requiring) that various matrices are equal.

Specifically, …
For Configural Invaraince we establish that the structure of $\boldsymbol{\Lambda_g}$ is equivalent across occasions.

For Weak Invariance we additionally establish that the factor loadings are equivalent across occasions, $\boldsymbol{\Lambda_g} = \boldsymbol{\Lambda}$.

For Strong Invariance we additionally establish that the manifest means are equivalent across occasions, $\boldsymbol{\tau_g} = \boldsymbol{\tau}$. and

For Strict Invariance we additionally establish that the residaul variances are also equivalent across occasions $\boldsymbol{\Theta_g} = \boldsymbol{\Theta}$.

Measurement Invariance testsing is usually conduced within a Structural Equation Modeling (SEM) framework. Here we illustrate how this may be done using the lavaan package. Lots of good information and instruction (including about invariance testing - in a multiple-group setting) can be found on the package website … http://lavaan.ugent.be.

Prelim - Reading in Repeated Measures Data

For this example, we use classic data from Osborne & Sudick (1972) that has been the basis for many examples in Jack McArdle and colleagues’ papers, courses, and workshops.
The data were obtained from 204 participants (rows) who completed a vareity of cognitive ability tests that are part of the WISC.

#set filepath for data file
filepath <- "https://quantdev.ssri.psu.edu/sites/qdev/files/WISC_MIexample.csv"
#read in the .csv file using the url() function
dat <- read.csv(file=url(filepath),header=TRUE)

Prelim - Descriptives

Lets have a quick look at the data file and the descriptives.

#data structure
head(dat,10)

##    id  info1  comp1  simi1  voca1  info6  comp6  simi6  voca6
## 1   1 31.287 25.627 22.932 22.215 69.883 44.424 68.045 51.162
## 2   2 13.801 14.787  7.581 15.373 41.871 44.862 33.897 37.741
## 3   3 34.970 34.675 28.052 26.841 60.424 50.260 35.844 55.477
## 4   4 24.795 31.391  8.208 20.197 52.865 42.669 45.802 35.987
## 5   5 25.263 30.263 15.977 35.417 67.368 86.654 72.368 60.417
## 6   6 15.402 23.399 11.453 20.560 46.437 52.956 22.537 47.716
## 7   7 15.380 -1.253  2.318 13.004 53.977 36.341 39.912 35.373
## 8   8 19.883  6.704 14.160 14.868 26.901 33.020 21.679 22.763
## 9   9 12.632 13.847 10.276 23.224 54.737 40.163 36.591 54.803
## 10 10 23.690 25.446 11.416 21.786 45.119 52.232 53.508 53.929

#descriptives (means, sds)
describe(dat[,-1])

##       vars   n  mean    sd median trimmed   mad   min   max range  skew
## info1    1 204 19.78  6.12  19.13   19.77  5.53  1.04 34.97 33.93 -0.06
## comp1    2 204 21.80  9.74  21.90   21.98 10.32 -1.25 46.49 47.74 -0.13
## simi1    3 204 14.90  7.56  14.54   14.57  7.83 -4.52 37.76 42.28  0.39
## voca1    4 204 20.40  6.29  19.84   20.11  6.52  2.77 39.66 36.89  0.37
## info6    5 204 48.51 12.79  46.34   47.80 13.27 24.76 80.00 55.24  0.47
## comp6    6 204 45.17 12.97  44.53   45.22 12.32  0.19 86.65 86.47 -0.01
## simi6    7 204 41.30 14.52  39.44   40.68 14.42 14.29 77.96 63.67  0.39
## voca6    8 204 44.45 11.05  43.80   44.43 12.48 18.17 69.87 51.70  0.01
##       kurtosis   se
## info1     0.17 0.43
## comp1    -0.49 0.68
## simi1    -0.14 0.53
## voca1     0.02 0.44
## info6    -0.65 0.90
## comp6     0.65 0.91
## simi6    -0.38 1.02
## voca6    -0.63 0.77

#correlation matrix
round(cor(dat[,-1]),2)

##       info1 comp1 simi1 voca1 info6 comp6 simi6 voca6
## info1  1.00  0.51  0.54  0.56  0.47  0.36  0.44  0.49
## comp1  0.51  1.00  0.45  0.58  0.40  0.36  0.40  0.46
## simi1  0.54  0.45  1.00  0.44  0.35  0.29  0.33  0.38
## voca1  0.56  0.58  0.44  1.00  0.55  0.44  0.54  0.60
## info6  0.47  0.40  0.35  0.55  1.00  0.62  0.67  0.75
## comp6  0.36  0.36  0.29  0.44  0.62  1.00  0.53  0.70
## simi6  0.44  0.40  0.33  0.54  0.67  0.53  1.00  0.66
## voca6  0.49  0.46  0.38  0.60  0.75  0.70  0.66  1.00

#visusal correlation matrix
corrplot(cor(dat[,-1]), order = "original", tl.col='black', tl.cex=.75)

B. Configural Invariance Model

Configural invariance model

Model Specification

configural <- '
# Define the latent factors.
verb_1 =~ NA*info1 + lambda1*info1 + voca1 + comp1 + simi1
verb_6 =~ NA*info6 + lambda1*info6 + voca6 + comp6 + simi6

# Intercepts
info1 ~ i1*1
voca1 ~ 1
comp1 ~ 1
simi1 ~ 1
info6 ~ i1*1
voca6 ~ 1
comp6 ~ 1
simi6 ~ 1

# Unique Variances
info1 ~~ info1
voca1 ~~ voca1
comp1 ~~ comp1
simi1 ~~ simi1
info6 ~~ info6
voca6 ~~ voca6
comp6 ~~ comp6
simi6 ~~ simi6

# Latent Variable Means
verb_1 ~ 0*1
verb_6 ~ 1

# Latent Variable Variances and Covariance
verb_1 ~~ 1*verb_1
verb_6 ~~ verb_6
verb_1 ~~ verb_6
'

Model Estimation and Interpretation

#Model fitting
fit_configural <- cfa(configural, data = dat, mimic = "mplus")
summary(fit_configural, fit.measures = TRUE)

## lavaan (0.5-23.1097) converged normally after  99 iterations
## 
##   Number of observations                           204
## 
##   Number of missing patterns                         1
## 
##   Estimator                                         ML
##   Minimum Function Test Statistic               25.968
##   Degrees of freedom                                19
##   P-value (Chi-square)                           0.131
## 
## Model test baseline model:
## 
##   Minimum Function Test Statistic              847.740
##   Degrees of freedom                                28
##   P-value                                        0.000
## 
## User model versus baseline model:
## 
##   Comparative Fit Index (CFI)                    0.991
##   Tucker-Lewis Index (TLI)                       0.987
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -5601.115
##   Loglikelihood unrestricted model (H1)      -5588.131
## 
##   Number of free parameters                         25
##   Akaike (AIC)                               11252.229
##   Bayesian (BIC)                             11335.182
##   Sample-size adjusted Bayesian (BIC)        11255.975
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.042
##   90 Percent Confidence Interval          0.000  0.079
##   P-value RMSEA <= 0.05                          0.588
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.031
## 
## Parameter Estimates:
## 
##   Information                                 Observed
##   Standard Errors                             Standard
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   verb_1 =~                                           
##     info1   (lmb1)    4.451    0.400   11.137    0.000
##     voca1             5.039    0.396   12.728    0.000
##     comp1             6.850    0.637   10.745    0.000
##     simi1             4.590    0.520    8.821    0.000
##   verb_6 =~                                           
##     info6   (lmb1)    4.451    0.400   11.137    0.000
##     voca6             4.102    0.453    9.057    0.000
##     comp6             4.006    0.489    8.194    0.000
##     simi6             4.551    0.545    8.346    0.000
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   verb_1 ~~                                           
##     verb_6            1.837    0.215    8.558    0.000
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .info1     (i1)   19.776    0.427   46.273    0.000
##    .voca1            20.396    0.439   46.416    0.000
##    .comp1            21.797    0.680   32.036    0.000
##    .simi1            14.903    0.528   28.223    0.000
##    .info6     (i1)   19.776    0.427   46.273    0.000
##    .voca6            17.970    1.844    9.747    0.000
##    .comp6            19.317    2.299    8.404    0.000
##    .simi6            11.922    2.516    4.738    0.000
##     verb_1            0.000                           
##     verb_6            6.455    0.606   10.649    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .info1            17.448    2.240    7.789    0.000
##    .voca1            13.999    2.133    6.563    0.000
##    .comp1            47.511    5.754    8.257    0.000
##    .simi1            35.810    4.031    8.884    0.000
##    .info6            47.096    6.432    7.322    0.000
##    .voca6            23.267    4.182    5.564    0.000
##    .comp6            73.850    8.388    8.805    0.000
##    .simi6            88.920   10.354    8.588    0.000
##     verb_1            1.000                           
##     verb_6            5.834    1.167    4.997    0.000

#Model Diagram 
semPaths(fit_configural, what="est", 
         sizeLat = 7, sizeMan = 7, edge.label.cex = .75)

## Warning in qgraph(Edgelist, labels = nLab, bidirectional = Bidir, directed
## = Directed, : The following arguments are not documented and likely not
## arguments of qgraph and thus ignored: loopRotation; residuals; residScale;
## residEdge; CircleEdgeEnd

We can also add the possibility that each of the unique factors are correlated across the two occasions

Configural invariance model with correlated uniquenesses

Model Specification

corrunique <- '
# Define the latent factors.
verb_1 =~ NA*info1 + lambda1*info1 + voca1 + comp1 + simi1
verb_6 =~ NA*info6 + lambda1*info6 + voca6 + comp6 + simi6

# Intercepts
info1 ~ i1*1
voca1 ~ 1
comp1 ~ 1
simi1 ~ 1
info6 ~ i1*1
voca6 ~ 1
comp6 ~ 1
simi6 ~ 1

# Unique Variances and Covariances
info1 ~~ info1
voca1 ~~ voca1
comp1 ~~ comp1
simi1 ~~ simi1
info6 ~~ info6
voca6 ~~ voca6
comp6 ~~ comp6
simi6 ~~ simi6
info1 ~~ info6
voca1 ~~ voca6
comp1 ~~ comp6
simi1 ~~ simi6

# Latent Variable Means
verb_1 ~ 0*1
verb_6 ~ 1

# Latent Variable Variances and Covariance
verb_1 ~~ 1*verb_1
verb_6 ~~ verb_6
verb_1 ~~ verb_6
'

Model Estimation and Interpretation

#Model estimation
fit_corrunique <- cfa(corrunique, data = dat, mimic = "mplus")
summary(fit_corrunique, fit.measures = TRUE)

## lavaan (0.5-23.1097) converged normally after 126 iterations
## 
##   Number of observations                           204
## 
##   Number of missing patterns                         1
## 
##   Estimator                                         ML
##   Minimum Function Test Statistic               24.882
##   Degrees of freedom                                15
##   P-value (Chi-square)                           0.052
## 
## Model test baseline model:
## 
##   Minimum Function Test Statistic              847.740
##   Degrees of freedom                                28
##   P-value                                        0.000
## 
## User model versus baseline model:
## 
##   Comparative Fit Index (CFI)                    0.988
##   Tucker-Lewis Index (TLI)                       0.977
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -5600.572
##   Loglikelihood unrestricted model (H1)      -5588.131
## 
##   Number of free parameters                         29
##   Akaike (AIC)                               11259.143
##   Bayesian (BIC)                             11355.369
##   Sample-size adjusted Bayesian (BIC)        11263.488
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.057
##   90 Percent Confidence Interval          0.000  0.095
##   P-value RMSEA <= 0.05                          0.350
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.030
## 
## Parameter Estimates:
## 
##   Information                                 Observed
##   Standard Errors                             Standard
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   verb_1 =~                                           
##     info1   (lmb1)    4.470    0.401   11.140    0.000
##     voca1             5.000    0.400   12.501    0.000
##     comp1             6.868    0.639   10.749    0.000
##     simi1             4.636    0.524    8.847    0.000
##   verb_6 =~                                           
##     info6   (lmb1)    4.470    0.401   11.140    0.000
##     voca6             4.090    0.447    9.144    0.000
##     comp6             4.026    0.487    8.258    0.000
##     simi6             4.564    0.543    8.406    0.000
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##  .info1 ~~                                            
##    .info6             0.979    2.601    0.376    0.707
##  .voca1 ~~                                            
##    .voca6             1.741    2.070    0.841    0.400
##  .comp1 ~~                                            
##    .comp6            -0.211    4.828   -0.044    0.965
##  .simi1 ~~                                            
##    .simi6            -1.448    4.484   -0.323    0.747
##   verb_1 ~~                                           
##     verb_6            1.817    0.212    8.555    0.000
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .info1     (i1)   19.776    0.427   46.267    0.000
##    .voca1            20.396    0.439   46.428    0.000
##    .comp1            21.797    0.680   32.039    0.000
##    .simi1            14.903    0.528   28.207    0.000
##    .info6     (i1)   19.776    0.427   46.267    0.000
##    .voca6            18.151    1.859    9.762    0.000
##    .comp6            19.294    2.301    8.386    0.000
##    .simi6            11.957    2.515    4.754    0.000
##     verb_1            0.000                           
##     verb_6            6.429    0.603   10.658    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .info1            17.294    2.265    7.636    0.000
##    .voca1            14.374    2.201    6.530    0.000
##    .comp1            47.246    5.785    8.166    0.000
##    .simi1            35.454    4.035    8.786    0.000
##    .info6            46.956    6.519    7.203    0.000
##    .voca6            23.859    4.329    5.512    0.000
##    .comp6            73.489    8.419    8.729    0.000
##    .simi6            88.773   10.405    8.532    0.000
##     verb_1            1.000                           
##     verb_6            5.797    1.150    5.043    0.000

#Model Diagram 
semPaths(fit_corrunique, what="est",
         sizeLat = 7, sizeMan = 7, edge.label.cex = .75)

## Warning in qgraph(Edgelist, labels = nLab, bidirectional = Bidir, directed
## = Directed, : The following arguments are not documented and likely not
## arguments of qgraph and thus ignored: loopRotation; residuals; residScale;
## residEdge; CircleEdgeEnd

C. Weak Invariance Model

Weak invariance model

Model Specification (additional constraints that $\boldsymbol{\Lambda_g} = \boldsymbol{\Lambda}$)

weak <- '
# Define the latent factors.
verb_1 =~ NA*info1 + lambda1*info1 + lambda2*voca1 + lambda3*comp1 + lambda4*simi1
verb_6 =~ NA*info6 + lambda1*info6 + lambda2*voca6 + lambda3*comp6 + lambda4*simi6

# Intercepts
info1 ~ i1*1
voca1 ~ 1
comp1 ~ 1
simi1 ~ 1
info6 ~ i1*1
voca6 ~ 1
comp6 ~ 1
simi6 ~ 1

# Unique Variances
info1 ~~ info1
voca1 ~~ voca1
comp1 ~~ comp1
simi1 ~~ simi1
info6 ~~ info6
voca6 ~~ voca6
comp6 ~~ comp6
simi6 ~~ simi6

# Latent Variable Means
verb_1 ~ 0*1
verb_6 ~ 1

# Latent Variable Variances and Covariance
verb_1 ~~ 1*verb_1
verb_6 ~~ verb_6
verb_1 ~~ verb_6
'

Model Estimation and Interpretation

#Model Estimation
fit_weak <- cfa(weak, data = dat, mimic = "mplus")
summary(fit_weak, fit.measures = TRUE)

## lavaan (0.5-23.1097) converged normally after  72 iterations
## 
##   Number of observations                           204
## 
##   Number of missing patterns                         1
## 
##   Estimator                                         ML
##   Minimum Function Test Statistic               41.897
##   Degrees of freedom                                22
##   P-value (Chi-square)                           0.006
## 
## Model test baseline model:
## 
##   Minimum Function Test Statistic              847.740
##   Degrees of freedom                                28
##   P-value                                        0.000
## 
## User model versus baseline model:
## 
##   Comparative Fit Index (CFI)                    0.976
##   Tucker-Lewis Index (TLI)                       0.969
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -5609.079
##   Loglikelihood unrestricted model (H1)      -5588.131
## 
##   Number of free parameters                         22
##   Akaike (AIC)                               11262.158
##   Bayesian (BIC)                             11335.157
##   Sample-size adjusted Bayesian (BIC)        11265.454
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.067
##   90 Percent Confidence Interval          0.035  0.097
##   P-value RMSEA <= 0.05                          0.173
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.076
## 
## Parameter Estimates:
## 
##   Information                                 Observed
##   Standard Errors                             Standard
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   verb_1 =~                                           
##     info1   (lmb1)    4.933    0.339   14.562    0.000
##     voca1   (lmb2)    4.865    0.336   14.492    0.000
##     comp1   (lmb3)    5.172    0.429   12.052    0.000
##     simi1   (lmb4)    5.072    0.397   12.762    0.000
##   verb_6 =~                                           
##     info6   (lmb1)    4.933    0.339   14.562    0.000
##     voca6   (lmb2)    4.865    0.336   14.492    0.000
##     comp6   (lmb3)    5.172    0.429   12.052    0.000
##     simi6   (lmb4)    5.072    0.397   12.762    0.000
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   verb_1 ~~                                           
##     verb_6            1.558    0.136   11.439    0.000
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .info1     (i1)   19.776    0.445   44.430    0.000
##    .voca1            20.396    0.436   46.803    0.000
##    .comp1            21.797    0.629   34.633    0.000
##    .simi1            14.903    0.544   27.411    0.000
##    .info6     (i1)   19.776    0.445   44.430    0.000
##    .voca6            16.111    1.765    9.130    0.000
##    .comp6            15.049    2.299    6.547    0.000
##    .simi6            11.756    2.212    5.315    0.000
##     verb_1            0.000                           
##     verb_6            5.824    0.429   13.591    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .info1            16.079    2.188    7.349    0.000
##    .voca1            15.075    2.109    7.149    0.000
##    .comp1            54.055    6.012    8.992    0.000
##    .simi1            34.578    3.941    8.775    0.000
##    .info6            49.748    6.494    7.661    0.000
##    .voca6            22.022    4.032    5.462    0.000
##    .comp6            72.254    8.353    8.650    0.000
##    .simi6            91.610   10.487    8.736    0.000
##     verb_1            1.000                           
##     verb_6            4.240    0.546    7.759    0.000

#Model Diagram 
semPaths(fit_weak, what="est",
          sizeLat = 7, sizeMan = 7, edge.label.cex = .75)

## Warning in qgraph(Edgelist, labels = nLab, bidirectional = Bidir, directed
## = Directed, : The following arguments are not documented and likely not
## arguments of qgraph and thus ignored: loopRotation; residuals; residScale;
## residEdge; CircleEdgeEnd

D. Strong Invariance Model

Strong invariance model

Model Specification (additional constraints on manifest intercepts, $\boldsymbol{\tau_g} = \boldsymbol{\tau}$)

strong <- '
# Define the latent factors.
verb_1 =~ NA*info1 + lambda1*info1 + lambda2*voca1 + lambda3*comp1 + lambda4*simi1
verb_6 =~ NA*info6 + lambda1*info6 + lambda2*voca6 + lambda3*comp6 + lambda4*simi6

# Intercepts
info1 ~ i1*1
voca1 ~ i2*1
comp1 ~ i3*1
simi1 ~ i4*1
info6 ~ i1*1
voca6 ~ i2*1
comp6 ~ i3*1
simi6 ~ i4*1

# Unique Variances
info1 ~~ info1
voca1 ~~ voca1
comp1 ~~ comp1
simi1 ~~ simi1
info6 ~~ info6
voca6 ~~ voca6
comp6 ~~ comp6
simi6 ~~ simi6

# Latent Variable Means
verb_1 ~ 0*1
verb_6 ~ 1

# Latent Variable Variances and Covariance
verb_1 ~~ 1*verb_1
verb_6 ~~ verb_6
verb_1 ~~ verb_6
'

Model Estimation & Interpretation

fit_strong <- cfa(strong, data = dat, mimic = "mplus")
summary(fit_strong, fit.measures = TRUE)

## lavaan (0.5-23.1097) converged normally after  61 iterations
## 
##   Number of observations                           204
## 
##   Number of missing patterns                         1
## 
##   Estimator                                         ML
##   Minimum Function Test Statistic               53.723
##   Degrees of freedom                                25
##   P-value (Chi-square)                           0.001
## 
## Model test baseline model:
## 
##   Minimum Function Test Statistic              847.740
##   Degrees of freedom                                28
##   P-value                                        0.000
## 
## User model versus baseline model:
## 
##   Comparative Fit Index (CFI)                    0.965
##   Tucker-Lewis Index (TLI)                       0.961
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -5614.992
##   Loglikelihood unrestricted model (H1)      -5588.131
## 
##   Number of free parameters                         19
##   Akaike (AIC)                               11267.984
##   Bayesian (BIC)                             11331.028
##   Sample-size adjusted Bayesian (BIC)        11270.830
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.075
##   90 Percent Confidence Interval          0.047  0.103
##   P-value RMSEA <= 0.05                          0.067
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.087
## 
## Parameter Estimates:
## 
##   Information                                 Observed
##   Standard Errors                             Standard
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   verb_1 =~                                           
##     info1   (lmb1)    5.270    0.331   15.917    0.000
##     voca1   (lmb2)    4.547    0.291   15.634    0.000
##     comp1   (lmb3)    4.525    0.312   14.487    0.000
##     simi1   (lmb4)    4.960    0.328   15.112    0.000
##   verb_6 =~                                           
##     info6   (lmb1)    5.270    0.331   15.917    0.000
##     voca6   (lmb2)    4.547    0.291   15.634    0.000
##     comp6   (lmb3)    4.525    0.312   14.487    0.000
##     simi6   (lmb4)    4.960    0.328   15.112    0.000
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   verb_1 ~~                                           
##     verb_6            1.608    0.142   11.316    0.000
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .info1     (i1)   19.929    0.457   43.623    0.000
##    .voca1     (i2)   20.311    0.421   48.252    0.000
##    .comp1     (i3)   21.459    0.598   35.901    0.000
##    .simi1     (i4)   14.882    0.529   28.146    0.000
##    .info6     (i1)   19.929    0.457   43.623    0.000
##    .voca6     (i2)   20.311    0.421   48.252    0.000
##    .comp6     (i3)   21.459    0.598   35.901    0.000
##    .simi6     (i4)   14.882    0.529   28.146    0.000
##     verb_1            0.000                           
##     verb_6            5.337    0.354   15.077    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .info1            15.124    2.238    6.757    0.000
##    .voca1            16.339    2.114    7.729    0.000
##    .comp1            57.819    6.150    9.402    0.000
##    .simi1            34.209    3.869    8.841    0.000
##    .info6            45.360    6.338    7.157    0.000
##    .voca6            24.586    3.951    6.223    0.000
##    .comp6            74.230    8.352    8.887    0.000
##    .simi6            89.572   10.110    8.860    0.000
##     verb_1            1.000                           
##     verb_6            4.557    0.585    7.791    0.000

#Model Diagram 
semPaths(fit_strong, what="est",
         sizeLat = 7, sizeMan = 7, edge.label.cex = .75)

## Warning in qgraph(Edgelist, labels = nLab, bidirectional = Bidir, directed
## = Directed, : The following arguments are not documented and likely not
## arguments of qgraph and thus ignored: loopRotation; residuals; residScale;
## residEdge; CircleEdgeEnd

E. Strict Invariance Model

Strict invariance model

Model Specification (additional constraints on manifest intercepts, $\boldsymbol{\Theta_g} = \boldsymbol{\Theta}$)

strict <- '
# Define the latent factors.
verb_1 =~ NA*info1 + lambda1*info1 + lambda2*voca1 + lambda3*comp1 + lambda4*simi1
verb_6 =~ NA*info6 + lambda1*info6 + lambda2*voca6 + lambda3*comp6 + lambda4*simi6

# Intercepts
info1 ~ i1*1
voca1 ~ i2*1
comp1 ~ i3*1
simi1 ~ i4*1
info6 ~ i1*1
voca6 ~ i2*1
comp6 ~ i3*1
simi6 ~ i4*1

# Unique Variances
info1 ~~ u1*info1
voca1 ~~ u2*voca1
comp1 ~~ u3*comp1
simi1 ~~ u4*simi1
info6 ~~ u1*info6
voca6 ~~ u2*voca6
comp6 ~~ u3*comp6
simi6 ~~ u4*simi6

# Latent Variable Means
verb_1 ~ 0*1
verb_6 ~ 1

# Latent Variable Variances and Covariance
verb_1 ~~ 1*verb_1
verb_6 ~~ verb_6
verb_1 ~~ verb_6
'

Model Estimation & Interpretation

#Model estimation
fit_strict <- cfa(strict, data = dat, mimic = "mplus")
summary(fit_strict, fit.measures = TRUE)

## lavaan (0.5-23.1097) converged normally after  60 iterations
## 
##   Number of observations                           204
## 
##   Number of missing patterns                         1
## 
##   Estimator                                         ML
##   Minimum Function Test Statistic              134.559
##   Degrees of freedom                                29
##   P-value (Chi-square)                           0.000
## 
## Model test baseline model:
## 
##   Minimum Function Test Statistic              847.740
##   Degrees of freedom                                28
##   P-value                                        0.000
## 
## User model versus baseline model:
## 
##   Comparative Fit Index (CFI)                    0.871
##   Tucker-Lewis Index (TLI)                       0.876
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -5655.410
##   Loglikelihood unrestricted model (H1)      -5588.131
## 
##   Number of free parameters                         15
##   Akaike (AIC)                               11340.820
##   Bayesian (BIC)                             11390.592
##   Sample-size adjusted Bayesian (BIC)        11343.067
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.134
##   90 Percent Confidence Interval          0.111  0.157
##   P-value RMSEA <= 0.05                          0.000
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.169
## 
## Parameter Estimates:
## 
##   Information                                 Observed
##   Standard Errors                             Standard
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   verb_1 =~                                           
##     info1   (lmb1)    5.083    0.364   13.977    0.000
##     voca1   (lmb2)    4.358    0.317   13.765    0.000
##     comp1   (lmb3)    4.309    0.333   12.944    0.000
##     simi1   (lmb4)    4.785    0.356   13.447    0.000
##   verb_6 =~                                           
##     info6   (lmb1)    5.083    0.364   13.977    0.000
##     voca6   (lmb2)    4.358    0.317   13.765    0.000
##     comp6   (lmb3)    4.309    0.333   12.944    0.000
##     simi6   (lmb4)    4.785    0.356   13.447    0.000
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   verb_1 ~~                                           
##     verb_6            1.812    0.168   10.812    0.000
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .info1     (i1)   20.019    0.509   39.339    0.000
##    .voca1     (i2)   20.313    0.433   46.915    0.000
##    .comp1     (i3)   21.513    0.621   34.648    0.000
##    .simi1     (i4)   14.805    0.617   23.988    0.000
##    .info6     (i1)   20.019    0.509   39.339    0.000
##    .voca6     (i2)   20.313    0.433   46.915    0.000
##    .comp6     (i3)   21.513    0.621   34.648    0.000
##    .simi6     (i4)   14.805    0.617   23.988    0.000
##     verb_1            0.000                           
##     verb_6            5.557    0.415   13.396    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .info1     (u1)   28.657    2.958    9.689    0.000
##    .voca1     (u2)   20.878    2.142    9.746    0.000
##    .comp1     (u3)   68.013    5.253   12.946    0.000
##    .simi1     (u4)   61.387    4.897   12.535    0.000
##    .info6     (u1)   28.657    2.958    9.689    0.000
##    .voca6     (u2)   20.878    2.142    9.746    0.000
##    .comp6     (u3)   68.013    5.253   12.946    0.000
##    .simi6     (u4)   61.387    4.897   12.535    0.000
##     verb_1            1.000                           
##     verb_6            5.056    0.692    7.304    0.000

#Model Diagram 
semPaths(fit_strict, what="est",
         sizeLat = 7, sizeMan = 7, edge.label.cex = .75)

## Warning in qgraph(Edgelist, labels = nLab, bidirectional = Bidir, directed
## = Directed, : The following arguments are not documented and likely not
## arguments of qgraph and thus ignored: loopRotation; residuals; residScale;
## residEdge; CircleEdgeEnd

E. Evluation of Invariance Models

Comparison of Model fits side-by-side

Model comparison…
Looking at fit statistics

#Compare model fit statistics
round(cbind(configural=inspect(fit_configural, 'fit.measures'), 
            weak=inspect(fit_weak, 'fit.measures'),
            strong=inspect(fit_strong, 'fit.measures'),
            strict=inspect(fit_strict, 'fit.measures')),3)

##                     configural      weak    strong    strict
## npar                    25.000    22.000    19.000    15.000
## fmin                     0.064     0.103     0.132     0.330
## chisq                   25.968    41.897    53.723   134.559
## df                      19.000    22.000    25.000    29.000
## pvalue                   0.131     0.006     0.001     0.000
## baseline.chisq         847.740   847.740   847.740   847.740
## baseline.df             28.000    28.000    28.000    28.000
## baseline.pvalue          0.000     0.000     0.000     0.000
## cfi                      0.991     0.976     0.965     0.871
## tli                      0.987     0.969     0.961     0.876
## nnfi                     0.987     0.969     0.961     0.876
## rfi                      0.955     0.937     0.929     0.847
## nfi                      0.969     0.951     0.937     0.841
## pnfi                     0.658     0.747     0.836     0.871
## ifi                      0.992     0.976     0.965     0.871
## rni                      0.991     0.976     0.965     0.871
## logl                 -5601.115 -5609.079 -5614.992 -5655.410
## unrestricted.logl    -5588.131 -5588.131 -5588.131 -5588.131
## aic                  11252.229 11262.158 11267.984 11340.820
## bic                  11335.182 11335.157 11331.028 11390.592
## ntotal                 204.000   204.000   204.000   204.000
## bic2                 11255.975 11265.454 11270.830 11343.067
## rmsea                    0.042     0.067     0.075     0.134
## rmsea.ci.lower           0.000     0.035     0.047     0.111
## rmsea.ci.upper           0.079     0.097     0.103     0.157
## rmsea.pvalue             0.588     0.173     0.067     0.000
## rmr                      3.162     8.288     8.402    11.125
## rmr_nomean               3.162     8.288     8.402    11.125
## srmr                     0.031     0.076     0.087     0.169
## srmr_bentler             0.031     0.072     0.086     0.133
## srmr_bentler_nomean      0.034     0.080     0.094     0.147
## srmr_bollen              0.031     0.069     0.079     0.133
## srmr_bollen_nomean       0.034     0.054     0.068     0.088
## srmr_mplus               0.031     0.076     0.087     0.169
## srmr_mplus_nomean        0.034     0.065     0.079     0.145
## cn_05                  237.800   166.180   143.977    65.519
## cn_01                  285.306   197.171   169.273    76.178
## gfi                      0.995     0.991     0.989     0.976
## agfi                     0.988     0.983     0.981     0.963
## pgfi                     0.430     0.496     0.562     0.643
## mfi                      0.983     0.952     0.932     0.772
## ecvi                        NA        NA        NA        NA

Overall, a Strong Invariance model fits the data reasonably well.

We can also do formal model tests.

#Chi-square difference test for nested models 
anova(fit_configural, fit_weak)

## Chi Square Difference Test
## 
##                Df   AIC   BIC  Chisq Chisq diff Df diff Pr(>Chisq)   
## fit_configural 19 11252 11335 25.968                                 
## fit_weak       22 11262 11335 41.897     15.929       3   0.001173 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

anova(fit_weak, fit_strong)

## Chi Square Difference Test
## 
##            Df   AIC   BIC  Chisq Chisq diff Df diff Pr(>Chisq)   
## fit_weak   22 11262 11335 41.897                                 
## fit_strong 25 11268 11331 53.723     11.826       3   0.008005 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

anova(fit_strong, fit_strict)

## Chi Square Difference Test
## 
##            Df   AIC   BIC   Chisq Chisq diff Df diff Pr(>Chisq)    
## fit_strong 25 11268 11331  53.723                                  
## fit_strict 29 11341 11391 134.559     80.836       4  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

In all cases, we reject the null hypothesis that the models fit the same.

We might alternatively conclude, based on the statistical information, that there is not measurement invariance.

Remember that we often have incentives to go on towards the examination of change. Given our goals, we may accept the strong invariance model as sufficient and carry on with the analysis of factor-level change.

Be careful and Thanks for playing!