By and large, structural equation models are used to model the covariance matrix of the observed variables in a dataset. But in some applications, it is useful to bring in the means of the observed variables too. One way to do this is to explicitly refer to intercepts in the lavaan syntax. This can be done by including ‘intercept formulas’ in the model syntax. An intercept formula has the following form:

``````variable ~ 1
``````

The left part of the expression contains the name of the observed or latent variable. The right part contains the number `1`, representing the intercept. For example, in the three-factor H&S CFA model, we can add the intercepts of the observed variables as follows:

``````# three-factor model
visual  =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed   =~ x7 + x8 + x9
# intercepts
x1 ~ 1
x2 ~ 1
x3 ~ 1
x4 ~ 1
x5 ~ 1
x6 ~ 1
x7 ~ 1
x8 ~ 1
x9 ~ 1
``````

However, it is more convenient to omit the intercept formulas in the model syntax (unless you want to fix their values), and to add the argument `meanstructure = TRUE` in the fitting function. For example, we can refit the three-factor H&S CFA model as follows:

``````fit <- cfa(HS.model,
data = HolzingerSwineford1939,
meanstructure = TRUE)
summary(fit)
``````
``````lavaan 0.6-11 ended normally after 35 iterations

Estimator                                         ML
Optimization method                           NLMINB
Number of model parameters                        30

Number of observations                           301

Model Test User Model:

Test statistic                                85.306
Degrees of freedom                                24
P-value (Chi-square)                           0.000

Parameter Estimates:

Standard errors                             Standard
Information                                 Expected
Information saturated (h1) model          Structured

Latent Variables:
Estimate  Std.Err  z-value  P(>|z|)
visual =~
x1                1.000
x2                0.554    0.100    5.554    0.000
x3                0.729    0.109    6.685    0.000
textual =~
x4                1.000
x5                1.113    0.065   17.014    0.000
x6                0.926    0.055   16.703    0.000
speed =~
x7                1.000
x8                1.180    0.165    7.152    0.000
x9                1.082    0.151    7.155    0.000

Covariances:
Estimate  Std.Err  z-value  P(>|z|)
visual ~~
textual           0.408    0.074    5.552    0.000
speed             0.262    0.056    4.660    0.000
textual ~~
speed             0.173    0.049    3.518    0.000

Intercepts:
Estimate  Std.Err  z-value  P(>|z|)
.x1                4.936    0.067   73.473    0.000
.x2                6.088    0.068   89.855    0.000
.x3                2.250    0.065   34.579    0.000
.x4                3.061    0.067   45.694    0.000
.x5                4.341    0.074   58.452    0.000
.x6                2.186    0.063   34.667    0.000
.x7                4.186    0.063   66.766    0.000
.x8                5.527    0.058   94.854    0.000
.x9                5.374    0.058   92.546    0.000
visual            0.000
textual           0.000
speed             0.000

Variances:
Estimate  Std.Err  z-value  P(>|z|)
.x1                0.549    0.114    4.833    0.000
.x2                1.134    0.102   11.146    0.000
.x3                0.844    0.091    9.317    0.000
.x4                0.371    0.048    7.779    0.000
.x5                0.446    0.058    7.642    0.000
.x6                0.356    0.043    8.277    0.000
.x7                0.799    0.081    9.823    0.000
.x8                0.488    0.074    6.573    0.000
.x9                0.566    0.071    8.003    0.000
visual            0.809    0.145    5.564    0.000
textual           0.979    0.112    8.737    0.000
speed             0.384    0.086    4.451    0.000
``````

As you can see in the output, the model includes intercept parameters for both the observed and latent variables. By default, the `cfa()` and `sem()` functions fix the latent variable intercepts (which in this case correspond to the latent means) to zero. Otherwise, the model would not be estimable. Note that the chi-square statistic and the number of degrees of freedom is the same as in the original model (without a mean structure). The reason is that we brought in some new data (a mean value for each of the 9 observed variables), but we also added 9 additional parameters to the model (an intercept for each of the 9 observed variables). The end result is an identical fit. In practice, the only reason why a user would add intercept-formulas in the model syntax, is because some constraints must be specified on them. For example, suppose that we wish to fix the intercepts of the variables `x1`, `x2`, `x3` and `x4` to, say, 0.5. We would write the model syntax as follows:

``````# three-factor model
visual  =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed   =~ x7 + x8 + x9
# intercepts with fixed values
x1 + x2 + x3 + x4 ~ 0.5*1
``````

where we have used the left-hand side of the formula to ‘repeat’ the right-hand side for each element of the left-hand side.