1. Heywood Cases on the Manifest Variables
The last section covered how to fix a Heywood case on the variances where correlations were extremely high or over 1. This section will explore negative error variances and how to fix them.
2. Why Negative Variances?
We know that variance cannot be negative because the formula is squared. However, when you estimate models, sometimes you will get a negative estimate. You might have a model that is not identified or misspecified as we described in the last section. Smaller samples are always hard to estimate and sampling fluctuations occur when estimates vary from sample to sample. The problem could potentially be in the data. Manifest variables do not need to have exactly the same scale, but they should not be wildly different scales, like using both a 1 to 7 scale and a 0 to 10,000 scale. For example, you would not want to include a rating of quality of a wine along with the cost of the wine if very expensive wines were used. Lastly, skewed data can cause problems with estimation.
3. Negative Variance Example
In this example, I simulated a two-factor model that would produce a negative error variance. Remember to look for the blue warning message after you run the cfa() function. For problems with the correlations, we saw a not positive definite error. The error message for negative variances or inappropriate estimates is that the model has not converged. You will want to run the summary() function to see where the problem lies.
4. Summarize to View Heywood Case
A new argument we can use in the summary() function is the rsquare equals TRUE option. This option will show you the estimates for the amount of variance accounted for in each manifest variable, which should not be negative or over 1. In the summary() output, we first see more lavaan warnings. If we look at the variances section, we can see that variable 2 is negative, and variable 5 is very large.
5. Investigate R-Square Output
In the R-square part of the output, we see that variable 2 is NA because the variance was negative, and several variables have negative estimates of R-square.
To troubleshoot this problem, I am going to calculate the variance of variable 2. You should edit one variable at a time, as fixing variable 2 may solve our Heywood case.
6. Update the Model
Using the model specification rules you learned before, we are going to set the variance for variable 2 equal to the output from the var() function by using V2 tilde tilde variance times V2. This code is similar to setting a correlation between latents to zero, but now we are just setting the variance of V2 to a specific number.
7. New Updated Output
Looking at the variance output, our negative values have gone away, and we did not see a lavaan warning when running the cfa() function.
8. Negative Variance Example (5)
Our estimates for R-square are also all below 1 and positive. We can interpret these numbers like r square from regression. V2 and V6 are not well estimated by this model, but V5 is.
9. Let's practice!
In the exercises, you will find and fix Heywood variance cases in the adoption model you created before.