Measure features: correlations and reliability

1. Measure features: correlations and reliability

By this point, you've looked at basic descriptive statistics of your dataset and learned how to split the data into random halves. When writing a description of your measure, you'll also want to include some more detailed information.

2. Correlations

Correlations are the standard way of reporting relationships between variables. The lowerCor() function provides this data in a more reader-friendly format than base R's cor() function. The diagonal of ones represents the perfect correlation between each item and itself, and the other values are the correlations between each pair of items. lowerCor() displays only the lower triangle of the correlation matrix, so each pair's correlation is only displayed once. This correlation matrix is your first clue about factor structure. Groups of items that are more strongly correlated typically load onto the same factor.

3. Testing correlations' significance: p-values

Once you've used lowerCor() to find the correlations between items, you will likely also want to report their significance and confidence intervals. corr.test() can be used to generate both of these metrics for inter-item correlations. corr.test() generates a lot of output when you run it, and results are given as a full matrix instead of just the lower half like lowerCor(). Its result object is a list, so you can specify named list elements to get only the information you want to view. In this example, we are accessing the 'p' list element to get the p-values for each of the correlations. This slide displays the p-values for the correlations of the items. All those zeroes indicate statistically significant correlations. This is unsurprising given that the gcbs dataset has over 2,000 cases since statistical significance is affected by sample size.

4. Testing correlations' significance: confidence intervals

You can also use corr-dot-test() to view confidence intervals for each of the correlations. By default, corr-dot-test() calculates 95% confidence intervals around the correlation value, r. This means that if we repeated the experiment many times with datasets drawn from the same population, the calculated confidence intervals would contain the true value 95% of the time. These confidence intervals are important to report for many types of publications. The output above shows the results for the first item, Q1.

5. Coefficient alpha

Coefficient alpha, also called Cronbach's alpha, is another important statistic to report during measure development. This statistic is a measure of the internal consistency of your measure, which is also called reliability. Most fields of research prefer measures whose alpha is greater than 0-point-8. Using the alpha() function, you can see that the gcbs items have a coefficient alpha of 0-point-93, which suggests excellent reliability.

6. Coefficient alpha

The output from the alpha() function will also tell you some basic stats for each item as well as how the overall alpha value would be affected if an item were dropped. If dropping an item would cause alpha to increase, that's an indicator that that item isn't performing as well.

7. Split-Half reliability

Split-half reliability is another common statistic showing internal consistency. It reflects how well two halves of the test relate to each other. The splitHalf() function displays several common split-half statistics. You will likely want to report the average split-half reliability, which happens to be 0-point-93: the same value as coefficient alpha! This is coincidental, but the reliability metrics are conceptually similar, so it's not surprising.

8. Let's practice!

Now, let's put these functions into practice.

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.