1. Spearman-rank correlation
The Spearman-Rank correlation is the nonparametric Pearson equivalent.
2. Spearman correlation assumptions
When data is not normally distributed, we can use the Spearman-rank correlation, or Spearman correlation to investigate variable relationships across or within AB groups. Spearman correlation assesses the strength and direction of two
rank ordered variables, meaning the data is ordered by position. A
high Spearman correlation is found when the observations have similar rank, or position label such as first, second, third, between the variables.
Continuous and discrete
ordinal, interval, and ratio variables are appropriate.
3. Monotonic relationships
Spearman’s rho assesses the
monotonic relationship of the variables, meaning it is
not consistently increasing or decreasing.
In an increasing monotonic relationship, variable Y sometimes increases as variable X increases, but does not increase and decrease.
A monotonically decreasing relationship occurs when variable Y never increases as variable X increases.
Monotonic relationships are
less restrictive than linearity, therefore Pearson is stronger and more accurate if linearity is met.
Scatter plots can assess if a monotonic relationship may be present within or across groups.
4. Hypothesis and sample size
Suppose a monotonic relationship is expected between subjects’ time to eat and enjoyment of the pizza across our Cheese and Pepperoni groups. Our
null hypothesis is there is no monotonic association between time and enjoyment of pizza.
In determining the sample size needed,
use the same pwr-dot-r-dot-test from the pwr package as the Pearson correlation.
To detect an effect of point-three, power of point-eight and alpha of point-zero-five, we need 85 data points.
5. Spearman ignoring groups
We use cor-dot-test for a Spearman correlation. The formula is tilde x plus y, or tilde enjoyment plus time. Setting exact to FALSE suppresses a warning indicating that identical rank values are ignored when used to derive a p-value.
Specify the data frame and method as spearman. The output gives the test statistic S, the sum of the squared rank pair differences, and p-value indicating the time to eat is positively correlated with the enjoyment of the pizza across the toppings.
It also provides the correlation coefficient rho, equivalent to r with a range of negative-one to positive-one, which can be squared to determine the proportion of variation in the dependent variable, y, time, is attributed to the independent variable, x, enjoyment. Degrees of freedom are not output and not needed as the significance is determined with S, not t.
We can check the number of points by calling data frame dollar-sign variable in length, or pizza dollar-sign time.
6. Spearman within groups
Spearman correlation can be run within each AB group, subsetting the data frame to only the group of interest, Cheese, for example. The correlation shape, linear or monotonic, should be reassessed in each group separately.
A monotonic relationship may appear across groups, but the relationship could be linear within groups, for which a Pearson correlation will be stronger and more accurate. Notice the monotonic relationship when ignoring groups in this plot but each AB group individually shows a linear relationship.
7. Spearman power analysis
The power analysis for a Spearman correlation can be run using pwr-dot-r-dot-test. Specify the correlation coefficient with r, the number of samples with n, and the p-value with sig-dot-level.
8. Referring to the output
Rather than hard coding the argument values, we can save cor-dot-test to an object, rhotest, and length, or number of rows, to the variable samp. Now call the values from the output object using dollar-sign.
Estimate stores the correlation coefficient and p-dot-value contains the p-value. Use samp for n. Our Spearman correlation across the AB groups is reliable.
9. Let's practice!
Let’s practice.