Get startedGet started for free

Non-parametric tests

1. Non-parametric tests

Recall that non-parametric tests remove the assumption of normality from our data, and thus are often more broadly applicable than parametric tests.

2. Non-parametric tests

While non-parametric tests may have their own assumptions, the key is that they do not require our data to be normally distributed. Therefore, we can apply non-parametric tests to a broader range of data than parametric tests. However, nothing is free. Non-parametric tests often have lower power, meaning a lower chance of detecting an effect. In addition, parametric and non-parametric tests often test different things. Non-parametric tests are especially applicable in situations where our data takes on a ranked order, such as star ratings for restaurants, where the assumption of normality will almost never be satisfied.

3. Examples

There are many different non-parametric tests, just as there are many different parametric tests. You may have seen the Wilcoxon-Mann-Whitney U test, which is a non-parametric analogue of an independent sample t-test. In addition, the Kruskal-Wallis test is a non-parametric analogue of an ANOVA test. But what if your data consists of paired measurements, such as university rankings from different rankers? In that case, Mood's median test acts as a non-parametric analogue of a paired sample t-test. Finally, we've worked with Pearson's correlation coefficient. But what about if your data is ranked, such as university rankings? In that case the assumption that the data is approximately normally distributed is almost certainly violated.

4. Mood's median test

Mood's median test compares the medians from two paired measurements. For example, let's consider university scores from two different organizations. Here we can see a small sample of these rankings from two different rankers - the Times Higher Education World University Ranking, or THEW, and the Academic Ranking of World Universities, or ARW. Each scores a university from 0 to 100. However, there is no reason to believe these scores should be normally distributed. After all, they largely work with top universities, so we would expect most rankings to be relatively high. However, this is something we'll investigate in the exercises. We would like to compare the median scores for each university. We can do so using the median test function from SciPy, which takes in the samples of rankings. It then returns the test statistic s, the p-value, the overall median score m, and a table showing how many rankings are above and below this median for each ranker.

5. Mood's median test

We can see that the p-value is significant at five percent, indicating the two organizations have different median rankings. While this may seem identical to how we have conducted paired t-tests, the important thing is that t-tests assume normality. If our data is not normal, then the results from a t-test are invalid. So while both tests may give us a p-value, only one accurately reflects the situation. Using the right tool for the job is key!

6. Kendall's tau

In addition, when working with ranked, ordinal data, Pearson's correlation coefficient would not be the right tool for the job. Pearson's R assumes both sets of data are approximately normally distributed. With rankings, every value occurs only once, so it will never be normal! Instead, we can use Kendall's tau. It takes values between negative one and one, where negative one indicates complete disagreement, one indicates complete agreement, and zero indicates no correlation. Using the kendalltau function in SciPy shows us a fairly high degree of correlation. Also, the null hypothesis of no correlation is rejected.

7. Let's practice!

Now, let's start working with the data.