Get startedGet started for free

Parametric tests

1. Parametric tests

Parametric tests are a class of hypothesis tests, which require some aspect of our data to be normally distributed.

2. ANOVA

In this video we'll start by discussing a one-way analysis of variance, or "ANOVA", test. While this is far from the only parametric test, understanding this one test will make understanding other parametric tests simpler. ANOVA compares a mean value, or "response", from a number of different groups, or "factors". Compare this to a t-test which works with only one or two groups. For example, suppose we were looking at venture capital funding from different companies in different market segments. We would likely see differences in amounts funded. Are these differences just due to random chance, or do they reflect an underlying difference in industries?

3. ANOVA

Here we see the first few markets, along with their average funding. Some markets seem to have similar average funding, such as advertising and analytics, while others like biotechnology are quite different. We want to determine if the differences are statistically significant, so we'll conduct an ANOVA test.

4. Assumptions of ANOVA

Since ANOVA is a parametric test, it has normality assumptions. First, the response in each factor must be normally distributed. This means that, for each market segment, the populations from which the samples are drawn must be normally distributed. Second, each market must have the same variance. That is, how spread out the different funding amounts are must match between segments. Let's check these assumptions.

5. Normally distributed response

Let's investigate the normality claim first. If we plot the data from the health and wellness market, we see it's definitely not normally distributed, and has a heavy tail. This is because most companies had only small amounts of funding, with a few companies getting significantly higher funding.

6. Log-transformations and normality

But all hope is not lost! A common trick when dealing with data of this sort of to take the logarithm. We refer to this as a "log-transformation". Doing so often helps make the data become normally distributed. Here we see exactly that. In fact, if we were to follow this up with an Anderson-Darling test we would see that the log-transformed data is indeed normally distributed. Therefore, we'll use the log-transformed amounts for this variable moving forward.

7. Equal variance

Let's investigate the assumption of equal variance. One way to do this would be to look at the standard deviation of each industry and see if they're similar. However, just like with anything statistical, there is always the question of "how close is close enough." To test if the differences in variance we see are statistically significant, we can use a Levene test of equal variance. The null hypothesis is that the populations from which the samples are drawn have equal variance. The alternative is that the populations have different variances.

8. Equal variance

The Levene test tests the null hypothesis of equal population variance. We can do the Levene test using the "levene" function from from SciPy-dot-stats by supplying the samples. Since the p-value is not less than five percent, we can conclude that the markets do indeed have equal variance. We're now ready to apply an ANOVA test!

9. ANOVA in SciPy

We'll use the f_oneway function from SciPy-dot-stats by supplying the samples. The test concludes a difference in funding at the five percent level.

10. Inference based on ANOVA

ANOVA only tells us if all means are identical. If we reject the null, all we can conclude is that at least one mean is different. Note that, whenever performing inference, a five percent cutoff is not set in stone. If you are willing to accept slightly less certainty, you could consider increasing your cutoff to ten percent when running your tests.

11. Let's practice!

Now, let's practice!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.