The "There is only one test" framework

1. The "There is only one test" framework

Traditional hypothesis tests make assumptions about the sample dataset. Here, we'll look at what to do when those assumptions don't hold.

2. Imbalanced data

Sometimes you can't get a sample representing all parts of your population. Consider this subset of the Stack Overflow survey. There are fewer users with hobbyist "No" than "Yes", and fewer users with "At least 30" ages than "Under 30". In fact, there are no users with hobbyist "No" and age "At least 30". This is a problem for a proportion test, since it requires that every group has at least ten observations. I used count's dot-drop argument to include subgroups with a zero count. A dataset where some group's counts are much larger than others is called "imbalanced".

3. Hypotheses

We can declare hypotheses to test. I'll set a significance level of point-one.

4. Proceeding with a proportion test regardless

Let's ignore the small sample problems and run a proportion test. The p-value is point-zero-nine, which is below the significance level. This test suggests that there is enough evidence to reject the null hypothesis. We'll revisit this later.

5. A grammar of graphics

Traditionally, every type of plot has a name, like scatter plot or line plot. In base-R you decide the plot type by specifying arguments to the plot function, or by calling a dedicated function like hist. This is fine until you want to draw a custom plot like a histogram with polar coordinates. If a function to draw that doesn't exist, you are stuck. The beauty of the grammar of graphics system that ggplot2 implements is that plots are broken down into components that can be combined in many ways. It's more flexible creating a new function for every plot type you can think of. What if there was a grammar of hypothesis tests?

6. A grammar of hypothesis tests

The good news is that there is a grammar of hypothesis tests. It was invented by DataCamp instructor Allen Downey, and is called the "There is only one test" framework. The framework was implemented in R in the infer package that you've used for proportion and chi-square tests. All hypothesis tests can be performed with this code flow, based on specify, hypothesize, generate, and calculate. By changing the arguments to those functions you can run standard tests like t-tests, or devise your own custom tests. We'll cover specify and hypothesize now, and the others in the next video. generate is worth mentioning briefly. It creates simulated data reflecting the null hypothesis. Using simulation rather than arithmetic equations to get the test statistic is computationally expensive, so it has only become accessible with modern computing power. However, simulation has a benefit that it works well even when you have small samples or imbalanced data.

7. Specifying the variables of interest

The Stack Overflow survey contains many variables, but for this test, we only care about two of them. specify works like dplyr's select, returning only the response and explanatory columns.

8. specify()

To call specify, you pipe from the dataset, and pass a formula with the response on the left and the explanatory variable on the right. In the one sample case, use NULL for the explanatory value. Just as when you called prop_test, you need to tell specify which response value is regarded as a success. specify returns a tibble with attributes noting the response and explanatory variable names.

9. hypothesize()

hypothesize declares the type of null hypothesis. Proportion tests are a special case of the chi-square independence test, so we choose "independence". hypothesize doesn't calculate anything; it simply adds another attribute to the dataset.

10. Let's practice!

Let's begin the infer pipeline.

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.