Hypothesis Testing

1. Hypothesis Testing

Now we ease into NULL hypothesis testing. It can be intimidating and even I get mixed up sometimes. Be patient it will come together.

2. Anatomy of hypothesis testing

At the heart of hypothesis testing is the hypothesis. Let's define what a hypothesis means in stats. Simply put, a hypothesis is a *testable* claim. Stating that the price of a Ferrari is high, isn't testable. This is because "high" isn't defined. To make it testable you could state a hypothesis such as the average Ferrari price is higher than the average sports car price.

3. Null Vs Alternate Testable Hypotheses

Let's talk about two types of hypotheses the NULL & alternate. The NULL hypothesis represents the status quo or the accepted fact. The NULL is really short for NULLIFY. This is because your statistical test seeks to NULLify or reject, the statement. The alternate hypothesis is the challenger statement meaning everything else not represented in the NULL hypothesis. Generally an H & 0 denote the NULL while H & 1 represent the alternate. In our example the NULL is that the average Ferrari price is equal to the average sports car price. The challenger, alternate hypothesis is that the average Ferrari price is greater or less than the average sports car price. The status quo is that there is no difference but you want to test if there is a difference.

4. Common sense testing

Taking a step back from math & the technical execution of a hypothesis test, suppose you wanted to examine the question of sports car prices given the NULL & alternate. You sampled 50 Ferraris to find an average of $252k. Another 50 non-Ferraris average $85K. With that information alone, would you agree that the status quo, H0, is correct? There is a large difference between the two averages, so its more likely you would not believe the prices are equal. In stats lingo you REJECT the NULL hypothesis. The sample data doesn't support it.

5. Another common sense test

This time, the NULL hypothesis is that the average Toyota price is equal to the average Honda price. The status quo, H0, is that pricing would be equal. Sampling 50 of each, averages $23,845 & $23,720. The values are so close, you are thinking the difference may only be from sampling. So you would accept the NULL hypothesis. In stats, you would say "Fail to reject the null hypothesis" That is, the status quo holds up given your data.

6. Removing subjectivity in a test

In the previous tests you made a judgment call to "Reject" or "fail to reject" H0. To further avoid subjectivity, hypothesis testing uses a test statistic to measure the H0 validity. The car price experiment tests the independence of two samples. So you use a t-test as the statistic. The t-Test produces a p-value. You have to predetermine a p-value cutoff for your test. This means you want to have a probability of say 1% or 5% that the results are an error. Use a p-value of 5% in most cases. If the probability is less than the cutoff, 5%, then you will "REJECT the Null Hypothesis" and conclude there *is* a difference between the samples. Let me repeat that, if the p-value from your test statistic is less than .05 then you reject the NULL hypothesis.

7. T-Tests in Google Sheets

In sheets calculate the p-value with "t_dot_test". It accepts range1, range2 followed by the number of tails being tested & type of test. If you're testing whether the difference is *only* greater than or *only* less than, the t.test has 1 tail. In our case, the H0 operator is equals. So values can be above *or* below the mean. Thus the test has 2 tails in either direction from mean. Next, if you measure the same observations at different times, the type equals 1. If you measure different observations with the same variance the type is 2. And to be the most strict if you measure different observations with different variances, then use type 3.

8. Let's practice!

Go slowly to make sure hypothesis testing sinks in.