1. P-values, alpha, and errors
In this video, we'll deepen our understanding of p-values, alpha levels, and experimental errors. This will prepare us for the next video, where we'll tackle a key concept in experimental design called power analysis!
2. P-values and alpha
P-values and alpha can be viewed as a game. Think of conducting a scientific experiment where we are trying to determine whether a certain strategy (our hypothesis) leads to winning (or a significant result) more often than just by chance.
P-values help us understand the likelihood of observing our data if the null hypothesis was true. That is they serve as the scoreboard of the game.
Setting an alpha level, often 0.05, allows us to determine the threshold at which we consider our results statistically significant, akin to setting the rules of a game before playing. Alpha is like establishing a rule for what counts as a "remarkable" win in this game.
If your P-value is below this alpha level, it's as if we've achieved a high score or a remarkable performance in the game, leading us to conclude that our strategy (the alternative hypothesis) might indeed be effective, and it's not just the luck of the draw.
3. The dataset: crop yields
We'll work with a dataset of crop yields from different fields, where each field was treated with either organic or synthetic fertilizer. Our goal is to analyze this data to determine if there's a significant difference in crop yields between the two fertilizer types.
4. Visualizing the data
It's helpful to visualize the crop yields for each fertilizer type. By plotting the kernel density estimates (kde), we get a sense of how the two fertilizers might differ in terms of their effect on crop yields and whether there's an overlap between their effects.
It appears that Organic tends to produce a higher yield than Synthetic with some overlap.
5. Conducting an independent samples t-test
We set our alpha to the standard five-percent level.
To compare the effectiveness of organic versus synthetic fertilizers, we perform a t-test on the crop yields from the two groups.
The p-value is smaller than alpha suggesting that fertilizer type has a statistically significant impact on crop yield.
6. Exploring experimental errors
In experimental design, we encounter two main types of errors.
Type I errors occur when we incorrectly reject a true null hypothesis, akin to a false alarm.
Type II errors happen when we fail to reject a false null hypothesis, similar to a missed detection.
7. More on alpha
Alpha, or the significance level, is crucial in hypothesis testing; it indicates the probability of a Type I error—rejecting a true null hypothesis.
Common alpha levels include 0.05, 0.01, and 0.10, representing risks of 5%, 1%, and 10%, respectively, for such errors.
Selecting an alpha hinges on the study's context and a balance between tolerating a Type I error and the risk of overlooking a true effect, known as a Type II error. The choice should align with the study's goals and the implications of potential errors.
Conventionally, 0.05 is the standard for statistical significance across many disciplines. For more rigorous scrutiny, particularly where the cost of a Type I error is high, an alpha of 0.01 is preferred. In preliminary studies, where a higher error tolerance is permissible, an alpha of 0.10 may be utilized, allowing for a broader exploration of potential effects with subsequent validation through more stringent testing.
8. Let's practice!
Time for some exercises!