Exercise

# Calculating the F statistic

Now that we have calculated the between-groups and the within-groups variability, we can calculate its ratio. The ratio of between-groups and within-groups variability produces a F statistic. See the following formula: $$F = \frac{Between-groups-variability}{within-groups-variability}$$

An F statistic will become larger if the between-groups variability rises and the within-groups variability stays the same. The F statistic will become smaller if the within-groups variability becomes larger and the between-groups variability stays the same. The F statistic has a F sampling distribution. This distribution is approximately centered around F = 1 when the null hypothesis is true. The larger the F statistic, the stronger the evidence against the null hypothesis.

The F distribution has two different degrees of freedom: df1 and df2. The formula for df1 is the following: \(df1 = g - 1\) where g is the amount of groups. The formula for df2 is the following: \(df2 = N - g\) where N is the sample size of all groups combined and g is the number of groups. These degrees of freedom come in handy when we want to calculate a p value for our obtained F statistic. To calculate a p value for our F statistic, we can use the `pf()`

function. This function works similarly as the `pnorm()`

and `pbinom()`

functions that you may have come across in the course on basic statistics. The `pf()`

function takes our F statistic as its first argument, our df1 as its second argument and our df2 as its third argument.

Instructions

**100 XP**

- The variables
`between_group_variance`

and`within_group_variance`

are available in your console. Use these variables to calculate the F statistic and store the result in a variable called`f_stat`

. Round the result to two digits. - Calculate the degrees of freedom df1 and df2 and store them in the variables
`df1`

and`df2`

- Using the
`pf()`

function, calculate the p value and store this in the variable`p_value`

. Round the result to two digits. Make sure to calculate the p value associated with the upper tail.