Expected Frequencies

We can also check for independence or dependence in the data using frequencies, rather than probabilities. In our Chi-square analysis we compare the observed values, to expected values under the null hypothesis. One way to calculate the expected frequency is \((row marginal frequency * column marginal frequency)/sample size\).

Here is an example of what we would tell R to find the expected frequency for the top left cell: (sum(data[1,]) * sum(data[,1]))/sum(data). data[1,] refers to the first column of data, and data[,1] refers to the first row of data. So R is saying "take the sum of the first column, multiplied by sum of the first row, and divide this by the total sample size". We can then input this into a new table of expected values!

We have another loop to find the expected values for us. See if you can fill in the last line.

Have a look at the loop in your script, it first makes an empty data frame to hold the expected values called expDat, then loops three times. First i = 1, so positions [1,1], [1,2], and [1,3] in exptDat should take their expected value based on the frequencies from data.
When i = 2, the second row positions [2,1], [2,2], [2,3] are filled.
In your script, write the correct code for adding the correct expected value for the third column.

The Basics of R

Introduction to R continued

Comparing two groups

Categorical Association

Simple Regression

Multiple Regression

Quantitative associations: ANOVA

Nonparametric tests

Exercise

Expected Frequencies

Instructions