Marginal Probabilities
To check whether person group and ad type are independent, we can find the expected probabilities under the null hypothesis (i.e. what you would expect to see if the variables are indeed independent), and then compare these values to your actual observations.
The data containing the observed frequencies in your advertising experiment is still saved in your console as data
. To find the observed probabilities for these values, we simply divide each value by the total number of observations. We can do this simply by entering data/76
into the console.
To find the predicted probabilities for each cell, we need to find the marginal probabilities for each category, and multiply these probabilities together for each cell of our data table.
We can do this manually - but why would we do that when we have R to help us!? We're going to have a go at using a loop that does this for us. A loop works by repeating the same task as many times as you ask it to. In our case we can loop our expected probability calculation, so that it runs through each cell of our contingency table.
Don't worry if it seems a little daunting - the loop is just a tool for doing the calculations quickly. You just need to understand what the loop is doing, not how it works!
This exercise is part of the course
Inferential Statistics
Exercise instructions
- We have already calculated the marginal probabilities and saved them as
margcol
andmargrow
, and made an empty data frame to hold the expected probabilities calledexpProb
. - Have a look at the loop in your script. It runs three times, each time inserting the expected probabilities into
expProb
.. First i = 1, so positions [1,1], [2,1], and [3,1] inexpProb
should take their expected value based on the marginal probabilites inmargcol
andmargrow
. - When i = 2, the second row positions [1,2], [2,2], [3,2] are filled.
- In your script, write the code for adding the correct expected value for the third row.
- In your script, add a line of code to print the observed probabilities.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Calculate marginal probabilities
margcol <- colSums(data) / sum(data)
margrow <- rowSums(data) / sum(data)
# Empty data frame for holding expected probabilities
expProb <- data.frame()
#Loop to fill in data frame
for (i in 1:3){
# Makes row 1 and column i into the expected joint probability based on marginal probability
expProb[1,i] <- (margcol[i] * margrow[1])
# Makes row 2 and column i into the expected joint probability based on marginal probability
expProb[2,i] <- (margcol[i] * margrow[2])
# Add code to makes row 3 and column i into the expected joint probability based on marginal probability
expProb[3,i] <-
}
# Print expected probabilities
expProb
# Print observed probabilities