Exercise

# Selling Old Phones: Identifying Imbalance

Now that we are familiar with Laurel's WePhones dataset, we need to do some balance checks between the treatment and control groups. To do that, let’s try something clever: let’s start by assuming our variables are balanced across the groups, and then look for any variables that are statistically *unbalanced*.

We’ll use a t-test to measure the difference in means for a variable between the two groups, and we’ll check if any difference is statistically significant through its p-value. The t-test is useful because it compensates for the variation within groups, and the p-value tells us how likely any difference is due to chance. Let’s use the standard assumption that we’ll believe our result if it has less than a 5% of being randomly generated. ln this case, we declare a variable to be imbalanced if it shows a difference between groups and has a p-value smaller than .05. If there is a difference but the p-value is larger than .05, we’ll say that any imbalance is not statistically significant and we’ll ignore it.

We’ll start by checking to see if the groups are balanced on the colors of the phones, then we’ll explore further in the dataframe.

Using the data frame `WePhones`

:

Instructions

**100 XP**

- 1) Use the
`t.test`

function to find out if the`Treatment`

and`Control`

groups have the same number of`Black`

WePhones. - 2) If that is statistically balanced, examine the data frame to find any variable that is imbalanced between the
`Treatment`

and`Control`

groups.