Session Ready
Exercise

Practice Using Instrumental Variables: CreditCo

If you took our Experiments course, you assessed the validity of a natural experiment using data from CreditCo. Let's revisit that analysis, framing it in the logic of instrumental variables.

As you may recall, CreditCo sent out offers in the mail to increase their customers' credit limits. We are worried that the customers who took up the offer might not have randomly done so, which causes an endogeneity problem. We proposed to solve this problem by using the rainy weather on the day the credit offers were delivered as an instrument for treatment. Specifically, you realized that heavy storms across the country hit half the zipcodes in the treatment group, while the other half of customers experienced sunny weather the day the offer arrived. You realize this variation in the weather may have made customers in the rainy zones feel depressed and therefore ignored the offer, while those in the sunny zones felt cheerful, and because of their good mood took up the offer. Because the weather is out of anyone's control, we can argue it is exogenous to any other factors related to credit card offers, and it also seems to be the exclusive source of variation between the treatment and control groups. Let's try using IV analysis to see if it provides a convincing causal explanation.

Included on the workspace is a dataset from CreditCo. The data are a sample of CreditCo customers who were offered the credit limit increase. We have two possible options for instruments from the stormy weather, the variables windy and rainy. Let's see which one we can use as an instrument for opt_in, which we suspect is endogenous to our outcomes of interest.

Instructions
100 XP
  • 1) Explore the dataset CreditCo on your own with basic tools like head(), str(), and summary().
  • 2) Using a linear regression, compute the correlation between opt-in and windy.
  • 3) Test whether rainy satisfies the relevance assumption (i.e. if its correlation with opt-in is significantly different from zero).
  • 4) Using a linear regression, compute the correlation between opt-in and rainy.
  • 5) Test whether rainy satisfies the relevance assumption (i.e. if its correlation with opt-in is significantly different from zero).
  • 6) What is the sign of the correlation between rain and opt-in?