Session Ready
Exercise

Common Issues in Interpretation III: Outliers

Outliers are another common source of causal misinpretation that often confound results. An outlier is an observation that has a very distant parameter or outcome from other observations. In this question, we will learn how to identify and handle outliers.

In a previous exercise, we examined an experiment conducted by the popular online auctioneer, eGulf, where a random sample of dedicated sellers of used WePhones was drawn. This sample only included eGulf sellers who typically post 10 pictures of their used WePhones. As an experimental treatment, eGulf temporarily allowed some of these sellers to post up to 15 pictures for each auction of their WePhones, and measured whether posting more than 10 pictures affected final WePhone sales prices. In this exercise, eGulf conducted their experiment again. Their sample size seems sufficiently large (N=350), and the mean sales prices of users in the treatment group also seem substantially higher than in the control group. But are they? Follow the code in the workspace to identify and correct for outliers.

Instructions
100 XP
  • Create boxplots of the final sales prices for the treatment and control groups.
  • Identify and remove the outliers from the sample.
  • Examine the mean difference between the Treatment and Control group.