Get startedGet started for free

Robustness to outliers

Measures of central tendency attempt to describe the middle or center point of a distribution. In the presence of outliers, or extreme values, the median is preferred over the mean. The reason for this is that the mean can be "dragged" up or down by extreme values, but since the median is just the middle value in a distribution, it is not influenced by the outliers.

A person who does not like wine at all enters the wine ratings survey and makes a statement by giving the Shiraz the lowest possible score of zero. Let's see how it affects the mean and median of the score distribution.

This exercise is part of the course

Intro to Statistics with R: Introduction

View Course

Exercise instructions

We've made available to you both the original red_wine ratings as well as red_wine_extreme, which contains the original ratings plus the new extreme rating.

  • Calculate the change in mean rating after adding the new extreme value. Use the mean() function and save the result to diff_mean.
  • Calculate the change in median rating after adding the new extreme value. Use the median() function and save the result to diff_median.
  • Print both differences to see which measure of central tendency is least affected by the addition of the extreme rating.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Calculate the change in mean
diff_mean <- ___

# Calculate the change in median
diff_median <- ___

# Print both differences

Edit and Run Code