Exercise

# Univariate outlier detection: the IQR rule

Outlier detection is an important step in your exploratory data analysis. Anomalous observations (also known as ** outliers**), if not properly handled, can skew your analysis and produce misleading conclusions.

Box plots help visually identify potential outliers as they summarize the distribution of a numerical variable. A commonly accepted rule of thumb is that an outlier is any value below \(Q1 - 1.5\times IQR\) or above \(Q3 + 1.5\times IQR\), where \(Q1\) and \(Q3\) are the first and third quartiles, respectively, of the variable distribution and \(IQR=Q3-Q1\) is the *interquartile range*.

In this exercise, you will apply the IQR rule to spot outliers in car fuel consumption. The `cars`

dataset is already loaded. The `quantile()`

function can be used to calculate \(Q1\) and \(Q3\).

Instructions 1/4

**undefined XP**

#### Question

Why is important to detect and treat outliers?

##### Possible Answers

- Because they are not representative of the data distribution.
- Because they can drastically bias/change the fit estimates and predictions.
- Because they make exploratory data analysis more difficult.
- Because they tend to group together.