Exercise

# Handle outliers with winsorization

Given is a `basetable`

with two variables: `"sum\_donations"`

and `"donor\_id"`

. `"sum_donations`

can contain outliers when donors have donated exceptional amounts. Therefore, you want to winsorize this variable such that the 5% highest amounts are replaced by the upper 5% percentile value.

Instructions

- Print the minimum value of
`sum_donations`

and verify that it is at least 0. Then print the maximum value of`sum_donations`

. - Fill out the appropriate lower limit percentile. As all values higher than 0 are realistic and occur often, it is not necessary to replace values lower than the lower limit percentile value.
- Create a new variable "sum_donations_winsorized" that is a winsorized version of the "sum_donations" variable.
- Print the maximum value of
`sum_donations_winsorized`

.