Multivariate outlier detection
100 persons living in the same area have filed a claim because their houses were damaged by hail from Sunday night's storm. The dataset hailinsurance
contains 100 observation and 2 variables. The first column contains the payments that were done by the insurance company to each customer whereas the second column is the most recent house price.
In this exercise, you're first going to use classical estimators on the dataset. You will then compare the results with those of robust estimators.
Este exercício faz parte do curso
Fraud Detection in R
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
# Create a scatterplot
plot(hailinsurance, xlab = "price house", ylab = "claim")
# Compute the sample mean and sample covariance matrix
clcenter <- colMeans(___)
clcov <- cov(___)
# Add 97.5% tolerance ellipsoid
rad <- sqrt(qchisq(___, ___))
ellipse(center = clcenter, shape = clcov, radius = rad,col = "blue", lty = 2)