LoslegenKostenlos loslegen

Multivariate outlier detection

100 persons living in the same area have filed a claim because their houses were damaged by hail from Sunday night's storm. The dataset hailinsurance contains 100 observation and 2 variables. The first column contains the payments that were done by the insurance company to each customer whereas the second column is the most recent house price.

In this exercise, you're first going to use classical estimators on the dataset. You will then compare the results with those of robust estimators.

Diese Übung ist Teil des Kurses

Fraud Detection in R

Kurs anzeigen

Interaktive Übung

Vervollständige den Beispielcode, um diese Übung erfolgreich abzuschließen.

# Create a scatterplot
plot(hailinsurance, xlab = "price house", ylab = "claim")

# Compute the sample mean and sample covariance matrix
clcenter <- colMeans(___)
clcov <- cov(___)

# Add 97.5% tolerance ellipsoid
rad <- sqrt(qchisq(___, ___))
ellipse(center = clcenter, shape = clcov, radius = rad,col = "blue", lty = 2)
Code bearbeiten und ausführen