BaşlayınÜcretsiz Başlayın

Choosing contamination

Even though the code implementation only takes a few lines, finding the suitable contamination requires attention.

Recall that contamination parameter only affects the results of IForst. Once IForest generates raw anomaly scores, contamination is used to chose the top n% of anomaly scores as outliers. For example, 5% contamination will choose the observations with the highest 5% of anomaly scores as outliers.

Although we will discuss some tuning methods in the following video, for now, you will practice setting an arbitrary value to the parameter.

The data is loaded as big_mart.

Bu egzersiz

Anomaly Detection in Python

kursunun bir parçasıdır
Kursu Görüntüle

Egzersiz talimatları

  • Instantiate an IForest() estimator with 5% contamination.
  • Fit the instance to the Big Mart sales data.

Uygulamalı interaktif egzersiz

Bu örnek kodu tamamlayarak bu egzersizi bitirin.

from pyod.models.iforest import IForest

# Instantiate an instance with 5% contamination
iforest = ____

# Fit IForest to Big Mart sales data
____
Kodu Düzenle ve Çalıştır