Choosing n_estimators
n_estimators
is the parameter that influences model performance the most. Building IForest
with enough trees ensures that the algorithm has enough generalization power to isolate the outliers from normal data points. The optimal number of trees depends on dataset size, and any number that is too high or too low will lead to inaccurate predictions.
Practice setting n_estimators
on the big_mart
dataset, which has been loaded for you along with IForest
from pyod
.
This exercise is part of the course
Anomaly Detection in Python
Exercise instructions
- Create an
IForest()
estimator with 300 iTrees. - Fit the instance to
big_mart
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create an IForest with 300 trees
iforest = ____
# Fit to the Big Mart sales data
____