Exercise

Choosing n_estimators

n_estimators is the parameter that influences model performance the most. Building IForest with enough trees ensures that the algorithm has enough generalization power to isolate the outliers from normal data points. The optimal number of trees depends on dataset size, and any number that is too high or too low will lead to inaccurate predictions.

Practice setting n_estimators on the big_mart dataset, which has been loaded for you along with IForest from pyod.

Instructions

100 XP
  • Create an IForest() estimator with 300 iTrees.
  • Fit the instance to big_mart.