1. Learn
  2. /
  3. Courses
  4. /
  5. Ensemble Methods in Python

Connected

Exercise

A more complex bagging model

Having explored the semi-conductor data, let's now build a bagging classifier to predict the 'Pass/Fail' label given the input features.

The preprocessed dataset is available in your workspace as uci_secom, and training and test sets have been created for you.

As the target has a high class imbalance, use a "balanced" logistic regression as the base estimator here.

We will also reduce the computation time for LogisticRegression with the parameter solver='liblinear', which is a faster optimizer than the default.

Instructions

100 XP
  • Instantiate a logistic regression to use as the base classifier with the parameters: class_weight='balanced', solver='liblinear', and random_state=42.
  • Build a bagging classifier using the logistic regression as the base estimator, specifying the maximum number of features as 10, and including the out-of-bag score.
  • Print the out-of-bag score to compare to the accuracy.