Get Started

Tuning bagging hyperparameters

While you can easily build a bagging classifier using the default parameters, it is highly recommended that you tune these in order to achieve optimal performance. Ideally, these should be optimized using K-fold cross-validation.

In this exercise, let's see if we can improve model performance by modifying the parameters of the bagging classifier.

Here we are also passing the parameter solver='liblinear' to LogisticRegression to reduce the computation time.

This is a part of the course

“Ensemble Methods in Python”

View Course

Exercise instructions

  • Build a bagging classifier using as base the logistic regression, with 20 base estimators, 10 maximum features, 0.65 (65%) maximum samples (max_samples), and sample without replacement.
  • Use clf_bag to predict the labels of the test set, X_test.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Build a balanced logistic regression
clf_base = LogisticRegression(class_weight='balanced', solver='liblinear', random_state=42)

# Build and fit a bagging classifier with custom parameters
clf_bag = ____(____, ____, ____, ____, ____, random_state=500)
clf_bag.fit(X_train, y_train)

# Calculate predictions and evaluate the accuracy on the test set
y_pred = ____
print('Accuracy:  {:.2f}'.format(accuracy_score(y_test, y_pred)))

# Print the classification report
print(classification_report(y_test, y_pred))
Edit and Run Code