Tuning bagging hyperparameters
While you can easily build a bagging classifier using the default parameters, it is highly recommended that you tune these in order to achieve optimal performance. Ideally, these should be optimized using K-fold cross-validation.
In this exercise, let's see if we can improve model performance by modifying the parameters of the bagging classifier.
Here we are also passing the parameter solver='liblinear'
to LogisticRegression
to reduce the computation time.
This is a part of the course
“Ensemble Methods in Python”
Exercise instructions
- Build a bagging classifier using as base the logistic regression, with
20
base estimators,10
maximum features,0.65
(65%) maximum samples (max_samples
), and sample without replacement. - Use
clf_bag
to predict the labels of the test set,X_test
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Build a balanced logistic regression
clf_base = LogisticRegression(class_weight='balanced', solver='liblinear', random_state=42)
# Build and fit a bagging classifier with custom parameters
clf_bag = ____(____, ____, ____, ____, ____, random_state=500)
clf_bag.fit(X_train, y_train)
# Calculate predictions and evaluate the accuracy on the test set
y_pred = ____
print('Accuracy: {:.2f}'.format(accuracy_score(y_test, y_pred)))
# Print the classification report
print(classification_report(y_test, y_pred))