Gradient boosting ensemble

Boosting is a technique where the error of one predictor is passed as input to the next in a sequential manner. Gradient Boosting uses a gradient descent procedure to minimize the log loss for each subsequent classification tree added one at a time that, on their own, are weak decision models. Gradient Boosting for regression is similar, but uses a loss function such as mean squared error applied to gradient descent.

In this exercise, you will create a Gradient Boosting Classifier model and compare its performance to the Random Forest from the previous exercise, which had an accuracy score of 72.5%.

The loan_data DataFrame has already been split is available in your workspace as X_train, X_test, y_train, and y_test.

Import the modules to create a Gradient Boosting model and print out the confusion matrix, accuracy, precision, recall, and F1-scores.
Instantiate a GB classifier and set the appropriate argument to generate 50 estimators and with a learning rate of 0.01.

Data Pre-processing and Visualization

Supervised Learning

Unsupervised Learning

Model Selection and Evaluation

Exercise

Gradient boosting ensemble

Instructions 1/4