Using a pipeline
Now that you have our pipeline defined, aka combining a logistic regression with a SMOTE method, let's run it on the data. You can treat the pipeline as if it were a single machine learning model. Our data X
and y
are already defined, and the pipeline is defined in the previous exercise. Are you curious to find out what the model results are? Let's give it a try!
This exercise is part of the course
Fraud Detection in Python
Exercise instructions
- Split the data 'X'and 'y' into the training and test set. Set aside 30% of the data for a test set, and set the
random_state
to zero. - Fit your pipeline onto your training data and obtain the predictions by running the
pipeline.predict()
function on ourX_test
dataset.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Split your data X and y, into a training and a test set and fit the pipeline onto the training data
X_train, X_test, y_train, y_test = ____
# Fit your pipeline onto your training set and obtain predictions by fitting the model onto the test data
pipeline.fit(____, ____)
predicted = pipeline.____(____)
# Obtain the results from the classification report and confusion matrix
print('Classifcation report:\n', classification_report(y_test, predicted))
conf_mat = confusion_matrix(y_true=y_test, y_pred=predicted)
print('Confusion matrix:\n', conf_mat)