Get startedGet started for free

Using ML classification to catch fraud

In this exercise you'll see what happens when you use a simple machine learning model on our credit card data instead.

Do you think you can beat those results? Remember, you've predicted 22 out of 50 fraud cases, and had 16 false positives.

So with that in mind, let's implement a Logistic Regression model. If you have taken the class on supervised learning in Python, you should be familiar with this model. If not, you might want to refresh that at this point. But don't worry, you'll be guided through the structure of the machine learning model.

The X and y variables are available in your workspace.

This exercise is part of the course

Fraud Detection in Python

View Course

Exercise instructions

  • Split X and y into training and test data, keeping 30% of the data for testing.
  • Fit your model to your training data.
  • Obtain the model predicted labels by running model.predict on X_test.
  • Obtain a classification comparing y_test with predicted, and use the given confusion matrix to check your results.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Create the training and testing sets
X_train, X_test, y_train, y_test = train_test_split(____, ____, test_size=____, random_state=0)

# Fit a logistic regression model to our data
model = LogisticRegression()
model.fit(____, ____)

# Obtain model predictions
predicted = model.predict(____)

# Print the classifcation report and confusion matrix
print('Classification report:\n', classification_report(____, ____))
conf_mat = confusion_matrix(y_true=y_test, y_pred=predicted)
print('Confusion matrix:\n', conf_mat)
Edit and Run Code