Elbow method

In the previous exercise you've implemented MiniBatch K-means with 8 clusters, without actually checking what the right amount of clusters should be. For our first fraud detection approach, it is important to get the number of clusters right, especially when you want to use the outliers of those clusters as fraud predictions. To decide which amount of clusters you're going to use, let's apply the Elbow method and see what the optimal number of clusters should be based on this method.

X_scaled is again available for you to use and MiniBatchKMeans has been imported from sklearn.

Define the range to be between 1 and 5 clusters.
Run MiniBatch K-means on all the clusters in the range using list comprehension.
Fit each model on the scaled data and obtain the scores from the scaled data.
Plot the cluster numbers and their respective scores, it will take a few seconds to run.

Introduction and preparing your data

Fraud detection using labeled data

Fraud detection using unlabeled data

Fraud detection using text

Exercise

Elbow method

Instructions