Calculate sum of squared errors
In this exercise, you will calculate the sum of squared errors for different number of clusters ranging from 1 to 15. In this example we are using a custom created dataset to get a cleaner elbow read.
We have loaded the normalized version of data as data_normalized
. The KMeans
module from scikit-learn
is already imported. Also, we have initialized an empty dictionary to store sum of squared errors as sse = {}
.
Feel free to explore the data in the console.
Cet exercice fait partie du cours
Customer Segmentation in Python
Instructions
- Fit KMeans and calculate SSE for each
k
with a range between 1 and 15. - Initialize KMeans with
k
clusters and random state 1. - Fit KMeans on the normalized dataset.
- Assign sum of squared distances to
k
element ofsse
dictionary.
Exercice interactif pratique
Essayez cet exercice en complétant cet exemple de code.
# Fit KMeans and calculate SSE for each k
for k in range(____, ____):
# Initialize KMeans with k clusters
kmeans = ____(n_clusters=____, random_state=1)
# Fit KMeans on the normalized dataset
kmeans.____(data_normalized)
# Assign sum of squared distances to k element of dictionary
sse[____] = kmeans.____