Calculate sum of squared errors
In this exercise, you will calculate the sum of squared errors for different number of clusters ranging from 1 to 15. In this example we are using a custom created dataset to get a cleaner elbow read.
We have loaded the normalized version of data as data_normalized
. The KMeans
module from scikit-learn
is already imported. Also, we have initialized an empty dictionary to store sum of squared errors as sse = {}
.
Feel free to explore the data in the console.
Este exercício faz parte do curso
Customer Segmentation in Python
Instruções do exercício
- Fit KMeans and calculate SSE for each
k
with a range between 1 and 15. - Initialize KMeans with
k
clusters and random state 1. - Fit KMeans on the normalized dataset.
- Assign sum of squared distances to
k
element ofsse
dictionary.
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
# Fit KMeans and calculate SSE for each k
for k in range(____, ____):
# Initialize KMeans with k clusters
kmeans = ____(n_clusters=____, random_state=1)
# Fit KMeans on the normalized dataset
kmeans.____(data_normalized)
# Assign sum of squared distances to k element of dictionary
sse[____] = kmeans.____