CommencerCommencer gratuitement

Calculate sum of squared errors

In this exercise, you will calculate the sum of squared errors for different number of clusters ranging from 1 to 15. In this example we are using a custom created dataset to get a cleaner elbow read.

We have loaded the normalized version of data as data_normalized. The KMeans module from scikit-learn is already imported. Also, we have initialized an empty dictionary to store sum of squared errors as sse = {}.

Feel free to explore the data in the console.

Cet exercice fait partie du cours

Customer Segmentation in Python

Afficher le cours

Instructions

  • Fit KMeans and calculate SSE for each k with a range between 1 and 15.
  • Initialize KMeans with k clusters and random state 1.
  • Fit KMeans on the normalized dataset.
  • Assign sum of squared distances to k element of sse dictionary.

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# Fit KMeans and calculate SSE for each k
for k in range(____, ____):
  
    # Initialize KMeans with k clusters
    kmeans = ____(n_clusters=____, random_state=1)
    
    # Fit KMeans on the normalized dataset
    kmeans.____(data_normalized)
    
    # Assign sum of squared distances to k element of dictionary
    sse[____] = kmeans.____
Modifier et exécuter le code