Exercise

Calculate sum of squared errors

In this exercise, you will calculate the sum of squared errors for different number of clusters ranging from 1 to 15. In this example we are using a custom created dataset to get a cleaner elbow read.

We have loaded the normalized version of data as data_normalized. The KMeans module from scikit-learn is already imported. Also, we have initialized an empty dictionary to store sum of squared errors as sse = {}.

Feel free to explore the data in the console.

Instructions

100 XP
  • Fit KMeans and calculate SSE for each k with a range between 1 and 15.
  • Initialize KMeans with k clusters and random state 1.
  • Fit KMeans on the normalized dataset.
  • Assign sum of squared distances to k element of sse dictionary.