Session Ready
Exercise

Evaluating clustering methods

Clustering results are assessed via both human intuition and validity indices. These indices aim at capturing desirable properties of a clustering, such as its compactness and separation. In a Machine Learning interview, it is important that you apply multiple indices to evaluate the clustering result.

In this exercise, you will use the clValid package to evaluate the clusters produced by the DIvisive ANAlysis (DIANA) hierarchical clustering method using three internal indices: Dunn, Connectivity and Silhouette. You will try with the number of clusters ranging from 2 to 10. DIANA will cluster 200 mall customers categorized by three attributes.

The original data has been pre-processed and is available as mall_scaled. The clValid and dplyr packages are already loaded.

Instructions 1/3
undefined XP
  • 1
  • 2
  • 3
  • Glimpse at the mall_scaled data.
  • Run the clValid() function on mall_scaled with the number of clusters ranging from 2 to 10 while implementing the "diana" method and choosing "internal" validation measures. Save the output as results.