k-means clustering and comparing results

As you now know, there are two main types of clustering: hierarchical and k-means.

In this exercise, you will create a k-means clustering model on the Wisconsin breast cancer data and compare the results to the actual diagnoses and the results of your hierarchical clustering model. Take some time to see how each clustering model performs in terms of separating the two diagnoses and how the clustering models compare to each other.

This exercise is part of the course

Unsupervised Learning in R

View Course

Exercise instructions

wisc.data, diagnosis, and wisc.hclust.clusters are still available.

Create a k-means model on wisc.data, assigning the result to wisc.km. Be sure to create 2 clusters, corresponding to the actual number of diagnosis. Also, remember to scale the data and repeat the algorithm 20 times to find a well performing model.
Use the table() function to compare the cluster membership of the k-means model to the actual diagnoses contained in the diagnosis vector. How well does k-means separate the two diagnoses?
Use the table() function to compare the cluster membership of the k-means model to the hierarchical clustering model. Recall the cluster membership of the hierarchical clustering model is contained in wisc.hclust.clusters.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Create a k-means model on wisc.data: wisc.km


# Compare k-means to actual diagnoses


# Compare k-means to hierarchical clustering

Edit and Run Code