Get startedGet started for free

Occupational wage data

1. Occupational wage data

There are many types of problems that are suitable for cluster analysis. In the last 3 chapters you encountered two common types of such problems. With the soccer lineup data you worked with clustering based on spatial data. With the wholesale spending data you segmented customers into clusters. In this chapter you will encounter a third type of problem. You will leverage the tools you have learned thus far to explore data that changes with time, or time-series data.

2. Occupational wage data

You will work with data that consists of the average incomes for twenty two occupations in the United States collected from 2001 to 2016. This corresponds to a matrix where the observations are the 22 occupations and the features of these observations are the measurements of the average income for each year.

3. Occupational wage data

This data is stored in the datamatrix called oes.

4. Occupational wage data

We can see the trends of each occupation with respect to time in this plot. So the question we must ask ourselves is which occupations cluster together? Or to put it another way are there distinct trends of observations that we can observe?

5. Next steps: hierarchical clustering

In the next series of exercises you will go through the necessary steps to analyze this data using hierarchical clustering. As we have discussed in chapters 1 and 2 you will: First determine if any pre-processing steps are needed for this data, such as scaling or imputation. Next you will use the post-processed data to create a distance matrix with an appropriate distance metric. Then you will use the distance matrix to build a dendrogram using a chosen linkage criteria. You will then use what you have learned from this dendrogram to select an appropriate height and extract the cluster assignments. Finally, and most importantly you will explore the resulting clusters to determine whether they make sense and what conclusions can be made from them.

6. Let's practice!

Let's give it a shot.